Figures
Abstract
Single nucleotide polymorphisms (SNPs) in DNA repair genes can impair protein structure and function, contributing to disease development, including cancer. Non-synonymous SNPs (nsSNPs) in the LIG3 gene are linked to genomic instability and increased cancer risk, particularly acute myeloid leukemia (AML). This study aims to identify the most deleterious nsSNPs in the LIG3 and potential therapeutic targets for DNA repair restoration in AML. We employed different computational approaches to analyze LIG3 nsSNPs and pathogenicity. Subsequently, molecular docking, molecular dynamics simulation (MDS), gene expression and clinical validation of LIG3 were performed to evaluate ligand-binding affinities, protein stability and to identify discriminatory gene signatures. Out of the 12,191 mapped SNPs, 132 were nsSNPs located in the coding region. Among these, 18 nsSNPs were identified as detrimental including 12 destabilizing and 6 stabilizing nsSNPs. Nine cancer-associated nsSNPs, including L381R and R528C, were predicted due to their structural and functional impacts. Further analysis revealed key phosphorylation and methylation sites, such as 529S and 224R. MDS highlighted stable interactions of compounds AHP-MPC and DM-BFC with wild-type and R528C mutant LIG3 proteins, while R671G and V781M mutants showed instability. Protein-protein interaction networks and functional enrichment linked LIG3 to DNA repair pathways. Kaplan-Meier analysis associated high LIG3 expression with improved survival in breast cancer and AML, suggesting its role as a prognostic biomarker. This study emphasizes the mutation-specific effects of LIG3 nsSNPs on protein stability and ligand interactions. We recommend identifying DM-BFC to advance personalized medicine approaches for targeting deleterious variants, following in-vitro and in-vivo validation for AML treatment.
Citation: Hossen MA, Jahan UMS, Hossain MA, Asif KH, Rahman A, Ahmed S, et al. (2025) Computational investigation unveils pathogenic LIG3 non-synonymous mutations and therapeutic targets in acute myeloid leukemia. PLoS One 20(6): e0320550. https://doi.org/10.1371/journal.pone.0320550
Editor: Rituraj Purohit, CSIR-IHBT: Institute of Himalayan Bioresource Technology CSIR, INDIA
Received: February 20, 2025; Accepted: May 16, 2025; Published: June 10, 2025
Copyright: © 2025 Hossen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data supporting the findings of this study are included in the article. The AlphaFold-predicted structural models of the LIG3 protein have been deposited in Figshare (DOI: 10.6084/m9.figshare.28844990).
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Acute myeloid leukemia (AML) is a prevalent and deadly leukemia characterized by the swift expansion of myeloid progenitor cells in the bone marrow, leading to immense disruption of normal hematopoiesis and representing a fatal form of bone marrow malignancy [1,2]. AML affects 12.6 out of 100,000 persons in the US who are 65 years of age or older at its peak, with an annual incidence of around 2.4 per 100,000. The incidence rises steadily with advancing years [3]. DNA repair is a crucial molecular defense system against chemicals that cause cancer, degenerative diseases, and aging. Various repair systems exist in humans to defend the genome by fixing changed bases, DNA adducts, cross-linkages, and double-strand breaks (DSBs) [4]. DSB repair is crucial for maintaining genomic integrity and preventing mutations that can lead to cancer and other diseases. In higher eukaryotes, DNA DSBs are predominantly repaired using a simple mechanism called non-homologous end joining (NHEJ). NHEJ involves the ligation of broken ends without the requirement of homology [5]. Translocations of chromosomes are facilitated by alternative non-homologous end-joining (alt-NHEJ), which is a newly discovered process for repairing DNA DSBs [6]. All kinds of leukemia exhibit impaired DSB repair, while several key components of DSB repair are especially affected. The Ku70/80 complex and DNA-dependent protein kinase (DNA-PK) play a role in the non-homologous end-joining process [7].
LIG3 protein, encoding DNA Ligase III, is crucial in the emergence of AML due to its role in DNA repair mechanisms, especially in NHEJ. This pathway frequently demonstrating heightened expression in cancers indicated by genomic instability, like AML, in which deficiencies in DNA repair serve a vital part in cancer progression and resistance to treatment [8,9]. LIG3 becomes more active when the normal system, which depends on DNA ligase IV, is not functioning properly [10]. LIG3 can be divided into two forms, LIG3-α and LIG3-β, using various splicing procedures. LIG3-α participates in the repair of nucleic acids through the DNA repair protein XRCC1, whereas LIG3-β is found in male germ cells [11]. LIG3 is necessary for the metabolism of mitochondrial DNA. LIG3 interacts with the single-strand break repair protein XRCC1 through its C-terminal BRCT domain. LIG3 has been shown to possess end-joining activity in cellular extracts and in LIG4-deficient cells that were depleted of LIG3 using plasmid substrates. This suggests that LIG3 is involved in a secondary mechanism of NHEJ for repairing DSBs [12].
SNPs denote variations in DNA sequences resulting from a mutation of a single nucleotide at the genomic level. The human genome is estimated to encompass a minimum of 3 million SNPs, with an average frequency of occurrence every single 300 base pairs [13]. SNP technologies are valuable for studying differences in treatment responsiveness between individuals and finding genes that cause human diseases. Moreover, the biological mechanisms behind sequence evolution can be understood by utilizing SNPs [14]. SNPs play a crucial role as markers in numerous research that establish connections between variations in DNA sequences and changes in observable traits [15]. nsSNPs and mutations have been associated with human features and diseases [16]. SNPs can also impact gene expression and protein function and are observed throughout numerous genomic locations, including as promoters, exons, and introns. Finding SNPs may facilitate the disease severity anticipation and personalized therapeutic strategies. The expression levels of LIG3 gene are associated with prognosis in several cancers, suggesting that both SNPs and expression may function as markers for clinical results. Particularly LIG3 SNPs have been linked to somatic mutations affecting numerous malignancies, underscoring their likelihood of prognostic significance [17]. LIG3 gene can facilitate NHEJ even when LIG4 gene is not present, as well as nucleotide excision repair (NER) and homologous recombination repair (HRR) [18]. Gene polymorphisms associated with DNA repair pathways, such as LIG3, might contribute to the initiation and progression of Alzheimer’s disease [19]. The LIG3-XRCC1 pathway identifies ADP-ribosylation and is necessary for the joining of Okazaki fragments in the final stages of DNA replication [18]. A number of studies have examined mutations in the LIG3 gene; however, the prediction of harmful SNPs in the LIG3 gene linked to AML has not yet been performed. Further investigation is required to figure out the most deleterious and disease-associated SNPs in the LIG3 gene associated with other malignancies, including AML. Our computational analysis points out that the R528C, R671G, and V781M mutations in the LIG3 gene significantly affect protein structure and function. These insights might shed light on why some mutations are associated with an increased risk of disease. Therefore, our objective was to identify the most deleterious nsSNPs in the LIG3 gene and potential therapeutic targets for DNA repair restoration in AML. Additionally, we identified therapeutic targets with potential to mitigate the effects of these mutations while improving protein composition, stability, and function, opening new possibilities for cancer treatment.
2. Materials and methods
2.1 Retrieval of LIG3 nsSNPs dataset
The dataset of SNPs for the human LIG3 gene and its protein sequence (Uniprot ID: P49916) was obtained from the NCBI dbSNP (https://www.ncbi.nlm.nih.gov/snp/) (Accessed on: 5 May, 2024) and UniProtKB (https://www.uniprot.org/) (Accessed on: 5 May, 2024) databases, respectively. A total of 12,191 SNPs that belong to different functional classifications (Fig 1) were mapped within the LIG3 gene sequence. Among 12,191 SNPs, 132 were non-synonymous SNPs (nsSNPs) located in the coding area, which may result in missense or nonsense mutations, hence influencing the structure and function of the protein. In this study, our focus was on the coding region of the LIG3 protein, where we evaluated the nsSNPs. Subsequently, the 132 identified nsSNPs were extracted and subjected to detailed analysis.
The structure illustrates the superimposition of the wild-type and mutant LIG3 proteins, with mutations at positions 528 (arginine to cysteine) in panel A and B, 671 (arginine to glycine) in panel C and D, and 781 (valine to methionine) in panel E and F, respectively. Proteins are labeled with colors: orange for wild-type, cyan for mutant, and a yellow box representing the sites of mutations in three mutant proteins relative to one wild-type protein.
2.2 Screening the highly deleterious nsSNPs
We used different bioinformatics tools to assess nsSNP variations, thoroughly screening and prioritizing alterations predicted to have detrimental effects. To predict the consequences of detrimental SNPs in the human genome, PhD-SNP (https://snps.biofold.org/phd-snp/phd-snp.html) (Accessed on 10 May, 2024) tools was utilized which uses Support Vector Machines (SVMs) as classifiers and seems for alterations in protein sequence, mutation locations, and mutated residues [20]. PANTHER is a biological and evolutionary database for all genes that code for proteins was utilized for categorizing the genes based on the evolutionary trajectory and functional attributes [21]. Using the availability or frequency of protein substitutions in the sequence of query protein, PolyPhen-2 categorizes mutations as potentially lethal (>0.15), likely detrimental (>0.85), or benign, depending on their impact on protein expression [22,23]. Subsequently, Predict-SNP, a consensus algorithm that incorporates the MAPP, SNAP, and PolyPhen-1, provides data for each mutation and significantly enhances estimated performance, proving that consensus prediction is a trustworthy and precise alternative to predictions generated through distinct tools [24]. Mutated protein function was examined using SIFT to ascertain if nsSNPs had a positive or negative effect [25]. Integrated computational methods, including functional annotation of SNPs through conservation profiling, analysis of protein structural and functional data, and linkage of coding SNPs to gene transcripts, enable a comprehensive evaluation of the likelihood of harmful missense mutations.
2.3 Functional consequences of nsSNPs on protein
We evaluated protein stability using the SVM-based web server I-Mutant 2.0 (Accessed on May 30, 2024). This method was pivotal to our study due to its capability to predict stability changes resulting from mutations [26]. MUpro (Accessed on May 31, 2024) was employed to analyze changes in protein sequences by comparing residues between the wild-type and mutant proteins [27]. Additionally, we utilized the ΔΔG free energy change values to assess protein stability through the mutation cutoff scanning matrix (mCSM) method (Accessed on May 31, 2024) [28]. A ΔΔG value greater than 0 signifies enhanced protein stability, while a value below 0 indicates that the mutation adversely affects protein function [29,30].
2.4 Assessment of nsSNPs linked to cancer
We analyzed a set of amino acid substitutions arising from somatic cancer mutations using Mutation 3D (http://www.mutation3d.org/) (accessed on June 3, 2024). This tool is widely utilized to evaluate the effects of cancer-associated nsSNPs on protein function and disease progression. By employing a 3D clustering approach, the tool identifies potential cancer-driving alterations in a protein’s amino acid composition. The analysis required input data comprising the target protein and its associated mutations [31].
2.5 Structural and functional changes prediction
MutPred2 (accessed on June 7, 2024) was utilized as a method for predicting structural and functional alterations induced by amino acid variants [32]. This tool enhances the detection of harmful variants by simulating how mutations impact protein structure and function, aiding in the understanding of disease mechanisms. It also provides insights into the specific biological pathways involved in disease progression. Protein FASTA sequences, along with amino acid variations, were input into MutPred2 for analysis. The method emphasizes the integration of genetic and molecular data through machine learning techniques [33].
2.6 Structural modeling of wild-type and mutant protein using AlphaFold
The three-dimensional structure of human LIG3 was predicted using AlphaFold2, a deep learning-based protein structure prediction tool developed by DeepMind [34]. The amino acid sequence of LIG3 was retrieved from the UniProt database and submitted to the AlphaFold Protein Structure Database [35]. The model confidence was evaluated using the per-residue predicted Local Distance Difference Test (pLDDT) score, where values above 90 indicate very high confidence. The Predicted Aligned Error (PAE) matrix was used to assess inter-residue distance reliability. Additionally, the predicted TM-score (pTM) was reported to evaluate the overall accuracy of domain packing within the model. No inter-chain predicted TM-score (ipTM) was reported, indicating the prediction was monomeric or not modeled as part of a complex [36].
2.7 Anticipating the modifications of protein 3D structure resulting from mutation
To assess the impact of residue substitutions on protein structure, we utilized the Project Hope server (https://www3.cmbi.umcn.nl/hope/) (accessed on June 15, 2024). This tool integrates 3D structural data and provides detailed insights into the structural differences between native and mutant protein residues. Structural impact analysis was performed using the protein sequence of LIG3 and its mutations (nsSNPs) [37]. We examined how alterations in amino acid composition affect native structures, focusing on differences in hydrophobicity, charge, and size between wild-type and mutant residues. Understanding these modifications’ influence on the protein’s three-dimensional structure is vital for elucidating its function, guiding future experimental studies, and developing novel treatments and diagnostic tools [38].
2.8 Estimation of post translation modification sites
Several computational tools, such as NetPhos 3.1 (https://services.healthtech.dtu.dk/services/NetPhos-3.1/) (accessed on June 20, 2024) and GPS-MSP 1.0 (http://msp.biocuckoo.org/), have made significant advances in identifying post-translational modification (PTM) sites, particularly for phosphorylation and methylation. These tools use machine learning and deep learning techniques to enhance prediction accuracy by incorporating sequence and structural data. We employed the NetPhos 3.1 tool, which utilizes multiple neural networks to predict phosphorylation sites on tyrosine, threonine, and serine residues, identifying potential locations for these modifications [39]. In addition, we used GPS-MSP 1.0 to predict potential methylation sites along the protein chain [40].
2.9 Protein-protein interaction network prediction
Protein-protein interactions (PPIs) were analyzed using the STRING database (v12.0) (accessed on June 22, 2024) [38]. Interaction networks were constructed based on experimental data, co-expression patterns, curated databases, and text mining. A confidence score threshold of ≥0.7 was applied to ensure high-confidence interactions. The generated networks were visualized and exported for further analysis. Functional enrichment and clustering tools within STRING were employed to identify key pathways and interaction modules [38]. Analyzing the PPI data is essential, as mutant proteins can continuously influence other proteins in the diseased state. Understanding these interactions provides insights into the underlying mechanisms of clinical conditions and aids in identifying the source protein and its associated network [41,42].
2.10 Exploring pathways with gene ontology and KEGG enrichment
Gene Ontology (GO) is a widely used knowledge-based resource that provides organized and computable information on gene functions, focusing on biological processes (BP), cellular components (CC), and molecular functions (MF). These three ontologies are integral to GO enrichment analysis [43]. We performed GO analysis using EnrichR (https://maayanlab.cloud/enrichr-kg) (accessed on June 30, 2024) to identify statistically significant associations (P < 0.05) between the input gene set and curated databases covering CC, MF, and BP [43]. Following this, we utilized SRplot (accessed on June 30, 2024) to create graphical summaries that visualize the enriched analysis results [44].
2.11 Superimposition and molecular layering of wild-type and mutant-type proteins
Protein structure superimposition is a key technique in structural biology for comparing protein structures to understand their evolutionary relationships, functions, and dynamics. By aligning protein structures, it reveals similarities, differences, and dynamic changes over time, highlighting functional patterns [45]. Chimera 1.16 was used to superimpose the native LIG3 protein and its mutant variants [46].
2.12 Molecular docking and pharmacokinetic profiling
Molecular docking is a key technique in drug discovery, facilitating virtual screening and drug repurposing [47]. Using Autodock Vina (v1.2.1), we evaluated how detrimental mutations affected LIG3’s (UniProt ID: P49916) binding affinity [48]. The Autodock Vina employed a Lamarckian Genetic Algorithm (LGA) and a semi-empirical free energy force field [49]. The LIG3 crystal structure complex, obtained from RCSB (https://www.rcsb.org/; Accessed on July 10, 2024) and analyzed with Phyre2 [50], was energy-minimized using Swiss-Pdb Viewer [51], to generate mutant forms. We focused on four proteins: the wild-type and three mutant LIG3 variants with nsSNPs. Sixteen PubChem ligands (CIDs: 707801, 70687578, 116535, 59937, 408383, 676443, 718154, 722325, 609964, and 684700) were selected based on their structural similarity to known inhibitors, predicted binding affinity, and reported bioactivity [52]. These ligands were then docked against both wild-type and mutant protein structures. Ligands were converted to pdbqt format using Autodock Vina, and grid boxes (x = 77.5488082123; y = 72.4516607666; z = 73.5725282574) were optimized for docking efficiency [53]. Docking results and ligand-protein interactions were visualized using BIOVIA Discovery Studio (v21.1.0) [54].
The pharmacokinetic phase (absorption, distribution, metabolism, excretion) and along with toxicity study (ADMET) are some essential parameters in designing and development of new drug.
An in-silico computational pharmacokinetics approach was used to determine the ADMET properties of the AHP-MPC (CID: 70687578) and DM-BFC (CID: 707801). In cases of AML, breast cancer, hepatocellular carcinoma, and other diseases, these compounds may be evaluated as possible therapeutic agents that target LIG3 [52]. Drug-likeness and pharmacokinetics parameters such as Absorption, Distribution, Metabolism, and Excretion (ADME) in the compounds were evaluated through the SwissADME [55] and pkCSM [56] web tools. For toxicity prediction, we used Protox III online server [57]. In this analysis, the simplified molecular input line entry system (SMILES) formats of the both compounds were retrieved from the PubChem database. Lipinski’s rule of five was used to assess the drug-likeliness properties of the compounds [58].
2.13 Molecular dynamics simulation
Molecular dynamics simulations (MDS) were performed using Desmond v24, developed by Schrödinger LLC, to validate the interactions predicted during docking analysis [59]. MDS applies Newton’s classical laws of motion to compute atomic positions and velocities over time, generating new configurations at small intervals. This approach enables the prediction of each wild-type and mutant LIG3 protein’s ligand-binding behavior under near-physiological conditions, providing dynamic insights into the stability and interaction patterns of the ligand-protein complexes [60,61]. The ligand-receptor complexes were preprocessed using the Protein Preparation Wizard, which facilitated optimization, energy minimization, and the addition of any missing residues to the protein complexes. The System Builder application was subsequently employed to construct the simulation system. The TIP3P solvent model, featuring an orthorhombic box structure, was used along with the OPLS_2005 force field, under conditions of 300 K temperature and 1 atm pressure, to ensure a realistic simulation environment [62–64]. MDS running on an NVIDIA GeForce RTX 4070 GPU, each 100 ns simulation took approximately 8 h. Frames were collected and analyzed using a simulation interaction diagram to examine trajectories and fluctuations [59,65]. Each complex was neutralized using counter ions and 0.15 M sodium chloride to replicate physiological conditions.
2.14 Analysis of gene expression discrimination using area under the curve
To evaluate the discriminatory potential of the identified gene signatures, we performed receiver operating characteristic (ROC) curve analysis using AML blast samples and mixed lineage leukemia (MLL) datasets obtained from the NCBI GEO Profiles database (https://www.ncbi.nlm.nih.gov/geoprofiles/). The SRplot ROC curve server (https://www.bioinformatics.com.cn/en?keywords=roc) was used for this analysis. Gene expression values were input into the platform, which computed the sensitivity and specificity values across thresholds and generated the ROC curve accordingly. The area under the curve (AUC) was calculated to determine the model’s ability to distinguish between the two leukemic conditions [66].
2.15 Clinical validation of LIG3
To assess the prognostic significance of LIG3 gene expression, survival analysis was performed using the Kaplan-Meier Plotter, an established online tool that integrates clinical and gene expression data across various cancer types [67]. Kaplan-Meier survival curves were generated to evaluate the relationship between LIG3 mRNA expression levels (high vs. low) and overall survival (OS). Hazard ratios (HR) with 95% confidence intervals (CI) and log-rank P-values were calculated to determine statistical significance. Patients were stratified into high- and low-expression groups based on median expression levels. A statistical threshold of P < 0.05 was applied, highlighting the relevance of the findings.
2.16 Ethical statement
This in-silico computational study did not involve human subjects and therefore did not require ethical approval. Furthermore, the authors declare that this manuscript, submitted to PLOS ONE, has been prepared with full adherence to responsible research practices and in accordance with the guidelines of publication ethics.
3. Results
3.1 Assessment of deleterious nsSNPs
We retrieved 12,191 nsSNPs within the LIG3 gene from dbSNP. Of these, 902 (7.4%) were missense variations, 9,685 (79.44%) were intronic variants, 398 (3.26%) were synonymous variants, 132 (1.08%) were somatic missense variants, and 1,074 (8.81%) belonged to other categories (S1 Fig). The missense variations underwent further analysis to pinpoint the most harmful SNPs. Notably, 132 nsSNPs were identified and categorized as somatic (S1 Table). Among these, we identified 18 detrimental nsSNPs that potentially affect the overall structure or function of the LIG3 protein (S2 Table).
3.2 Prediction the effects of nsSNPs and post-translation modification on protein stability
To evaluate protein stability, we examined the 18 detrimental nsSNPs and found that 12 of them significantly reduced protein stability, while the other variants improved it. These findings were supported by reliability index (RI) values and ΔΔG free energy change values (S3 Table). A decrease in stability indicates protein destabilization, while an increase suggests stabilization. For further analysis, we concentrated solely on the missense variants of nsSNPs. Using the GPS-MSP 1.0 tool, we identified 224R as a potential methylation site on the LIG3 protein. Additionally, phosphorylation site predictions from NetPhos 3.1 identified 529S in the native protein and 666Y in the mutant as potential phosphorylation sites (S2 Fig, S4 Table).
3.3 Evaluation of cancer-associated nsSNPs and their structural and functional changes
We further predicted an increased likelihood of cancer development linked to nine specific mutations, including L381R, A432T, R614G, G799R, R806H, R528C, R528H, V781M, and R671G. These nsSNPs were divided into two groups. The first group, referred to as “covered mutations”, included L381R, A432T, R614G, G799R, and R806H. The second group, known as “clustered mutations”, consisted of R528C, R528H, V781M, and R671G (S3 Fig). These nine nsSNPs were prioritized for further investigation due to their potential cancer association. Additionally, we identified two nsSNPs (Y316C and R643W) as potentially harmful. The results included g-scores, which indicate pathogenicity, and p-values. A g-score above 0.50 suggests a mutation is likely pathogenic. Both Y316C and R643W demonstrated significant pathogenic potential, with g-scores greater than 0.80 and p-values below 0.05, highlighting their importance for further research (Table 1).
3.4 Estimating the effects of high risk nsSNPs on the structure of protein
The structural effects of high-risk nsSNPs on the LIG3 protein revealed distinct physicochemical changes between the wild-type and mutant amino acids, including variations in size, charge, and hydrophobicity. Among the nine identified nsSNPs, four mutations (V781M, L381R, A432T, and G799R) led to an increase in amino acid size, while five mutations (R528C, R671G, R528H, R614G, and R806H) resulted in size reductions. Eight of these mutations also altered the charges of the amino acids. Furthermore, mutations R528C, R671G, R528H, R614G, and R806H exhibited reduced hydrophobicity compared to their wild-type counterparts, while L381R, A432T, and G799R showed increased hydrophobicity, which could potentially influence hydrophobic interactions within the protein structure (Table 2). These results revealed that the mutant-type amino acids diverged markedly from the wild-type proteins (S5 Table). To further investigate, we generated three-dimensional (3D) models of the nine mutant LIG3 proteins. These 3D models, displayed with ribbon representations (S3 Fig), clearly highlight the structural changes induced by the mutations, providing valuable insights into the critical aspects of the LIG3 protein.
3.5 Evaluation of predicted structural reliability using confidence metrics
The AlphaFold-predicted structure of LIG3 (see DOI: 10.6084/m9.figshare.28844990) reveals a well-folded core domain with high confidence (pLDDT > 90), indicating accurate modeling of the structured regions (Table 3). In contrast, lower confidence scores (pLDDT < 70) observed at the terminal and loop regions suggest flexibility or intrinsic disorder. The predicted aligned error (PAE) plot further supports this, showing low inter-residue error within the core and increased uncertainty in peripheral segments. The predicted TM-score (pTM = 0.58) indicates moderate confidence in the global domain packing, while the absence of an inter-chain TM-score (ipTM) suggests a monomeric prediction or lack of multimer modelling. Variants such as R528C, R614G, and R671G show high pTM values (~0.59) and mean pLDDT >70, indicating preserved global structure with minor local perturbations (Table 3). In contrast, mutations like G799R, R806H, and V781M exhibit lower pTM (≤0.32) and mean pLDDT <65, suggesting significant structural destabilization and possible interface disruption (Table 3). Overall, the structure offers a reliable basis for functional and mechanistic studies of LIG3.
3.6 Functional enrichment and signaling pathways analysis
We assessed the biological characteristics of the LIG3 gene by functionally annotating its principal targets through GO enrichment analysis. A total of 23 GO terms were generated, with nine related to (BP), seven to cellular components CC, and seven to molecular functions MF. The sizes of the nodes represented the associated target genes, while the color gradient, ranging from green to red, indicated p-values from high to low (S4 Fig). The KEGG provides a comprehensive pathway database, widely used as a knowledge resource for analyzing biological pathways and cellular activities. Using a p-value threshold of less than 0.05, which was strongly associated with the target genes, KEGG enrichment analysis revealed several enriched pathways related to 11 key targets. The network view of significant KEGG pathways linked to LIG3, highlighting its role in DNA repair and cellular processes (S5 Fig). LIG3, as a central node, connects to pathways involved in mitochondrial DNA repair (GO:0043504), DNA ligation (GO:0051103, GO:0006266), and mitochondrial DNA metabolic processes (GO:0032042). It is also associated with V(D)J recombination (GO:0033151), which is crucial for immune diversity. Additionally, phenotypic outcomes, such as embryonic lethality (MP:0011106, MP:0011107), reduced embryo size (MP:0001698), increased mitotic sister chromatid exchange (MP:0003701), and growth retardation (MP:0003984), further highlight LIG3’s critical role in genomic stability and development (S5 Fig).
3.7 Prediction of protein-protein interaction
The PPI network analysis revealed that LIG3 interacts with ten other proteins: APLF, PRKDC, NHEJ1, XRCC6, XRCC4, LIG4, ATM, DCLRE1C, PAXX, and PARP1 (S6 Fig). This network analysis using the STRING database with a high confidence score threshold (≥0.7) to identify key interacting partners of the studied gene variants (S7 Table). This interaction network, which includes 11 nodes and 55 edges, demonstrates a highly interconnected web, with LIG3 at its center. The network’s high PPI enrichment value of 1.11e-16 and an average node degree of 9.64 suggest significant functional interactions among these proteins, likely involving mutual regulation. Different edge colors were used to visually represent the protein-protein connections (S6 Fig).
3.8 Superimposition of wild and mutated type proteins
The superimposition of wild-type and mutant LIG3 proteins, performed using the Chimera tool, revealed structural changes induced by mutations (Fig 1). The R528C mutation, where Arginine is replaced by Cystine, led to deviations in the loop region (Figs 1A-B). The R671G mutation, involving the substitution of Arginine with Glycine, resulted in a reduction of side-chain bulkiness, affecting the local structure (Figs 1C-D). The V781M mutation, where Methionine is replaced by Valine, caused alterations in side-chain length and packing (Figs 1E-F). These mutations induced localized structural shifts, which may impact protein stability and ligand interactions. The observed structural deviations, particularly in key residues, suggested significant consequences for LIG3 functionality and its binding dynamics, emphasizing the relevance of this research.
3.9 Binding interactions
All 16 ligands were subjected to molecular docking studies against LIG3 to identify potential hit molecules for subsequent drug discovery experiments. The docking grid was manually defined to encompass the entire binding site of LIG3 based on visual inspection of the protein structure. Our analysis revealed a notable decrease in binding affinity for specific compounds due to the presence of three nsSNPs (S7 Table). The docked complexes were carefully analyzed for their binding affinity (kcal/mol) and interaction structures (Fig 2, Table 2). The R528C, V781M, and R671G mutations showed reduced binding affinities for disease-associated compounds, including AHP-MPC (CID: 70687578), DM-BFC (CID: 707801). S8 Table presents the binding affinities (kcal/mol) of 16 AML/cancer-associated compounds against wild-type LIG3 and three mutant variants (R528C, R671G, V781M). The results demonstrate significant mutation-dependent effects on ligand binding, with CID 707801 showing the strongest wild-type affinity (−10.2 kcal/mol) but suffering substantial reductions with R671G (−8.4) and V781M (−7.8) mutations. While most compounds exhibited decreased binding across mutants, CID 70687578 maintained relatively stable affinities (wild-type: −9.2; mutants: −9.3 to −9.9), suggesting its potential as a robust therapeutic candidate. The R671G mutation consistently caused the most severe affinity losses (e.g., CID 59937: −7.8 → −5.5; CID 707801: −10.2 → −8.4), highlighting its particularly disruptive effect on LIG3’s binding pocket. Interestingly, CID 749518 showed improved binding with R528C mutation (−6.9 → −8.6), indicating mutation-specific interactions (Table 2). In its native form, the peptide sequence showed substantial hydrogen bonds, yet its binding affinity remains below −10.0 kcal/mol. In contrast, the mutant forms exhibited decreased binding affinities and formed fewer hydrogen bonds with the target molecules.
The upper side illustrates the interactions involving (A) R528C with DM-BFC, (B) R671G with DM-BFC, (C) V781M with DM-BFC, (D) the wild-type protein with DM-BFC, (E) the wild-type protein with AHP-MPC, (F) Y316C with AHP-MPC, (G) R643W with AHP-MPC, and (H) the wild-type protein with AHP-MPC.
3.10 Pharmacokinetics and toxicity profiles of ten selected compounds
The drug-likeness and ADMET profiles of the AHP-MPC and DM-BFC compounds are presented in Table 4. The evaluation of drug-likeness was conducted using Lipinski’s “rule of five,” which considers criteria such as molecular weight (MW) < 500 daltons (Da), octanol-water partition coefficient (LOGPo/w) < 5, hydrogen bond donors < 5, and hydrogen bond acceptors < 10. The analysis confirmed that both compounds complied with Lipinski’s guidelines. Furthermore, additional physicochemical properties of the selected compounds such as the number of rotatable bonds, heavy atoms, aromatic heavy atoms, hydrogen bond acceptors, and hydrogen bond donors (detailed in Table 4) indicate that these compounds hold promise as safe candidates for therapeutic applications. Protox III was utilized to evaluate the toxicological profiles of the screened compounds. The analysis revealed that both compounds are free from AMES toxicity, hepatotoxicity, and skin sensitization. Furthermore, they demonstrated safety and minimal toxicity in the oral acute toxicity (LD50) test conducted on rats. DM-BFC exhibits moderate acute toxicity (LD₅₀: 0.684 mol/kg) with potential cardiotoxicity (hERG II inhibitor) and a low chronic toxicity threshold (LOAEL: 0.684 mg/kg/day). In contrast, AHP-MPC demonstrates lower acute toxicity (LD₅₀: 3.29 mol/kg), no hERG inhibition, and a significantly higher LOAEL (14.01 mg/kg/day), suggesting a safer profile (Table 4). Notably, the favorable ADME profiles and physicochemical properties of these compounds highlight their potential as promising candidates for the development of new medications (Table 4).
3.11 Molecular dynamics simulation
To improve the accuracy of molecular docking, we performed 100 ns MDS to assess dynamic stability and binding free energy. Through 100 ns MDS, we characterized the interactions of AHP-MPC and DM-BFC with wild-type LIG3 and three pathogenic variants (R528C, V781M, R671G). Structural analyses revealed ligand-specific and mutation-dependent effects on protein stability. RMSD analysis demonstrated that AHP-MPC maintained stable binding with wild-type (4–6 Å) and R528C variants (Figs 3A-B), while inducing conformational instability in R671G and V781M mutants. In contrast, DM-BFC exhibited stabilizing effects across all systems (3–6 Å), with particularly enhanced stability observed for wild-type and R528C complexes (Figs 3A-B). RMSF analysis further revealed that DM-BFC binding conferred structural rigidity to wild-type LIG3, while R671G and V781M mutants displayed increased backbone flexibility, particularly within residues 300–400 and the C-terminal domain (residues 500+) (Figs 3C-D). Notably, the V781M-AHP-MPC complex exhibited pronounced C-terminal destabilization. The R528C mutant maintained wild-type-like stability under both ligand conditions, as evidenced by consistently low RMSF values. These findings demonstrate that while wild-type and R528C LIG3 form stable complexes with both ligands, the R671G and V781M mutations confer structural instability that is particularly pronounced with AHP-MPC binding (Figs 3C-D). The differential stabilization effects observed between ligands and variants suggest that DM-BFC may represent a more robust therapeutic scaffold for targeting both wild-type and R528C LIG3 proteins (Figs 3C-D).
The root mean square fluctuations (RMSF) values of the wild-type LIG3 protein, three mutant-type LIG3 proteins (R528C, R671G, and V781M), and the two ligands (DM-BFC and AHP-MPC) are utilized to evaluate the structural changes of proteins.
The rGyr analysis provided insights into the compactness and structural integrity of the wild-type LIG3 protein and its mutant variants (R528C, V781M, R671G) in the presence of DM-BFC and AHP-MPC (Fig 4A). With DM-BFC, the wild-type protein exhibited the highest rGyr values (~32–36 Å), reflecting considerable structural expansion, particularly in the early phase. In contrast, the R528C and V781M mutants displayed consistently lower rGyr values (~26–28 Å), indicating a more compact and stable conformation, while the R671G mutant showed intermediate values (~28–32 Å). When bound to AHP-MPC, all variants exhibited reduced rGyr values (~26.5–28.5 Å) with minimal variation, suggesting enhanced stability (Fig 4B). Notably, the R528C mutant demonstrated the lowest rGyr values under both ligand conditions, highlighting its superior compactness and structural integrity. SASA analysis further elucidated the solvent exposure of the wild-type and mutant proteins (Figs 4C, D). In the presence of DM-BFC, the wild-type protein had the highest SASA values (~30,000–31,500 Ų), indicative of greater solvent accessibility and an expanded structure. Conversely, the R528C and V781M mutants exhibited significantly lower SASA values (~26,000–28,000 Ų), suggesting a more compact conformation with reduced solvent exposure. The R671G mutant displayed intermediate values (~28,000–29,000 Ų). With AHP-MPC, all variants showed decreased SASA values compared to DM-BFC, with the wild-type retaining higher exposure (~29,000 Ų) and the R528C mutant displaying the lowest, reinforcing its structural compactness. PSA and MolSA analyses revealed distinct differences between the wild-type and mutant proteins (Fig 5). For PSA, the wild-type exhibited the highest values with DM-BFC (~15,000 Ų), suggesting greater polar solvent accessibility, while the R528C and V781M mutants showed lower values (~13,500–14,000 Ų), consistent with a more compact structure. The R671G mutant had intermediate PSA values (~14,200 Ų) (Fig 5A). Upon binding AHP-MPC, PSA values decreased across all variants (~13,200–14,000 Ų), indicating ligand-induced stabilization (Fig 5B). Similarly, MolSA analysis demonstrated that the wild-type protein had the highest MolSA value (~27,000 Ų) with DM-BFC, whereas the R528C mutant exhibited the lowest (~24,000 Ų) (Fig 5C). This trend persisted with AHP-MPC, where further reductions in MolSA values underscored the compact structural conformations (Fig 5D).
The wild-type LIG3 protein and three mutant-type LIG3 proteins (R528C, R671G, and V781M) were assessed utilizing solvent accessible surface area (SASA) values considering interacting with two ligands (DM-BFC and AHP-MPC).
3.12 Predictive utility of gene signatures
The ROC analysis yielded an AUC value of 0.683, indicating a moderate classification performance of the selected biomarkers in distinguishing between AML blast samples and mixed lineage leukemia profiles (Fig 6, S9 Table). This result suggests that the analyzed gene expression patterns hold predictive value and could serve as potential diagnostic indicators with further validation. Although the model does not achieve perfect discrimination, the AUC above 0.65 reflects a meaningful level of separation between the two conditions, supporting its utility as a preliminary screening tool in leukemia subtyping studies.
The area under the curve (AUC) was 0.683, indicating moderate discriminatory power of the selected gene expression signatures between the two leukemia types. Analysis was performed using the SRplot ROC curve database based on data retrieved from NCBI GEO profiles.
3.13 Association of LIG3 gene in various cancers
We further explored the association between the LIG3 gene and the survival rate for patients with breast cancer, bladder cancer, AML, and hepatocellular carcinoma. The Kaplan-Meier analysis demonstrated a significant correlation between high LIG3 expression and improved survival in breast cancer (HR: 0.81, p = 0.00093), and AML (HR: 0.67, p = 3e-05), respectively. However, no significant association was observed in bladder cancer (HR: 0.93, p = 0.16) or hepatocellular carcinoma (HR: 0.93, p = 0.32), and it demonstrated reduced survival rates for these cancer types (Fig 7). These findings suggest that LIG3 may serve as a prognostic biomarker in breast cancer and AML but not in bladder or liver cancers, underscoring its potential cancer-type-specific relevance in survival outcomes.
“P” denotes p-values, while “HR” stands for hazard ratio.
The statistical power of Kaplan-Meier survival analysis is significantly influenced by sample size, as larger cohorts enhance the accuracy and reliability of survival estimates. A sufficient sample size ensures narrower confidence intervals, reduces variability, and increases the likelihood of detecting true survival differences between groups. In contrast, smaller sample sizes may lead to wide confidence intervals and reduced statistical power, making it challenging to identify subtle but meaningful survival trends. To address this, we evaluated our sample size in comparison to previous studies and ensured adequate statistical significance using the log-rank test and hazard ratio analysis. These considerations strengthen the robustness and interpretability of our survival findings.
4. Discussion
This study intends to uncover the potential influence of genetic variants (nsSNPs) on gene function, particularly their effects on protein structure, function, and their relevance to AML through an analysis of the LIG3 gene. We identified 12,191 nsSNPs in the LIG3 gene, with 902 (7.4%) being missense variants. Among these, 18 were classified as detrimental due to their potential to disrupt protein structure or function. This finding aligns with previous studies that highlight the importance of missense mutations in altering protein function, particularly in DNA repair genes like LIG3 [68,69]. The identification of 132 somatic missense variants further underscores the relevance of LIG3 in cancer biology, as somatic mutations are often drivers of oncogenesis [70]. We further predicted that 12 out of 18 detrimental nsSNPs significantly reduced protein stability, while the other SNPs demonstrated elevated stability. We excluded mutations that stabilize the protein because this wasn’t the intended purpose of our research, as changes in protein stability affect its structural conformation and functions [71]. Protein stability is crucial for maintaining functional integrity, and destabilizing mutations can lead to loss of function or misfolding, which is often associated with disease [72]. The identification of potential methylation and phosphorylation sites (224R, 529S, and 666Y) further highlights the role of post-translational modifications in regulating LIG3 gene activity. These findings align with previous research indicating that phosphorylation and methylation can regulate the activity of DNA repair proteins, highlighting their crucial role in connecting genetic variations to phenotypic outcomes [68,69,73]. Nine nsSNPs were linked to an increased risk of AML, showing significant potential for oncogenic transformation. The amino acid size was found to be increased in the wild-type variant due to mutations V781M, L381R, A432T, and G799R and decreased due to mutations R528C, R671G, R528H, R614G, and R806H. Changes in protein size due to mutations can disrupt the folding and spatial arrangement of the protein, altering its stability and functionality [74]. Categorizing mutations as “covered” or “clustered” helps interpret their structural and functional consequences. Covered mutations lie within key functional sites, whereas clustered mutations are spatially adjacent in the 3D structure, potentially influencing local folding, stability, or molecular interactions [72,75]. Notably, Y316C and R643W were identified as highly pathogenic, with g-scores > 0.80, suggesting their potential as biomarkers for cancer risk assessment. This aligns with studies that have identified specific nsSNPs in DNA repair genes as predictive markers for cancer susceptibility [76,77].
The study revealed significant physicochemical changes in the LIG3 protein due to high-risk nsSNPs, including alterations in size, charge, and hydrophobicity. These changes can disrupt PPI and ligand binding, which are critical for LIG3’s role in DNA repair. Charge modifications may affect active sites along with PPI, resulting in functional impairments, such as ineffective DNA repair, which may play a role in diseases like cancer [78,79]. The 3D models of mutant proteins provided visual evidence of structural deviations, supporting the hypothesis that these mutations impair protein function. Similar structural analyses have been used to elucidate the impact of nsSNPs in other DNA repair proteins, such as BRCA1 [80]. GO and KEGG pathway analyses highlighted the involvement of LIG3 gene in DNA repair, mitochondrial DNA metabolism, and V(D)J recombination. The interaction of the LIG3 protein with other ligase proteins, such as LIG1 and LIG4, suggests a multifaceted role in DNA repair pathways that may influence the formation and progression of AML [10]. Furthermore, GO enrichment analysis of the LIG3 gene provided valuable insights into its function in DNA repair and genomic integrity by identifying over-represented GO terms associated with LIG3 [81]. These findings are consistent with LIG3’s known role in maintaining genomic stability [82]. The PPI network analysis further revealed the interactions of LIG3 protein with key DNA repair proteins, including XRCC4, PARP1, and ATM. These interactions are critical for NHEJ and base excision repair (BER), pathways essential for maintaining genomic integrity [83]. The high PPI enrichment value (1.11e-16) suggests that LIG3 is a central player in these pathways, further emphasizing its importance in cancer biology [84]. The superimposition of wild-type and mutant LIG3 proteins revealed significant structural deviations, particularly in loop regions and side-chain packing. These changes can affect protein stability and ligand binding, as seen in other DNA repair proteins like XRCC1 [85]. The observed structural shifts provide a mechanistic basis for the functional impairments associated with these mutations. We found that mutations like R528C, V781M, and R671G reduced binding affinity for disease-associated compounds. This is consistent with previous research showing that nsSNPs can disrupt ligand binding, leading to functional deficits [86]. The reduced hydrogen bonding in mutant forms further supports the idea that these mutations impair protein-ligand interactions. The drug-likeness and ADMET profiles of AHP-MPC and DM-BFC suggested that these compounds are promising candidates (molecular weights <500 g/mol) for therapeutic development. Their compliance with Lipinski’s rule of five and favorable toxicity profiles align with criteria for drug development [62]. These findings are significant for developing targeted therapies in cancers linked to LIG3 dysfunction, suggesting that the identified compounds could serve as promising candidates for future in vivo drug evaluations for AML. The selection of the 16 ligands was based on their structural resemblance to known inhibitors, predicted binding affinities, and documented bioactivity. Key selection criteria included favorable ADMET profiles, adherence to Lipinski’s rule of five, and the presence of essential functional groups for protein-ligand interactions [87]. Additionally, molecular docking pre-screening was conducted to identify ligands with high binding potential. This systematic approach ensured the selection of promising candidates for further computational and functional analysis [49]. To enhance the predictive accuracy of molecular docking, we performed 100 ns MDS. These refinements enabled a more reliable evaluation of ligand-protein interactions by incorporating molecular flexibility and solvent effects, which are often overlooked in rigid docking protocols. Key parameters, including interaction energy, conformational stability, and hydrogen bond persistence, were monitored throughout the simulations. The resulting binding affinities showed strong agreement with initial docking predictions, validating the robustness of our integrative computational workflow [88,89]. MDS provided mechanistic insights into the dynamic behavior of the LIG3-ligand complexes, revealing that the R671G and V781M mutations significantly destabilize protein-ligand interactions, particularly with AHP-MPC. In contrast, both the wild-type and R528C variants maintained stable binding conformations across ligands, with DM-BFC conferring enhanced stability. These observations highlight the ligand- and mutation-specific nature of structural perturbations and suggest that DM-BFC may serve as a more effective scaffold for therapeutic development targeting wild-type and R528C LIG3. Our findings underscore the importance of post-docking refinement in structure-based drug discovery and in elucidating mutation-induced changes in protein dynamics and ligand responsiveness [90–92]. Consistent with previous reports [91–93], the R528C variant’s preserved structural integrity suggests it may retain residual function, thus representing a viable target for precision therapy. In addition, ROC curve analysis demonstrated an AUC of 0.683, reflecting a moderate ability of the selected gene expression signatures to differentiate between AML blast samples and mixed lineage leukemia (S9 Table). Although not indicative of perfect classification, this result underscores the potential diagnostic value of LIG3-associated expression patterns [94]. Further refinement and integration with additional molecular markers could enhance predictive accuracy in future studies. The Kaplan-Meier analysis demonstrated that high LIG3 expression correlates with improved survival in breast cancer and AML but not in bladder or liver cancers. This cancer-type-specific association highlights the complex role of LIG3 in different malignancies. Our findings highlighted the cancer-type-specific relevance of LIG3 expression in survival outcomes, warranting further investigation into its functional mechanisms. Similar findings have been reported for other DNA repair genes, such as BRCA1 and BRCA2, which show tissue-specific effects in cancer prognosis [80,89].
5. Conclusion
AML is a genetically heterogeneous malignancy driven by multiple mutations that influence its initiation, progression, and therapeutic response. This study presents a comprehensive analysis of the LIG3 gene, a key component of the NHEJ pathway, which plays a critical role in DNA repair, particularly during DNA DSBs repair. Through the analysis of 132 missense SNPs, we identified 12 destabilizing mutations, nine of which were associated with cancer, suggesting their potential pathogenic impact on protein structure and function. Molecular docking studies identified two promising ligands, DM-BFC and AHP-MPC, exhibiting high binding affinity for both mutant and wild-type LIG3 proteins, indicating their potential as therapeutic agents. Pharmacokinetic and dynamics simulation further supported the suitability of these compounds for future preclinical evaluation. These findings highlight the critical role of LIG3 gene in maintaining genomic integrity and suggest its potential as a therapeutic target for AML. However, further experimental validation is required to confirm these in-silico predictions, elucidate the molecular mechanisms underlying the identified mutations, and evaluate the clinical applicability of the proposed ligands. This study provides a foundation for the development of targeted therapeutics and personalized treatment strategies for AML and other diseases associated with LIG3 gene dysfunction.
Supporting information
S1 Table. List of 132 nsSNPs + Somatic variants of the LIG3 gene in the NCBI dbSNP database.
https://doi.org/10.1371/journal.pone.0320550.s001
(DOCX)
S2 Table. High risk nsSNPs identified by eight computational tools.
https://doi.org/10.1371/journal.pone.0320550.s002
(DOCX)
S3 Table. List of nsSNPs affecting protein stability detected using I-Mutant 2.0 and MUpro.
https://doi.org/10.1371/journal.pone.0320550.s003
(DOCX)
S4 Table. Estimation of LIG3 phosphorylation sites utilizing NetPhos 3.1 in both wild-type and mutant-type variants.
https://doi.org/10.1371/journal.pone.0320550.s004
(DOCX)
S5 Table. Structural alterations, mutations in conserved domains, and amino acid characteristics of the wild-type and mutant-type amino acids through project hope.
https://doi.org/10.1371/journal.pone.0320550.s005
(DOCX)
S6 Table. Functions of proteins linked with LIG3 gene in PPI.
https://doi.org/10.1371/journal.pone.0320550.s006
(DOCX)
S7 Table. Analysis of the binding affinity of wild-type LIG3 assessed to its mutant variants, along with the associated interacting residues.
https://doi.org/10.1371/journal.pone.0320550.s007
(DOCX)
S8 Table. Screening of 16 compounds that are associated with AML and several cancers.
https://doi.org/10.1371/journal.pone.0320550.s008
(DOCX)
S9 Table. Summary of GEO profiles datasets used for ROC curve analysis comparing AML blast samples and mixed lineage leukemia (MLL) expression profiles.
https://doi.org/10.1371/journal.pone.0320550.s009
(DOCX)
S1 Fig. A clustered pyramid visually depicts the quantity and arrangement of SNPs within the human LIG3 gene, sourced from the dbSNP database (nsSNPs: 902; synonymous SNPs: 398; intronic SNPs: 9685; nsSNPs +Somatic: 132; others: 1074).
https://doi.org/10.1371/journal.pone.0320550.s010
(DOCX)
S2 Fig. Possible targeted phosphorylation and methylation sites as anticipated by GPS-MSP 1.0 and NetPhos 3.1 (using IBS software).
https://doi.org/10.1371/journal.pone.0320550.s011
(DOCX)
S3 Fig. The mutation 3D server identified certain nsSNPs as potential cancer-causing mutations (red mark).
Red indicates clustered mutations, whereas blue signifies covered mutations. A represents cluster-1, where five nsSNPs (L381R, A432T, R614G, G799R, and R806H) are present, and B represents cluster-2 where four nsSNPs are (R528C, R528H, V781M, and R671G) present related with cancer (red mark).
https://doi.org/10.1371/journal.pone.0320550.s012
(DOCX)
S4 Fig. Assessment of the LIG3 gene with a deep focus on Gene Ontology (GO) pathways, specifically Biological Process (BP), Cellular Component (CC), and Molecular Function (MF).
https://doi.org/10.1371/journal.pone.0320550.s013
(DOCX)
S5 Fig. Significant KEGG pathways of LIG3 were represented in network view.
The findings for the pathway term results were sorted based on the combined score (P-value).
https://doi.org/10.1371/journal.pone.0320550.s014
(DOCX)
S6 Fig. STRING database analyzes PPI networking of LIG3 protein.
Its straight line represents the connection between the proteins, while its circular form represents the proteins that are adjacent.
https://doi.org/10.1371/journal.pone.0320550.s015
(DOCX)
References
- 1. Shimony S, Stahl M, Stone RM. Acute myeloid leukemia: 2023 update on diagnosis, risk-stratification, and management. Am J Hematol. 2023;98(3):502–26. pmid:36594187
- 2. Löwenberg B, Downing JR, Burnett A. Acute myeloid leukemia. N Engl J Med. 1999;341(14):1051–62. pmid:10502596
- 3.
Johansson B, Harrison CJ. Acute myeloid leukemia. Cancer Cytogenetics: Chromosomal and Molecular Genetic Aberrations of Tumor Cells. 2015. p. 62–125.
- 4. Barnes DE, Lindahl T. Repair and genetic consequences of endogenous DNA base damage in mammalian cells. Annu Rev Genet. 2004;38(1):445–76.
- 5. Paul K, Wang M, Mladenov E, Bencsik-Theilen A, Bednar T, Wu W, et al. DNA ligases I and III cooperate in alternative non-homologous end-joining in vertebrates. PLoS One. 2013;8(3):e59505. pmid:23555685
- 6. Pashaiefar H, Yaghmaie M, Tavakkoly-Bazzaz J, Hamidollah Ghaffari S, Alimoghaddam K, Izadi P, et al. The Association between PARP1 and LIG3 Expression Levels and Chromosomal Translocations in Acute Myeloid Leukemia Patients. Cell J. 2018;20(2):204–10. pmid:29633598
- 7. Nilles N, Fahrenkrog B. Taking a Bad Turn: Compromised DNA Damage Response in Leukemia. Cells. 2017;6(2):11. pmid:28471392
- 8. Caracciolo D, Juli G, Riillo C, Coricello A, Vasile F, Pollastri S, et al. Exploiting DNA Ligase III addiction of multiple myeloma by flavonoid Rhamnetin. J Transl Med. 2022;20(1):482. pmid:36273153
- 9. Hua R-X, Zhuo Z, Zhu J, Zhang S-D, Xue W-Q, Li X-Z, et al. LIG3 gene polymorphisms and risk of gastric cancer in a Southern Chinese population. Gene. 2019;705:90–4. pmid:31034940
- 10. Tomkinson AE, Sallmyr A. Structure and function of the DNA ligases encoded by the mammalian LIG3 gene. Gene. 2013;531(2):150–7. pmid:24013086
- 11. Sun L, Liu X, Song S, Feng L, Shi C, Sun Z, et al. Identification of LIG1 and LIG3 as prognostic biomarkers in breast cancer. Open Med (Wars). 2021;16(1):1705–17. pmid:34825062
- 12. Simsek D, Brunet E, Wong SY-W, Katyal S, Gao Y, McKinnon PJ, et al. DNA ligase III promotes alternative nonhomologous end-joining during chromosomal translocation formation. PLoS Genet. 2011;7(6):e1002080. pmid:21655080
- 13. Chen Y, Mei Y, Jiang X. Universal and high-fidelity DNA single nucleotide polymorphism detection based on a CRISPR/Cas12a biochip. Chem Sci. 2021;12(12):4455–62. pmid:34163711
- 14. Shastry BS. SNPs in disease gene mapping, medicinal drug development and evolution. J Hum Genet. 2007;52(11):871–80. pmid:17928948
- 15. Kim S, Misra A. SNP genotyping: technologies and biomedical applications. Annu Rev Biomed Eng. 2007;9:289–320. pmid:17391067
- 16. Dantzer J, Moad C, Heiland R, Mooney S. MutDB services: interactive structural analysis of mutation data. Nucleic Acids Res. 2005;33(Web Server issue):W311-4. pmid:15980479
- 17. Kwon NS, Baek KJ, Kim D-S, Yun H-Y. Leucine-rich glioma inactivated 3: Integrative analyses reveal its potential prognostic role in cancer. Mol Med Rep. 2018;17(3):3993–4002. pmid:29257304
- 18. Kumamoto S, Nishiyama A, Chiba Y, Miyashita R, Konishi C, Azuma Y, et al. HPF1-dependent PARP activation promotes LIG3-XRCC1-mediated backup pathway of Okazaki fragment ligation. Nucleic Acids Res. 2021;49(9):5003–16. pmid:33872376
- 19. Kwiatkowski D, Czarny P, Toma M, Korycinska A, Sowinska K, Galecki P, et al. Association between Single-Nucleotide Polymorphisms of the hOGG1,NEIL1,APEX1, FEN1,LIG1, and LIG3 Genes and Alzheimer’s Disease Risk. Neuropsychobiology. 2016;73(2):98–107. pmid:27010693
- 20. Capriotti E, Fariselli P. PhD-SNPg: a webserver and lightweight tool for scoring single nucleotide variants. Nucleic Acids Res. 2017;45(W1):W247–52. pmid:28482034
- 21. Tang H, Thomas PD. PANTHER-PSEP: predicting disease-causing genetic variants using position-specific evolutionary preservation. Bioinformatics. 2016;32(14):2230–2. pmid:27193693
- 22. Adzhubei I, Schmidt S, Peshkin L, Ramensky V, Gerasimova A, Bork P. PolyPhen-2: prediction of functional effects of human nsSNPs. Nat Methods. 2010;7(4).
- 23. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen‐2. Current protocols in human genetics. 2013;76(1):7.20.1-7.
- 24. Bendl J, Stourac J, Salanda O, Pavelka A, Wieben ED, Zendulka J, et al. PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations. PLoS Comput Biol. 2014;10(1):e1003440. pmid:24453961
- 25. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7):1073–81. pmid:19561590
- 26. Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006;22(22):2729–34. pmid:16895930
- 27. Marabotti A, Scafuri B, Facchiano A. Predicting the stability of mutant proteins by computational approaches: an overview. Briefings in Bioinformatics. 2021;22(3):bbaa074.
- 28. Pires DEV, Ascher DB, Blundell TL. mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics. 2014;30(3):335–42. pmid:24281696
- 29. Hasnain MJU, Shoaib M, Qadri S, Afzal B, Anwar T, Abbas SH, et al. Computational analysis of functional single nucleotide polymorphisms associated with SLC26A4 gene. PLoS One. 2020;15(1):e0225368. pmid:31971949
- 30. Wang Z, Huang C, Lv H, Zhang M, Li X. In silico analysis and high-risk pathogenic phenotype predictions of non-synonymous single nucleotide polymorphisms in human Crystallin beta A4 gene associated with congenital cataract. PLoS One. 2020;15(1):e0227859. pmid:31935276
- 31. Meyer MJ, Lapcevic R, Romero AE, Yoon M, Das J, Beltrán JF, et al. mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome. Hum Mutat. 2016;37(5):447–56. pmid:26841357
- 32. Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam HJ. Nature Communications. 2020;11(1):5918.
- 33. Das SC, Rahman MdA, Das Gupta S. In-silico analysis unravels the structural and functional consequences of non-synonymous SNPs in the human IL-10 gene. Egypt J Med Hum Genet. 2022;23(1).
- 34. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
- 35. Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022;50(D1):D439–44. pmid:34791371
- 36. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nat Methods. 2022;19(6):679–82. pmid:35637307
- 37. Islam R, Rahaman M, Hoque H, Hasan N, Prodhan SH, Ruhama A, et al. Computational and structural based approach to identify malignant nonsynonymous single nucleotide polymorphisms associated with CDK4 gene. PLoS One. 2021;16(11):e0259691. pmid:34735543
- 38. Venselaar H, Te Beek TAH, Kuipers RKP, Hekkelman ML, Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics. 2010;11:548. pmid:21059217
- 39. Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999;294(5):1351–62. pmid:10600390
- 40. Xue Y, Zhou F, Zhu M, Ahmed K, Chen G, Yao X. GPS: a comprehensive www server for phosphorylation sites prediction. Nucleic Acids Res. 2005;33(Web Server issue):W184-7. pmid:15980451
- 41. Hossain MA, Al Amin M, Khan MA, Refat MRR, Sohel M, Rahman MH, et al. Genome-Wide Investigation Reveals Potential Therapeutic Targets in Shigella spp. Biomed Res Int. 2024;2024:5554208. pmid:38595330
- 42. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447-52. pmid:25352553
- 43. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44(W1):W90-7. pmid:27141961
- 44. Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, et al. SRplot: A free online platform for data visualization and graphing. PLoS One. 2023;18(11):e0294236. pmid:37943830
- 45. Wu D, Wu Z. Superimposition of protein structures with dynamically weighted RMSD. J Mol Model. 2010;16(2):211–22. pmid:19568776
- 46. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem. 2004;25(13):1605–12. pmid:15264254
- 47. Al Mamun Khan MA, Ahsan A, Khan MA, Sanjana JM, Biswas S, Saleh MA, et al. In-silico prediction of highly promising natural fungicides against the destructive blast fungus Magnaportheoryzae. Heliyon. 2023;9(4):e15113. pmid:37123971
- 48. Eberhardt J, Santos-Martins D, Tillack AF, Forli S. AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. J Chem Inf Model. 2021;61(8):3891–8. pmid:34278794
- 49. Guleria V, Pal T, Sharma B, Chauhan S, Jaiswal V. Pharmacokinetic and molecular docking studies to design antimalarial compounds targeting Actin I. Int J Health Sci (Qassim). 2021;15(6):4–15. pmid:34916893
- 50. Rose PW, Prlić A, Altunkaya A, Bi C, Bradley AR, Christie CH, et al. The RCSB protein data bank: integrative view of protein, gene and 3D structural information. Nucleic Acids Res. 2017;45(D1):D271–81. pmid:27794042
- 51. Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997;18(15):2714–23. pmid:9504803
- 52. Peterson LE. Small Molecule Docking of DNA Repair Proteins Associated with Cancer Survival Following PCNA Metagene Adjustment: A Potential Novel Class of Repair Inhibitors. Molecules. 2019;24(3):645. pmid:30759820
- 53. Cotner-Gohara E, Kim I-K, Hammel M, Tainer JA, Tomkinson AE, Ellenberger T. Human DNA ligase III recognizes DNA ends by dynamic switching between two DNA-bound states. Biochemistry. 2010;49(29):6165–76. pmid:20518483
- 54. Šudomová M, Hassan STS, Khan H, Rasekhian M, Nabavi SM. A Multi-Biochemical and In Silico Study on Anti-Enzymatic Actions of Pyroglutamic Acid against PDE-5, ACE, and Urease Using Various Analytical Techniques: Unexplored Pharmacological Properties and Cytotoxicity Evaluation. Biomolecules. 2019;9(9):392. pmid:31438631
- 55. Daina A, Michielin O, Zoete V. SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep. 2017;7:42717. pmid:28256516
- 56. Pires DEV, Blundell TL, Ascher DB. pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures. J Med Chem. 2015;58(9):4066–72. pmid:25860834
- 57. Banerjee P, Kemmler E, Dunkel M, Preissner R. ProTox 3.0: a webserver for the prediction of toxicity of chemicals. Nucleic Acids Res. 2024;52(W1):W513–20. pmid:38647086
- 58. Lipinski CA. Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol. 2004;1(4):337–41. pmid:24981612
- 59. Hasnat S, Hoque MN, Mahbub MM, Sakif TI, Shahinuzzaman ADA, Islam T. Pantothenate kinase: A promising therapeutic target against pathogenic Clostridium species. Heliyon. 2024;10(14):e34544. pmid:39130480
- 60. Jorgensen WL, Tirado-Rives J. Monte Carlo vs Molecular Dynamics for Conformational Sampling. J Phys Chem. 1996;100(34):14508–13.
- 61. Hossain MS, Hasnat S, Akter S, Mim MM, Tahcin A, Hoque M, et al. Computational identification of Vernonia cinerea-derived phytochemicals as potential inhibitors of nonstructural protein 1 (NSP1) in dengue virus serotype-2. Front Pharmacol. 2024;15:1465827. pmid:39474614
- 62. Islam MA, Hossain MS, Hasnat S, Shuvo MH, Akter S, Maria MA, et al. In-silico study unveils potential phytocompounds in Andrographis paniculata against E6 protein of the high-risk HPV-16 subtype for cervical cancer therapy. Sci Rep. 2024;14(1):17182. pmid:39060289
- 63. Hasnat S, Rahman S, Alam MB, Suin FM, Yeasmin F, Suha T. High-throughput screening reveals potential inhibitors targeting trimethoprim-resistant dfrA1 protein in Klebsiella pneumoniae and Escherichia coli. bioRxiv. 2024;2024:18.624070.
- 64. Hasnat S, Rahman S, Alam MB, Suin FM, Yeasmin F, Suha T, et al. High throughput screening identifies potential inhibitors targeting trimethoprim resistant DfrA1 protein in Klebsiella pneumoniae and Escherichia coli. Sci Rep. 2025;15(1):7141. pmid:40021806
- 65. Hasnat S, Hoque MN, Mahbub MM, Jummah JB, Ali J, Sakif TI, et al. In-silico identification and characterization of effector proteins in the rice blast pathogen Magnaporthe oryzae. Computational and Structural Biotechnology Reports. 2025;2:100028.
- 66. Bullinger L, Döhner K, Bair E, Fröhling S, Schlenk RF, Tibshirani R, et al. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N Engl J Med. 2004;350(16):1605–16. pmid:15084693
- 67.
Zwyea S, Naji L, Almansouri S. Kaplan-Meier plotter data analysis model in early prognosis of pancreatic cancer. In: Journal of Physics: Conference Series, 2021.
- 68. Köberle B, Koch B, Fischer BM, Hartwig A. Single nucleotide polymorphisms in DNA repair genes and putative cancer risk. Arch Toxicol. 2016;90(10):2369–88. pmid:27334373
- 69. Vignal A, Milan D, SanCristobal M, Eggen A. A review on SNP and other types of molecular markers and their use in animal genetics. Genet Sel Evol. 2002;34(3):275–305. pmid:12081799
- 70. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA Jr, Kinzler KW. Cancer genome landscapes. Science. 2013;339(6127):1546–58. pmid:23539594
- 71. Deller MC, Kong L, Rupp B. Protein stability: a crystallographer’s perspective. Acta Crystallogr F Struct Biol Commun. 2016;72(Pt 2):72–95. pmid:26841758
- 72. Wang Z, Moult J. SNPs, protein structure, and disease. Hum Mutat. 2001;17(4):263–70. pmid:11295823
- 73. Lukas J, Lukas C, Bartek J. More than just a focus: The chromatin response to DNA damage and its role in genome integrity maintenance. Nat Cell Biol. 2011;13(10):1161–9. pmid:21968989
- 74. Dobson CM. Protein folding and misfolding. Nature. 2003;426(6968):884–90. pmid:14685248
- 75. Zhang Y, Leung AK, Kang JJ, Sun Y, Wu G, Li L, et al. A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology. Nat Commun. 2025;16(1):975. pmid:39856048
- 76. Tan CSH, Bodenmiller B, Pasculescu A, Jovanovic M, Hengartner MO, Jørgensen C, et al. Comparative analysis reveals conserved protein phosphorylation networks implicated in multiple diseases. Sci Signal. 2009;2(81):ra39. pmid:19638616
- 77. Goode EL, Ulrich CM, Potter JD. Polymorphisms in DNA repair genes and associations with cancer risk. Cancer Epidemiol Biomarkers Prev. 2002;11(12):1513–30. pmid:12496039
- 78. Tjong H, Zhou HX. Displar: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Research. 2007;35(5):1465–77.
- 79. Dill KA, MacCallum JL. The protein-folding problem, 50 years on. Science. 2012;338(6110):1042–6. pmid:23180855
- 80. Williams RS, Green R, Glover JM. Crystal structure of the BRCT repeat region from the breast cancer-associated protein BRCA1. Nature Structural Biology. 2001;8(10):838–42.
- 81.
Haber J. Genome stability: DNA repair and recombination. Garland Science. 2013.
- 82. Simsek D, Furda A, Gao Y, Artus J, Brunet E, Hadjantonakis A-K, et al. Crucial role for DNA ligase III in mitochondria but not in Xrcc1-dependent repair. Nature. 2011;471(7337):245–8. pmid:21390132
- 83. Caldecott KW. Single-strand break repair and genetic disease. Nat Rev Genet. 2008;9(8):619–31. pmid:18626472
- 84. Pannunzio NR, Watanabe G, Lieber MR. Nonhomologous DNA end-joining for repair of DNA double-strand breaks. J Biol Chem. 2018;293(27):10512–23. pmid:29247009
- 85. Nissar S, Sameer AS, Rasool R, Rashid F. DNA repair gene--XRCC1 in relation to genome instability and role in colorectal carcinogenesis. Oncol Res Treat. 2014;37(7–8):418–22. pmid:25138303
- 86. Yadav AK, Murthy TPK, Divyashri G, Prasad N D, Prakash S, Vaishnavi V V, et al. Computational screening of pathogenic missense nsSNPs in heme oxygenase 1 (HMOX1) gene and their structural and functional consequences. J Biomol Struct Dyn. 2024;42(10):5072–91. pmid:37434323
- 87. Sharma B, Jaiswal V, Khan MA. In silico Approach for Exploring the Role of AT1R Polymorphism on its Function, Structure and Drug Interactions. Curr Comput Aided Drug Des. 2021;17(7):927–35. pmid:33100208
- 88. Ganesan A, Coote ML, Barakat K. Molecular dynamics-driven drug discovery: leaping forward with confidence. Drug Discov Today. 2017;22(2):249–69. pmid:27890821
- 89. O’Donovan PJ, Livingston DM. BRCA1 and BRCA2: breast/ovarian cancer susceptibility gene products and participants in DNA double-strand break repair. Carcinogenesis. 2010;31(6):961–7. pmid:20400477
- 90. Devidas SB, Rahmatkar SN, Singh R, Sendri N, Purohit R, Singh D, et al. Amelioration of cognitive deficit in zebrafish by an undescribed anthraquinone from Juglans regia L.: An in-silico, in-vitro and in-vivo approach. European Journal of Pharmacology. 2021;906:174234.
- 91. Singh R, Manna S, Nandanwar H, Purohit R. Bioactives from medicinal herb against bedaquiline resistant tuberculosis: removing the dark clouds from the horizon. Microbes Infect. 2024;26(3):105279. pmid:38128751
- 92. Kumar S, Bhardwaj VK, Singh R, Purohit R. Structure restoration and aggregate inhibition of V30M mutant transthyretin protein by potential quinoline molecules. Int J Biol Macromol. 2023;231:123318. pmid:36681222
- 93. Ray A, Di Felice R. Protein-Mutation-Induced Conformational Changes of the DNA and Nuclease Domain in CRISPR/Cas9 Systems by Molecular Dynamics Simulations. J Phys Chem B. 2020;124(11):2168–79. pmid:32079396
- 94. Zhou H, Gao M, Skolnick J. Comprehensive prediction of drug-protein interactions and side effects for the human proteome. Sci Rep. 2015;5:11090. pmid:26057345