Figures
Abstract
Fibroblast growth factor receptor 1 (FGFR1) is recognized as an oncogene that fosters tumor development, playing a vital role in cancer progression. This has established it as a promising target for cancer drug development. However, existing FGFR1 inhibitors are often limited by drug resistance and lack of specificity, emphasizing the need for more selective and potent alternatives. To address this challenge, the present study employed an AI-driven virtual screening approach, integrating molecular docking (MD) and molecular dynamics simulations (MDS) to discover novel FGFR1 inhibitors. A voting classifier integrating three machine learning classifiers was utilized to screen 10 million compounds from the eMolecules database, leading to 44 promising candidates with a prediction probability exceeding 80%. MD identified compound with PubChem Compound Identifier (CID) 165426608 (−10.8 kcal/mol) as the highest-scoring ligand, while compounds with CID 145940129 (−9.8 kcal/mol), CID 131910163 (−9.4 kcal/mol), CID 155915988 (−9.2 kcal/mol), and CID 132423733 (−9.1 kcal/mol), exhibited binding affinities comparable to or slightly lower than that of the native ligand (−10.4 kcal/mol). MDS further revealed that all these compounds, except CID 131910163, maintained structural stability with time. Thermodynamic stability assessment confirmed the spontaneity and feasibility of their complex formation reactions with negative ΔGBFE values ranging from −21.87 to −12.76 kcal/mol. Decomposition of binding free energy change further provided key stabilizing residues. The heatmaps and histograms of the interaction over the full 200 ns simulation period highlighted the prominent interaction profiles. Structural similarity analysis of the four MDS-stable compounds displayed the dice similarity scores of 0.200000 to 0.452830 with known FGFR1 inhibitors. Additionally, the pIC50 prediction using a voting regressor indicated promising pIC50 values (7.07 to 7.47), highlighting their potential as hit candidates for further structural optimization and therapeutic development. Further, this study underscores the efficiency of machine learning-based virtual screening and in silico analysis as a cost-effective and reliable strategy for accelerating hit drug discovery from large datasets, even with limited resources and time.
Citation: Shrestha RL(, Tamang A, Poudel Chhetri S, Parajuli N, Poudel M, M. C. S, et al. (2025) AI-assisted discovery of potent FGFR1 inhibitors via virtual screening and in silico analysis. PLoS One 20(9): e0331837. https://doi.org/10.1371/journal.pone.0331837
Editor: Ahmed A. Al-Karmalawy, University of Mashreq, IRAQ
Received: April 21, 2025; Accepted: August 21, 2025; Published: September 11, 2025
Copyright: © 2025 Shrestha et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files. Further raw data can be made available on request.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Cancer represents a critical global health challenge, accounting for one in every six deaths worldwide. In 2022, around 20 million individuals were newly diagnosed with cancer, and approximately 9.7 million deaths were attributed to cancer-related diseases [1]. Despite significant advancements in different therapeutic approaches such as chemotherapy, hormone therapy, and immunotherapy, cancer mortality rates remain alarmingly high. This is mainly because of the complex genetic and phenotypic diversity of cancer, and the emergence of drug-resistant phenotypes [2].
The fibroblast growth factor receptor (FGFR) signaling axis plays a crucial role in transducing signals that govern various cellular processes, including proliferation, angiogenesis, differentiation, embryonic development, migration, organogenesis, and survival [3]. Members of the fibroblast growth factor receptor family, including FGFR1, frequently undergo genomic alterations such as mutations, amplifications, and gene fusions across various cancer types [4]. FGFR1, in particular, has been extensively studied and recognized as an oncogene that fosters tumor development, underscoring its critical role in cancer progression [5]. Overexpression of FGFR1 has been observed in cancers such as breast [6,7], lung [8], ovarian [9,10], bladder [11], prostate [12,13], and gastric cancers [14], among others. Consequently, targeting FGFR1 for cancer therapy has become an appealing therapeutic strategy [15]. To date, drugs such as Regorafenib, Nintedanib, Sorafenib, Lenvatinib, Erdafitinib, Pemigatinib, Infigratinib, and Futibatinib have received FDA approval for FGFR1 inhibition [16–18]. These inhibitors work by reducing FGFR1 activity, which is often overexpressed in certain cancers, thereby impeding tumor growth. However, the efficacy of these inhibitors is limited due to challenges like drug resistance and lack of specificity [19]. Therefore, the development of novel inhibitors with enhanced effectiveness and reduced side effects remains a significant challenge.
Traditional drug discovery methods rely heavily on in vivo experiments and in vitro screening, which are both costly and labor-intensive [20]. Preclinical drug discovery constitutes approximately one-third of the drug development expenses and usually takes nearly five and a half years [21,22]. The high failure rate during drug development further exacerbates the costs. As a result, methodologies that can reliably predict success at early stages are critically valuable. Computer-aided drug design (CADD) has emerged as a transformative approach in this domain [23]. By employing in silico techniques, CADD accelerates drug discovery and reduces the time required for identifying leads and introducing new drugs. These methods also enable the prediction of biological activity for chemical compounds against specific targets [24]. In this study, an Artificial Intelligence (AI)-driven virtual screening approach of millions of molecules was adopted to advance FGFR1-targeted drug discovery [25,26]. By leveraging AI’s ability to analyze vast datasets and predict drug efficacy in a relatively short time span, this study aims to streamline the drug development process and increase the likelihood of identifying alternate treatments [27]. Fig 1 outlines the detailed workflow adopted in this study.
2. Materials and methods
2.1. Data collection and curation
The Chemical European Molecular Biology Laboratory (ChEMBL) database was used to obtain the Simplified Molecular Input Line Entry System (SMILES) representations and half-maximal inhibitory concentration (IC50) values for 2,153 FGFR1 inhibitors [28]. The IC50 value represents the concentration of a compound required to inhibit a specific biological process or activity by 50% which serves as a preliminary guide for selecting efficient and biologically active molecules. After filtering entries without IC50 values, retaining bioactivity data measured in nanomolar (nM), and removing duplicates, 1876 data points remained. The IC50 values were transformed into pIC50 values using negative logarithms to standardize the data. Lipinski’s Rule of Five (RO5) was applied to assess drug-likeness and exclude less potent compounds [29,30], resulting in 1523 data points for model training. Radar plots depicting the physicochemical properties of the filtered dataset are shown in Fig 2.
2.2. Model building and database screening
Molecular fingerprints [31], encoded as numerical vectors or bit-strings, facilitate rapid similarity evaluations critical for virtual screening [31,32], structure-activity relation studies, and chemical space mapping [33]. Using the RDKit toolkit [34], fingerprints from SMILES entries were computed, and the dataset was classified into 813 active and 710 inactive compounds (1523 total) using a pIC50 threshold of 7.0, as a cut-off ranging from 5 to 7 has been recommended [26]. Based on the Morgan3 protocol, which employs 2048 bits as a circular fingerprint [35], machine learning models were constructed using Scikit-learn and XGBoost. The Morgan3 fingerprints (radius = 3) encode molecular features extending up to three bonds from each atom, allowing the capture of broader substructural patterns that may play a vital role in determining biological activity. Scikit-learn is a versatile machine-learning library that provides a diverse range of algorithms for classification, regression, clustering, and dimensionality reduction tasks [36]. XGBoost is an advanced library tailored for the fast and scalable execution of gradient-boosting algorithms [37]. Twenty classification models were trained, and the best-performing models were fine-tuned to create a voting classifier, amplifying accuracy and robustness in comparison with the individual models [38]. A similar approach was applied for building a voting regressor. The voting classifier was then employed to screen 10 million compounds from the eMolecules database [39,40]. Compounds with invalid SMILES, those violating the Rule of Five (RO5), and Pan-Assay Interference Compounds (PAINS) were excluded prior to screening.
Classification models’ performance was evaluated using accuracy, precision, sensitivity, specificity, and Area Under Curve (AUC) metrics, calculated based on the confusion (error) matrix [41]. Regressors were assessed according to mean absolute error (MAE), root-mean-squared error (RMSE), and R2 scores [42]. Multiple steps were used systematically for model training and database screening to identify potential molecules.
2.3. Molecular docking calculations
The 44 potential inhibitors with prediction probabilities above 80% obtained from the screening of 10 million compounds from the eMolecules database were selected as hit ligands. The 3D structures of 35 compounds available in the PubChem database (https://pubchem.ncbi.nlm.nih.gov/) [43] were retrieved in SDF format, while the remaining 9 compounds were drawn using their SMILES strings. The molecular formulas were verified using the Avogadro program (v1.2.0) [44] after adding the hydrogen atoms. Energy minimization was carried out using the UFF force field with 5000 steps employing a conjugate gradient algorithm, ensuring energy convergence at 1.0 × 10−8 kcal/mol. This process was repeated until the global minima was reached. The bond orders, including double bond positions, were examined, and steric hindrance or stress was removed. Finally, the optimized ligands were converted to PDBQT format with Gasteiger charges using AutoDock Tools [45].
The 3D crystal structure of FGFR1 (PDB ID: 4ZSA, DOI: https://doi.org/10.2210/pdb4ZSA/pdb) with X-ray crystallographic resolution of 2.00 Å was obtained from the RCSB database (https://www.rcsb.org/) [46]. Missing amino acid residues were repaired using the SwissModeling server (https://swissmodel.expasy.org/) [47], where model_01 of template 5B7V.1.A (global model quality estimate: 0.88, qualitative model energy analysis with distance constraints: 0.83 ± 0.05) was selected due to its 100% sequence identity. The finalized protein structure was converted to PDBQT format with the addition of polar hydrogens and Kollman charges using the AutoDock Tools. The apo form of the protein was then utilized as the target for computational analyses.
The molecular docking calculations of the ligands with the FGFR1 protein were done using the user-friendly software AutoDock Vina [48], with the same protocol as outlined by Phunyal et al. [49], with slight modifications of parameters. A grid box size of (50, 50, 50) Å3, the grid center at (x: 5.060, y: − 0.501, z: 16.013), an energy range of 4, and 20 number of modes were used with an exhaustiveness (converged) of 64. The five protein-ligand complexes with the top binding affinities were saved in pdb format and utilized for MDS. The binding interaction between the protein and ligand was visualized using the PyMOL [50] program and the protein-ligand interaction profiler (PLIP) [51] web server.
2.4. Molecular dynamics simulations
The GROMACS (version 2021.2) software [52] was used to simulate the protein-ligand complex, with the CHARMM36 force field [53] applied to the receptor, while the ligand parameters were derived from the SwissParam server [54]. The system was solvated using the TIP3P water model in a triclinic box with a 12 Å spacing at the sides, to prevent any unwanted effects caused by repetition of the simulation box. Neutralization was achieved by adding counter ions, followed by the inclusion of an isotonic NaCl solution (0.15 M). Equilibration was carried out in four steps with each of 1 ns at 310 K and 1 bar, with the first two using the NVT ensemble and the last two using NPT. The V-rescale thermostat, which is a modified version of the Berendsen method, was used for the temperature coupling, while pressure coupling was applied through the isotropic Parrinello-Rahman approach. The final 200 ns production run, with a 2 fs step size, was performed without constraints on the protein-ligand complex. Additional parameter details for different system setups can be found in the literature [55–58]. The complex was centered and analyzed using GROMACS built-in modules to obtain geometric parameters such as snapshots, root mean square deviation (RMSD) of the ligand and protein backbone, root mean square fluctuation (RMSF), radial pair distribution function (RPDF), radius of gyration (Rg), and solvent-accessible surface area (SASA).
The thermodynamic stability of the protein-ligand complexes was evaluated using the Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) binding free energy calculations [59], following the parameters as utilized by Shrestha et al. [60]. A 20 ns equilibrated segment from the 200 ns molecular dynamics trajectory was used for this purpose. Calculations were carried out using the MMPBSA module [61], which applies the Poisson–Boltzmann solvation model. Binding free energy was computed at 100 ps intervals to assess the stability, spontaneity, and feasibility of complex formation over time. The spontaneity and viability of the forward reaction were evaluated based on the sign of the free energy changes. To investigate the contribution of individual amino acid residues to binding free energy change, decomposition analysis was performed on the same 20 ns equilibrated segment using the g_mmpbsa tool [62], which allowed the decomposition of the total binding energy to identify key stabilizing and destabilizing residues. The associated gmx_MMPBSA_ana subprogram was used for data analysis and visualization.
Additionally, to understand residue-level interaction dynamics over the entire course of the simulation, amino acid interaction heatmaps and histograms were generated using the full 200 ns trajectory (20,000 frames).
2.5. Similarity analysis
Based on the principle that structurally similar compounds often share chemical and biological properties, a similarity analysis between potential and known FGFR1 inhibitors [63] was conducted. The relevance of this analysis is closely tied to the nature of structure-activity relationships (SARs) that define biologically active molecules, serving as key factors for the success of ligand-based virtual screening, regardless of the methods employed [64]. Using Morgan2 fingerprints, the similarity maps based on the dice similarity metric were generated, highlighting structural features influencing biological activity [65,66]. Morgan2 fingerprints (radius = 2) encode molecular features extending up to two bonds, emphasizing localized structural variations. In similarity maps, Morgan2 is beneficial for highlighting key local substructures that impact activity, enhancing the interpretability of the visualization.
2.6. Computational resources
All the calculations, including plot generation, were executed on high-performance multiprocessor systems. The machine learning computations were conducted on a system featuring 96 cores, 256 GB of RAM, a 16 GB GPU accelerator, and running Ubuntu 20.04 LTS. Meanwhile, MD and MDS were performed on a system with 24 cores, a 24 GB GPU accelerator, and running Ubuntu 20.04 LTS. The analysis and the visualization of the data were done on a personal computer with Windows 11 operating system.
3. Results and discussion
3.1. Model evaluation and screening results
Twenty classification models were evaluated, and their performance metrics are summarized in Table 1.
The Support Vector Classifier (SVC), ExtraTreesClassifier (ET), and Extreme Gradient Boosting Classifier (XGB) demonstrated superior accuracy and AUC scores, leading to their integration into a voting classifier with a soft voting mechanism. The soft voting mechanism is an ensemble learning technique that predicts the final class by averaging the probability estimates from multiple models and selecting the class with the highest mean probability [67]. For SVC, the parameters were set to C = ‘2.0’ and probability = ‘True’; for ET, n_estimators = ‘400’, criterion = ‘log_loss’, and max_features = ‘log2’; for XGB, n_estimators = ‘1000’, max_depth = ‘5’, and learning_rate = ‘0.04’; all other parameters were set to their default values.
Fig 3 presents confusion matrices for individual and voting classifiers, while Table 2 summarizes the five-fold cross-validation results.
ROC curves in Fig 4 illustrate the excellent discrimination ability of these models.
The AUC scores show that all the classifiers have excellent discrimination abilities between active and inactive compounds. Additionally, a voting classifier was tested on an external test set, which consisted of FDA-approved selective inhibitors of FGFR1- Erdafitinib, Pemigatinib, Infigratinib, and Futibatinib. The model classified all these drugs as active with high prediction probabilities (> 90%), further demonstrating the reliability of our model. Based on these results, the voting classifier was used to screen the eMolecules database, identifying 44 compounds with prediction probabilities above 80% as potential active inhibitors of FGFR1 protein.
3.2. Docking score comparison and interaction analysis
The molecular docking protocol was validated by docking the native ligand into the apo protein’s active site. The pose of the docked native ligand from the molecular docking was superimposed on its pose from the crystal structure, as shown in Fig 5, resulting in a heavy atom RMSD of 0.397 Å (< 2 Å) [68]. This confirmed the parameters, algorithm, and ligand poses, justifying the numerical method adapted with the capability of reproducing the natural process.
The binding affinities and poses of ligands interacting with the FGFR1 protein were obtained from molecular docking calculations. The 44 ligands obtained by the virtual screening of 10 million compounds were subjected to molecular docking calculations against the FGFR1 receptor to assess their potential for competitive inhibition. Ligand M34 exhibited the highest binding affinity (−10.8 kcal/mol), outperforming the native ligand (−10.4 kcal/mol). Additionally, four ligands, M29 (−9.8 kcal/mol), M26 (−9.4 kcal/mol), M32 (−9.2 kcal/mol), and M28 (−9.1 kcal/mol), demonstrated binding affinities comparable to or slightly lower than that of the native ligand. The binding affinity of 44 ligands is presented in Table 3 along with their molecular formula, PubChem chemical identifier (CID), and parent ID.
The mode of interactions was studied for the complexes of M26, M28, M29, M32, and M34 with the protein. The 3D representations of the complexes, as shown in Fig 6, demonstrated that all top five ligands were bound at the catalytic site of the protein, suggesting the competitive inhibitors.
To better understand the details at the molecular level, the bonding interactions between the top five docked ligands and key amino acid residues, along with the distances, were studied (Table 4) and the interaction profiles are presented in Fig 7. The interaction analysis revealed several key hydrophobic interactions between the ligands and the protein’s amino acid residues, along with hydrogen bonds.
The molecular interaction analysis revealed that all ligand-protein complexes (M26-M34) were primarily stabilized by hydrophobic interactions, especially with residues Leu27, Val35, Val104, Leu173, and Phe185, which appeared frequently across the top complexes and the native complex, except M29-complex. Among these, residues Leu27, Leu173, and Phe185 were the most consistently conserved hydrophobic residues, underlining their importance in ligand anchoring. Notably, residue Asp184 (3–4 Å) contributed to hydrogen bonding in M26 and M32, while M26 further exhibited additional hydrogen bonds with Phe185 (3.93 Å) and Gly186 (4.08 Å), suggesting a stabilizing role of Asp184 that was not observed in the native complex.
In contrast, the native complex formed distinct hydrogen bonds with Glu105 (2.80 Å) and Ala107 (2.87, 2.98 Å), interactions not replicated by the ligand-bound complexes. Despite this, the native complex shared hydrophobic contacts with residues Leu27, Val104, Leu173, and Phe185, which were also involved in complexes of M26, M28, M32, and M34, indicating partial overlap in binding site occupation. M29-complex exhibited a unique interaction pattern, including hydrogen bonds (Lys57: 3.06 Å, Asp195: 2.90 Å), a π-stacking interaction with Phe32, and distinct hydrophobic residues, implying a different binding orientation. Additionally, M28 uniquely formed a halogen bond with residue Ile88 (3.51 Å) via a fluorine atom.
Overall, hydrophobic interactions emerged as the key driving force for ligand binding, with residues Leu27, Val104, Leu173, and Phe185 serving as conserved contributors across multiple complexes. In addition, hydrogen bond distances ranging from 2.90 to 4.03 Å indicated strong to moderate binding affinity, as shorter bond lengths generally correlate with stronger interaction [69]. While most protein-ligand complexes exhibited similar hydrophobic interaction patterns with the native complex, M29 displayed a distinct binding profile, which may influence its receptor modulation potential.
3.3. Adduct stability with time (spatial and energetic)
Understanding the spatial and energetic stability of the adduct is crucial for evaluating the inhibitory potential of ligands on the FGFR1 protein. To achieve this, MDS was performed for 200 ns. The structural and interaction stability of the protein-ligand complexes for the top five ligands was analyzed by examining various time-dependent parameters. The spontaneity and feasibility of the complex formation reactions for the top five ligands were determined in terms of changes in binding free energy. Both geometric and thermodynamic parameters are discussed in the following sections.
3.3.1. Structural stability assessment.
The binding of the ligand to the protein can induce structural alterations in both the protein and the ligand, which may affect the stability of the complex and hence the inhibition mechanism [70]. The stability of the top five ligand complexes was evaluated by analyzing various computational metrics from MDS trajectories, which provided insight into the structural integrity of the protein-ligand complexes. The metrics include the study of ligand pose (snapshot), RMS deviation of ligands and protein backbone relative to protein backbone, RMS fluctuation of the α-carbon atoms, RPDF, Rg, and SASA which are discussed next.
Dynamic insights into ligand behavior at the protein active site through MDS: Snapshots were taken at various time intervals during the MDS to examine the orientation and position of the docked ligands, providing insight into the stability of the complex’s geometry over time. Detailed images of the top four complexes at five distinct instances are presented in Fig 8.
Snapshots taken at 1, 50, 100, 150, and 200 ns showed that most ligands remained at the same location but with variations in orientation, except for a few cases. For the M28-complex, the ligand exhibited distinct rotational motion starting at 1 ns, accompanied by a slight upward position shift from 100 ns till the end due to translational movement. The protein backbone displayed some motion, particularly in the α-helix and loops (on the right side) from 50 ns onward, with subtle movement in the central β-sheet. In the case of M29-complex, the ligand underwent significant positional and orientational changes at the active site, with pronounced rotation at 50 ns and minimal rotational motion thereafter. From 100 ns onward, the ligand moved slightly upward, eventually returning to its position at 50 ns by 200 ns. The protein backbone showed the movement of the right α-helix (absent before but observed after 50 ns till the end), along with motions in the left β-sheet and central loops. In the M32-complex, the ligand depicted minimal delocalization and rotation until 100 ns, followed by rotation and a slight downward shift in the position along with the noticeable motion of the α-helix and loops (150 ns). In the M34-complex, the ligand remained stable with some rotation for the first 50 ns. After this period, it slightly shifted upward along with the protein backbone, maintaining minimal rotational and translational movement until the end of the simulation. The protein backbone showed notable dynamics, including the disappearance of the left α-helix after 1 ns and upward movement of the top lying α-helix (100 ns). The loops fluctuated throughout the simulation. The M26-complex was excluded from Fig 8 as the ligand showed displacement from the orthosteric site after 150 ns, suggesting weak binding or instability that compromised complex integrity. This indicated that the binding affinity of −9.4 kcal/mol, even though better than that observed for M32 and M28 complexes, was not sufficient to retain the pose and position at the active site. This implies that MD does not necessarily provide information about the stability of the complexes.
Periodic monitoring of adducts’ dynamical behavior provided valuable insights into molecular evolution, which could be linked to specific structural descriptors. Overall, the results demonstrated that the ligand’s pose and the protein backbone’s structural integrity were nearly preserved across the top four complexes (M28, M29, M32, and M34) with minimal structural changes and no major disruptions, suggesting the stability of the complexes. On the other hand, the M26-complex was unstable due to ligand displacement. These findings can be correlated with the RMSD and RPDF curves, which will be discussed next.
RMSD of ligand and protein backbone in the complex: The stability and dynamic behavior of the protein-ligand complexes were analyzed by examining the RMSD of the ligands and protein backbones. The RMSD for both the ligands and protein backbones with respect to the protein backbone was calculated from the MDS trajectories of various complexes and is displayed in Figs 9 and 10, respectively.
The RMSD of ligands relative to the protein backbone (Fig 9) provides insight into the extent of the conservation of the pose over time. The RMSD profiles of M28 (blue) and M29 (red) in their respective complexes exhibited smooth trajectories, with average RMSD of 0.38 ± 0.11 nm and 0.56 ± 0.08 nm, respectively. A slight increase in fluctuation was observed for M29 after 155 ns, attributed to an orientation shift of the ligand beyond 150 ns, as illustrated in (Fig 8). In the case of M34 (magenta), the RMSD trajectory was relatively flat after 95 ns, whereas M32 (maroon) depicted a moderate RMSD curve throughout the simulation period with some fluctuation after 120 ns. The fluctuation before 95 ns in the M34 complex and after 120 ns in M32 can be corroborated with the ligand’s orientation as seen previously in snapshots (Fig 8). The average RMSD for M34 and M32 was determined to be 0.63 ± 0.16 nm and 0.49 ± 0.16 nm, respectively. Conversely, ligand M26 (green) initially displayed a stable trajectory up to 170 ns, followed by a sharp increase in RMSD, reaching approximately 2.0 nm. This trend suggested an unstable nature of the complex, which aligns with the interpretation from the snapshots.
The backbones (Fig 10) of M28 (blue) and M34 (magenta) followed similar trajectories, with average RMSD of 0.31 ± 0.07 nm and 0.34 ± 0.08 nm, respectively, closely matching that of the apo protein (black = 0.30 ± 0.05 nm). In contrast, the backbones of M29 (red) and M32 (maroon) exhibited slightly lower average RMSD of 0.26 ± 0.04 nm and 0.29 ± 0.03 nm, respectively. The observed spikes in the RMSD plots corresponded to minor backbone adjustments, as depicted in the snapshots (Fig 8). These findings suggest that the protein backbone remained largely stable across the four complexes, indicating that ligand binding had minimal impact on its overall structure compared to the apo form. Since the M26 was unstable, the protein backbone of the M26-complex was not discussed.
From the analysis of the RMSD profiles, it was found that all ligands except M26 were bound at the protein’s active site till the end, narrowing the selection of top candidates from five to four. Among the four, ligands M28 and M29 resulted in the most stable complexes, as reflected by their low and consistent RMSD. M34-complex exhibited good stability, whereas M32-complex demonstrated moderate stability. In contrast, the complex with M26 showed significant instability, with a sharp increase in RMSD after 170 ns, indicating a loss of stable binding, and therefore, protein backbone analysis was omitted. The protein maintained the sturdy geometry across the top four complexes, capable of holding the ligand at its catalytic site. Hence, four ligands, except M26, could potentially inhibit the functioning of the FGFR1 protein.
Radial pair distribution function (RPDF): The radial pair distribution function (RPDF) describes how the distance between two entities varies over time. In its reduced form [g(r)], it represents the probability of finding the ligand’s center of mass at a distance r from the protein’s center of mass [71]. Fig 11 represents the RPDF plots for different protein-ligand complexes, which have been derived from the MDS trajectories.
The RPDF plot revealed distinct binding behavior of the ligands relative to the protein’s center of mass. M32 (maroon), M28 (blue), and M34 (magenta) displayed two peaks at different distances, whereas M29 (red) exhibited a single peak. For M32, two peaks with a taller one at ca. 0.9 nm and a shorter one at ca. 1.2 nm were observed. Similarly, a tall peak at ca. 1.1 nm and a short peak at ca. 1.4 nm were observed for M28. The two different peaks indicated occupancy at two distinct locations, with the longer peak suggesting a preference for a shorter ligand-protein distance. On the other hand, M34 displayed two peaks of comparable height at ca. 1.0 nm and 1.2 nm, implying that it occupied two distinct locations within the protein’s active site for most of the simulation period. The presence of these peaks supported the minor variations in ligand position and orientation within the complex, as observed in Fig 8. In contrast, the M29 (red) complex exhibited a single peak at ca. 1.4 nm, indicating the localization of the ligand’s center of mass relative to the protein’s center of mass throughout the simulation. The occurrence of RPDF maxima at ca. 1.0 nm for the top four ligands indicates that the orthosteric site remained occupied throughout most of the simulation period. These results indicate that after binding to the orthosteric pocket of the receptor protein, the ligands remained largely localized within the site, possibly inhibiting the protein’s regular function. Thus, the RPDF analysis effectively evaluated ligand stability over time, reinforcing earlier conclusions drawn from structural snapshots.
For M26 (green), a single peak was observed at ca. 1.0 nm, but the presence of further smaller broad peaks afterward indicated the delocalization, which supported the instability of the ligand, as previously noted in snapshots and RMSD analysis. Since ligand M26 exhibited instability, further geometrical parameter analysis for this ligand was not conducted.
Fluctuation of α-carbon in the protein backbone of the complex: The root mean square fluctuation (RMSF) of the α-carbon atoms was calculated from the MDS trajectory to identify the flexible and rigid regions of the FGFR1 protein after ligand binding. The RMSF curves for the five ligand-protein complexes, along with that of the apo form (Fig 12), displayed a similar nature of the plot.
The RMSF was below ca. 0.8 nm for all top four complexes, whereas it was ca. 1.0 nm for the apo form, indicating the stability of protein geometry [72]. Higher RMSF were observed at the terminal ends and within three specific loop regions around residues 45, 130, and 200. The increased flexibility in these regions can be attributed to the absence of α-helix or β-sheet structures, which typically restrict molecular motion and reduce degrees of freedom [55]. Since these flexible regions do not play a significant role in ligand interactions or disruptions, their high RMSF does not indicate structural instability. Therefore, for clarity in the plot, only residues ranging from 12 to 300 were considered, excluding the terminal ends. The RMSF profiles indicated that the α-carbon atom fluctuations exert minimal influence on the ligand’s binding affinity within the active site. Consequently, the stability of the adduct remained unaffected, which potentially may lead to the inhibition of protein activity.
Gyradius (Rg) and solvent-accessible surface area (SASA): The gyradius (Rg), derived from the MDS trajectory, was used to evaluate the protein’s compactness and backbone conformational changes. It represents the average distance of the macromolecule’s components from its central axis, which is a crucial indicator of the stability of the protein-ligand complex [73]. The Rg plot for the protein backbone of the four ligand-protein complexes (Fig 13) revealed similar stability patterns across all systems, with Rg values ranging from ca. 2.00 to 2.15 nm, indicating no significant structural expansion or contraction during the simulation period. In contrast, the M29 (red = 2.05 ± 0.01 nm) and M32 (maroon = 2.06 ± 0.02 nm) complexes displayed some fluctuations, particularly between 60 ns and 145 ns, similar to that of the apo form (black = 2.06 ± 0.02 nm) and M34 (magenta = 2.07 ± 0.02 nm). A pronounced fluctuation in the case of the M34 complex can be correlated with minor changes in ligand orientation and α-helix positioning at 100 ns (Fig 8). Overall, the Rg of the top four complexes closely matched the apo form, suggesting no significant receptor expansion or shrinkage upon ligand binding. These findings indicate that the protein maintained structural integrity even after ligand binding, suggesting that the ligands may contribute to target protein inhibition.
Solvent-accessible surface area (SASA) is a crucial parameter for evaluating protein-solvent interactions, as it measures the exposure of protein residues to water molecules [74]. Changes in SASA can influence the protein’s structure, dynamics, and function [75]. SASA analysis was conducted to evaluate the effect of ligand binding on the conformational behavior of the FGFR1 protein over 200 ns MD simulations (Fig 14). The SASA ranged from 155 to 185 nm2 for most complexes, showing similar trends to that of the apo form (black), with an exception for the M34 complex (magenta = 172.94 ± 4.48 nm2) which exhibited slightly higher SASA due to minor surface adjustments after 80 ns. The M28 (blue = 166.00 ± 3.32 nm2), M29 (red = 166.83 ± 2.59 nm2), and M32 (maroon = 168.59 ± 3.75 nm2) complexes displayed comparable average SASA to the apo form (165.47 ± 3.13 nm2), supporting the stable surface geometry. The minimal variations observed (below 5 nm2) suggest that ligand binding did not significantly alter the protein’s hydrophobic regions or shape, ensuring consistent solvent accessibility and reinforcing the structural stability of the complexes for most of the cases, as discussed previously.
3.3.2. Thermodynamic stability assessment of the protein-ligand complexes.
The spontaneity and feasibility of complex formation reactions were evaluated by analyzing the binding free energy changes in the equilibrated segment of the MDS trajectory (20 ns, 200 frames) for the top four (MDS stable) ligand adducts, as outlined in Table 5. The table reflects the degree of spontaneity in the complex formation reactions from the discrete protein and ligand. The negative ΔGBFE (ΔGBFE < 0) signifies the spontaneity of the complex formation reaction, and a smaller value corresponds to higher stability [76].
All top-ranked protein-ligand complexes exhibited negative ∆GBFE (ranging from −21.87 to −12.76 kcal/mol), affirming the spontaneity and feasibility of the complex formation reactions. Among them, the M34-complex demonstrated the highest thermodynamic stability, with the lowest ∆GBFE of −21.87 ± 3.98 kcal/mol. Analysis of the thermodynamic components revealed that the solvent contribution from the Poisson-Boltzmann model (pb) posed a significant destabilizing influence across all the complexes. Nevertheless, this adverse effect was effectively mitigated by substantial positive contributions from electrostatic (el), van der Waals (vdW), and non-polar (np) interactions. These findings suggest that the top four ligands exhibit a natural propensity to associate with the FGFR1 receptor, forming energetically favorable and stable complexes throughout the simulation period. Notably, M34 stood out as the most stable adduct based on its overall energy profile.
To further dissect the energetic contributions at the residue level, decomposition analysis was performed. M29 exhibited the most favorable binding energy (−13.57 ± 2.00 kcal/mol), primarily stabilized by hydrophobic residues such as Phe32, Gly33, and Val35. Minor destabilizing effects were noted from residues Asp184 and Lys57. Similarly, M28 showed strong binding affinity (−10.12 ± 1.98 kcal/mol), with stabilizing contributions from Asn111, Val35, and Leu173, while residues Lys57, Glu74, and Glu105 depicted unfavorable effects. The M34-complex displayed moderate binding energy (−9.87 ± 2.00 kcal/mol), supported by residues Val35, Val104, Asn171, Leu173, and Ala183, and opposed by interactions with Lys57 and Asp184. In contrast, M32 demonstrated the weakest binding (−6.57 ± 1.27 kcal/mol), with stabilizing hydrophobic contacts at residues Leu27, Val35, and Leu173, but significant destabilization from Asp184 and Lys57. Detailed residue-wise free energy contributions are presented in Supplementary Information S1 Table in S1 File. Corresponding bar plots and heatmaps highlighting the interactions between active site residues of FGFR and the respective ligands are shown in Supplementary Information S1-S5 Figs in S1 File, respectively. In these heatmaps, blue shades denote favorable (negative value) contributions, while red shades represent unfavorable (positive value) ones [62].
Overall, the results indicate that hydrophobic and polar residues, particularly Val35 and Leu173, played dominant roles in stabilizing the ligand-FGFR complexes. M29 emerged as the most promising ligand candidate based on per-residue energetic contributions, while M34 demonstrated the greatest thermodynamic stability based on ∆GBFE. Collectively, these findings suggest that all top ligands formed energetically stable complexes with FGFR1, with M34 and M29 showing particularly strong and favorable interactions. These results highlight their potential as promising FGFR1-targeted inhibitors, although further experimental validation is required to confirm their therapeutic efficacy.
3.4. Protein-ligand interaction heatmaps and histograms
To understand the residue-level binding dynamics, amino acid interaction heatmaps and histograms were generated for the 200 ns (20,000 frames) simulation period and provided in Supplementary Information S6-S13 Figs in S1 File. The analyses revealed that all four ligands complexed with the FGFR1 protein predominantly exhibited stable van der Waals (vdW) interactions, with sporadic hydrogen bonding observed in some cases.
In the M28-complex, the ligand maintained strong and consistent vdW interactions with residues Leu27, Glu105, Ala107, and Ser108. Among these, Ser108 and Glu105 accounted for the highest number of contacts, followed by Ala107and Leu27, suggesting their central role in ligand stabilization. M29 showed stable vdW contacts with residues Gly33, Gln34, and Met58, while transient interactions were observed with Gly28 and Glu29 residues. A hydrogen bond with residue Asn20 emerged after frame 13,600. Residues Gly33 and Gln34 were the most frequently engaged residues in the M29-complex, with moderate contacts involving residues Met58, Gly29, Gly28, and Asn202. In the M32-complex, the ligand demonstrated stable vdW contacts with residues Ala107 and Ser108 post frame 11300, along with moderate transient interactions involving residues Leu27 and Glu29. Residue Leu27 also formed intermittent hydrogen bonds, further contributing to ligand retention. The residue Ser108 was the most frequently interacting residue in this complex, followed by Ala107 and Leu27. For the M34-complex, stable vdW interactions were primarily observed with residues Glu29, Leu27, Gly18, and Arg170. Occasional hydrogen bonding was also noted with residue Leu27. The corresponding bar plot/histogram indicated residue Glu29 as the most engaged residue, followed by Leu27, with moderate contributions from Arg170, Glu105, and Gly28 residues.
Collectively, these findings underscore the importance of stable vdW contacts, particularly with residues Glu29, Gly33, Glu105, and Ser108 in securing ligand binding and maintaining complex stability. Their frequent engagement across multiple complexes highlights their potential relevance as key anchoring sites in FGFR1-ligand interactions.
3.5. Similarity measure analysis
The chemical similarity of four top candidate compounds (as suggested by MDS) against known FGFR1 drugs was evaluated. Fig 15 shows how structural modifications influence similarities, providing insights into the potential effectiveness of new compounds using a two-color scheme to highlight conserved and divergent features [77]. Green regions represent structural elements shared with reference drugs, suggesting retention of critical pharmacophores needed for FGFR1 inhibition, while pink areas indicate novel modifications or distinct scaffolds [66].
The analysis revealed that M32 exhibits strong green overlap with infirgatinib, implying it likely maintains a similar binding mode and could serve as a high-priority candidate for lead optimization. Compounds M28 and M29 displayed a mix of green and pink regions, indicating partial structural conservation with opportunities for hybrid scaffold development to fine-tune selectivity or potency. In contrast, M34 showed prominent pink regions, suggesting unique structural features that may either introduce novel binding mechanisms or require further validation to confirm target specificity. The overall results suggest that M32 is the most promising candidate due to its strong structural similarity to known FGFR1 inhibitors, while M28-M34 offer opportunities for optimization or novel scaffold exploration.
The dice similarity index calculated between the known and potential inhibitors of FGFR1 is shown in Table 6.
The dice similarity index revealed that M32 has the strongest structural resemblance to infirgatinib (0.4532830), making it the most promising candidate. M28 showed moderate similarity across all inhibitors, with the highest value for infirgatinib (0.313043). M29 exhibited comparable similarity to erdafitinib (0.251656) and pemigatinib (0.294872), indicating shared structural features. Meanwhile, M34 had the lowest overall similarity, peaking with futibatinib (0.263158). Overall, these results suggest that all hit compounds are viable leads, with M32 standing out as the most promising, warranting further optimization to enhance potency.
3.6. Prediction of pIC50 values
In addition to classification models, a regression model was used to estimate the pIC50 values of the candidate compounds. Table 7 summarizes the performance metrics of the top 20 regression models.
Light Gradient Boosting Machine (LGBM), Hist Gradient Boosting (HGB), and Gamma Regressor (GR) were combined into a voting regressor for superior performance. For LGBM, the parameters were set to n_estimators = ‘600’ and learning_rate = ‘0.05’; for HGB, max_iter = ‘400’ and learning_rate =’0.05’; for GR, max_iter = ‘500’ and alpha = ‘0.02’; all other parameters were set to their default values.
The five-fold cross-validation of individual regression models and the voting regressor is shown in Table 8.
Experimental versus predicted pIC50 values are shown in Fig 16, while Table 9 compares these values for known and potential inhibitors.
The predicted pIC50 values of the hit candidate compounds (7.07–7.47), while lower than those of FDA-approved selective FGFR1 inhibitors, demonstrate promising lead-like potency. These values suggest that the compounds possess potential as FGFR1 inhibitors and could benefit from further structural optimization. Collectively, the findings support their candidacy for continued validation and development in FGFR1-targeted drug discovery.
Overall, the study identified four hit drug candidates as FGFR1 inhibitors by screening a large dataset. The combination of machine learning-based virtual screening with an in silico method has been found to accelerate the preliminary drug discovery process, enabling efficient analysis of extensive datasets within a short period despite the resource constraints. This integration has not only improved accuracy but also ensured the reliable identification of high-potential drug candidates for FGFR1 inhibition, streamlining the selection for further studies.
4. Conclusion
The study demonstrates the effectiveness of combining AI-guided screening with molecular docking and dynamics simulations in identifying structurally stable and energetically favorable FGFR1 inhibitors. Four hit molecules were identified from a pool of 10 million molecules using this approach. The added insights from per-residue energy decomposition and long-term interaction histogram profiling offered valuable mechanistic insights into ligand-residue interactions, reinforcing the reliability of the identified candidates. These computational results offer a cost effective and quick foundation for further investigation. Future work will focus on pharmacophore modelling, structural optimization of hit compounds and their biological evaluation through in vitro and in vivo studies to support preclinical development.
Supporting information
S1 File. Supplementary data.
This file contains additional data supporting the findings of this study.
https://doi.org/10.1371/journal.pone.0331837.s001
(DOCX)
Acknowledgments
The author(s) hereby declare that Artificial Intelligence (AI) technology (ChatGPT) has been used during the preparation of the work to improve the readability and language of the manuscript. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the published article.
References
- 1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–63. pmid:38572751
- 2. Urbach D, Lupien M, Karagas MR, Moore JH. Cancer heterogeneity: origins and implications for genetic association studies. Trends Genet. 2012;28(11):538–43. pmid:22858414
- 3. Ornitz DM, Itoh N. The fibroblast growth factor signaling pathway. Wiley Interdiscip Rev Dev Biol. 2015;4(3):215–66. pmid:25772309
- 4. Perez-Garcia J, Muñoz-Couselo E, Soberino J, Racca F, Cortes J. Targeting FGFR pathway in breast cancer. Breast. 2018;37:126–33. pmid:29156384
- 5. Liu Q, Huang J, Yan W, Liu Z, Liu S, Fang W. FGFR families: biological functions and therapeutic interventions in tumors. MedComm (2020). 2023;4(5):e367. pmid:37750089
- 6. Turner N, Pearson A, Sharpe R, Lambros M, Geyer F, Lopez-Garcia MA, et al. FGFR1 amplification drives endocrine therapy resistance and is a therapeutic target in breast cancer. Cancer Res. 2010;70(5):2085–94. pmid:20179196
- 7. Shi Y, Ma Z, Cheng Q, Wu Y, Parris AB, Kong L, et al. FGFR1 overexpression renders breast cancer cells resistant to metformin through activation of IRS1/ERK signaling. Biochim Biophys Acta Mol Cell Res. 2021;1868(1):118877. pmid:33007330
- 8. Wang K, Ji W, Yu Y, Li Z, Niu X, Xia W, et al. FGFR1-ERK1/2-SOX2 axis promotes cell proliferation, epithelial-mesenchymal transition, and metastasis in FGFR1-amplified lung cancer. Oncogene. 2018;37(39):5340–54. pmid:29858603
- 9. Gorringe KL, Jacobs S, Thompson ER, Sridhar A, Qiu W, Choong DYH, et al. High-resolution single nucleotide polymorphism array analysis of epithelial ovarian cancer reveals numerous microdeletions and amplifications. Clin Cancer Res. 2007;13(16):4731–9. pmid:17699850
- 10. Lee Y-Y, Ryu J-Y, Cho Y-J, Choi J-Y, Choi J-J, Choi CH, et al. The anti-tumor effects of AZD4547 on ovarian cancer cells: differential responses based on c-Met and FGF19/FGFR4 expression. Cancer Cell Int. 2024;24(1):43. pmid:38273381
- 11. Ross JS, Wang K, Al-Rohil RN, Nazeer T, Sheehan CE, Otto GA, et al. Advanced urothelial carcinoma: next-generation sequencing reveals diverse genomic alterations and targets of therapy. Mod Pathol. 2014;27(2):271–80. pmid:23887298
- 12. Edwards J, Krishna NS, Witton CJ, Bartlett JMS. Gene amplifications associated with the development of hormone-resistant prostate cancer. Clin Cancer Res. 2003;9(14):5271–81. pmid:14614009
- 13. Ko J, Meyer AN, Haas M, Donoghue DJ. Characterization of FGFR signaling in prostate cancer stem cells and inhibition via TKI treatment. Oncotarget. 2021;12(1):22–36. pmid:33456711
- 14. Schäfer MH, Lingohr P, Sträßer A, Lehnen NC, Braun M, Perner S, et al. Fibroblast growth factor receptor 1 gene amplification in gastric adenocarcinoma. Hum Pathol. 2015;46(10):1488–95. pmid:26239623
- 15. Katoh M. Therapeutics Targeting FGF Signaling Network in Human Diseases. Trends Pharmacol Sci. 2016;37(12):1081–96. pmid:27992319
- 16. Zhang P, Yue L, Leng Q, Chang C, Gan C, Ye T, et al. Targeting FGFR for cancer therapy. J Hematol Oncol. 2024;17(1):39. pmid:38831455
- 17. Touat M, Ileana E, Postel-Vinay S, André F, Soria J-C. Targeting FGFR signaling in cancer. Clin Cancer Res. 2015;21(12):2684–94. pmid:26078430
- 18. Du S, Zhang Y, Xu J. Current progress in cancer treatment by targeting FGFR signaling. Cancer Biol Med. 2023;20:490–9.
- 19. Facchinetti F, Hollebecque A, Braye F, Vasseur D, Pradat Y, Bahleda R, et al. Resistance to selective FGFR inhibitors in FGFR-driven urothelial cancer. Cancer Discov. 2023;13(9):1998–2011. pmid:37377403
- 20. Hinkson IV, Madej B, Stahlberg EA. Accelerating therapeutics for opportunities in medicine: a paradigm shift in drug discovery. Front Pharmacol. 2020;11:770. pmid:32694991
- 21. Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discov. 2010;9(3):203–14. pmid:20168317
- 22. Shah FA, Qadir H, Khan JZ, Faheem M. A review: from old drugs to new solutions: the role of repositioning in alzheimer’s disease treatment. Neuroscience. 2025;576:167–81. pmid:40164279
- 23. Yu W, MacKerell AD. Computer-Aided Drug Design Methods. In: Sass P, editor. Antibiotics: Methods and Protocols. New York, NY: Springer; 2017. pp. 85–106.
- 24. Sydow D, Morger A, Driller M, Volkamer A. TeachOpenCADD: a teaching platform for computer-aided drug design using open source packages and data. J Cheminform. 2019;11(1):29. pmid:30963287
- 25. Chhetri SP, Bhandari VS, Maharjan R, Lamichhane TR. Identification of lead inhibitors for 3CLpro of SARS-CoV-2 target using machine learning based virtual screening, ADMET analysis, molecular docking and molecular dynamics simulations. RSC Adv. 2024;14(40):29683–92. pmid:39297030
- 26. Salimi A, Lim JH, Jang JH, Lee JY. The use of machine learning modeling, virtual screening, molecular docking, and molecular dynamics simulations to identify potential VEGFR2 kinase inhibitors. Sci Rep. 2022;12(1):18825. pmid:36335233
- 27. Schneider P, Walters WP, Plowright AT, Sieroka N, Listgarten J, Goodnow RA Jr, et al. Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov. 2020;19(5):353–64. pmid:31801986
- 28. Zdrazil B, Felix E, Hunter F, Manners EJ, Blackshaw J, Corbett S, et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 2024;52(D1):D1180–92. pmid:37933841
- 29. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Delivery Rev. 1997;23(1–3):3–25.
- 30. Doak BC, Over B, Giordanetto F, Kihlberg J. Oral druggable space beyond the rule of 5: insights from drugs and clinical candidates. Chem Biol. 2014;21(9):1115–42. pmid:25237858
- 31. Cereto-Massagué A, Ojeda MJ, Valls C, Mulero M, Garcia-Vallvé S, Pujadas G. Molecular fingerprint similarity search in virtual screening. Methods. 2015;71:58–63. pmid:25132639
- 32. Willett P. Similarity-based virtual screening using 2D fingerprints. Drug Discov Today. 2006;11(23–24):1046–53. pmid:17129822
- 33. Boldini D, Ballabio D, Consonni V, Todeschini R, Grisoni F, Sieber SA. Effectiveness of molecular fingerprints for exploring the chemical space of natural products. J Cheminform. 2024;16(1):35. pmid:38528548
- 34.
Landrum G. Rdkit: Open-source cheminformatics software. 2016. Available from: https://www.rdkit.org/
- 35. Kwon S, Bae H, Jo J, Yoon S. Comprehensive ensemble in QSAR prediction for drug discovery. BMC Bioinformatics. 2019;20(1):521. pmid:31655545
- 36. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12: 2825–30.
- 37. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: Association for Computing Machinery; 2016. pp. 785–94.
- 38. Ruta D, Gabrys B. Classifier selection for majority voting. Inf Fusion. 2005;6(1):63–81.
- 39. Lima AN, Philot EA, Trossini GHG, Scott LPB, Maltarollo VG, Honorio KM. Use of machine learning approaches for novel drug discovery. Expert Opin Drug Discov. 2016;11(3):225–39. pmid:26814169
- 40.
eMolecules. [cited 4 Jan 2024]. Available from: https://search.emolecules.com/
- 41. Luque A, Carrasco A, Martín A, de las Heras A. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 2019;91:216–31.
- 42. Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci. 2021;7:e623. pmid:34307865
- 43. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem 2023 update. Nucleic Acids Res. 2023;51(D1):D1373–80. pmid:36305812
- 44. Hanwell MD, Curtis DE, Lonie DC, Vandermeersch T, Zurek E, Hutchison GR. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. J Cheminform. 2012;4(1):17. pmid:22889332
- 45. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem. 2009;30(16):2785–91. pmid:19399780
- 46. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–42. pmid:10592235
- 47. Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46(W1):W296–303. pmid:29788355
- 48. Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31(2):455–61. pmid:19499576
- 49. Phunyal A, Adhikari A, Adhikari Subin J. In silico exploration of potent flavonoids for dengue therapeutics. PLoS One. 2024;19(12):e0301747. pmid:39666626
- 50. Yuan S, Chan HCS, Hu Z. Using PyMOL as a platform for computational drug design. WIREs Comput Mol Sci. 2017;7(2).
- 51. Adasme MF, Linnemann KL, Bolz SN, Kaiser F, Salentin S, Haupt VJ, et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 2021;49(W1):W530–4. pmid:33950214
- 52. Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1–2:19–25.
- 53. Huang J, MacKerell AD Jr. CHARMM36 all-atom additive protein force field: validation based on comparison to NMR data. J Comput Chem. 2013;34(25):2135–45. pmid:23832629
- 54. Zoete V, Cuendet MA, Grosdidier A, Michielin O. SwissParam: a fast force field generation tool for small organic molecules. J Comput Chem. 2011;32(11):2359–68. pmid:21541964
- 55. Sharma BP, Adhikari Subin J, Marasini BP, Adhikari R, Pandey SK, Sharma ML. Triazole based Schiff bases and their oxovanadium(IV) complexes: Synthesis, characterization, antibacterial assay, and computational assessments. Heliyon. 2023;9(4):e15239. pmid:37089299
- 56. Neupane P, Adhikari Subin J, Adhikari R. Assessment of iridoids and their similar structures as antineoplastic drugs by in silico approach. J Biomol Struct Dyn. 2024:1–16. pmid:38345021
- 57. Subin JA, Shrestha RLS. Computational Assessment of the Phytochemicals of Panax ginseng C.A. Meyer Against Dopamine Receptor D1 for Early Huntington’s Disease Prophylactics. Cell Biochem Biophys. 2024;82(4):3413–23. pmid:39046621
- 58. Lal Swagat Shrestha R, Marasini BP, Adhikari Subin J. Phytochemicals of Swertia chirayita Roxb. ex Fleming against malarial dihydroorotate dehydrogenase: an in silico study. Discov Mol. 2024;1(1).
- 59. Wang E, Sun H, Wang J, Wang Z, Liu H, Zhang JZH, et al. End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. Chem Rev. 2019;119(16):9478–508. pmid:31244000
- 60. Lal Swagat Shrestha R, Maharjan B, Shrestha T, Prasad Marasini B, Adhikari Subin J. Geometrical and thermodynamic stability of govaniadine scaffold adducts with dopamine receptor D1. Results Chem. 2024;7:101363.
- 61. Valdés-Tresanco MS, Valdés-Tresanco ME, Valiente PA, Moreno E. gmx_MMPBSA: A New Tool to Perform End-State Free Energy Calculations with GROMACS. J Chem Theory Comput. 2021;17(10):6281–91. pmid:34586825
- 62. Kumari R, Kumar R, Open Source Drug Discovery Consortium, Lynn A. g_mmpbsa--a GROMACS tool for high-throughput MM-PBSA calculations. J Chem Inf Model. 2014;54(7):1951–62. pmid:24850022
- 63. Bender A, Glen RC. Molecular similarity: a key technique in molecular informatics. Org Biomol Chem. 2004;2(22):3204–18. pmid:15534697
- 64. Eckert H, Bajorath J. Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today. 2007;12(5–6):225–33. pmid:17331887
- 65. Bero SA, Muda AK, Choo YH, Muda NA, Pratama SF. Similarity measure for molecular structure: a brief review. J Phys: Conf Ser. 2017;892:012015.
- 66. Riniker S, Landrum GA. Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods. J Cheminform. 2013;5(1):43. pmid:24063533
- 67. Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int J Cogn Comput Eng. 2021;2:40–6.
- 68. Ramírez D, Caballero J. Is it reliable to take the molecular docking top scoring position as the best solution without considering available structural data? Molecules. 2018;23(5):1038. pmid:29710787
- 69. Fargher HA, Sherbow TJ, Haley MM, Johnson DW, Pluth MD. C-H⋯S hydrogen bonding interactions. Chem Soc Rev. 2022;51(4):1454–69. pmid:35103265
- 70. Almoyad MAA, Wahab S, Ansari MN, Ahmad W, Hani U, Chandra S. Predictive insights into plant-based compounds as fibroblast growth factor receptor 1 inhibitors: a combined molecular docking and dynamics simulation study. J Biomol Struct Dyn. 2024;:1–10. pmid:38669200
- 71. Mandle RJ. Implementation of a cylindrical distribution function for the analysis of anisotropic molecular dynamics simulations. PLoS One. 2022;17(12):e0279679. pmid:36584026
- 72. Aljarba NH, Hasnain MS, Bin-Meferij MM, Alkahtani S. An in-silico investigation of potential natural polyphenols for the targeting of COVID main protease inhibitor. J King Saud Univ Sci. 2022;34(7):102214. pmid:35811756
- 73. Falsafi-Zadeh S, Karimi Z, Galehdari H. VMD DisRg: New User-Friendly Implement for calculation distance and radius of gyration in VMD program. Bioinformation. 2012;8(7):341–3. pmid:22553393
- 74. Das R, Bhattarai A, Karn R, Tamang B. Computational investigations of potential inhibitors of monkeypox virus envelope protein E8 through molecular docking and molecular dynamics simulations. Sci Rep. 2024;14(1):19585. pmid:39179615
- 75. Ghahremanian S, Rashidi MM, Raeisi K, Toghraie D. Molecular dynamics simulation approach for discovering potential inhibitors against SARS-CoV-2: a structural review. J Mol Liq. 2022;354:118901. pmid:35309259
- 76. Dhital S, Parajuli N, Poudel M, Shrestha T, Bharati S, Maharjan B, et al. Spatial and Energetic Stability Assessment of the Adducts of Phytocompounds of Piper longum L. with α-amylase by Computational Approach. Biointerface Res Appl Chem. 2024;14(6):126.
- 77. Riniker S, Landrum GA. Open-source platform to benchmark fingerprints for ligand-based virtual screening. J Cheminform. 2013;5(1):26. pmid:23721588