Figures
Abstract
Human Immunodeficiency Virus type 1 protease (HIV-1 PR) is one of the most challenging targets of antiretroviral therapy used in the treatment of AIDS-infected people. The performance of protease inhibitors (PIs) is limited by the development of protease mutations that can promote resistance to the treatment. The current study was carried out using statistics and bioinformatics tools. A series of thirty-three compounds with known enzymatic inhibitory activities against HIV-1 protease was used in this paper to build a mathematical model relating the structure to the biological activity. These compounds were designed by software; their descriptors were computed using various tools, such as Gaussian, Chem3D, ChemSketch and MarvinSketch. Computational methods generated the best model based on its statistical parameters. The model’s applicability domain (AD) was elaborated. Furthermore, one compound has been proposed as efficient against HIV-1 protease with comparable biological activity to the existing ones; this drug candidate was evaluated using ADMET properties and Lipinski’s rule. Molecular Docking performed on Wild Type, and Mutant Type HIV-1 proteases allowed the investigation of the interaction types displayed between the proteases and the ligands, Darunavir (DRV) and the new drug (ND). Molecular dynamics simulation was also used in order to investigate the complexes’ stability allowing a comparative study on the performance of both ligands (DRV & ND). Our study suggested that the new molecule showed comparable results to that of darunavir and maybe used for further experimental studies. Our study may also be used as pipeline to search and design new potential inhibitors of HIV-1 proteases.
Citation: Baassi M, Moussaoui M, Soufi H, Rajkhowa S, Sharma A, Sinha S, et al. (2023) Towards designing of a potential new HIV-1 protease inhibitor using QSAR study in combination with Molecular docking and Molecular dynamics simulations. PLoS ONE 18(4): e0284539. https://doi.org/10.1371/journal.pone.0284539
Editor: Arabinda Ghosh, Gauhati University, INDIA
Received: November 9, 2022; Accepted: April 1, 2023; Published: April 20, 2023
Copyright: © 2023 Baassi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper.
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Human Immunodeficiency Virus (HIV) is one of the most challenging viruses in medicine, causing severe complications related to human health [1]. HIV which is responsible for Acquired Immunodeficiency Syndrome (AIDS), still has no cure for more than three decades [2]. This is the main reason why synthesized drugs have been used in combinations to treat HIV infection [3,4]. Highly active antiretroviral therapy (HAART) attacks multiple stages of the HIV viral life cycle and stops the virus from making copies of itself in the body thus leading to a reduction in mortality and morbidity rates of HIV/AIDS [3,5–7].
Antiretroviral therapy plays an essential role in the treatment of HIV/AIDS, but the accelerated evolution of multidrug-resistant (MDR) strains of HIV-1 protease (PR) and poor oral bioavailability and side effects have firmly restricted long-term treatment benefits [8,9].
PIs are supposed to overcome the replication of viruses. However, some residual viral activity endures throughout the therapy process, leading to the development of drug-resistant strains with various mutations that decrease protease affinity for the inhibitors. The mutations are detected not precisely inside the active site where they directly affect the inhibitor binding but also outside the binding site [10–12].
Corresponding to the International AIDS Society, 23 mutations in 16 codons of the protease gene relevant to significant drug resistance to PIs were highlighted using phenotypic resistance assays [13].
Therefore, the design of new HIV-1 PIs has become an obligation. In order to discover new drugs, looking forward to amplifying the inhibitory activity and according to the strategy to defeat drug resistance, a series of 33 compounds were synthesized and evaluated in previous work for their antiretroviral activities [14]. The primary purpose of this study is to develop a Quantitative Structure Activity Relationship (QSAR) model able to relate the structural features (descriptors) to the biological activity of these drug candidates against HIV-1 protease.
The QSAR method is based on computational methods, aiming at relating the activity (y) to the chemical properties (x), y = f(x) [15]. To achieve this, we need a series of compounds with well-known biological activities (y), and for each compound, we compute several descriptors (x) using various software, incorporating the DFT method [16,17].
Once the QSAR model is elaborated and statistically validated, it can be used for the prediction, analysis, and estimation of new elements with convenient activities, minimizing time, effort, and charges [18]. The flow chart mentioned above (Fig 1) covers an overview of the multiple axes elaborated along with this research.
Material and methods
Chemical compounds and descriptors
Ten HIV-1 protease inhibitors have been approved by the Food and Drug Administration (FDA), but the emergence of multidrug-resistant (MDR) strains has limited long-term treatment options [19–22]; therefore, the search for new efficient drugs has become a necessity.
Thirty-three new compounds were synthesized and evaluated in a previous study to determine their optimal biological activity [14] (S1 Table in S1 File).
Meanwhile, this work is based on computing various descriptors (Topological, Constitutional, Geometrical, Physicochemical, and Quantum) of the compounds mentioned above using several software packages (Gaussian, Chem3D, ChemSketch, and MarvinSketch).
Descriptive analysis
The computed descriptors must be analyzed to generate a computational model that relates the biological activity of these compounds to the structure (descriptors).
To do so, we used both methods; the first one is called Principal Component Analysis (PCA), the main purpose of which is to delete correlated descriptors, so we lower the dimension of the data representation area. The second one is a clustering method, called k-means partitioning, used to split the dataset into training sets for model generation and test sets for validation.
Statistical analysis
Multiple linear regression.
Multiple linear regression analysis is a statistical technique based on several analytical independent variables called descriptors to anticipate the outcome of a response variable (biological activity); it is selected to asset a linear model relating the activity (dependent variable) to descriptors having high correlation with the response (activity) [23].
The linear model takes the form that follows: Where; Y represents the biological activity (dependent variable), a0 is the intercept of the equation, xi is the molecular descriptors, and ai is their coefficients.
Model generation.
A QSAR model was generated using XLSTAT software after analyzing the data with both methods (PCA and K-means) [24], which after validation, were used to anticipate the activity of brand-new compounds that can be more efficient as HIV-1 protease inhibitors.
In order to assess the physicochemical influence of the substituents (structure/descriptors) on the biological activity, we introduced the dataset along with descriptors corresponding to the 33 compounds listed previously and their biological activities to an MLR analysis.
To choose the first-rate regression performance, we use several coefficients; r, r2, r2adj, MSE and Pvalue [25,26], where r represents the correlation coefficient, r2 is the coefficient of determination, r2adj is the coefficient adjusted for degrees of freedom, MSE is the mean squared error, and Pvalue is the probability of Fisher statistics.
Model validation.
The model generated by MLR analysis must be validated to evaluate its significance and ensure its accuracy prediction ability. In order to achieve this, we use internal and external validation.
Internal validation. Also called leave-one-out cross-validation (LOOCV), whereby one element is removed from the training set, and the remaining compounds are used to rebuild a model; then it will be returned to the training set, and another compound will be removed, the model generated will be used to predict the activity of the removed one and the cycle is repeated until all compounds have been detached one by one, in the end, a correlation coefficient Q2 is computed [27].
External validation. Besides the internal validation, external validation is primordial; the k-means clustering method allowed us to divide the dataset into training and test sets. The second one was employed in this stage. The obtained model will be used to investigate the activities of the test set compounds, and the regression coefficient (R2cv) value will be computed [28].
Applicability domain.
The model was obtained based on the training set, so it is valid only with compounds with similarities as compared to those included in the training set. Therefore, new molecules must belong to the training domain. A model without an applicability domain can presume the activity of all compounds, regardless of their features, compared to those counted in the aberrant training set. So the AD is a tool to detect compounds outside the applicability domain of the obtained QSAR model and the outliers in the training set [29].
Molecular docking
Molecular Docking is an important technique used to preview the binding affinities for a vast number of small molecules, with the protease generating several conformations of the ligand-protease complex that will be ranked based on their affinity [30].
The main purpose of molecular docking study is to assess the binding energy as well as the interaction types between the ligands and the protease [31].
ADMET properties
The Absorption, Distribution, Metabolism, Elimination, and Toxicity (ADMET) properties are crucial for the effectiveness and safety of a therapeutic compound. More than 50% of practical clinical tests are unsuccessful due to the insufficiency in ADMET properties [32]. Therefore, computing ADMET properties using various servers in the drug design field can significantly shorten the probability of drug evolution failure.
These properties can be predicted using many servers, such as pkCSM [33] and SwissADME [34]. The obtained properties contain drug-likeness prediction based on Lipinski’s rule. When compounds meet Lipinski’s rule with a bioavailability score of 0.55 they will be considered as sufficiently absorbable via oral route [35,36].
Molecular dynamics simulation
Molecular dynamics simulation is the most incredible tool to predict the properties of new particles and their motion [37]. In this work, we aim to predict the dynamics information between the HIV1-protease and the proposed ligand in order to check the stability results of the docked complex [38]. For the Molecular Dynamics Simulations and MM-PBSA calculations, a similar methodology performed in a previous study was adopted [39].
Results and discussions
Chemical compounds
A series of thirty-three compounds (inhibitors with purine base amine-acetamide as P2-ligands) synthesized and evaluated for their biological activities in previous work are the key elements in the current research; their molecular structures are listed in the ST1 Table in S1 File.
Dependent variable values
The experiment IC50, biological activity values, were transformed to the negative logarithm of IC50, using the following equation: pIC50 = -log (IC50). The results are listed in the table below (Table 1).
Descriptors generation
Several softwares were used to compute various descriptors such as Gaussian, Chem3D, ChemSketch and MarvinSketch, but only some descriptors correlated with the activity were used in minimizing the size of the data representation space. Considering the quantum descriptors, they were investigated using DFT approach performed by Gaussian 09 program package; employing for this purpose the hybrid method B3LYP combining the Becke’s three-parameter and the Lee-Yang-Parr exchange-correlation functional, using as well 6-31G (d,p) basis set, performing the optimization of the compounds geometries ultimately while all the other parameters were computed using Chem3D, Chemsketch and MarvinSketch software (S2 Table in S1 File).
Principle component analysis
Using the Principal Component Analysis, the size of the data representation space was reduced using descriptors that show a correlation coefficient with the activity higher than 0.1 in absolute value (Table 2), as well as the absence of collinearity between descriptors used to elaborate the model, was inspected by the correlation matrix.
K-Means Cluster Analysis (k-MCA)
A clustering method, called k-means partitioning, was used to cut the dataset into a training set for model generation and a test set for its validation (Table 3). The data set is divided into five clusters. Five compounds are selected randomly, one from each cluster, to form the test set (16f, 17g, 18d, 16b and 17h), while the remaining compounds will form the training set. The last one is the key element to generate the model, and the first one was used to validate it.
Multiple linear regression (MLR)
Model generation.
The model was elaborated using XLSTAT, statistical software, used as add-on for Excel.
MLR equation:
pIC50 = − 3–0.59*EGap+1.27*HLC- 0.033*PSA– 0.015*DE
Statistical parameters:
R2 = 0.66; R2Adj = 0.60; MSE = 0.18; P value<10−4; F = 11.23
For the model above, Pvalue is lower than 0.0001, which means that taking the risk of 0.01% by considering the null hypothesis (no effect of the descriptors on the activity) as wrong, therefore, we can assume that the model proposed includes variables with a representative amount of information. The higher values of R2 and R2Adj and the lower value of MSE show that the proposed model has a higher predictive ability and reliability.
The existence of multi-collinearity among the descriptors was investigated with a parameter called variance inflation factor (VIF), the highest value is less than ten (VIF = 0.62) which further confirmed the absence of multi-collinearity problem [40,41]. The table below shows the variance inflation factor values (Table 4).
Model interpretation:
In the proposed model, descriptors that are influencing the activity negatively are the Energy Gap (EGap), the Polar Surface Area (PSA) and the Dreiding Energy (DE), while only one parameter has a positive influence on the activity, which is Henry’s Law Constant (HLC).
- EGap displays a negative sign in the model, which means that increasing the activity requires minimizing EGap value, as well as PSA and DE.
- HLC shows a positive sign in the model, allowing us to conclude that increasing the activity is achieved by increasing HLC.
To sum up, the biological activity is influenced by four variables (EGap, HLC, PSA and DE). To increase the biological activity, EGap must be decreased, PSA as well as the DE while HLC is increased.
Internal and external validation.
The model proposed, despite its statistical parameters, must be validated following two steps:
- Internal validation (Y-randomization test):
The leave-one-out cross-validation technique obtains the model’s cross validation coefficient, the coefficient Q2LOO obtained is used as a proof of both robustness and predictive capacity of the model [42]. The given model’s robustness was confirmed with a cross validation value of 0.53 (Q2LOO = 0.53).
Y-randomization test
Y-scrambling is performed on the training set; it is used to confirm that the developed model was not a result of random correlation between the biological activity and the descriptors. In this analysis, the dataset is permuted; the biological activity values were randomly distributed while the descriptors matrix was unchanged, followed by MLR analysis generating new models [43].
For each randomization and subsequent MLR analysis, we obtain a new set of values for R2 Rand and Q2 Rand [44] (Table ST3). If the new QSAR models have lower determination coefficient (R2 Rand) and leave one out determination coefficient (QLOO2) values as well for several trials (100 times in this study), we consider the proposed QSAR model as robust. Moreover, if the cRp2 is greater than 0.5, it will be confirmed that the model is not a result of chance correlation [45,46].
For the current work, the average values of RRand, R2Rand and Q2cv (Rand) are 0.35, 0.14 and -0.29 respectively, the cRp2 value equals 0.60 which is higher than 0.5 (S3 Table in S1 File), and all the new QSAR models are showing significantly lower R2Rand and Q2cv (Rand) values for the 100 trials. Therefor Y-randomization analysis results are showing that there is no random correlation between the activity and the descriptors affecting significantly the response and the developed QSAR model is robust. - External validation:
The model then must be externally validated using the test set mentioned above, in this stage, the model proposed must conclude the activities of the test set compounds in arrangement with the experimental values (Table 5), graphically presented in the figure bellow (Fig 2). The predictive ability was confirmed with a test coefficient value of 0.64 (R2Test = 0.64).
Applicability domain (AD)
The standardized residuals and the leverage were both jointed to illustrate the applicability domain. The Williams plot for the QSAR model is illuminated in figure below (Fig 3). The warning leverage (h*) was found to be 0.45 for the developed QSAR model. Based on the leverages, all compounds were found to be inside the defined AD.
New drugs elaboration
In order to suggest new efficient compounds, we must select from the series of compounds used in the present work, those with the highest values of pIC50 (1.37, 1.38, 1.17, 1.10) corresponding to (16a, 16f, 16j and 16k) respectively. These particles will be the object of structural modification in order to design new molecules; their descriptors’ values are determined using the same tools as well as pIC50 values predicted by MLR model proposed. Furthermore, 24 compounds candidates were designed and their parameters were computed. The leverage values (hi) were computed using Matlab software with the following equation: hi = xiT (XTX)-1xi (i = 1, 2 … n) (S4 Table in S1 File).
With: xi represents the proposed compounds descriptors’ matrix, X represents the test set descriptors’ matrix and XT represents the transpose of the test set descriptors’ matrix.
Among the 24 compounds, only one compound (16th) has a leverage value (hi = 0.43) lower than h* (h* = 0.45) and a biological activity higher than the known ones (pIC50 = 1.58) (Fig 4).
ADMET properties
In the one hand, regarding Lipinski’s rule, the drug-likeness of the proposed compound was verified with only one violation (MW>500) (Table 6), which means that the proposed compound is considered as sufficiently absorbable via oral route with a bioavailability score of 0.55, in the other hand, ADMET properties predictions for the selected compound were performed using SwissADME and pkCSM web servers.
The pharmacokinetic parameters (ADMET) (absorption, distribution, metabolism, excretion, and toxicity) related to the brand-new drug are computed using pkCSM.
The absorption of the drug is primarily based on the factors that comply with; water-solubility, membrane permeability (Caco-2), intestinal absorption (human), skin permeability, p-glycoprotein. The drug distribution properties are expected from the data of volume distribution (VDss), the fraction of unbound drug, the blood-brain barrier (BBB), and central nervous system (CNS) permeability. For the biotransformation evaluation, participants of the cytochrome P450 (CYP) superfamily are selected (CYP 2D6, CYP 3A4, CYP 1A2, CYP 2C19, CYP 2C9, CYP 2D6 & CYP 3A4), while the excretion of compounds involves the total clearance of xenobiotics and renal clearance via organic cation transporter 2 (OCT 2). The toxicity of compounds is investigated using AMES toxicity; maximum tolerated dose, the human Ether-a-go-go Related Gene (hERG) potassium channel inhibition, oral rat acute toxicity, oral rat chronic toxicity, skin sensitization, T.Pyriformis toxicity and Minnow toxicity. Just a few of the important factors are mentioned in the present study, notably:
- Water solubility
For the oral administrative drugs discovery, water solubility prediction is highly required. The decimal logarithm of the molar solubility in water is -3.224 (log mol/L). Considering what follows (Insoluble < -10 < poorly soluble < -6 < Moderately < -4 < soluble < -2 < very soluble < 0 < highly soluble) [47], the compound has a good solubility in water, therefore the development and the production as well of oral solid dosage is possible. - Caco2 permeability
If the predicted Papp log value is higher than 0.90 10−6 cm/s [47], the compound is considered to have high Caco-2 permeability, for the drug candidate, it has for value 1.098 10−6, so we can say it has a high permeability in Caco-2. - Intestinal absorption (human)
The quantity absorbed of the drug candidate by the intestinal system is one of the major factors for oral bioavailability [48]. For the proposed compound, the intestinal absorption (human, % absorbed) seems to be 74.616%. - BBB permeability
The BBB permeability of the drug candidate has a value of -1.118 log BBB. According to the research [33], the compound is adept to cross the blood–brain barrier, if the Log BB value is higher than 0.3 and it can’t cross adequately the blood–brain barrier if the log BB value is lower than -1. Therefore, the drug candidate won’t be able to cross the blood-brain barrier. - CYP2D6 substrate
Drug that inhibit or compete for CYP2D6 can conduct clinical problems; this isoenzyme is highly polymorphic and is responsible for metabolizing relatively 25% of known pharmaceuticals [49]. In the current study, the drug candidate is not inhibitor of CYP2D6 enzymes. - Total Clearance
The compound has a total clearance of 0.288 log ml/min/kg, therefore, it could be excreted quickly [47]. - AMES toxicity
The compound is AMES negative and test suggests that the compound could be not mutagenic [47]. - hERG inhibitor
Drugs that block these HERG K+ channels are likely to cause cardiac toxicity [50].
The safe range for an ideal drug should be -5 or higher, if the value is below this level, it is predicted to cause cardiac toxicity [47]. - Oral Rat Acute Toxicity (LD50)
The proposed compound is dangerous only at huge doses regarding its high LD50 value (2,259 mol/kg) [50].
Molecular docking
Molecular docking study was carried out with the aim of predicting the best conformation of the HIV-1 protease of both types (mutant and wild), on the one hand; combined to the proposed compound as a new efficient drug candidate (ND), on the other hand; combined to an FDA approved drug called Darunavir (DRV). We selected both types of the HIV-1 protease (WT and MT) as receptors. The structures of the wild type (WT) as well as the mutant type (MT) proteases were downloaded from Protein Data Bank (PDB), their PDP codes are respectively: (4LL3-Structure of wild-type HIV-1 protease in complex with Darunavir) (Fig 5) and (3TTP-Structure of multiresistant HIV-1 protease in complex with Darunavir) (Fig 6). Their original ligands were eliminated using Discovery Studio, polar hydrogens were added and the proteins were saved in PDB format, and then saved in PDBQT format using Autodock MGL Tools. The ligand proposed as a new efficient drug was earlier designed and optimized using Gaussian, then saved in PDBQT format by Autodock MGL tools (Fig 7); in addition, DRV was taken from the crystal structures downloaded from Protein Data Bank (Fig 8).
Command prompt and Vina folder were used in order to run the Docking. Different conformations of the ligand binding modes for both types were obtained with their respective binding energies (kcal/mol) after the accomplishment of the docking runs; the best pose is the one with the lowest affinity value.
The best-ranked poses based on their binding affinities are selected for farther analysis; figures (Figs 9–12) represent the 2D-binding interactions in the active site of the proteases; wild type and mutant type with Darunavir and the new drug. Figures (Figs 13–16) disclose the 3D-interactions for the same compounds (WT-ND, MT-ND, WT-DRV & MT-DRV).
The interactions between the ligands (ND & DRV) and the proteases were visualized using Discovery Studio (Table 7). Active residues interacting with the ligands (ND & DRV) are also disclosed (S5 Table in S1 File). Moreover, atoms from ligands and residues interacting with each other to form hydrogen bonds are mentioned (S8 Table in S1 File).
Based on the Molecular Docking analysis; results lead us to conclude that the complex compounds (WT-ND & MT-ND) with binding affinity values of -10.2 kcal/mol & -10.4 kcal/mol respectively, display a higher stability as compared to (WT-DRV & MT-DRV).
Molecular dynamics simulation
To evaluate the native proteins’ stability (WT & MT), as well as the docked compounds’ (WT-DRV, WT-ND, MT-DRV & MT-ND), a computational process is carried out through the Molecular Dynamics simulation (MD) study, allowing structural analysis at the atomic level, aiming at investigating the motion of the four complex compounds and the native proteins.
Therefore, MD simulations were administered in nine plots, with 100ns for each, using the best poses generated based on the docking study performed previously, the compounds were carried out in water simulations separately. Further, the stability analysis was performed through several techniques, namely: Root Mean Square Deviation (RMSD), Root Mean Square fluctuation (RMSF) and the Radius of Gyration (Rg).
Root Means Square Deviation (RMSD).
RMSD stands for Root Means Square Deviation, it is a numerical measurement, it estimates the approximate distance between a band of atoms, mainly, backbone atoms of a protein plotted against time. The Root Means Square Deviation value is typically a measure of how much the protein’s structure has been modified over time in comparison to the starting point. Further, if the RMSD of the protein presents considerable fluctuations, then no equilibrium is reached, therefore, more simulation time is required for better results.
As the RMSD plots display (Fig 17), the native proteins (WT and MT) do not show any promising stability within the simulation time especially for the wild type protease. Regarding the RMSD plots (Fig 17 (WT)) for the two complexes (WT-DRV and WT-ND), it is highly clear that these compounds are showing lower fluctuations than the native protein (WT) within the simulation time. As for the complexes (MT-ND and MT-DRV), they’re showing as well lower fluctuations as compared to the native protein (MT) within the simulation time (Fig 17 (MT)). However, WT-ND and MT-ND complexes are showing promising results comparable to those of Darunavir in terms of fluctuations.
Root Means Square Fluctuation (RMSF).
The Root Mean Square fluctuation (RMSF) measures the approximate deviation of a particle over time from a reference position at a specific temperature and pressure. The RMSF analysis illuminates the fluctuations of residues during the MD simulation time.
Considering the graphics, for the wild type and the mutant type proteases for both chains (Fig 18), A & B chains are displaying slightly similar fluctuations in some regions, and highly non-similar fluctuations in the other regions, leading us to conclude that for all complexes (WT-DRV, WT-ND, MT-DRV & MT-ND) regardless the chain, the new drug and Darunavir are significantly influencing the fluctuations of the proteins’ residues in most regions.
Radius of gyration (Rg).
The radius of gyration is an interesting parameter as well to investigate the motion of a protein as well as its stability; it describes the compactness of the protein during the simulation time.
For the Mutant Type protease (Fig 19 (MT)), the radius of gyration of the complex compound MT-ND is higher in value as compared to the MT native protein and the complex compound in presence of DRV, causing eventually higher flexibility of the compound MT-ND. For the Wild Type protease (Fig 19 (WT)), the plots show that the complex compound WT-ND reveals more compactness with lower radius of gyration values as compared to the complex compound WT-DRV and the WT native protein within the simulation time, inducing less flexibility, which means higher potential of stability for the complex WT-ND.
Hydrogen bonds.
Hydrogen bonds are primordial in drug specificity and stability, so the determination of H-bond number in complex compounds is essential to check its contribution to the overall stability of each system and further conduct a comparative study including all complex compounds in question.
The figure (Fig 20) shows that during the MD simulation period (100ns), the complex MT-ND’s graph is showing up to seven hydrogen bonds by the end of the simulation time, while the MT-DRV complex compound’s graph is showing a few hydrogen bonds during the first 40ns as compared to MT-ND, then significantly increasing at 60ns displaying ten hydrogen bonds then decreasing to seven by the end of the simulation time (Fig 20 (MT)). In contrast, for the complex compound WT-DRV, the number of hydrogen bonds is consistently decreasing from 8 to 5 while the compound WT-ND displays up to five hydrogen bonds with no significant decrease compared to the WT-DRV compound during the simulation time (Fig 20 (WT)).
We can conclude that whether the wild type or the mutant type proteases, when docked to the new drug, the number of hydrogen bonds is likely to be the same with no significant change as compared to the complex compounds with Darunavir that shows a decreasing number of hydrogen bonds during the simulation time.
Hydrophobic interactions.
Hydrophobic interactions are non-bonded interactions between the protein and the ligand, which play a major role in the stability of complexes.
As shown below, considering the wild type protease (Fig 21 (WT), both complexes WT-DRV and WT-ND show highly similar numbers of hydrophobic interactions during the simulation time. In contrast, for the mutant type protease (Fig 21 (MT), the complexes MT-DRV and MT-ND, the number of hydrophobic interactions for the complex compound MT-DRV is significantly higher than the number of hydrophobic interactions for the complex compound with the new drug MT-ND.
We can conclude that for the wild-type protease, the new drug significantly competes with Darunavir, displaying similar numbers of hydrophobic interactions at every 20 ns of the simulation time. However, Darunavir is showing highly promising results for the mutant-type protease compared to the new drug in terms of hydrophobic interactions.
Solvent Accessible Surface Area (SASA).
The accessible surface area (ASA) or solvent-accessible surface area (SASA) is the surface area of a biomolecule that is accessible to a solvent.
Based on the graphics (Fig 22), the new drug, when combined to the wild type protease, is showing promising results regarding the significant decrease of the ASA values since 40ns to the end of the simulation time (Fig 22 (WT)), but for the mutant type, the ASA values are not promising on the ground that the graphics are displaying increasing values starting from 60ns of the simulation time (Fig 22 (MT)).
We can conclude that the new drug is comparable to Darunavir during the last 30ns of the simulation time for the wild type protease while no possible competition is investigated for the mutant type on the ground that the graphic is showing significant ASA values for the complex MT-ND as compared the MT-DRV mainly during the last 40ns of the simulation time.
Binding free energy calculation.
Molecular dynamics simulations were used to calculate binding free energy using the MM-PBSA method. Snapshots were extracted at every 1 ns of stable intervals from 70–100 ns MD trajectory. The binding free energy and its corresponding component obtained from the MM-PBSA calculations are listed (Table 8).
The results indicate that for both wild and mutant type protease, Darunavir is showing a binding affinity of -173.323 kJ/mol and -190.868 kJ/mol, respectively, which is slightly higher than the New Drug (-170.903 kJ/mol and -187.521 kJ/mol, respectively).
van der Waals, Electrostatic and SASA energy played a crucial role in binding energy and complex stability. In contrast, polar solvation energy has an opposite effect causing binding energy to depend on its unfavorable positive value. Among different energy terms, the contribution of van der Waals energy towards total binding energy is superior.
Compilation of the data demonstrated that although the binding of Darunavir to both wild and mutant HIV protease is better, the binding of the new drug is comparable to that of Darunavir in both wild and wild mutant type. This is illustrated by the different analyses that have been used so far. Thus, the new drug may also be considered a potential inhibitor against multi-drug resistant HIV and may be tested experimentally.
Conclusion
Various softwares have been used in this study in order to generate a reliable model relating the biological activity of new HIV-1 protease inhibitors to their physicochemical parameters. The generated model showed a high predictability efficiency regarding its statistical parameters. The applicability domain was also generated to frame the workspace (only compounds with features with greater similarity to those included in the training set can be used). Regarding the proposed model, the biological activity of the new HIV-1 protease inhibitors can be increased by increasing the three variables’ values; the Energy Gap (EGap); the Polar Surface Area (PSA) and the Dreiding Energy (DE) (positively related to the activity), and decreasing the Henry’s Law Constant value (negatively related to the activity). A new drug was proposed based on the model generated with a biological activity higher than the known drug compounds’ activities. Afterwards, the molecular docking study was performed on the wild-type and the mutant-type HIV-1 proteases to predict the best conformation displayed by two ligands, the New Drug and Darunavir as an approved FDA drug. Moreover, molecular dynamics simulation was performed to study the stability of the complexes (WT-DRV, WT-ND, MT-DRV & MT-ND); results disclosed some interesting results related to the new drug, therefore, the new drug may be considered as a potential inhibitor against multi-drug-resistant (MDR) strains of HIV-1 protease (PR) and may be tested experimentally.
Supporting information
S1 File. Supporting information contains all the supporting tables and figures.
https://doi.org/10.1371/journal.pone.0284539.s001
(DOCX)
Acknowledgments
We would like to thank all of the members of the Ben M’Sick Faculty of Science’s as well as the physical chemistry of materials laboratory group and the heads of the chemistry department for their motivation and assistance in carrying out this work.
References
- 1. Shattock R. J., Warren M., McCormack S., and Hankins C. A., "Turning the tide against HIV," Science (80-.)., vol. 333, no. 6038, pp. 42–43, 2011, pmid:21719662
- 2. Vasavi C. S., Tamizhselvi R., and Munusami P., "Drug Resistance Mechanism of L10F, L10F/N88S and L90M mutations in CRF01_AE HIV-1 protease: Molecular dynamics simulations and binding free energy calculations," J. Mol. Graph. Model., vol. 75, pp. 390–402, 2017, pmid:28645089
- 3. Montaner J. S. et al., "Association of highly active antiretroviral therapy coverage, population viral load, and yearly new HIV diagnoses in British Columbia, Canada: A population-based study," Lancet, vol. 376, no. 9740, pp. 532–539, 2010, pmid:20638713
- 4. Murphy E. L. et al., "Highly active antiretroviral therapy decreases mortality and morbidity in patients with advanced HIV disease," Ann. Intern. Med., vol. 135, no. 1, pp. 17–26, 2001, pmid:11434728
- 5. Egger M., "Mortality of HIV-1-infected patients in the first year of antiretroviral therapy: Comparison between low-income and high-income countries," Lancet, vol. 367, no. 9513, pp. 817–824, 2006, pmid:16530575
- 6. Esté J. A. and Cihlar T., "Current status and challenges of antiretroviral research and therapy," Antiviral Res., vol. 85, no. 1, pp. 25–33, 2010, pmid:20018390
- 7. Zhan P., Pannecouque C., De Clercq E., and Liu X., “Anti-HIV Drug Discovery and Development: Current Innovations and Future Trends,” J. Med. Chem., vol. 59, no. 7, pp. 2849–2878, 2016, pmid:26509831
- 8. Moyle G. J. and Back D., "Principles and practice of HIV-protease inhibitor pharmacoenhancement," HIV Med., vol. 2, no. 2, pp. 105–113, 2001, pmid:11737387
- 9. Ali A. et al., "Molecular basis for drug resistance in HIV-1 protease," Viruses, vol. 2, no. 11, pp. 2509–2535, 2010, pmid:21994628
- 10. C S V. and Munusami P., "Revealing the drug resistance mechanism of saquinavir due to G48V and V82F mutations in subtype CRF01_AE HIV-1 protease: molecular dynamics simulation and binding free energy calculations," J. Biomol. Struct. Dyn., vol. 41, no. 3, pp. 1000–1017, 2023, pmid:34919029
- 11. C.S V., Tamizhselvi R., and Munusami P., "Exploring the drug resistance mechanism of active site, non-active site mutations and their cooperative effects in CRF01_AE HIV-1 protease: molecular dynamics simulations and free energy calculations," J. Biomol. Struct. Dyn., vol. 37, no. 10, pp. 2608–2626, 2019, pmid:30051758
- 12. Kožíšek M., Lepšík M., Grantz Šašková K., Brynda J., Konvalinka J., and Řezáčová P., "Thermodynamic and structural analysis of HIV protease resistance to darunavir—Analysis of heavily mutated patient-derived HIV-1 proteases," FEBS J., vol. 281, no. 7, pp. 1834–1847, 2014, pmid:24785545
- 13. Antunes D. A. et al., "New insights into the in silico prediction of HIV protease resistance to nelfinavir," PLoS One, vol. 9, no. 1, 2014, pmid:24498124
- 14. Zhu M., Dong B., Zhang G. N., Wang J. X., Cen S., and Wang Y. C., "Synthesis and biological evaluation of new HIV-1 protease inhibitors with purine bases as P2-ligands," Bioorganic Med. Chem. Lett., vol. 29, no. 12, pp. 1541–1545, 2019, pmid:31014912
- 15. Barril X. and Morley S. D., "Unveiling the full potential of flexible receptor docking using multiple crystallographic structures," J. Med. Chem., vol. 48, no. 13, pp. 4432–4443, 2005, pmid:15974595
- 16. Rajkhowa S. and Deka R., “DFT Based QSAR/QSPR Models in the Development of Novel Anti-tuberculosis Drugs Targeting Mycobacterium tuberculosis,” Curr. Pharm. Des., vol. 20, no. 27, pp. 4455–4473, 2014, pmid:24245759
- 17. Rajkhowa S., Hussain I., Hazarika K., Sarmah P., and Deka R., "Quantitative Structure-Activity Relationships of the Antimalarial Agent Artemisinin and Some of its Derivatives–A DFT Approach," Comb. Chem. High Throughput Screen., vol. 16, no. 8, pp. 590–602, 2013, pmid:23597248
- 18. Tonmunphean S., Parasuk V., and Kokpol S., "QSAR study of antimalarial activities and artemisinin-heme binding properties obtained from docking calculations," Quant. Struct. Relationships, vol. 19, no. 5, pp. 475–483, 2000,
- 19. Ghosh A. K., Osswald H. L., and Prato G., “Recent Progress in the Development of HIV-1 Protease Inhibitors for the Treatment of HIV/AIDS,” J. Med. Chem., vol. 59, no. 11, pp. 5172–5208, 2016, pmid:26799988
- 20. Gupta R., Hill A., Sawyer A. W., and Pillay D., "Emergence of drug resistance in HIV type 1-infected patients after receipt of first-line highly active antiretroviral therapy: A systematic review of clinical trials," Clin. Infect. Dis., vol. 47, no. 5, pp. 712–722, 2008, pmid:18662137
- 21. Wainberg M. A. and Friedland G., "Public health implications of antiretroviral therapy and HIV drug resistance," J. Am. Med. Assoc., vol. 279, no. 24, pp. 1977–1983, 1998, pmid:9643862
- 22. Koh Y. et al., "In Vitro Selection of Highly Darunavir-Resistant and Replication-Competent HIV-1 Variants by Using a Mixture of Clinical HIV-1 Isolates Resistant to Multiple Conventional Protease Inhibitors," J. Virol., vol. 84, no. 22, pp. 11961–11969, 2010, pmid:20810732
- 23. Roy K., Kar S., and Das R. N., Selected Statistical Methods in QSAR. 2015.
- 24. Paris A. and Târcolea C., "COMPUTER AIDED SELECTION IN DESIGN PROCESSES WITH MULTIVARIATE STATISTICS measured by correlations, and maximizes the signal, an iterative algorithm. The technique begins by finding a maximized. Next it is find the second direction along previous sele," vol. 4, pp. 4–7, 2009.
- 25. Tropsha A., "Best practices for QSAR model development, validation, and exploitation," Mol. Inform., vol. 29, no. 6–7, pp. 476–488, 2010, pmid:27463326
- 26. Chtita S., Hmamouchi R., Larif M., Ghamali M., Bouachrine M., and Lakhlifi T., "QSPR studies of 9-aniliioacridine derivatives for their DNA drug binding properties based on density functional theory using statistical methods: Model, validation and influencing factors," J. Taibah Univ. Sci., vol. 10, no. 6, pp. 868–876, 2016,
- 27. Maurya A., Khan F., Bawankule D. U., Yadav D. K., and Srivastava S. K., "QSAR, docking and in vivo studies for immunomodulatory activity of isolated triterpenoids from Eucalyptus tereticornis and Gentiana kurroo," Eur. J. Pharm. Sci., vol. 47, no. 1, pp. 152–161, 2012, pmid:22659375
- 28. Savjani K. T., Gajjar A. K., and Savjani J. K., "Drug Solubility: Importance and Enhancement Techniques," ISRN Pharm., vol. 2012, no. 100 mL, pp. 1–10, 2012, pmid:22830056
- 29. Ousaa A. et al., "Quantitative structure-toxicity relationship studies of aromatic aldehydes to Tetrahymena pyriformis based on electronic and topological descriptors," J. Mater. Environ. Sci., vol. 9, no. 1, pp. 256–265, 2018,
- 30. Ouassaf M., Belaidi S., Lotfy K., Daoud I., and Belaidi H., "Molecular docking studies and ADMET properties of new 1.2.3 triazole derivatives for anti-breast cancer activity," J. Bionanoscience, vol. 12, no. 1, pp. 26–36, 2018,
- 31. B. M. Belhassan A, Chtita S, Zaki H, Tahar L, "Molecular docking analysis of N-substituted oseltamivir derivatives with the SARS-Cov-2 main protease," Bioinformation, vol. 16, no. 5, pp. 404–410, 2020, pmid:32831522
- 32. Fortmeyer R., "The zero effect," Archit. Rec., vol. 195, no. 3, p. 153, 2007,
- 33. Pires D. E. V., Blundell T. L., and Ascher D. B., "pkCSM: Predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures," J. Med. Chem., vol. 58, no. 9, pp. 4066–4072, 2015, pmid:25860834
- 34. Daina A., Michielin O., and Zoete V., "SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules," Sci. Rep., vol. 7, no. January, pp. 1–13, 2017, pmid:28256516
- 35. Lipinski C. A., Lombardo F., Dominy B. W., and Feeney P. J., "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings," Adv. Drug Deliv. Rev., vol. 64, no. SUPPL., pp. 4–17, 2012,
- 36. Veber D. F., Johnson S. R., Cheng H. Y., Smith B. R., Ward K. W., and Kopple K. D., "Molecular properties that influence the oral bioavailability of drug candidates," J. Med. Chem., vol. 45, no. 12, pp. 2615–2623, 2002, pmid:12036371
- 37. Zhang H., Zhang L., Gao C., Yu R., and Kang C., "Pharmacophore screening, molecular docking, ADMET prediction and MD simulations for identification of ALK and MEK potential dual inhibitors," J. Mol. Struct., vol. 1245, p. 131066, 2021,
- 38. Mali S. N. and Pandey A., "Multiple QSAR and molecular modelling for identification of potent human adenovirus inhibitors," J. Indian Chem. Soc., vol. 98, no. 6, p. 100082, 2021,
- 39. Rajkhowa S., Pathak U., and Patgiri H., "Elucidating the Interaction and Stability of Withanone and Withaferin‐A with Human Serum Albumin, Lysozyme and Hemoglobin Using Computational Biophysical Modeling," ChemistrySelect, vol. 7, no. 12, 2022,
- 40. Sivakumar P. M., Geetha Babu S. K., and Mukesh D., "QSAR Studies on chalcones and flavonoids as anti-tuberculosis agents using genetic function approximation (GFA) method," Chem. Pharm. Bull., vol. 55, no. 1, pp. 44–49, 2007, pmid:17202700
- 41. Beheshti A., Pourbasheer E., Nekoei M., and Vahdani S., "QSAR modeling of antimalarial activity of urea derivatives using genetic algorithm-multiple linear regressions," J. Saudi Chem. Soc., vol. 20, no. 3, pp. 282–290, 2016,
- 42. Linear A., Edache E. I., Arthur D. E., and Abdulfatai U., "Quantitative Structure-Activity Relationship Analysis of the Anti- tyrosine Activity of Some Tetraketone and Benzyl-benzoate Derivatives Based on Genetic Quantitative Structure-Activity Relationship Analysis of the Anti- tyrosine Activity of Some Tetraket," no. January, pp. 2–13, 2016.
- 43.
Q. Christoph, M. Meringer, M. Biometry, M. Informatics, and M. Chemistry, "of Random Pseudomodels By Calculation Rather Than By Tedious Multiple Simulations on Random Number Variables.," pp. 1–42, 2007.
- 44. Edache E. I., Uzairu A., and Abechi S. E., "Investigation of 5,6-dihydro-2-pyrones derivatives as potent anti-HIV agents inhibitors," J. Comput. Methods Mol. Des. Sch. Res. Libr., vol. 5, no. 3, pp. 135–149, 2015, [Online]. Available: http://scholarsresearchlibrary.com/archive.html.
- 45. Adedirin O., Uzairu A., Shallangwa G. A., and Abechi S. E., “Computational studies on α-aminoacetamide derivatives with anticonvulsant activities,” Beni-Suef Univ. J. Basic Appl. Sci., vol. 7, no. 4, pp. 709–718, 2018,
- 46. Roy K., "On some aspects of validation of predictive quantitative structure-activity relationship models," Expert Opin. Drug Discov., vol. 2, no. 12, pp. 1567–1577, 2007, pmid:23488901
- 47. Ouassaf M., Belaidi S., Mogren Al Mogren M., Chtita S., Ullah Khan S., and Thet Htar T., "Combined docking methods and molecular dynamics to identify effective antiviral 2, 5-diaminobenzophenonederivatives against SARS-CoV-2," J. King Saud Univ.—Sci., vol. 33, no. 2, p. 101352, 2021, pmid:33558797
- 48. Barthe L., Woodley J., and Houin G., "Gastrointestinal absorption of drugs: Methods and studies," Fundam. Clin. Pharmacol., vol. 13, no. 2, pp. 154–168, 1999, pmid:10226759
- 49. Ogu C. C. and Maxa J. L., "Drug Interactions Due to Cytochrome P450," Baylor Univ. Med. Cent. Proc., vol. 13, no. 4, pp. 421–423, 2000, pmid:16389357
- 50. Lee H. M. et al., "Computational determination of hERG-related cardiotoxicity of drug candidates," BMC Bioinformatics, vol. 20, no. Suppl 10, 2019, pmid:31138104