Linear Interaction Energy Based Prediction of Cytochrome P450 1A2 Binding Affinities with Reliability Estimation

doi:10.1371/journal.pone.0142232

Table 1.

Comparison of IC₅₀ and ΔG_bind values for CYP 1A2 as determined in-house (inhouse) and gathered from literature sources (lit), expressed in μM and kJ mol^-1, respectively.

ΔΔG_bind refers to the difference in ΔG_bind between literature and in-house determined values.

More »

Expand

Table 2.

Comparison between predicted inhibition mechanisms and experimentally determined inhibition mechanisms reported in literature.

More »

Expand

Fig 1.

Correlation between calculated (∆G_bind^Calc) and observed (∆G_bind^Obs) binding free energies obtained for the CYP 1A2 LIE model (Eq 4, α = 0.587 and β = 0.267).

The solid line indicates ideal correlation between ∆G_bind^Obs and ∆G_bind^Calc, and dashed lines represent deviations between calculated and experimental values of ±5 kJ mol⁻¹ (corresponding to an error well within 1.0 pK_i units). Compounds from the training set are represented in black. Test-set compounds that were found to be outlier in 0, 1, 2, and 3 analyses are represented in green, yellow, orange, and red, respectively.

More »

Expand

Table 3.

Calculated (ΔG_bind^Calc) and observed (ΔG_bind^Obs) free energies of binding, and corresponding residuals (ΔG_bind^Obs—ΔG_bind^Calc) for the training-set compounds (kJ mol^-1).

More »

Expand

Table 4.

Calculated (ΔG_bind^Calc) and observed (ΔG_bind^Obs) free energies of binding (kJ mol^-1), and residuals (ΔG_bind^Obs–ΔG_bind^Calc) for the test-set compounds.

Results from the reliability analyses are given as well, where a score 1 in columns (A)-(D) refers to the identification of outliers according to the following analyses: (A) Chemical similarity analysis; (B) Average interaction energy distribution analysis; (C) Ligand-residue electrostatic interaction analysis; (D) Ligand-residue van der Waals interaction analysis. In the last column (Total), the total sum of the number of analyses is reported in which a compound is identified as an outlier.

More »

Expand

Fig 2.

Similarity matrix of the data set.

Heat map of the compounds included in the training and test set, colored according to percent similarity expressed in terms of Tanimoto scores (TSs) between pairs of structural fingerprints (white = 100% similarity (TS = 1.00); black = 0% similarity (TS = 0.00)).

More »

Expand

Fig 3.

Distribution of ΔV^Ele and ΔV^VdW values (Eq 2) for training-set (black circles) and test-set (white squares) MD simulations.

The dashed line represents the confidence for the 95 percentile of the training set distribution. The simulations from the test set that are not comprised in this interval are labeled according to the corresponding compound ID.

More »

Expand

Fig 4.

Per-residue decomposition analysis of the electrostatic interaction energies between the ligand and its surrounding in the protein-ligand simulations.

(A) PCA loading plot for training-set electrostatic interaction energies; (B) Active site of CYP 1A2 from the crystallographic structure; heme group (purple carbon atoms), co-crystallized ligand α-naphthoflavone (yellow carbon atoms), and amino acids with high loading on the first two PCs (in red) are explicitly represented. (C) PCA score plot for the training-set (black circles) and test-set (white squares) compounds for the first two PCs. (D) Orthogonal distance (OD) of the compounds of the training set (black circles) and test set (white squares) from the model with 2 PCs. The dashed horizontal line represents the critical orthogonal distance, calculated for the training-set distribution.

More »

Expand

Fig 5.

Per-residue decomposition analysis of the van der Waals interaction energies between the ligand and its surrounding in the protein-ligand simulations.

(A) PCA loading plot for training-set van der Waals interaction energies; (B) Active site of CYP 1A2 from the crystallographic structure; heme group (purple carbon atoms), the co-crystallized ligand α-naphthoflavone (yellow carbon atoms), and amino acids with high loadings in the PCA are explicitly represented. Residues with high positive loadings on the first PC are depicted in green; Residues with high loadings on the second component are also represented, both for positive (blue) and negative values (red). (C) PCA score plot for the training-set (black circles) and test-set (white squares) compounds for the first two PCs. (D) Orthogonal distance (OD) of the compounds of the training set (black circles) and test set (white squares) from the model with 4 PCs. The dashed horizontal line represents the critical orthogonal distance, calculated for the training-set distribution.

More »

Expand

Fig 6.

Prediction errors obtained for the external test set compounds.

The compounds were grouped in a category according to the number of occurrences in which they were found to be an outlier according to analyses (A)-(D) in Table 4. Horizontal lines represent the standard error (SDEP) for a given category, while the boxes represent the standard deviation around this average.

More »

Expand