Fig 1.
Genome used by ADFR to encode the docking of a flexible ligand into a receptor with two flexible side-chains.
This figure illustrates a genome optimized by the GA implemented in ADFR for solving the problem of docking a flexible ligand with two rotatable bonds into a receptor with two flexible side-chains. The genome is the set of variables to optimize. A given set of values for these variables constitutes a docking solution also called an individual. Variables are grouped into the following genes: the ligand translation (3 values: x, y, z), rotation (4 values: quaternion), and conformation (1 torsion angle per ligand rotatable bond), and the receptor conformation (χ angles for each flexible receptor side-chain).
Fig 2.
A) A cross-section of the AutoDock carbon affinity map. B) The same cross-section after processing the map to create a gradient inside the protein. Besides creating a potential gradient inside the receptor, this processing also removes the local minima inside the receptor volume. The color gradient outside the protein surface indicates favorable interactions going from weak (green) to strong (blue). Inside the protein surface the color gradient indicates unfavorable interactions going from low (yellow) to highly unfavorable (red).
Fig 3.
The flexibility information of the ligand (i.e. rotatable bonds) and receptor (i.e. flexible side-chains) is used to assemble the genome from which an initial list of solutions (i.e. population) is created. The population is scored, sorted, and top-ranking solutions are clustered. The GA seeds the next generation with the best solution of each cluster and completes it by crossing-over, mutating, and minimizing individuals from the mating population. The optimization stops when one of the termination criteria (maximum number of generations or evaluations) is reached or the search converges, at which point the solutions within 1 kcal/mol of the best solution are written out.
Fig 4.
A) Translational points. The surface enclosing points of the carbon affinity map located outside the protein and with carbon affinity less than or equal to -0.3 kcal/mol is shown in blue; protein atomic spheres (with reduced vdW radii) are shown in green. The set of translational points cover grooves and cavities that can accommodate a ligand and provide sensible initial placement points for the ligand root atom. B) Translational points cutoff value selection For energy cutoff values varying from 0.0 to -0.6 kcal/mol in decrements of 0.1 kcal/mol, the average number of grid points retained is plotted against the average number of retained grid points within 1Å of the ligand root atom. The average is computed over the 85 systems in the Astex Diverse set. Lower energy cutoff values produce fewer translational points, however they increase the chance of discarding points surrounding the ligand root atom (i.e. reducing coverage of the root). The value of -0.3 kcal/mol is the closest to the curve’s inflection point and was selected as the best cutoff value to maximize the reduction in retained points and maximize the coverage of the ligand root atom.
Fig 5.
A) The bars depict the energy differences between lowest energy solution found by ADFR and AD2.5M (dark), and ADFR and AD25M (light). Negative values indicate a lower energy for the ADFR solution. Only complexes with at least one of the two differences larger than 0.5 kcal/mol are shown. 1R1H is the only complex where ADFR finds a significantly better solution than AutoDock (i.e. difference > 2 kcal/mol). B) This histogram shows the distribution of number of evaluations of the scoring function performed by ADFR in the GA evolution leading to lowest energy solution. C) Each docking consists of 50 GA evolutions, each producing a solution. The 50 solutions are clustered with an RMSD cutoff of 2Å. In this diagram the 85 complexes are binned based on the cluster size of the lowest energy solution indicating how many of the 50 GA runs identified the pose corresponding to the lowest energy pose found across the 50 runs, i.e. the reliability of the GA.
Table 1.
SEQ17 cross-docking into apo conformations with receptor side-chains.
Table 2.
Cross-docking results comparison between ADFR and AutoDock Vina with 0, 4, 10, 12 flexible receptor side-chains.
Fig 6.
Scaling of docking runtimes as function of the number of flexible receptor side-chains.
The Y-axis represents multiples of the rigid cross-docking runtimes. The times used in this graph are averages taken over all docking runs for the 52 complexes of the CDK2 cross-docking experiments. For AutoDock Vina the times corresponding to the default exhaustiveness 8 are used. The X-axis indicates the number of flexible receptor side-chains. ADFR scales by a factor of 2, while Vina8 scales by a factor of 62, when 12 protein side-chains are made flexible.
Fig 7.
Frequency of receptor side-chain changes in the GA population during a successful docking of the 4EK6 ligand docked into the corresponding 4EK3 apo receptor with 12 flexible side-chains.
The figure plots the evolution of the average number of receptor side-chains with a modified conformation over successive generations of the GA optimization. In the initial population all receptor side-chains are in the apo conformation. The number of side-chains changing rotameric state in individuals of the optimized population quickly increases in the first few generations and reaches a plateau. This profile is typical and observed in all runs for all system.
Fig 8.
Comparison of side-chain conformations between apo, holo, and successfully docked solution.
This figure provides a pairwise comparison of the conformations of the apo (4EK3), holo complex (1YKR), and the 1YKR ligand docked solution with the 12 flexible receptor side-chains displayed as ball-and-sticks. A) Apo vs. holo: The native bound ligand is displayed as sticks with green carbon atoms along with a partially transparent green molecular surface. The 2 lysine side-chains in the apo conformation severely overlap with the space occupied by the ligand. B) Docked vs. apo. The docked solution is shown with purple carbon atoms and partially transparent ligand molecular surface. The apo structure is shown with orange carbon atoms. All 12 side-chains in the docked solution adopt conformations different from the initial apo conformation. Most of them settle for conformations corresponding to small adjustments while others adopt substantially different conformations to resolve steric clashes (Lys33 and Lys89). C) The docked solution (purple carbon atoms) is shown with the holo receptor (green carbon atoms). The ligand is docked perfectly (RMSD from the crystallographic structure is 0.34Å) and the receptor side-chains changed their conformations to accommodate the ligand binding in the correct binding mode.
Fig 9.
Heat map of ligand-flexible receptor atomic contacts reproduced in docked poses.
The 43 systems reported in this table are the ones for which ADFR correctly reports the docked solution (i.e. ligand RMSD < 2.5Å). The rank of the solution among 50 GA runs is reported. White cells correspond to flexible side-chains not interacting with the ligand in either the holo or the docked complex. Grey cells indicate interactions formed in the docked solution, which do not exist in the holo complex. The remainder of the cells is colored using a red to green color scale indicating the percentage of holo interacting atomic pairs reproduced by the docked solution. A green cell (rate of 100%) indicates that every pairwise atomic interaction between ligand atoms and the side-chain atoms of the residue corresponding to that cell are reproduced in the docked solution. The histogram displays the percentage of holo interactions that are reproduced across all 12 side-chains for every ligand. The ligand reproduced at least 57.1% of all the interacting pairs in the holo complex, with an average of 79.8% interactions.
Fig 10.
Impact of down-weighting the receptor internal energy.
A) and B) Sorted ranks of the correct docked solutions without scaling the receptor energy (blue) and with a scaling factor of 1/NFS (green) where NFS is the number of flexible receptor side-chains, for the SEQ17 and CDK2 FS12 cross-docking calculations respectively. Overall down-weighting the receptor energy improves the rank of the lowest-energy correct solution. The top horizontal line (Rank 51) in the plots represents data points that did not find the solution in the 50 docking runs. C) and D) Distributions of improvements in receptor-ligand interaction energies (ER-L) in kcal/mol, when the internal energy of the receptor is down-weighted in the scoring functions for SEQ17 and the CDK2 FS12 calculation respectively.
Fig 11.
Impact of making 12 receptor side-chains flexible when docking ligands into the native holo receptor and the apo receptor.
An expected loss of accuracy is observed when making the native holo receptor flexible, reflecting shortcomings in the scoring function and search method. Adding flexibility to the apo receptor, however, improves the docking success rate. Holo docking success rates are shown for ligand RMSD < 2Å. The success rate for apo cross-docking increases from 17.3% to 36.5% with a 2.0 Å RMSD cutoff. This success rate increases from 23.1% to 44.2% when using a 2.5Å RMSD cutoff (darker shade bars).