Fig 1.
Implicit and explicit treatment of water In Rosetta.
Implicit water score function potentials, panels A-D. Potential plots were generated by orienting the N-H and C = O groups of two ALA residues along the same axis with a H—O distance of 1.3 Å (origin). The donor residue is then shifted +/- 7 Å to generate a planar cut of the solvation potentials between the N and O atoms. All plots have units of kcal/mol[13, 14]. (A) fa_sol term: isotropic desolvation penalty implemented in Rosetta using the Lazaridis-Karplus model. (B) lk_ball term: anisotropic correction for polar atom types, first introduced into the REF2015 score function. (C) lk_bridge term: anisotropic solvation reward introduced into the Rosetta-ICO score function. (D) Composite of panels A-C, using the finalized Rosetta-ICO score term weights. Explicit water placement with Rosetta-ECO, Panels E-H. (E) Initial possible solvation sites (blue) are based on statistics of water positions around backbone polar atoms in addition to sites around side chain polar atoms considering all possible non-clashing rotamers. Pictured is the interface of PDB ID 1P57, between the N-terminal (pink) and catalytic (teal) domains of hepsin, with crystallographic waters in transparent grey. (F) After an initial stage of Monte Carlo packing of both the possible water sites and surrounding protein side chains, a cutoff is applied based on the water occupancy of each site over the simulation (blue = 0% occupancy, green = 25%, red = 50%). (G) Remaining water sites are clustered, and a second cumulative dwell time cutoff is applied. (H) The final predicted water sites are converted into three-atom water molecules and the orientation is reoptimized together with nearby sidechain conformations using the Rosetta all-atom energy function.
Table 1.
Classification of predicted native waters (test set of 123).
Table 2.
Performance of solvation schemes on protein-protein and protein-small molecule docking discrimination.
Fig 2.
Protein-protein docking results.
(A) Scatter plot comparing results of 53 cases between REF2015 and Rosetta-ECO. Values are the average Boltzmann-weighted discrimination score ± 1σ from three independent runs. (B) Energy funnels for PDB ID 1E6E, adrenodoxin reductase bound to adrenodoxin (red data point in 2A), plotting computed ΔGbind vs. RMSD from the native binding conformation for three different scoring methods. Discrimination scores for each distribution are noted in bottom right of each plot. (C) Explicitly solvated near-native docking pose (RMSD = 0.14 Å; pink data point in 2B) with the reductase in grey and adrenodoxin in rainbow (N- to C-terminus colored blue to red). (D) Coordination of some predicted interface waters.
Fig 3.
Protein-ligand docking results.
(A) Scatter plot comparing results of 46 cases between baseline (REF2015) and Rosetta-ECO. Values are the Boltzmann-weighted discrimination score ± 1σ from an average of three independent runs. (B) Energy funnels, similar to Fig 2, for PDB ID 1X8X, tyrosyl t-RNA synthase bound to tyrosine (red data point in 3A) C. Explicitly-solvated, near-native docking pose in pink (RMSD = 0.43 Å; pink data point in 3B) with native ligand in transparent blue. (D) Explicitly-solvated decoy binding pose (RMSD = 6.57 Å; yellow data point in 3B). (E-H) A comparison of recovered waters (red) to high-resolution crystallographic waters (green spheres) from PDB ID: 1N2J (Panels E-G) and PDB ID: 1U4D (Panel H).