Efficient consideration of coordinated water molecules improves computational protein-protein and protein-ligand docking discrimination
Fig 1
Implicit and explicit treatment of water In Rosetta.
Implicit water score function potentials, panels A-D. Potential plots were generated by orienting the N-H and C = O groups of two ALA residues along the same axis with a H—O distance of 1.3 Å (origin). The donor residue is then shifted +/- 7 Å to generate a planar cut of the solvation potentials between the N and O atoms. All plots have units of kcal/mol[13, 14]. (A) fa_sol term: isotropic desolvation penalty implemented in Rosetta using the Lazaridis-Karplus model. (B) lk_ball term: anisotropic correction for polar atom types, first introduced into the REF2015 score function. (C) lk_bridge term: anisotropic solvation reward introduced into the Rosetta-ICO score function. (D) Composite of panels A-C, using the finalized Rosetta-ICO score term weights. Explicit water placement with Rosetta-ECO, Panels E-H. (E) Initial possible solvation sites (blue) are based on statistics of water positions around backbone polar atoms in addition to sites around side chain polar atoms considering all possible non-clashing rotamers. Pictured is the interface of PDB ID 1P57, between the N-terminal (pink) and catalytic (teal) domains of hepsin, with crystallographic waters in transparent grey. (F) After an initial stage of Monte Carlo packing of both the possible water sites and surrounding protein side chains, a cutoff is applied based on the water occupancy of each site over the simulation (blue = 0% occupancy, green = 25%, red = 50%). (G) Remaining water sites are clustered, and a second cumulative dwell time cutoff is applied. (H) The final predicted water sites are converted into three-atom water molecules and the orientation is reoptimized together with nearby sidechain conformations using the Rosetta all-atom energy function.