Protein Structure Validation and Refinement Using Amide Proton Chemical Shifts Derived from Quantum Mechanics

We present the ProCS method for the rapid and accurate prediction of protein backbone amide proton chemical shifts - sensitive probes of the geometry of key hydrogen bonds that determine protein structure. ProCS is parameterized against quantum mechanical (QM) calculations and reproduces high level QM results obtained for a small protein with an RMSD of 0.25 ppm (r = 0.94). ProCS is interfaced with the PHAISTOS protein simulation program and is used to infer statistical protein ensembles that reflect experimentally measured amide proton chemical shift values. Such chemical shift-based structural refinements, starting from high-resolution X-ray structures of Protein G, ubiquitin, and SMN Tudor Domain, result in average chemical shifts, hydrogen bond geometries, and trans-hydrogen bond (h3 JNC') spin-spin coupling constants that are in excellent agreement with experiment. We show that the structural sensitivity of the QM-based amide proton chemical shift predictions is needed to obtain this agreement. The ProCS method thus offers a powerful new tool for refining the structures of hydrogen bonding networks to high accuracy with many potential applications such as protein flexibility in ligand binding.


S2 Bonds to carboxylic acids and alcohols
In this section, a model of the chemical shift contributions due to hydrogen bonding to carboxylic acids and alcohol side chains are described. These are common hydrogen bonding acceptors in proteins. Since the C-terminus and the aspartic acid and glutamic acid side chains are, in the current Phaistos framework, always given in the deprotonated state, only this state is considered in this section.
An approach similar to the amide-amide model due to Barfield[1] is used. The model system consists of two molecules, modeling the amide hydrogen bonding donor and acceptor complex, and the geometric dependence is modeled by scanning conformations over relevant angles and distances.
As an approximation to the carboxylic acid functional groups found in the protein (aspartate, glutamate and the C-terminus), an acetate anion is used. The alcohol functional groups, threonine, serine and tyrosine are approximated by a methanol. As a backbone amide model, an N -methylacetamide (NMA) molecule is used.
Using minimal amide, carboxylic acid and alcohol models, a scan over a range of bond angles and distances is carried out. The hydrogen bonding distance is modeled in a range from 1.5Å to 2. To allow for the prediction of any geometry between the grid points, a linear interpolation algorithm is used.

S3 Solvent exposed amide protons
Accurate modelling of solvent around the protein is an extremely difficult problem in modern computational chemistry due to the complexity involved. Since Phaistos generate structures, that do not contain explicit solvents (e.g. crystallographic water molecules), the simplistic model used in this work implicitly assumes the presence of water molecules, when the amide proton is considered to be solvent exposed.
The chemical shift contribution from solvent exposure of the amide proton is here assumed to be equivalent to the contribution from a hydrogen bond to a water molecule. A typical contribution from a water molecule in a energy minimized hydrogen bonding conformation is found in section S3.1 to be +2.07 ppm. As a crude approximation solvent exposed amide protons are assigned with a fixed +2.07 ppm primary bond contribution.

S3.1 Hydrogen bonding to a water molecule
A water molecule is placed near the amide hydrogen atom of a probe NMA molecule and a B3LYP/6-311++G(d,p) minimization is carried out. The resulting structure is a local minimum of an amide hydrogen bonded to a water molecule. From this geometry the chemical shift of the entire dimer is then calculated.
Using four different starting geometries, the water molecule was minimized into two different conformation (see Fig. S5A and S5B). The hydrogen bonding geometry of these two conformations had a few similarities. The water oxygen was in both cases aligned into the N-H bond axis. The hydrogen bonding distance was also similar at 2.04Å and 2.07Å, respectively. To separate the change in chemical shift due to change of the internal geometry of the NMA molecule, another NMR calculation was carried out using the optimized systems, but with the water molecule removed. One minimization done without the geometry restriction led to a 60 • rotation of a methyl group. The subsequent analysis of the systems with the water molecule removed revealed that the rotation caused an extra shielding of about 0.2 ppm. However, by using the NMA geometries from the optimized dimer as a reference, this artifact was removed. The resulting chemical shift due to the water molecule turned out to be very similar, at +2.04 ppm and +2.09 ppm respectively. The chemical shift in these minima serve as rough figures for the chemical shift due to solvent exposure. We thus assign a contribution of +2.07 ppm to the total chemical shift of solvent exposed amide protons.

Chemical Shift
Minimum A Minimum B Optimized NMA-Water Dimer 6.23 ppm 6.45 ppm Optimized NMA alone 4.19 ppm 4.37 ppm Difference +2.04 ppm +2.09 ppm Table 1: The chemical shift of the amide proton in two local energy minima. "NMA-Water Dimer" is the chemical shift of the NMA amide proton in the dimer. "NMA alone" is the amide proton chemical shift is the resulting chemical shift of the NMA amide proton in the optimized configuration when the water molecule is removed and no furter optimization is carried out. The resulting difference corresponds to the change in chemical shift due a hydrogen bond to a water molecule.
(A) (B) Figure S5: Two different local energy minima of the NMA-water dimer. The hydrogen bonding distances are almost identical. The water molecule is rotated 90 • between A and B, and one methyl group has a 60 • difference in rotation between A and B.