13C NMR Reveals No Evidence of n−π* Interactions in Proteins

An interaction between neighboring carbonyl groups has been postulated to stabilize protein structures. Such an interaction would affect the C chemical shielding of the carbonyl groups, whose paramagnetic component is dominated by and excitations. Model compound calculations indicate that both the interaction energetics and the chemical shielding of the carbonyl group are instead dominated by a classical dipole-dipole interaction. A set of high-resolution protein structures with associated carbonyl C chemical shift assignments verifies this correlation and provides no evidence for an inter-carbonyl interaction.


Introduction
Proteins exhibit a diversity of structures, with 1,282 folds or topologies present in the CATH database [1]. Each unique structure is defined by its amino acid composition, where sequence identities greater than 40% imply homologous structures [2]. Predicting the three-dimensional conformation of a protein from its primary sequence is a fundamental challenge of structural biology, and achieving this goal requires a thorough understanding of the underlying interactions and forces that stabilize protein structures [3].
Hydrophobic interactions and hydrogen bonds are two of the most common forces attributed to the overall stability of protein structures [4,5]. The burial of hydrophobic residues is generally considered a major driving force in protein folding [6] and has been predicted to contribute roughly 8 kJ/mol per buried residue. Conversely, the contribution of hydrogen bonds to protein structure stability has been controversial [7]. Hydrogen bonds have been described as destabilizing, partially stabilizing or important driving forces. Of course, hydrogen bonds are a defining feature of a-helices, b-sheets and turns. Thus, the generally accepted view is that hydrogen bonds within a protein structure are marginally favored over hydrogen bonds to water. Hydrogen bonds are estimated to contribute roughly 4 kJ/mol to protein stability, but can vary based on the polarity of the microenvironment [8]. Despite these observations, a satisfying general mechanism for protein folding has not been described [9,10]. This implies that our understanding of the factors involved in protein folding and stability is incomplete.
In a recent paper, Bartlett et. al. proposed a new and potentially important interaction analogous to the hydrogen bond [11]. Unfortunately, the predicted n{p Ã interaction was based on density functional theory and a relatively low-level basis set.
However, conventional Kohn-Sham density functional theory does not properly model virtual orbitals [12] such as the p Ã orbital proposed by Bartlett et. al. to have a role in protein stabilization. Moreover, the relatively low-level basis set used by the authors is inadequate to model such orbitals, and likely gives rise to substantial basis-set superposition errors. Also, experimental data in support of this prediction was not presented. Nevertheless, the predicted n{p Ã interaction was suggested to aid in the stabilization of protein structures and contribute roughly 0.4 to 5.4 kJ/ mol. This stabilization is predicted to occur through the electron delocalization of the lone pair (n) of a carbonyl oxygen atom to the antibonding orbital (p Ã ) of a neighboring carbonyl carbon atom. An optimal n{p Ã interaction was predicted to be restricted to a specific range of structural parameters ( Figure 1A) corresponding roughly to the Bürgi-Dunitz trajectory [13]. The distance (d) between the donor oxygen and acceptor carbon must be ƒ3.2 Å , and the angle between the (donor O)Á Á Á(acceptor C) vector and the acceptor carbonyl vector (h) must lie between 99u and 119u. Interestingly, the structural parameters required for an optimal n{p Ã interaction are prevalent in a wide variety of common secondary structures, including a-helices, 3 10 -helices and twisted b-sheets, suggesting a potential alternative explanation.
Despite the presence of numerous conformations consistent with the n{p Ã interaction in protein structures, no experimental evidence has been presented that supports the actual existence of this interaction. NMR chemical shifts of sp 2 -hybridized groups contain a paramagnetic component caused by mixing of excited states with non-zero orbital angular momentum into the diamagnetic ground state [14]. These excitations are predominantly n{p Ã and p{p Ã and are therefore highly sensitive to perturbations of these orbitals. The predicted n{p Ã interactions between neighboring carbonyls would be expected to modify the local electronic environment of the acceptor carbonyl carbon nucleus, and the NMR 13 C chemical shift of the acceptor carbonyl carbon would experience a significant chemical shift change in the presence of an n{p Ã interaction [15]. A large (roughly 20 ppm range) linear relationship has been previously observed between carbonyl 13 C chemical shifts and the carbonyl n{p Ã transition energy [16,17].
An extensive analysis of 13 C chemical shifts correlated to highresolution x-ray structures combined with a detailed analysis of the molecular orbitals of a formamide trimer model complex does not support the postulated n p {p Ã interaction. In fact, our model indicates that an n p {p Ã interaction is implausible. Instead, a simple dipole-dipole interaction better explains the observed effects used in support of the n{p Ã interaction. A prior manuscript by the same authors dismissed the dipole-dipole interaction explanation without elaboration [18].

Results
A total of 2,516,360 residue pairs from a set of 164 highresolution (v1.6 Å ) x-ray crystal structures with assigned and uniformly referenced carbonyl 13 C chemical shifts were analyzed for potential n{p Ã interactions. Setting a maximal distance of 6.0 Å between the donor oxygen and acceptor carbon yielded 45,792 potential acceptor carbonyl carbon atoms. The carbonyl 13 C chemical shift differences relative to random coil values for each of the 45,792 potential acceptor carbonyls were plotted against the d and h values for each carbonyl pair ( Figure 1B). These chemical shift differences represent the contribution from the local structural environment, and potentially the n{p Ã interaction. The two-dimensional contour plot indicates a maximal downfield shift of 2.9 ppm centered on the optimal structural parameters predicted for an n{p Ã interaction.
Of the 45,792 carbonyls, 5,378 had optimal d and h values for an n{p Ã interaction and 40,414 were outside this optimal range. The mean carbonyl 13 C chemical shift difference for the 40,414 carbonyls labeled as non-interactors is 0.58 6 1.98 ppm. In contrast, the mean carbonyl 13 C chemical shift difference for the 5,378 interactors is 2.93 6 2.41 ppm. A Student's t-test indicates the difference of 2.35 ppm between the two means is statistically significant at the 99.9% confidence level. To address possible errors introduced into the analysis by highly dynamic residues in the x-ray structures, possible carbonyl interactors with B-factors greater than two standard deviations above the mean were omitted from the dataset. In the resultant set of 44,302 potential carbonyl interactors, the 2.33 ppm chemical shift difference was statistically indistinguishable from the original analysis. Similarly, possible interactors predicted at a 95% confidence level to participate in crystal-packing interfaces were also omitted from the dataset. Again, the corresponding set of interactors yielded a chemical shift difference of 2.33 ppm, which is still statistically significant at the 99.9% confidence level.
To address the possiblity that differences between x-ray crystal structures and NMR solution structures may lead to errors in the analysis, a replicate analysis was performed on a set of 137 NMR solution structures corresponding to the same set of 13 C and 15 N chemical shifts used previously. Structural alignments using MAMMOTH showed a mean rmsd of 1.87 6 0.57 Å between the pairs of x-ray and NMR structures. Of the 1,419,547 resulting carbonyl pairs from the NMR structures, 38,534 pairs were found to be potential interactors. Of the carbonyls in that set, 2,510 interactors were found with a mean carbonyl 13 C chemical shift difference of 2.84 6 1.71 ppm. The remaining 36,024 noninteractors had a mean carbonyl 13 C chemical shift difference of 1.0262.02 ppm. Again, the 1.82 ppm difference between the two means is statistically significant at the 99.9% confidence level, indicating that differences between x-ray and NMR structures cannot account for the observed downfield 13 C chemical shift.
As predicted, a clear correlation is observed between structural regions consistent with an optimal n{p Ã interaction and a downfield shift of the accepting carbonyl 13 C resonance. Interestingly, the potential n{p Ã interactions were primarily observed between sequential (Di{jD~1) carbonyls. Out of the 164 structures and 2,516,360 residue pairs, only four pairs of carbonyls exhibited a through-space (Di{jDw5) arrangement consistent with an optimal n{p Ã interaction. This result implies any protein stabilization energy obtained from the proposed n{p Ã interaction is opportunistic, as opposed to a driving force for protein folding. Apparently, the formation of through-space n{p Ã interactions is simply less favorable than for other interactions, such as hydrogen bonds or salt-bridges. This also implies that the predicted energy of 5.4 kJ/mol for an optimal n{p Ã interaction is an over-estimate.
In actuality, an n{p Ã interaction that imparts a stability of 5.4 kJ/mol would likely fix adjacent pairs of carbonyl groups to preferred torsional angles in order to maximize this interaction. Correspondingly, the existence of these highly energetic n{p Ã interactions would likely be detrimental to properly folding a protein structure. Folding a protein to its native fold would require distorting the majority of carbonyl pairs away from the ideal torsion angles for a proper n{p Ã interaction. Only 12% (5,378 out of 45,792) of carbonyls from the 164 x-ray structures adopted conformations with optimal d and h values for an n{p Ã interaction. As a result, folding every protein structure would incur an initial energetic penalty of nearly 5.4 kJ/mol per carbonyl pair.
A predominant number of the carbonyls consistent with an optimal n{p Ã interaction and with a downfield shift of roughly 2.5 ppm fall within the typical a-helical region of a Ramachandran plot, where the remaining residues are near the twisted b-sheet region (see Figure S2). Significant chemical shift changes for carbonyl residues within secondary structures are well documented [19]. Previous analyses of structural factors contributing to carbonyl 13 C chemical shifts have implicated hydrogen bond formation [20][21][22] or excluded hydrogen bond formation [23][24][25], have implicated w, y, and x dihedral angles [24] or have excluded secondary structure parameters [20,23]. Thus, other factors, such as hydrogen bonds or dipole-dipole interactions, may explain the apparent correlation between carbonyl 13 C shifts and the optimal d and h values for an n{p Ã interaction. This is probable given the association of n{p Ã interactions with secondary structure elements. The contribution of a dipole-dipole interaction to carbonyl 13 C chemical shifts is illustrated in Figure 2. The dipoledipole potentials were calculated using the high-resolution x-ray structures for each of the 45,792 carbonyl pairs with a maximal distance of 6.0 Å between the donor oxygen and acceptor carbon. While there is significant scatter in the data, there is also a clear trend between a downfield carbonyl 13 C chemical shift and an increasing dipole-dipole energy. Importantly, the cluster of acceptor carbonyls in Figure 2 with the largest 13 C chemical shift difference (3.15 + 2.44 ppm) and positive dipole-dipole potentials also conforms to the optimal d and h values for the predicted n{p Ã interaction.
The contribution of a hydrogen-bond interaction to the carbonyl 13 C chemical shift was similarly evaluated by calculating the shortest oxygen-hydrogen distance (d O{H ) for each donor carbonyl. Again, the distances were calculated using the highresolution x-ray structures for each of the 45,792 carbonyl pairs. A three-dimensional plot comparing the dipole-dipole potentials, oxygen-hydrogen distances, and the associated carbonyl 13 C chemical shifts is very revealing. It can be clearly seen from Figure 3 that any contribution from a hydrogen bond to the 13 C carbonyl chemical shift is minimal relative to the dipole-dipole contribution. Both the a-helical and b-sheet regions, which obviously contain hydrogen bond interactions, have distinctly different 13 C carbonyl chemical shifts. The a-helical region corresponds to a positive dipole-dipole interaction, and correspondingly to a large carbonyl 13 C chemical shift difference. Conversely, the b-sheet region has a negative dipole-dipole interaction and a near zero carbonyl 13 C chemical shift difference.
These results further indicate a consistency with a dipole-dipole interaction as opposed to the predicted n{p Ã interaction.
It is important to note that there is a second cluster of carbonyls in Figure 2 with low 13 C chemical shifts and negative dipole-dipole potentials that are also consistent with the optimal d and h values for the predicted n{p Ã interaction. A visual inspection of the xray structures indicates that these carbonyl pairs are actually pointing away from each other and do not form the configuration for an n{p Ã interaction illustrated in Figure 1A. Clearly, d and h values alone fail to adequately define the optimal geometry of the dipole-dipole interaction that is apparently responsible for the observed downfield 13 C chemical shifts.
To further examine the origin of these effects, quantum chemical calculations were conducted on a model system, a formamide trimer in which molecules 2 and 3 form an approximately planar, head-to-tail hydrogen bonded dimer, and molecule 3 acts as a putative n{p Ã donor, with the n p 'donor' oxygen fixed at a distance d which ranges from 2.9 Å and 3.3 Å from the carbonyl carbon of molecule 2, with the O 3 Á Á ÁC 2 vector also fixed at angles h from 70u to 120u from the C 2 = O 2 vector. To avoid problems with the use of density functional theory to model virtual orbitals, Möller-Plesset second order perturbation theory (MP2) was instead used, with a substantially larger basis set than in the previous work. The geometry and relevant Hartree-Fock orbitals of the complex used is shown in Figure 4, for d = 2.9 Å and h~100 0 . The computed chemical shielding is shown in Figure 5A as a function of d and h. The shielding decreases monotonically with h, but, in contrast, the slope of the shielding surface with respect to d changes sign between h~70 0 and h~120 0 . This shielding surface does not have the geometry expected if the chemical shielding dependencies on h and d were dominated by an n p {p Ã interaction, where shielding should be maximum at h slightly larger than 90u and d = 2.9 Å , decreasing rapidly with increasing values of d.
However, the shielding surface does show a remarkable similarity to the dipole-dipole energy between the putative donor and acceptor, as shown in Figure 5B. This energy was computed using a very simple model assuming the electric dipole vector lies along the carbonyl bond for both molecules and has a value of 2.34 D or 7:81|10 {30 C : m. As can be seen, the dipole energy closely parallels the chemical shielding surface, monotonically increasing with h and inverting its slope with respect to d as h increases. This indicates the major influence on the carbonyl 13 C chemical shielding is not an n p {p Ã interaction but rather the electrostatic field from the neighboring carbonyl dipole. The correspondence is not, however, exact: the chemical shielding surface shows a small negative inflection around h~90 0 , which is actually slightly reversed in the dipolar energy plot.
In order to examine whether an n p {p Ã interaction might be responsible for this inflection, the chemical shielding surface was fit to a function proportional to the dipolar energy, under the assumption the dipole moment vector lies along the C = O bond direction, and the best fit subtracted from the chemical shielding surface (see Figure S3A). The residual shows a minimum at h*95 0 , as would be expected for an n p {p Ã interaction, but the dip does not appear to decrease rapidly as d increases, as an orbital overlap term would. In fact, the residual is slightly larger at d = 3.3 Å than at 2.9 Å (1.3 ppm vs. 1.1 ppm).
From the fit of the shielding surface to the estimated dipole interaction energy, with the assumption the magnitude of the electric dipole moment is that of a formamide monomer (3.7 D), a dependence of chemical shielding on field of {190 ppm/a.u. was obtained (1 atomic unit (a.u.) of electric field equals 5:142|10 11 V/m). Direct calculations of the dependence of the shielding of an isolated formamide on an external applied field along the C = O bond direction gave a value of {150 ppm/a.u. However, it is highly likely that this estimation of the dipole-dipole interaction for two amides is low. Firstly, higher electric multipole terms were neglected in the calculation, and these are likely to be substantial for a moiety as asymmetric as a peptide linkage, at these close proximities. Second, the interaction of the dipoles is likely to be enhanced by the highly polarizable hydrogen bond, which is necessarily omitted in the monomer model. Agreement of the model with direct estimates of the effect of electric field on shielding is therefore rather good.
The dependence of chemical shielding on hydrogen bonding strength for all combinations of d and h was examined as a function of the hydrogen bond distance d O{N (see Figure S3B). In accordance with the results of Wishart and others [23][24][25], and  contrary to initial naïve expectations, the effect was very small and independent of the position of the putative n p {p Ã donor carbonyl.
For the sake of completeness, the effect of an 'end-on' carbonylcarbonyl interaction was examined, using a dimeric cluster in which the 'donor' carbonyl bond was parallel to the 'donor' oxygen-'acceptor' carbon vector, resulting in a possible n s {p Ã interaction. As can be seen in Figure 6, for values of d ranging from 2.9 Å to 4.1 Å , the chemical shielding also follows the negative of the dipolar interaction energy over the range 70 0 vhv120 0 , with little evidence of any effect of orbital overlap on chemical shielding.
One other outcome of the calculations is of possible note. While there was little discernible effect of the proposed n s {p Ã or n p {p Ã interactions on the shielding of the carbonyl carbon or the length of the carbonyl bond, substantial pyramidalization of the amide nitrogen was observed at low values of d and values of h close to 90u. This would indicate that the primary effect of the 'donor' carbonyl might not be on the carbonyl p bond per se, but on its delocalization over the entire amide group. There was also a substantial lengthening of the carbon-nitrogen bond -consistent with a reduced bond order -accompanied by substantial changes in the computed 15 N chemical shielding. Thus, while no evidence was found of effects from n s {p Ã interactions on the 13 C NMR spectroscopy or the energetics of the system, such interactions might be detectable in 15 N chemical shifts. Unfortunately, 15 N shifts are known to be much more dispersed than carbonyl 13 C shifts and are susceptible to a wide range of influences, so disentangling the interaction in real proteins might be a Herculean task.

Discussion
When the molecular orbitals for the trimeric complex are examined in detail, the above results become clear. It is in fact misleading to think of amide groups as being dominated by the carbonyl p bond. The highest occupied molecular orbital (HOMO) of the formamide trimer in fact consists almost entirely of p z orbitals on the N and O, with wavefunctions of opposite sign. This is depicted in Figure 4A for the Hartree-Fock HOMO of the hydrogen bond donor (energy = {0:377 Ha). The orbital is slightly bonding with respect to the carbonyl, but the carbonyl carbon overall has very little contribution to the molecular orbital.  The equivalent orbital of the putative acceptor ( Figure 4B) has somewhat lower energy ({0:438 Ha) but shows remarkably little mixing with other molecular orbitals, and in particular little mixing with the n p orbital of the putative n p {p Ã donor ( Figure 4C). That orbital has in fact a very similar energy ({0:465 Ha), and at other geometries -specifically lower values of h, mixes with the HOMO of the acceptor. The reason for this is quite simple; because the HOMO has only a very small contribution for carbonyl carbon orbitals, bringing the n p orbital closer to it has very little effect. The mixing that is present at smaller values of h in fact seems to be partly responsible for the increased pyramidalization of the nitrogen of the acceptor at those orientations. We see no evidence of any orbital mixing that could be attributed to n p {p Ã interactions. Given the weakness of the mixing with orbitals that are very close in energy to n p it is implausible that substantial mixing would be observed with an orbital almost a Hartree higher in energy.
In conclusion, quantum chemical calculations, experimental carbonyl 13 C chemical shifts and structural data indicate that a simple electrostatic dipole-dipole interaction explains the large downfield carbonyl 13 C chemical shift in an a-helix. There is no evidence for a significant contribution from an n{p Ã interaction to the carbonyl bond. The single indication of n{p Ã interactions seems to be a substantial lengthening of the carbon-nitrogen bond and pyramidalization of the nitrogen at h angles favorable for these interactions. In fact, such pyramidalization seems to be a logical consequence of the electronic structure of amides, whose p orbitals are delocalized over the whole system.

Analysis of Experimental Structures
A detailed statistical analysis was performed to correlate experimentally observed carbonyl 13 C chemical shifts with structural parameters between all possible pairs of carbonyls. Specifically, the angle between the carbonyls (h) and the distance (d) between the oxygen and carbon were compared to experimental carbonyl 13 C chemical shifts. The PISCES [26] (http://dunbrack.fcc.edu/pisces) set of 2,885 high-resolution (v1.6 Å ) x-ray crystal structures with less than 30% pairwise sequence identity selected from the RCSB Protein Data Bank (PDB) [27] was used for this analysis. Each structure was associated with assigned NMR 13 C and 15 N chemical shifts from the Biological Magnetic Resonance Bank (BMRB: www. bmrb.wisc.edu) [28] by FASTA [29] sequence alignments. The best match with an E-value ƒ1:0|10 {9 and sequence identity §95% was chosen, where the median E-value was 3:8|10 {40 . The aligned 13 C and 15 N chemical shifts were uniformly referenced with the SHIFTCOR software tool [30]. The protein interfaces, surfaces and assemblies software tool (PISA, http:// www.ebi.ac.uk/pdbe/prot_int/pistart.html) [31] was used to predict residues involved in crystal packing interfaces. Residues with B-factors two standard deviations from the mean within each structure were identified as dynamic. Also, 3,699 NMR solution structures with PDB depositions cross-linked to the BMRB were used as a validation dataset, with alignments performed in an identical fashion to the analyzed x-ray structures.
A set of Perl scripts was written to extract structural parameters from the x-ray structures. For each structure in the selected set, all pairs of residues were analyzed for the possibility of an n{p Ã interaction. Values of d and h were calculated for each residue pair, and torsional angles w and y were calculated for the 'acceptor' residue of each pair. Pairs of carbonyls with d and h values within the optimal limits for an n{p Ã interaction were labeled as interactors (see Figure S1). Standard random-coil chemical shifts were subtracted from the experimental carbonyl 13 C chemical shifts for each residue.
For all pairs of residues, a dipole-dipole potential (V dd ) was calculated from the high-resolution x-ray structures using equation 1: wherem m 1 andm m 2 are the two C = O bond vectors,r r is the vector between the centers of the C = O bonds, andr r is its unit vector. The nominal value of 2.34 Debye was taken for the carbonyl dipole moment. Similarly, for all residue pairs, the minimum possible hydrogen bond length (d O{H ) was calculated from the high-resolution x-ray structures. Hydrogen bond lengths were calculated based on the nearest non-neighboring backbone amide hydrogen, with a maximal bonding angle of 60u.

Model Compound Calculations
Quantum chemical calculations were done using the program Gaussian-09 [32]. A nearly planar formamide head-to-tail dimer, composed of a formamide monomer (molecule 1) hydrogen bonded through its C = O group to the N-H group of a second, nearly parallel formamide (molecule 2) was chosen to approximate the hydrogen bonding motif found in both a-helices and antiparallel b-sheets. The dimer was fully optimized at the MP2/6-311++G(2d,p) level; Möller-Plesset second order perturbation theory (MP2) was chosen because it is superior in modeling long-range and dispersive contributions to the electron correlation Hamiltonian. A third formamide (molecule 3) was then added to generate the putative n{p Ã interaction with molecule 1, imposing the following constraints: (1) C 3 = O 3 Á Á ÁC 1 angle fixed at 90u, to ensure the n p orbital of molecule 3 points toward the carbonyl of molecule 1 (2) O 3 Á Á ÁC 1 = O 1 constrained to a set of fixed angles h, ranging from 70u to 120u (3) r O3 Á Á ÁC 1 constrained to a set of fixed distances d, ranging from 2.9 Å to 3.3 Å (5) r O1 Á Á ÁN 2 constrained to a set of fixed distances, ranging from 2.8 Å to 3.2 Å , to vary the strength of the hydrogen bond. The system of three molecules was then subjected to constrained optimization at the same level as before. The optimized trimolecular system at an angle h~90 0 is shown in Figure 5. Finally, a full set of shielding tensors was computed using standard GIAO methods. Figure S1 d,h-space Analysis of Experimental Structures. Plot of the distance (d) and angle (h) measured between each of the 45,792 pairs of carbonyls with a potential n{p Ã interaction. The relative density of points in the occupied d and h space was used to generate a transparency mask for Figure 1. (TIF) Figure S2 Ramachandran-space Analysis of Experimental Structures. Ramachandran plot of carbonyls with 13 C chemical shift differences relative to random coil that are w2.5 ppm. The acceptor carbonyls from each pair of carbonyls with d and h values within the optimal limits for an n{p Ã interaction are colored red.