Random Coil to Globular Thermal Response of a Protein (H3.1) with Three Knowledge-Based Coarse-Grained Potentials

The effect of temperature on the conformation of a histone (H3.1) is studied by a coarse-grained Monte Carlo simulation based on three knowledge-based contact potentials (MJ, BT, BFKV). Despite unique energy and mobility profiles of its residues, the histone H3.1 undergoes a systematic (possibly continuous) structural transition from a random coil to a globular conformation on reducing the temperature. The range over which such a systematic response in variation of the radius of gyration (Rg) with the temperature (T) occurs, however, depends on the potential, i.e. ΔTMJ ≈ 0.013–0.020, ΔTBT ≈ 0.018–0.026, and ΔTBFKV ≈ 0.006–0.013 (in reduced unit). Unlike MJ and BT potentials, results from the BFKV potential show an anomaly where the magnitude of Rg decreases on raising the temperature in a range ΔTA ≈ 0.015–0.018 before reaching its steady-state random coil configuration. Scaling of the structure factor, S(q) ∝ q−1/ν, with the wave vector, q = 2π/λ, and the wavelength, λ, reveals a systematic change in the effective dimension (De∼1/ν) of the histone with all potentials (MJ, BT, BFKV): De∼3 in the globular structure with De∼2 for the random coil. Reproducibility of the general yet unique (monotonic) structural transition of the protein H3.1 with the temperature (in contrast to non-monotonic structural response of a similar but different protein H2AX) with three interaction sets shows that the knowledge-based contact potential is viable tool to investigate structural response of proteins. Caution should be exercise with the quantitative comparisons due to differences in transition regimes with these interactions.


Introduction
The structures of proteins have been a subject of extensive investigation for decades particularly using computer simulations (with overwhelming amount of literature, the list is too large to cite) .Accurate potentials based on the structural details of atoms, molecules, and amino acids are of particular interest in modeling proteins.Incorporation of good potentials or force fields based on the fundamental interaction is highly desirable particularly in probing the structural details at small scales.With a welldefined force field based on the basic interactions (involving few fundamental parameters), it is easier to implement standard tools of statistical mechanics (e.g., molecular dynamics, Monte Carlo and variants).As a result, it is feasible to probe the effects of thermodynamic parameters such as temperature, concentration of solvent, molecular weight, substrate, etc. on the structure of protein.Due to the enormity of the time scale, it is not feasible to incorporate such elaborate force fields (involving electronic structures at atomic scales) in probing the conformational ensemble of larger proteins that can undergo dramatic structural transformation.
In order to carry out large-scale computer simulations and draw meaningful conclusions, some degree of approximation is unavoidable in almost all models involving all-atom details to minimalist coarse-grained descriptions.Some of the approximations and coarse-graining procedures include devising interaction potentials, exploring the phase space selectively, resorting to efficient and effective methods, etc.A considerable part of phenomenological modeling of proteins relies on the native structure, which is critical in performing its major function.Interaction among the amino acids (AAs), arising from fundamental atomic interactions and covalent bonding, and with the underlying matrix is critical in understanding the structure of the protein.What distinguishes one protein from another is the size of the protein (number of AAs) and the sequence.In many investigations, simplified phenomenological energy functions are considered to explain the native structure as the minimum energy state.The lowest energy configuration may not be the most probable configuration due to frustration caused by the competing effects of steric constraints (covalent bonding), interactions and temperature.
One approach used extensively to understand the structure of the protein involves residue-residue interactions based on the contact matrix, which is derived from an ensemble of frozen structures of protein available at the protein data bank (PDB) using a number of assumptions and approximations.Early knowledgebased interaction potential proposed by Tanaka and Scheraga 1 was further developed by Miyazawa and Jernigan (MJ) [2,3] in a mean-field spirit of an effective medium.
Over the years, a number of knowledge-based contact potentials [1][2][3][4][5][6][7][8][9][10][11] have been re-examined and redeveloped to understand the folding dynamics of a protein.For example, Betancourt and Thirumalai [7] have examined the MJ contact matrix and the potential matrix by Skolnick et al. [11] and proposed their own contact potential matrix (BT).By selecting an appropriate reference solvent (Thr) within the Miyazawa and Jernigan scheme [2,3], they find [7] that the BT interaction matrix gives 'hydrophobicities that are in very good agreement with experiment.' Bastolla et al. (BFKV) [8] have examined some of these knowledge-based interaction potentials and presented a scheme to guarantee optimal stability for most representative structures.They have pointed [8] out that 'the optimized energy function guarantees high stability and a well-correlated energy landscape to most representative proteins in the PDB database.'We have recently implemented [33] the classical MJ interaction matrix to examine the thermal response of two histones (H3.1, H2AX) [36,37].These proteins are of comparable size (with 136 and 143 residues, respectively) but respond very differently to temperature.Whereas H3.1 exhibits a systematic transition from a random coil to globular conformation (see below), H2AX shows non-monotonic dependence (expanded conformations followed by compact structures) with a maximum as a function of temperature [33].
Because of the phenomenological nature of the knowledge-based interactions, it is worth re-analyzing the thermal response again with the tested and improved potentials such as BT [7] and BFKV [8] potentials in addition to classical MJ potential to understand similarity and differences in the conformational response to temperature.
In context to extracting the optimal weight associated with the knowledge-based residue-residue contact matrix elements, Pokarowski et al [35] have analyzed 29 contact potentials and concluded that 'one-body approximations of the contact potentials could be useful in some applications.'They have pointed out the 'opportunities' to develop different further types of potentials (perhaps multibody)'.Such an extensive analysis clearly underscores the fact that the knowledge-based contact potentials are phenomenological measures and are somewhat adhoc and that the reliability of results about the general features and specific response should be carefully examined.In absence of comprehensive analysis based on fundamental hierarchical interactions, knowledge-based interaction do provide a feasible mechanism to incorporate some specificity of residues in understanding the structure of a protein.We focus here on the conformation of histone H3.1 [36,37] as a function of temperature.In order to identify common results (reproducible by different potentials) and differences, we use three contact matrices as an input to a phenomenological potential (see below) to investigate the effect of temperature on the conformation of histone H3.1:

Model and Methods
The histone H3.1 consists of 136 residues in a unique sequence [36,37].It is represented [31][32][33] by 136 nodes tethered together by the fluctuating bonds [38] on a cubic lattice in our coarsegrained description.A node (residue) is represented by a unit cubic cell (occupying its eight lattice sites) with excluded volume constraints [39].The bond length between consecutive nodes can vary between 2 and !10.Such a bond fluctuation description is known to capture the computational efficiency while incorporating ample degrees of freedom and extensively used in complex polymer systems [39], multi-component nano-composites [39,40] and protein chains [31][32][33]41].Note that the lattice model with fluctuating (i.e., expanding and contracting) covalent bonds between consecutive residues has more degrees of freedom that the minimalist HP model used for sensitivity test by Betancourt and Thirumalai [7].The large-scale simulations become feasible with such simplified coarse-grained representation of a residue without the all-atom details while capturing the specificity of each residue via residue-residue interactions.Each residue interacts with the neighboring residues within a range (r c ) with a generalized Lennard-Jones potential, where r ij is the distance between the residues at site i and j; r c = !8 and s = 1 in units of lattice constant.The potential strength e ij is unique for each interaction pair with appropriate positive (repulsive) and negative (attractive) values (see below).
The Metropolis algorithm is used to move each tethered residue randomly.For example, a residue, for instance, at a site i, is selected randomly to move to a neighboring lattice site, j.As long as the excluded volume constraints and the limitations on changes in the covalent bond length are satisfied, the residue is moved from site i to site j with the Boltzmann probability exp(2DE ij /T), where DE ij = E j -E i is the change in energy between its new (E j ) and old (E i ) configuration and T is the temperature in reduced units of the Boltzmann constant and the energy (e ij ).A unit Monte Carlo step (MCS) is defined as attempts to move each residue once.During the course of simulation, we keep track of a number of local and global physical quantities including energy of each residue, its mobility, mean square displacement of the center of mass of the protein, radius of gyration and its structure factor.Simulations at each temperature are performed for a sufficiently long time (typically for ten million time steps) with many independent samples (typically 150 samples for long runs and 1000 samples for short runs) to estimate these quantities.The data presented here are generated on a 64 3 lattice although different lattice sizes are used to assure that there is no finite size effect on the qualitative variations of the physical quantities and our conclusions.
As mentioned above, we use three knowledge-based contact matrices (figure 1), i.e., the classical MJ [2], BT [7], and BFKV [8] for the residue-residue pair interaction (e ij ).On a first look, these matrices appear somewhat similar in general apart from the difference in magnitudes.There are however differences that are easier to spot with a closer look, e.g.elements 1-10, 190-200, etc., which may show in the final results on the thermal response (see below).

Results
Some insight into the structural relaxations and equilibration of the conformation at local and global scales of protein can be gained from the snapshots and animations during the course of simulation.For example, see a typical snapshot of the histone at a representative temperature resulting from MJ interaction in Figure 2 at the end of the simulation, i.e., at t = 10 7 steps.
Note that the conformation of the protein undergoes numerous configurations (better seen in animations) and each of the snapshots is an instantaneous configuration.At low temperature (T = 0.010), most of the residues are organized into a compact globular conformation which opens up as the temperature is increased while maintaining some degree of local assembly.The overall size of the protein increases as the temperature increases.Which sections of the local segments coagulate while others elongate depend on the specific residues in the sequence in corresponding segments.The interactions among the residues compete with the thermal energy and this competition leads to a vast ensemble of configurations.The ensemble averaging of the local and global physical quantities provides us the trend in variation of the equilibrium structure and size of the protein as a function of the temperature.
The residue map of the protein structures at representative temperatures (low to high on a relative basis) with BT and BFKV potentials are presented in figures 3 and 4 respectively.Despite the difference in range of temperature in BT and BFKV schemes, we see a general systematic conformational crossover, i.e. from a compact to an elongated conformation on raising the temperature.Thus the visual inspections of the snapshots as a result of three potentials (MJ, BT, BFKV) lead to a general feature of histone H3.1, that it opens up on raising the temperature and collapses into a compact form on lowering the temperature.This may not appear dramatic, but it is particularly noteworthy that the residues in a specific sequence in H3.1 experience a diverse range of interactions but respond collectively in such an organized fashion to bring about a systematic change in global conformation (somewhat similar to homo-polymers).However, another histone (H2AX) of comparable size exhibits non-monotonic structural response [33] where the competing and cooperative effect of interacting residue leads to a very different result.
Variation of the average radius of gyration (R g ) with the temperature is presented in Figure 5 as a result of the MJ potential.Despite fluctuations in the data points, a systematic variation of R g with temperature seems to suggest a rather smooth transition around T c ,0.013-0.015from an extended structure at high temperature (T$0.020) to a compact morphology at the low temperature (T#0.013).The dependence of the root mean square (RMS) displacement (R c ) of the center of mass of the protein with the time steps (t) also exhibits a systematic change in the global dynamics of the protein characterized by a power-law R c /t k .For example, it changes from a nearly standstill (frozen-in) state (kR0) at low temperatures (T#0.013) to a highly mobile protein with diffusive motion (k = 1/2) at high temperatures with a range of subdiffusive (1/2,k,0) dynamics at the intermediate temperatures in the transition regime.Regardless of appreciable fluctuations, the radius of gyration seems to reach steady-state values at most temperatures (inset figure 5).It should be pointed out that MC step (t) is not the real time but provides a mean to test stochastic dynamics.We also know that the asymptotic dynamics of a chain are diffusive at high temperature as for a gas molecule in a dilute gas.The approach to diffusive motion of the protein chain at high temperature is clearly seen (the inset in figure 5 for R c ). Reducing the temperature leads to slow dynamics; a systematic slowing down of the protein dynamics as observed here (figure 5 inset) is thus consistent with the expectation.These trends support the reliability of such a coarse-grained approach in gaining the global insight into the structural evolution of the protein at large scales.It should be pointed out that not every sequence of 136 amino acids can lead to such transition from a random coil to a globular conformation.As mentioned above, a similar histone (H2AX) of comparable size exhibits [33] very different thermal response, i.e., a nonmonotonic dependence of its gyration radius with the temperature with the same MJ potential.
The thermal response of the radius of gyration of the protein with all three potentials, i.e., MJ, BT, and BFKV, is presented in figure 6 for comparison.We see both similarity and differences.The protein expands on raising the temperature within a range (DT), a general cooperative characteristics of the histone H3.1 results from all potentials.The range over which a systematic (monotonic) response occurs, however, depends on the potential matrix.The range of temperatures is DT MJ <0.013-0.020with the MJ potential.The range shifts towards higher temperatures DT BT <0.018-0.026with the BT potential and towards lower temperatures DT BFKV <0.006-0.013with the BFKV potential.At the high temperature regime, the magnitudes of R g converge to a steady-state value with a random coil structure (see below) with all potentials.There is an anomaly, however, with the BFKV potential where the magnitude of R g decreases on raising the temperature in a range DT A <0.015-0.018before reaching its saturation.It is difficult to know which potential is better than another over the entire range of temperatures due to lack of explicit experimental data on the variation of the R g of the histone H3.1 with the temperature.The potential matrices BT and BFKV proposed by Betancourt and Thirumalai [7] and Bastolla et al. [8], respectively, seem to be an improvement over the classical MJ potential [2].Results from both potentials, BT and BFKV, show opposite shift with respect to MJ potential.Despite similar statistics regarding the sampling, the data for R g in figure 6 with BFKV appear smoother than that with MJ and BT which is primarily due to differences in its contact matrices.
The reduced unit of temperature (with Boltzmann constant and interaction energy) used here appears somewhat arbitrary but it is kept the same for all potential matrices in our coarse-grained MC approach as are the potentials which are phenomenological as described above.Therefore, some guidance will be welcome from clean experiments to identify the thermal response of histone H3.1.Such a calibration may help identify the range of validity or reliability of different potentials.The common features (e.g., random coil to globular transition) in thermal response resulting from all potentials may also help with understanding and interpreting the universal characteristics of the histone H3.1 in future experiments.
In order to examine the spatial distribution of residues, i.e., the shape of the protein, we have analyzed the structure factor S(q) (Figure 7): e {iq q : r r D 2 T Dq qD where r j is the position of each residue and |q| = 2p/l is the wave vector of wavelength, l.From the power-law scaling of the structure factor with the wave vector, S(q)/q 21/n , one can estimate the spatial distribution of residues in the protein (R g / N n ).Of particular interest is the range of the wave vector, q < 0.35-0.75corresponding to the average size of the protein, R g <10-26 (see Similar scaling fits are also consistent with the results of BT and BFKV potentials (figure 7).Note that at high temperatures, thermal energy dominates over residue-residue interaction.Therefore, the protein chain behaves like an ideal polymer chain (with excluded volume constraints) as it should.

Discussion and Conclusions
A coarse-grained Monte Carlo simulation is used to investigate the effect of temperature on the conformation of a protein, histone H3.1, using three knowledge-based potentials, the classical MJ, BT and BFKV.The protein is described by a coarse-grained chain of residues (nodes) tethered together by fluctuating covalent bonds.Each residue interacts with other residues within a range of interaction using a generalized LJ potential where knowledgebased contact matrices (MJ, BT, BFKV) are used as an input for the residue-residue interaction.The coarse-grained interaction matrix thus captures the specificity of each residue.Extensive simulations are performed for a range of temperatures for sufficiently long time steps to identify changes in conformation and stability.We have examined a number of local and global physical quantities including the energy of each residue, its mobility, mean square displacement of the center of the mass of the protein, radius of gyration and its structure factor.Thermal responses resulting from the three potentials are compared and similarities and differences are pointed out.
Global conformation (measured by the radius of gyration and the structure factor) resulting from the collective dynamics of residues in histone H3.1 depends on temperature.How it changes depends on the interaction potential and the range of temperature.One of the general characteristics common to results from all three potentials (MJ, BT, BFKV) is that the protein undergoes a systematic conformational transformation: globular conformation at low temperature to a random coil at high temperatures (simple but unique to H3.1).The range over which such a systematic response occurs, however, depends on the potential matrix.For example, DT MJ < 0.013-0.020with the MJ potential, and shifts towards higher temperatures DT BT < 0.018-0.026with the BT potential and lower temperatures DT BFKV < 0.006-0.013with the BFKV potential.The magnitudes of R g converge to a steady-state value with a random coil structure with all potentials in the high temperature regime.Variation of R g with the temperature shows an anomaly with the BFKV potential where the magnitude of R g decreases on raising the temperature in a range DT A < 0.015-0.018before reaching its saturation.Despite the improved potentials suggested by both groups, Betancourt and Thirumalai [7] and Bastolla et al. [8], it is not clear which potential is better than the other over the entire range of temperatures.Results from both potentials, BT and BFKV, show opposite shift with respect to the MJ potential.One of the main problems is the lack of explicit experimental data on the variation of the R g of the histone H3.1 with temperature.The power-law scaling of the structure factor with the wave vector, S(q) / q 21/n , a consequence of the spatial distribution of residues (R g / N n with N being the number of residues) in the protein provides an insight into the overall morphology.The wave vector in the range of the radius of gyration leads to an effective dimension D e < 1/n of protein.Results of all three potentials (MJ, BT, BFKV) are consistent with our assessment about the global structure of the protein, i.e., the globular conformation D e <3 in low temperature regime and a random coil (ideal chain) D e <2 at the high temperatures.
Because of the unique sequence of interacting residues, the segmental morphology is heterogeneous and shows enormous variability.The cooperative and competing effect of these segments is expected to exhibit complex temperature dependence as seen with other histones [33].It is remarkable to observe such a continuous global transformation from a random coil to a globular structure on cooling.The cooperative assembly of residues seems to propagate on larger (genetic) length scales smoothly despite a rather random but unique distribution of attractive and repulsive segments (residues) of the protein.The structural response of the histone H3.1 to temperature is very different from that of the histone H2AX, which shows non-linear (non-monotonic) thermal response [33].It must be pointed out that the histone H3.1 plays a critical role in response to the cell's cycle in the structural response of DNA in translation, transcription, and replication while the histone H2AX is believed to be crucial in mounting the response to repairing damaged DNA [42][43][44][45][46][47][48].Therefore, the differences in thermal response of the global structures of two different histones seem appropriate for performing the distinct functions of each histone.
A protein can thus undergo a continuous conformational transition.Why is such a systematic thermal response in the structure of a protein important?First, to our knowledge, it is not common in such a complex protein.Second, it provides insight into the global response with respect to local characteristics of residues with its multi-scale hierarchical structural evolution.One may question that the size of the protein H3.1 (with 136 residues) is too small to observe a possible continuous change in conformation in thermodynamic limit (i.e., a protein chain with infinite size).While one cannot rule out such a possibility, completely different thermal response [33] (non-monotonic) of dissimilar proteins (e.g., H2AX) of comparable size (e.g., 143 residues) leads us to believe that the competition between the temperature and the characteristic interactions among the residues in a specific sequence is critical.
Coarse graining in modeling proteins is not unique; there can be many possibilities to develop alternate and new methods that may verify, complement or provide evidence against our findings.The variety of proteins and their characteristics is so huge (even in the histone family) that identifying their common characteristics is not feasible but highly desirable.If many proteins are found to exhibit such a thermodynamic transition, then one may be able to identify their common characteristics and special features, which may help identify universality in the non-universal world of peptides and proteins.  .Variation of the structure factor, S(q), of histone H3.1 with the wave vector, q with MJ, BT, and BFKV potentials from simulations on a 64 3 lattice with 150 independent samples with t = 10 7 MCS time.doi:10.1371/journal.pone.0049352.g007