Ensemble-Based Computational Approach Discriminates Functional Activity of p53 Cancer and Rescue Mutants

The tumor suppressor protein p53 can lose its function upon single-point missense mutations in the core DNA-binding domain (“cancer mutants”). Activity can be restored by second-site suppressor mutations (“rescue mutants”). This paper relates the functional activity of p53 cancer and rescue mutants to their overall molecular dynamics (MD), without focusing on local structural details. A novel global measure of protein flexibility for the p53 core DNA-binding domain, the number of clusters at a certain RMSD cutoff, was computed by clustering over 0.7 µs of explicitly solvated all-atom MD simulations. For wild-type p53 and a sample of p53 cancer or rescue mutants, the number of clusters was a good predictor of in vivo p53 functional activity in cell-based assays. This number-of-clusters (NOC) metric was strongly correlated (r2 = 0.77) with reported values of experimentally measured ΔΔG protein thermodynamic stability. Interpreting the number of clusters as a measure of protein flexibility: (i) p53 cancer mutants were more flexible than wild-type protein, (ii) second-site rescue mutations decreased the flexibility of cancer mutants, and (iii) negative controls of non-rescue second-site mutants did not. This new method reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants.


Introduction
The tumor suppressor protein p53 is a transcription factor that plays a major role in preventing cancer initiation and progression. Cellular stress conditions such as hypoxia or DNA damage activate p53, which induces cell cycle arrest, DNA repair, senescence, or apoptosis [1,2,3]. In most, if not all, human cancers, the p53 apoptosis pathway is inactivated, and p53 itself is mutated in about half of all human cancers. About three-quarters of tumors with mutant p53 express full-length p53 with single missense mutations in the p53 DNA-binding core domain. These mutations may cause partial or global protein destabilization, loss of zinc coordination, or disruption of DNA contacts, and thus inactivate the tumor suppressor function of p53 (www-p53.iarc.fr) [4]. These missense mutations (''cancer mutations'' or ''oncogenic mutations'') are widely distributed throughout the core domain ( Figure 1). They have been classified based on their physical location within the protein: (i) DNA-contact mutants (e.g., R248Q, R273H), (ii) structural mutants in the DNA binding surface (e.g., R175H, G245S, R249S, R282W), (iii) b-sandwich mutants (e.g., Y220C), and (iv) zinc-binding domain mutants (e.g., C242S, R175H).
Pharmacological rescue of p53 function in cancer tissues is an attractive therapeutic target [5]. Recently, two independent studies on transgenic mice demonstrated that restoration of p53 activity enables tumor regression in vivo [6,7]. p53 reactivation is especially promising in regression of advanced stage cancers [8,9]. The p53 function of some oncogenic mutants has been rescued in vivo by a handful of small molecules [10,11,12,13,14] as well as by second-site suppressor (''cancer rescue'') mutations [15,16,17,18]. The secondsite mutations provide easily-studied cases of p53 cancer rescue.
The effect of oncogenic and rescue mutations in p53 has been of great interest. Many detailed structural studies have been pursued, including X-ray crystal structures of individual oncogenic and rescue mutants of p53 [19,20,21]. The loss or gain of hydrogenbonding interactions, salt bridges and other minute stabilizing or destabilizing effects upon different missense mutations have been investigated to develop a more complete understanding of the inactivation mechanisms by the oncogenic missense mutations and, correspondingly, the mechanisms by which restoration of activity for rescue mutations occur [22,23]. At 310 K, wild-type p53 is estimated to be only 3.0 kcal/mol more stable than the denatured state [23], and thus missense mutations can easily shift the delicate balance of p53 stability.
The present study quantifies the effect of oncogenic and rescue mutations on the overall dynamics of p53 without focusing on local structural details. The core DNA-binding domain of p53 was used, as it dictates the stability of the overall protein [22]. The overall protein flexibility of the p53 DNA-binding domain for the wild-type, cancer mutants, rescue mutants and non-rescue mutants was compared in explicitly-solvated all-atom molecular dynamics (MD) trajectories, which are well suited to investigate the local conformational space sampled by each particular mutant. A single discriminating metric, the measure of flexibility of p53 in terms of the number of clusters obtained at a certain RMSD cutoff, was able to predict the functional activity of various mutant p53 proteins.

System preparation
The coordinates for the starting structure were obtained from the wild-type p53 coordinates of chain B in pdbID 1TSR [4]. Each mutant system was prepared from this structure by rebuilding the mutated side chain(s) with the AMBER suite [24]. Crystallographic waters were retained. Histidine, asparagine and glutamine side chains which were mis-fit during structure characterization were determined and flipped by 180u using the Molprobity web server [25]. Histidine protonation states were determined using the Whatif Web Interface [26] and manually verified. Zinc coordination residues (Cys176, Cys238, Cys242 and His179) were modeled following the cationic dummy atom method of Pang et al [27]. Missing atoms and hydrogens were added using the Leap module of Amber10 [24]. Each system was solvated in a TIP3P [28] water box. The buffer between the protein and the periodic boundary was not closer than 8 Å in any direction. The wild-type p53 system has a charge of +1. Chloride ions were added as needed to neutralize the different mutant systems studied. The topology and coordinate files of the systems were constructed using Amber FF99SB force field [29]. The final wild-type p53 system consisted of 27,264 atoms.

MD simulations
Each system was first relaxed by 36,000 steps of minimization and a standard relaxation procedure using restrained MD. In the first 2,000 steps of minimization only the hydrogen atoms were relaxed, leaving all other atoms fixed. In the second 2,000 steps, all water atoms and ions were minimized in addition to the hydrogen atoms. In the third 2,000 steps, zinc-coordinating residues Cys176, Cys238, Cys242 and His179 as well as all hydrogens, water atoms, Author Summary p53 is a tumor suppressor protein that controls a central apoptotic pathway (programmed cell death). Thus, it is the most-mutated gene in human cancers. Due to the marginal stability of p53, a single mutation can abolish p53 function (''cancer mutants''), while a second mutation (or several) can restore it (''rescue mutants''). Restoring p53 function is a promising therapeutic goal that has been strongly supported by recent experimental results on mice. Understanding of the effects of p53 cancer and rescue mutations would be helpful for designing drugs that are able to achieve the same goal. The challenge is that cancer and rescue mutations are distributed widely in the protein, and experimental testing of all possible combinations of mutations is not feasible. This paper describes a simple computational metric that reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants. and ions were minimized. In the following 10,000 steps, all atoms were minimized except backbone atoms, which were held fixed. In the last 20,000 steps, the entire system was minimized. Following the minimizations, restrained MD simulation at 310 K was carried out for 1 nanosecond to prevent structural artifacts from introducing kinetic energy into the system. For this purpose, positional restraints for the heavy atoms of the protein backbone were gradually decreased from 4.0 to 1.0 kcal/(mol * Å 2 ) in four consecutive 250-picosecond-long MD simulations.
Thereafter, unrestrained MD was performed in explicit solvent for 30 nanoseconds at 310 K using a time step of 1 femtosecond. Temperature was maintained constant at 310 K by Langevin dynamics with a collision frequency of 5 ps 21 , and pressure was maintained at 1 atm by the Nose Hoover-Langevin piston method [30,31] using period and decay times of 100 and 50 femtoseconds, respectively. Long-range electrostatics was treated by the Particle Mesh Ewald method [32] and a nonbonded cutoff of 10 Å was used. The interatomic distances within the water molecules were fixed using the SHAKE algorithm [33,34]. A multiple-time step algorithm was employed, in which bonded interactions were computed at every time step, short-range non-bonded interactions were computed at every second time step, and full electrostatics was computed at every fourth time step.
All minimizations and MD simulations were performed using NAMD2.7 [35] on the Teragrid Ranger cluster. The simulations scaled as 0.10 days per nanosecond using 64 processors. Rootmean-square-deviation (RMSD) traces over the course of the MD trajectories are depicted in Figure S1.

Mutants considered
We considered all four structural classes of p53 mutants: (i) DNA-contact mutants, (ii) structural mutants in the DNA binding surface, (iii) b-sandwich mutants, and (iv) zinc-binding domain mutants. We did not attempt to characterize any direct zincbinding residue mutations (e.g., C242S), as rigorous parameterization of the partial charges on the metal ion and coordinating groups would be required for proper treatment of any mutations in this area.

MD analysis
Conformational clustering was performed using the gromos algorithm [39] with GROMACS4.0.5 analysis software [40]. For each of the mutants, atomic coordinates were extracted at 10 ps intervals over the 30 ns MD simulation. The resulting 3000 structures were superimposed with respect to all C a atoms to remove overall translation and rotation, then clustered at various RMSD cutoff values (i.e., 0.95, 1.05, and 1.15 Å ) based on atomic coordinates of all C a atoms of the protein. After calculating an RMSD-distance matrix of atomic positions between all pairs of MD snapshots in a trajectory, the gromos clustering algorithm counts the number of similar MD snapshots for which the calculated RMSD is less than or equal to the determined RMSD cutoff for each MD snapshot. The MD snapshot with the highest number of neighbors (e.g. the structure with the smallest RMSD between all the other structures in the cluster) is determined to be the center of the first cluster. Thus this structure is also referred to as the ''cluster centroid.'' Subsequently, this entire cluster (i.e. the cluster centroid and its neighbors) is eliminated from the pool of MD snapshots, and the same process is repeated until all MD snapshots are assigned to a cluster.
As another potential flexibility metric, root-mean-squarefluctuation of all C a atoms of p53 in the trajectories were calculated using AMBER suite.
Two alternative clustering methods available in GROMACS package, namely single-linkage clustering and Jarvis-Patrick clustering, were also performed for comparison. A cutoff of 0.65 Å was used for single-linkage clustering. In Jarvis-Patrick clustering, the RMSD cutoff used to determine the number of nearest neighbors considered for Jarvis-Patrick algorithm was set to 0.80 Å , and the snaphots that have at least 3 identical nearest neighbors were assigned to the same cluster.

Biological assays
Functional activities of rescue and non-rescue mutants for which no published experimental p53 activity result exists (R273H_N239S, R273H_R282S, R273H_L114G, G245S_E286D, Y220C_A138G, Y220C_L137R and Y220C_L114G) were verified using yeast assays ( Figure 2). Wild-type p53 and relevant cancer mutants R273H and Y220C were also included in the assays as controls. For this purpose, p53-tester yeast strain RBy379 (1cUASp53::URA3 his3D200 a/ alpha) [17,41] expressing the URA3 gene under control of a p53dependent promoter was transformed with centromeric pTW300 plasmids [41] (HIS3 selection marker) expressing either wild-type human p53 or the mutants indicated under control of the ADH1 promoter. Yeast strains were grown in YEPD (10% yeast extract, 20% pepton, 20% dextrose) and transformed with the relevant plasmids using a LiAc-based transformation protocol [42]. Transformants were selected on SC plates lacking histidine and incubated at 30uC. Serial dilutions of mid-log phase cells (10,000; 2,000; and 400 cells) were spotted onto agar plates lacking either histidine or uracil. Plates were incubated for 2 days at 37uC. The growth on plates lacking histidine is selective only for the presence of the plasmid, while growth on plates lacking uracil is dependent on expression of the URA3 gene and is a measure of p53 activity [43].

Results
In order to compare the dynamical effects of different mutations on p53, MD-generated trajectories of various p53 mutants were clustered based on overall structural similarity. Explicitly solvated MD simulations were run in the isothermal-isobaric (NPT) ensemble for 30 ns, after which RMSD-based clustering was performed on the resulting trajectories with the gromos clustering algorithm [39]. The RMSD distance matrix was computed in a pairwise fashion over all of the alpha carbons for each structure extracted every 10 ps from a particular trajectory (i.e., 3000 structures representing each trajectory). A large range of RMSD cutoff values were tested, including 0.95, 1.05, 1.15, 1.25, 1.35 and 1.60 Å . RMSD cutoffs larger than 1.15 Å caused loss of sensitivity of NOCs to the effect p53 mutations. The low optimal RMSD cutoff is an indication of a well-behaved system sampling configurations within a single energetically low-lying substate, as well as a reflection of the small size and low flexibility of the p53 core domain. The NOCs observed for p53 wild-type and its various mutants using a cutoff of 1.15 Å are shown in Tables 1  through 3. Clustering results at several other RMSD cutoff values are presented in Table S1.
Cancer mutants are more flexible than wild-type p53 The number of clusters was significantly higher for the cancer mutants (in bold in Table 1) compared to the wild-type p53, which suggests that oncogenic mutations increase the overall plasticity of the p53 core domain. This result is consistent with Rauf et al. [44], who investigated the effects of different oncogenic mutations on the flexibility of the p53 DNA-binding domain using a graph theoretical approach. Here, the oncogenic property for all four structural classes of p53 mutants has been quantified by a single metric.
The flexibility of the structural and zinc-binding domain mutant R175H, as characterized by the number of clusters, was remarkably higher compared to the wild-type and other cancer mutants ( Table 1). The especially high degree of flexibility exhibited by this system may explain why so far it has not been possible to rescue the R175H mutant with second-site suppressor mutations, even though all possible single point core domain mutations of R175H were tested exhaustively for p53 function [18].
The number-of-clusters (NOC) metric presented here correctly locates the DNA-contact mutant R273H as the closest cancer mutant to the wild-type in terms of structural variability over the 30 ns trajectories. R273H has been previously demonstrated to be the easiest to rescue among the most-frequent cancer mutants [23]. The thermodynamic stability of the R273H mutant is the closest to the wild-type p53 among the 19 cancer mutants considered by Bullock et al [23]. The number of rescue mutants A p53 tester yeast strain expressing the URA3 gene under control of a p53-dependent promoter was transformed with centromeric plasmids (HIS3 selection marker) expressing either wild-type human p53, or the mutants R273H, R273H_N239S, R273H_R282S, R272H_L114G, G245S_E286D, Y220C, Y220C_A138G, Y220C_L137R, and Y220C_L114G. Serial dilutions cells (10,000; 2,000; and 400) grown to mid-log phase were spotted onto agar plates lacking either histidine or uracil. Plates were incubated for 2 days at 37uC. Growth on plates lacking uracil is dependent on expression of the URA3 gene and is a measure of p53 activity, while the growth on plates lacking histidine is selective for the presence of the plasmid. doi:10.1371/journal.pcbi.1002238.g002 Cancer mutants are typed in bold letters, rescue mutants are italicized, and nonrescue mutants are underlined. Wild-type p53 is presented as the first line of known to reactivate R273H mutant is large compared to few or no rescue mutants known to reactivate each of the other hot spot cancer mutants [17,18,37,38].

Rescue mutations decrease the flexibility of cancer mutants
Comparison of the number of clusters for the R273H rescue mutants (in italics in Table 2) with those for the R273H cancer mutant (in bold in Table 2) indicated a significant decrease in flexibility for the rescue mutants. The restoration of stability to the protein was especially remarkable in the case of the R273H_N263V and R273H_N200Q_D208T rescue mutants, for which the number of clusters was even lower than the wild-type p53. Although thermodynamic stability data is not available for these rescue mutations, our results suggest that such values would be lower than wild-type p53.
As there are no single point mutations that can strongly rescue cancer mutants R175H, R248Q, R249S or R282W, the generality of this finding was tested on rescue mutants for which experimental data is available (in italics in Table 2). More specifically, the known rescue mutants G245S_N239Y and G245S_T123P were considered for the class of structural mutants in the DNA binding surface. Similarly, two rescue mutants for the b-sandwich mutant Y220C, Y220C_A138G and Y220C_L137R, were investigated with the same approach. Functional activity of the latter two rescue mutants, for which no published experimental p53 activity results exist, was verified using yeast assays and depicted in Figure 2. Three out of these four rescue mutants showed decreased flexibility compared to their relevant cancer mutant ( Table 2). The only exception in this test set was G245S_T123P, which exhibited more clusters than its cancer mutant G245S.

Nonrescue mutations introduce even more flexibility to cancer mutants
As negative controls, we tested experimentally confirmed nonrescue second-site mutations (underlined in Table 2) of relevant cancer mutants with the same approach. Functional inactivity of all the nonrescue mutants was verified using yeast assays and depicted in Figure 2. All non-rescue mutants that we simulated were more flexible, as compared to their relevant cancer mutant, indicating destabilization introduced to the cancer mutant by these second-site mutations. Thus, our method can successfully discriminate rescue mutants from non-rescue mutants.

More stable p53 mutants follow a similar trend
We extended the same analysis on stabilizing mutant N239Y and the ''superstable'' quadruple mutant M133L_V203A_ N239Y_N268D [38] ( Table 3). The N239Y mutant exhibited a significant decrease in flexibility compared to the wild-type. In contrast, the superstable mutant did not follow the same trend (Table 3). This may be due to the need for longer relaxation in MD simulations in order to account for the greater extent of structural change introduced by four point mutations. To explore this hypothesis, we extended the quadruple mutant MD simulation for an additional 30 ns (for a total of 60 ns of production dynamics). In the second 30 ns, its number of clusters decreased significantly to a value much lower than that of the wild-type p53 as hypothesized.

At least 30 ns MD simulation is required
The data presented in tables 1-3 relies on the full 30 ns trajectories of p53. In order to determine what is the shortest MD simulation necessary to discriminate between the functional and nonfunctional forms of p53 mutants, shorter segments of the full production MD trajectories were analyzed. At the RMSD cutoff of 1.15 Å , the number of clusters for each p53 mutant calculated at 5, 10, 20, 25 and 30 ns of MD simulations were separately depicted as column graphs in Figure S2. In this set of graphs, active and inactive p53 mutants were grouped and designated with a green arrow and a red arrow, respectively. In Figure 3, the percentage of mutants for which p53 function was correctly predicted by our flexibility metric are depicted for 5, 10, 20, 25 and 30 ns of MD simulations. The success of function prediction increased from 74% to 91% while our simulation time increased from 5 ns to 30 ns. This analysis indicated that at least 30 ns of MD simulation is required for a successful prediction of function of p53 mutants.

NOC metric correlates with experimental thermodynamic stability data
The thermodynamic stability values of several p53 cancer mutants are available in the literature (Table 1), as measured by urea-induced unfolding experiments at 283 K [23,37,38]. There are no comparable experimental data for the rescue or non-rescue mutants, which were not included in this part of the analysis. All cancer mutants evaluated experimentally exhibit differential experimental destabilization compared to wild-type p53 (Table 1). Figure 4 depicts the correlation between the available thermodynamic stability values of p53 cancer mutants and the number of clusters observed in the MD simulations at the RMSD cutoff of 1.15 Å (r 2 = 0.77). The r 2 values for p53 single mutants at RMSD cutoff values of 0.95 Å and 1.05 Å are both 0.74. The number of clusters at these cutoffs for each mutant is tabulated in Table S1. The number of clusters in the second 30 ns MD simulation of the superstable mutant was used for this correlation analysis. If the NOCs in the initial 30 ns MD simulation of the superstable mutants was used instead, the r 2 value decreased from To compare the NOC metric with another simple flexibility metric, the root-mean-square-fluctuation (RMSF) values of all C a atoms of p53 were calculated using the AMBER suite for each mutant (Table S2). The correlation of the RMSF values with the thermodynamic data gave an r 2 value of 0.62, which showed the superiority of the NOC method comparing to its r 2 value of 0.77.
To compare the performance of other clustering methods, singlelinkage clustering and Jarvis-Patrick clustering were performed on the MD trajectories (Table S3). Both methods resulted in a lower correlation with the thermodynamic stability values versus the RMSD-based clustering method, with r 2 values of 0.44 and 0.45, respectively Discussion p53 is an inherently unstable protein, as reflected by its low melting temperature of ,42-44uC [45]. It has been shown that the main reason of p53 instability is neither poor packing density nor the presence of unusually large void volumes in the protein.
Instead, an analysis of the solution structure of p53 core domain obtained by NMR has revealed several reasons for instability of p53 [46]. First, this study indicated that p53 has buried hydroxyl and sulfhydryl groups that form sub-optimal hydrogen-bonding networks. Second, high flexibility of loop regions, especially of L1 loop, is observed in p53. Lastly, some buried tyrosine residues were found to be involved in temperature-dependent dynamic processes possibly indicating presence of alternative hydrogen-bonding networks in p53. Based on all of these factors, Canadillas et al concluded that ''the p53 structure is more flexible than is apparent from crystal structures'' [46].
In an effort to capture this intrinsic structural flexibility, we have focused on finding a computational method to measure the overall flexibility of the p53 core domain and the effect of mutations, be they cancer mutations, rescue mutations or non-rescue mutations, on the flexibility. This work presents a new method in which the number of structural clusters representing an explicitly solvated allatom MD trajectory can be used as a single robust measure of overall flexibility in the p53 core domain. All hot-spot cancer mutants we studied demonstrated higher flexibility compared to the wild-type p53, in line with the results of an earlier graphtheoretical approach that assessed the flexibilities of wild-type p53 and several cancer mutants [44]. Testing rescue and non-rescue mutants for particular cancer mutants, the number of clusters for functional p53 mutants was found to differ significantly from the nonfunctional p53 mutants. Remarkably, the NOC metric is able to predict which second-site mutations may restore p53 activity to cancer mutants and which will leave p53 functionally defective.
It is also notable that such a simple metric reflecting system flexibility or entropy can account for three-quarters of the variance in experimentally measured thermodynamic stability values of p53 mutants. MD simulations thus promise to be a robust tool to predict thermodynamic stability of p53 mutants of interest. The NOC metric could further be used to discover new rescue mutants that restore p53 activity, and thus kill the cancer cell. Additionally, whether binding of a small-molecule can achieve enough stabilization to restore p53 function to cancer mutants could be tested with this metric. The computational cost of performing classical MD simulations could be decreased by using alternative methods such as accelerated MD, which may achieve increased sampling of conformational states over significantly shorter simulation timescales. Experimental efficiencies could be achieved through an integrated strategy that is guided by use of the NOC metric as a predictive measure for p53 function.    Figure S2 Time evolution of the number of clusters during molecular dynamics simulations. Shorter segments of MD trajectories were analyzed in order to find out what is the shortest MD simulation necessary to discriminate between the functional and nonfunctional forms of p53 mutants. Number of clusters for the p53 mutants calculated at each of 5, 10, 20, 25 and 30 ns of MD simulations are graphed separately in the form of column graphs. Functionally active p53 mutants are grouped and designated with a green arrow while nonfunctional p53 mutants were designated with a red arrow. The clustering results at RMSD cutoff of 1.15 Å indicates that at least 30 ns of MD simulation is required. (PDF)