Molecular Origin of Polyglutamine Aggregation in Neurodegenerative Diseases

Expansion of polyglutamine (polyQ) tracts in proteins results in protein aggregation and is associated with cell death in at least nine neurodegenerative diseases. Disease age of onset is correlated with the polyQ insert length above a critical value of 35–40 glutamines. The aggregation kinetics of isolated polyQ peptides in vitro also shows a similar critical-length dependence. While recent experimental work has provided considerable insights into polyQ aggregation, the molecular mechanism of aggregation is not well understood. Here, using computer simulations of isolated polyQ peptides, we show that a mechanism of aggregation is the conformational transition in a single polyQ peptide chain from random coil to a parallel β-helix. This transition occurs selectively in peptides longer than 37 glutamines. In the β-helices observed in simulations, all residues adopt β-strand backbone dihedral angles, and the polypeptide chain coils around a central helical axis with 18.5 ± 2 residues per turn. We also find that mutant polyQ peptides with proline-glycine inserts show formation of antiparallel β-hairpins in their ground state, in agreement with experiments. The lower stability of mutant β-helices explains their lower aggregation rates compared to wild type. Our results provide a molecular mechanism for polyQ-mediated aggregation.


Introduction
The appearance of polyglutamine (polyQ)-containing aggregates [1][2][3] is a hallmark of disease progression in all diseases in which CAG-expansions occur in genes [2]. Intranuclear inclusion bodies containing polyQ aggregates have been found in vitro [4,5], in cell cultures, animal models, and affected patients [6,7]. The aggregates are known to have a characteristic amyloid topology [8]. The inhibition of oligomerization by the azo-dye Congo red, or by the Hsp70/ Hsp40 chaperone system, exerts marked protective effects in vivo and in vitro [9,10]. Aggregation and disease are observed if the number of glutamines in the expansion, n, exceeds a critical value, n C (i.e., n C ¼ 35-40) [3]. The nearly universal existence of this criticality in all polyQ-related diseases (except in spinocerebellar ataxia 6) suggests that when the polyQ insert length exceeds a critical value (n . n C ), a pathological change, largely independent of the host protein, occurs in the polyQ insert itself. Therefore, isolated polyQ peptides (Q n ) have been used as model systems for studying polyQ aggregation [4,11,12], and it is known that: (a) The nuclear uptake of polyQ peptide aggregates prepared in vitro is cytotoxic in cell cultures [13], (b) isolated polyQ peptides have in vitro aggregation properties similar to the corresponding full-length proteins containing the polyQ insert [4,14], (c) peptide aggregation follows a nucleated mechanism showing characteristic lag and growth phases [5,11], and (d) the glutamine tract-length dependence of the lag-time interval correlates well with the age of onset of disease [11]. Peptides of subcritical lengths (n , 35-40) have long lag times of aggregation and a corresponding (predicted) age of onset later than the typical life span of a person. Longer peptides (n . 35-40) have progressively smaller lag times of aggregation, and a correspondingly early age of onset of the disease [11].
Unaggregated polyQ peptides form random coil structures, whereas aggregates are composed of amyloid-like b-strands [15]. The conversion of random coil to b-strand occurs in an individual polyQ chain [11], and fibril formation occurs by addition of other polyQ chains to these monomeric b-strand nuclei. Therefore, the conformational dynamics of an individual polyQ chain determines both its aggregation mechanism and the structure of the final aggregates. The details of the conformational dynamics of polyQ and the length dependence of the dynamics are not well understood [6].

Results/Discussion
To elucidate the structural dynamics of single polyQ chains, we performed molecular dynamics (MD) simulations of simplified models of polyQ. An atomic-level representation of polyQ limits sampling efficiency in simulations, making it unsuitable to study the dynamics of aggregation. Therefore, we used a simplified pseudo-atom representation to capture the relevant degrees of freedom for aggregation. We introduced three types of nonbonded interactions between the glutamine pseudo-atoms: hydrophobic interactions between the glutamine methylene groups, geometrically determined hydrogen bonds between backbone NH and O atoms, and sidechain-backbone polar interactions between the sidechain carboxylamine group and the backbone NH or O atoms. Protein-solvent interactions play an important role in protein folding and aggregation. However, in simplified models of protein folding and aggregation the solvent interactions are considered implicitly [16]. In the interaction models employed in our study, the solvent effects were captured by the effective hydrophobic interactions between the methylene groups in the sidechain. We used the discrete molecular dynamics (DMD) algorithm [17] to study polyQ dynamics.
The first question that we address is the following: What is the underlying minimal set of interactions responsible for the experimentally observed conformational transitions in polyQ aggregation? Since the conformational transition from random-coil polyQ to b-strand is known to be a nucleated process [11], we expect that an energy barrier is crossed during b-strand formation. Barrier crossing is enhanced as the system temperature is increased. Therefore, we study the dynamics of model polyQ peptides as a function of temperature. We hypothesized that the physical basis of the conformational change from random coil to b-strands is the presence of unique sidechain-backbone hydrogen bonding interactions in polyQ. To test this hypothesis, we performed simulations of a 37-mer (Q 37 ) polyQ peptide with and without sidechain-backbone hydrogen bonding interactions. It is known that homopolymeric peptides with no sidechainbackbone interactions, e.g., polyalanine, form a-helices in their ground state [18,19], and at higher temperatures the helices melt to form a random coil [20] that then aggregates into a b-rich structure. We found that in the absence of sidechain-backbone interactions, polyQ dynamics are similar to polyalanine: It forms a-helices at low temperature ( Figure  1A), and a random coil as the temperature is increased. A monomer peptide in this polyQ model does not form b-strands. In contrast, when sidechain-backbone hydrogen bonding is present, polyQ is a random coil at low temperatures, adopts a b-strand conformation in an intermediate range of temperatures, and is again a random coil at higher temperatures ( Figure 1B).Thus, sidechain-backbone interactions lead to the formation of b-strands by a single polyQ peptide, which is the nucleating structural transition observed in polyQ-peptide aggregation.
Strikingly, the conformation adopted by a Q 37 chain under conditions in which it adopts b-strand topologies (T ¼ 0.72 to T ¼ 0.78, in units of e/k B , where e is the energy unit and k B is the Boltzmann's constant) is a parallel b-helix ( Figure 1C). In these b-helices, all residues adopt b-strand backbone dihedral angles, and the polypeptide chain coils around a central helical axis. Several examples of such parallel b-helices (reviewed by Wetzel [21]) are found in the Protein Databank (http://www.rcsb.org/pdb/). For amyloid fibrils formed by polyQ, Perutz previously proposed a b-helix model based on X-ray diffraction data [8]. However, in contrast with Perutz's model, which has a central aqueous pore, the bhelices observed in our simulations are well packed, exclude the solvent, and are stabilized by buried sidechain-backbone and sidechain-sidechain hydrogen bonds ( Figure 1D-1F). To evaluate whether these b-helix structures, once formed, have residence times long enough to propagate further aggregation, and to obtain better-defined thermodynamic ensembles, we evaluated the stability of b-helices at 300 Kelvin using allatom MD simulations. As shown in Figure 2A, the polyQ structure remains stable on the nanosecond time scale accessible in all-atom simulations. If the formation of bhelices corresponds to the nucleation step in the aggregation reaction [5], and, assuming that the further elongation of the aggregate is diffusion-limited, the average time between protein collisions at a concentration of 100 lM is expected to be about 10 ns. Therefore, the observed stability of the polyQ b-helix on the nanosecond time scale is expected to be sufficient for further propagation of the aggregate.
Apart from sidechain-backbone interactions, a number of sidechain-sidechain interactions persisted throughout the MD simulation. We did not use sidechain-sidechain hydrogen bonding interactions in the DMD simulations used to generate b-helices. Thus, even though sidechain-sidechain interactions were not required for the formation of b-helices, they were nevertheless formed in b-helices. Further, these interactions were persistent throughout the length of the allatom MD simulations, suggesting that they do play a significant role in the stabilization of the b-helices.
By characterizing the length dependence of b-helix formation, we uncovered the molecular basis of the observed glutamine length dependence of polyQ aggregation. Since the average number of residues per turn of the b-helix in our simulations is 18.5 6 2 residues, we expected that about 33-40 glutamines would be required for its formation. To test this hypothesized length dependence of b-helix formation, we studied the conformational dynamics of 25-mer and 45-mer polyQ peptides and found that b-helices are absent at all temperatures when the repeat length was 25. Moreover, the bhelix topology was stable in a broader range of temperatures for the 45-mer ( Figure 2B), demonstrating that b-helix formation increases with the length of polyQ. The formation of a b-helix from a random coil was accompanied by entropy loss, leading to a free energy barrier. This barrier results in

Synopsis
Nine human diseases, including Huntington's disease, are associated with an expanded trinucleotide sequence CAG in genes. Since CAG codes for the amino acid glutamine, these disorders are collectively known as polyglutamine diseases. Although the genes (and proteins) involved in different polyglutamine diseases have little in common, the disorders they cause follow a strikingly similar course: If the length of the expansion exceeds a critical value of 35-40, the greater the number of glutamine repeats in a protein, the earlier the onset of disease and the more severe the symptoms. This fact suggests that abnormally long glutamine tracts render their host protein toxic to nerve cells, and all polyglutamine diseases are hypothesized to progress via common molecular mechanisms. One possible mechanism of cell death is that the abnormally long sequence of glutamines acquires a shape that prevents the host protein from folding into its proper shape. What is the structure acquired by polyglutamine and what is the molecular basis of the observed threshold repeat length? Using computer models of polyglutamine, the authors show that if, and only if, the length of polyglutamine repeats is longer than the critical value found in disease, it acquires a specific shape called a b-helix. The longer the glutamine tract length, the higher is the propensity to form bhelices. This length-dependent formation of b-helices by polyglutamine stretches may provide a unified molecular framework for understanding the structural basis of different trinucleotide repeatassociated diseases.
the lag times observed in experiments of polyQ peptide aggregation [11]. Barrier crossing is enhanced for longer peptides, because the enthalpy gain upon b-helix formation compensates for the entropy loss in the transition. Therefore, b-helices are formed only by peptides longer than a critical length (n . n C ). Recently, Stork et al. [22] found that the dimerization of two Q 37 b-helices resulted in a stabilization of the (preformed, in their study) b-helix conformation. The stabilization of the b-helix upon dimerization shows that the dimerization is a downhill process (i.e., there is no energy barrier) on the free-energy landscape. Thus, we propose that length-dependent b-helix formation may be the molecular origin of polyglutamine-mediated aggregation. We also propose that once a b-helix is formed by a monomer, the elongation of the aggregate may involve the conversion of other chains to b-helices induced by the b-helix nucleus. A fibril may be formed by stacking of multiple b-helices, as suggested by Stork et al. [22], and these fibrils may arrange further to form larger fibers.
The formation of b-helices by a single polyQ chain can be used to rationalize the aggregation of experimentally characterized mutant polyQ peptides. Previously, Thakur and Wetzel [5] found that mutant polyQ peptides, in which the turn-inducing amino acid sequence proline-glycine was inserted at different sequence intervals-e.g., (Q 9 -PG-Q 9 ) 3 (PG-Q 9 ) and (Q 10 -PG-Q 10 ) 3 (PG-Q 10 )-modulated the aggregation kinetics of polyQ peptides. The mutant PG-Q 9 was found to aggregate at a marginally smaller rate than the wildtype polyQ, and the critical nucleus size for aggregation, as for the wild type, was one. The existence of nucleated aggregation kinetics of the mutant suggests that, similar to the wild type, barrier crossing from the ground state to an aggregation-prone state occurs. Therefore, we hypothesized that, similar to wild-type polyQ, a thermodynamically unfavorable nucleating conformational transition occurs in a single b-hairpin forming PGQ 9 peptide. We studied the conformational dynamics of the PGQ 9 -mutant peptide and found that, in agreement with Thakur and Wetzel's prediction, this peptide forms a four-stranded antiparallel b-sheet. The antiparallel structures are formed at low temperatures in our simulation ( Figure 3A and 3B). Do these mutants aggregate through antiparallel b-hairpin structures or by wild-type-like b-helix formation? Thakur and Wetzel's data [5] is compatible with either scenario. If b-hairpin formation is rate-limiting (i.e., nucleates aggregation), since b-hairpin formation by mutants is more thermodynamically favorable than wild type, the aggregation rates of mutants should be higher than wild type. In contrast, if wild type-like b-helix formation in the mutants nucleates aggregation, their aggregation rate compared to the wild type is expected to be determined by the relative stabilities of the metastable mutant and wild-type b-helices. To understand the nucleating conformational transition in these mutant peptides, we performed all-atom MD simulations of a PGQ 9 sequence in a b-helix conformation ( Figure 3C). We found that, similar to the wild type b-helix, the b-helix formed by PGQ 9 remained stable on the nanosecond time scale, but showed a greater root mean-square distance (RMSD) compared to a wild type b-helix of identical length (see Figure 2A). We compared the RMSD per residue of the wild-type and mutant structures (unpublished data) during MD simulations and found that the destabilization induced by the proline-glycine is not limited to the proline-glycine residues-it is transduced across the whole peptide, leading to an overall higher RMSD. Thus, we propose that the differential stability of the transiently formed b-helix by the mutant peptide compared to wildtype polyQ may underlie the experimentally observed slower rate of aggregation of the mutant.
Mechanisms of protein aggregation [23] are increasingly being sought as a framework for understanding and, importantly, therapeutically interfering with, the fundamental events that underlie misfolding diseases [24]. The common underlying basis of protein aggregation has been demonstrated by the discovery of antibodies can cross-react with early aggregates of different peptides and proteins [25]. Further, the early oligomers themselves, rather than the final fibrils, have been shown to be toxic [26]. Thus, the conversion to specific bstrand topologies is a common central feature associated with cytotoxicity in all aggregation-linked diseases. The structural basis of the mechanism of polyQ peptide aggregation that we present here may thus aid the understanding and development of rational therapies to modulate protein aggregation in these debilitating neurodegenerative diseases.

Materials and Methods
We modeled the polyQ chain as ''beads on a string,'' where each glutamine is represented by six pseudo-atoms-four corresponding to the backbone NH, C9, C a , and O atoms, and two side chain atoms, one for the methylene (-CH 2 -CH 2 -) groups and another for the carboxylamine (-CONH 2 ) group. Neighboring residues in this peptide representation are covalently constrained to mimic the peptide flexibility in real proteins.
To study the conformational dynamics of polyQ, we introduce simplified amino acid interactions: hydrophobic interactions between the methylene groups, polar interactions between sidechains and backbone NH or O beads, and nonspecific backbone hydrogen bonds as described in [16]. The interaction strengths of the hydrophobic interactions, sidechain-backbone polar interactions and nonspecific backbone hydrogen bonds are assigned as 0.7e, 5e, and 5e, respectively. These interaction strengths were used to successfully fold a miniprotein, the Trp-cage, to within 1 Å of its native structure [16]. Interactions in proline and glycine were also modeled as in [16]. We used the rapid DMD algorithm to perform simulations on our model proteins [17,27,28].
We used the snapshots collected from DMD simulations to perform all-atom MD simulations using standard MD protocols. Using a threestep algorithm [29], we reconstructed all atoms of the polyQ chain from the snapshots taken from simulations of coarse-grained protein models. All-atom MD simulations were performed using the package AMBER 7, with the AMBER force-field of parm99 [30,31] at a temperature of 300 Kelvin and pressure of one atmosphere, in a octahedral periodic box of water. The protocol for MD simulations involved equilibration of the solvent and the peptide, and production as described by Urbanc et al. [29]. The trajectory was recorded for 3 ns after equilibration.
Sidechain-backbone interactions have been identified as playing important roles in the formation of protein structures. It has been pointed out that the hydrogen bonds between the polar side chain and backbones are important for the starting and ending of a-helices [32,33] and for the formation of turns in proteins [34]. To evaluate how sensitive our results were to the relative strengths of sidechain and backbone hydrogen bonds, we performed DMD simulations with varying relative strengths of sidechain and backbone hydrogen bonds (e sidechain /e backbone ). We found that for weaker sidechain hydrogen bonds compared to backbone hydrogen bonds, i.e., e sidechain /e backbone , 1, polyglutamine (polyQ) formed a-helices at low temperatures as opposed to random coil structures observed at e sidechain /e backbone ¼ 1. The observation of random coils at low temperatures is in agreement with experiments in [35], and therefore we chose e sidechain /e backbone ¼ 1.