Protein Aggregation and Protein Instability Govern Familial Amyotrophic Lateral Sclerosis Patient Survival

The nature of the “toxic gain of function” that results from amyotrophic lateral sclerosis (ALS)-, Parkinson-, and Alzheimer-related mutations is a matter of debate. As a result no adequate model of any neurodegenerative disease etiology exists. We demonstrate that two synergistic properties, namely, increased protein aggregation propensity (increased likelihood that an unfolded protein will aggregate) and decreased protein stability (increased likelihood that a protein will unfold), are central to ALS etiology. Taken together these properties account for 69% of the variability in mutant Cu/Zn-superoxide-dismutase-linked familial ALS patient survival times. Aggregation is a concentration-dependent process, and spinal cord motor neurons have higher concentrations of Cu/Zn-superoxide dismutase than the surrounding cells. Protein aggregation therefore is expected to contribute to the selective vulnerability of motor neurons in familial ALS.


Introduction
Amyotrophic lateral sclerosis (ALS) is an adult-onset neurodegenerative disease with roughly 10% of the cases being inherited or familial [1]. The cause of sporadic ALS (sALS) is unknown while familial ALS (fALS) is known to be caused by mutations in six different genes and six different chromosomal loci [2][3][4]. One of these genes encoding Cu/Znsuperoxide dismutase (SOD1) was found to associate with 20% of fALS, and at least 119 fALS-associated SOD1 mutations have been characterized in humans [1,5].

SOD1 Mutations Have Inherently Different Toxicities
The goal of this study is to discover the mechanisms of toxicity of fALS SOD1 mutations. Neurologists often publish the age at onset and the time from disease onset to death (also termed survival or disease duration) for their ALS patients, thereby enabling epidemiological studies that assess the risk of a given variable [63][64][65][66], which for this study include given mutations' relative toxicity and physical characteristics (physicochemical parameters). Previous studies revealed that different SOD1 mutations have inherently different toxicities (encode different mean disease durations) [67]. We expanded upon these studies with a larger set of fALS-causing SOD1 mutations as well as larger patient cohorts. Hazard ratios (relative risk of dying at a given time) of fALS SOD1 mutations and non-SOD1-related fALS compared to that of sALS were obtained from the Cox proportional hazard model (Table 1). From this result, fALS SOD1 mutations with data from at least five individual patients (full criteria for inclusion are defined in the Materials and Methods section) were significantly related to different hazards. Kaplan-Meier survival curves from patients with fALS-causing SOD1 mutations, non-SOD1-related fALS, and sALS were generated. Figure 1 illustrates that fALS SOD1 mutations encode different prognoses, ranging from considerably better (e.g., H46R, hazard rate ¼ 0.075 3 sALS) to considerably worse (e.g., A4V, hazard rate ¼ 5.7 3 sALS) than sALS. Moreover, the Log rank, Breslow, and Tarone-Ware tests, which also compare patient survival rates (i.e., each mutation versus every other fALS-causing SOD1 mutation, non-SOD1-related fALS, and sALS) using different mathematical functions, confirm that different SOD1 mutations have inherently different prognoses (Table 2).

SOD1 Variants' Gain of Hydrophobicity, Loss of a-Helix, and Gain of b-Sheet Propensity Are fALS Risk Factors, While Loss of Net Charge Is Protective
To test the hypothesis that changes to the physicochemical properties of SOD1 variants are toxic, specifically those properties known to influence protein aggregation, physicochemical properties for each protein variant (hydrophobicity, propensity to lose a-helices, form b-sheets, protein net charge, etc.) were evaluated in a Cox proportional hazard model ( Table 3). The hazard ratios were significantly higher than 1.0 for mutations that either increase hydrophobicity, lose a-helices, or form b-sheets. In contrast, mutations that decrease the magnitude of the protein net charge correlate with hazard ratios significantly smaller than 1.0. These results indicate that changes in the SOD1 variants' properties, specifically increases in hydrophobicity and propensity to lose a-helices and to form b-sheets, correlate with decreased fALS patient survival, while decreases of the magnitude of net charge correlate with increased fALS patient survival, in contradiction with previous reports [15,68].
Dobson and co-workers [69] introduced an equation (termed the Chiti-Dobson equation herein) to predict the changes of aggregation rates of unfolded peptides or proteins upon point mutations by their physicochemical properties. This equation was derived empirically by modeling how three physicochemical properties, hydrophobicity, secondary structure (including loss of a-helix and gain of b-sheet), and protein net charge, change upon mutations (the hazard analyses for each of these properties were reported in the previous paragraph). These physicochemical property changes then were related to changes in protein aggregation rate, yielding an equation that predicts how any mutation will change the rate of protein aggregation (the predicted change of aggregation rate is referred to as the aggregation propensity). Although this equation is empirical, it is based upon first physical/chemical principles and approximates how a given mutation will change the energy (and thus the equilibrium) between a solvated and an aggregated protein. The Chiti-Dobson equation is ln(m mut /m wt ) ¼ 0.633DHydr þ 0.198(DDG coil-a þ DDG b-coil ) -0.491Dcharge, in which ln(m mut / m wt ) represents change of aggregation rate upon mutation, and DHydr, DDG coil-a , DDG b-coil , and Dcharge represent the changes of hydrophobicity, free energy change for the process from a-helix to random coil, free energy change for the process from random coil to b-sheet, and protein net

Author Summary
Amyotrophic lateral sclerosis (ALS), also known in America as Lou Gehrig's disease, is a fatal neurodegenerative disease with no effective treatment. Paralysis occurs as the result of the death of cells that connect the brain to various muscles, namely, the motor neurons of the brain and spinal cord. Ninety percent of ALS is sporadic and of unknown cause. A landmark discovery in ALS research was that mutations in the gene coding for Cu/Znsuperoxide dismutase cause at least 2% of ALS, and researchers have since discovered at least 119 such mutations. Neurologists also discovered that different mutations have remarkably different prognoses. For example, patients with the A4V mutation survive an average of 1 year after diagnosis, whereas patients with the H46R mutation survive an average of 18 years. Biochemists discovered that different mutations result in remarkably different physical properties, for example, stability of Cu/Zn-superoxide dismutase. In this article we apply an algorithm that predicts how fast a given Cu/ Zn-superoxide dismutase will aggregate (stick to other proteins) and demonstrate that faster aggregation relates to faster death of ALS patients. We also demonstrate that loss of Cu/Zn-superoxide dismutase stability relates to faster ALS patient death. Our findings imply that aggregation of unfolded SOD1 is toxic for ALS patients, and in fact accounts for 69% of the variability in mutant Cu/Znsuperoxide-dismutase-linked familial ALS patient survival times.
charge from the mutation, respectively. In their landmark study, it was demonstrated that increases in hydrophobicity, losses of a-helices, gains of b-sheets, and decreases in the magnitude of protein net charge increase the rate of protein aggregation.

Protein Aggregation Propensity Is a Risk Factor of fALS
The Chiti-Dobson equation and the many equations it inspired are robust and versatile, having successfully predicted aggregation rates of diverse disease-associated proteins [70], including amyloid b-peptide [69], tau [69], asynuclein [69], amylin [69], lysozyme [71], etc. Moreover, increases in the predicted rates of aggregation of various mutations in amyloid b-peptide were shown to relate to increased neuronal dysfunction and degeneration in a Drosophila model of Alzheimer's disease [72]. To test the hypothesis that protein aggregation propensity is related to fALS patient survival, the Chiti-Dobson equation was used to predict the aggregation propensities of fALS-causing SOD1 mutations. We started this study by validating the Chiti-Dobson equation, taking all experimental protein aggregation rate data available at the inception of our study (data reported as of 2005, listed in Table 4) and recalibrating the equation. The detailed results of the validation are reported in Figure 2. In summary, the Chiti-Dobson equation was verified for use in fALS, and the statistical correlation between the physicochemical parameters (hydrophobicity, net charge, and secondary structure) and the aggregation propensity remained and changed only marginally. Since the time we validated the Chiti-Dobson equation, a number of papers also validated their general approach [73][74][75]. Even so, we have included our own analysis since it provides exposure to the physical basis of aggregation propensity. Furthermore, inclusion of this data makes this study self-contained so that all of the data necessary to support or disprove our model are contained herein. Notably, this paper's conclusions were the same using both the original and the recalibrated Chiti-Dobson equation. The average patient survival times for different SOD1 variants with measured thermodynamic stabilities were plotted against corresponding predicted aggregation propensities, and linear regression analysis weighted by the number of patients for each mutation yielded R (multiple correlation coefficient, with a larger value indicating a stronger relationship) and P (value less than 0.05 implies a significant result) values of 0.58 and ,0.001, respectively ( Figure 3A). The severity of fALS thus is related to mutationinduced increases in SOD1 aggregation propensity. The same plot was performed with the linear regression analysis not weighted by the number of patients ( Figure 4A), yielding R and P values of 0.23 and 0.2, respectively. Unfortunately, the published epidemiology data do not provide the information necessary to stratify for known ALS covariates, including lifestyle (diet and smoking) [76][77][78][79], palliative care [80], bulbar onset, etc., and weighted data are more likely to account for differences in these factors. The Chiti-Dobson equation results for all fALS-causing SOD1 mutations with patients' survival data also were evaluated in a univariate Cox proportional hazard model ( Table 3). The hazard ratio for the Chiti-Dobson equation result was significantly higher than 1.0, which also indicates that aggregation propensity is a risk factor for fALS. Previous studies of Huntington's disease revealed an inverse relationship between the length of glutamine repeat of huntingtin and age of disease onset. The authors of this previous study concluded that disease onset correlates with rate of nucleation of aggregation [81]. We demonstrate here an inverse relationship between the rate of aggregation elongation after nucleation and the disease duration after onset.

Protein Instability Is a Risk Factor for fALS
On the basis of our observation that predicted increased protein aggregation correlates with increased disease severity and previous data indicating that protein unfolding or misfolding promote aggregation [82][83][84][85], we tested the hypothesis that a loss of protein stability also could be a risk factor for ALS. For the sake of simplicity, we use the term instability throughout this article, with instability defined as the inverse of either the normalized DDG (unfolding free energy change difference between mutant and wild-type SOD1) or normalized DT m (melting point difference between mutant and wild-type SOD1). Instability was considered for two reasons: (1) the Chiti-Dobson equation predicts the aggregation rates of unfolded proteins (it was derived from the aggregation rates of proteins in high trifluoroethanol concentrations that contained secondary but no tertiary structure), and therefore, formally, unfolding must occur prior to aggregation, and (2) unfolding is known to speed protein aggregation in vitro to the extent that without chemically induced unfolding induction periods extend from months to years, as demonstrated for SOD1 [32]. Aggregation in vivo therefore may require protein unfolding. Before using stability data published by different laboratories using different methods (melting point, which yields DT m , or chaotroph-induced unfolding, which yields DDG), we sought to determine the reliability of the data. If different laboratories reported similar values of stability for the same mutants, then the data could be deemed reliable. Therefore, all published measurements of apo SOD1 stability (metallated SOD1 calorimetry data often bear the characteristics of irreversible denaturation, probably via Cu-catalyzed disulfide bond formation, and is therefore less reliable) [15,31,60,[86][87][88] were compiled, and the experimental values of DDG and DT m were normalized to the range from 0 to 1 (described in the Materials and Methods section), with 0 representing the least stable, and 1 representing the most stable (highest stability) variant. Through the use of all of the data from mutants where DDG and DT m were measured by different laboratories, a plot of normalized DDG versus normalized DT m was created. Good interlaboratory correlation of measured stability values was observed (slope ¼ 0.94, R ¼ 0.90, P ¼ 0.002; Figure 5), and we therefore deemed the stability data reliable for use.
Next, patient survival data for fALS-causing SOD1 variants were plotted against corresponding instability values, and linear regression analysis weighted by the number of patients for each mutation yielded R and P values of 0.71 and ,0.001, respectively ( Figure 3B). The same plot was performed with the linear regression analysis not weighted by the number of patients ( Figure 4B), yielding R ¼ 0.34 and P ¼ 0.07. A gain of SOD1 instability (loss of stability) upon mutation therefore is related to decreased fALS patient survival. Increased in vitro instability is consistent with previous findings that the in vivo half-lives of SOD1 variants are decreased [89].  Previous results from computer simulations indicate a multistep process for aggregation via destabilization [90], encouraging us to understand the combined effect of aggregation propensity and protein instability upon ALS patient survival. On the basis of their respective multiple correlation coefficients and slopes, aggregation propensity and instability are equal contributors to fALS patients' survival. Moreover, no obvious correlation between protein instability and aggregation propensity was observed for the SOD1 variants used in Figure 3 ( Figure S1), indicating that increased instability is not responsible for the increased predicted protein aggregation propensity. The combination of instability and aggregation propensity represents the relative energy in proceeding from folded to unfolded apo SOD1 and then from unfolded to aggregated states. Patient survival was plotted against corresponding summed instability and aggregation propensity values. A linear regression analysis weighted by the number of patients for each mutation yielded R and P values of 0.83 and ,0.001, respectively ( Figure 3C). The same plot was performed with the linear regression analysis not weighted by the number of patients ( Figure 4C), yielding R ¼ 0.47 and P ¼ 0.01. The improved statistical result of predicting patient survival after combining instability and aggregation propensity indicates that aggregation occurs from unfolded or partially unfolded SOD1. The stability data used herein were for apo SOD1, and therefore the absence of metals is implicit. The R 2 value was 0.69 from the weighted data, indicating that 69% of the intrinsic variability these fALS patients' survival resulted from the combination of increased aggregation propensity and instability. Additionally, aggregation propensity and instability were evaluated in a Cox proportional hazard model ( Table 5). The hazard ratios were significantly higher than 1.0 for both factors. The sum of aggregation propensity and instability also was evaluated in a univariate Cox proportional hazard model ( Table 5). The hazard ratio for this sum was also significantly higher than 1.0, further  Table 4). The dependence of observed ln(m mut /m wt ) on hydrophobicity, secondary structure, and charge changes were still observed after the addition of extra protein aggregation data. (A) The relationship between observed ln(m mut /m wt ) and Dhydrophobicity. To insure that the effect of hydrophobicity change was considered independent of other physiochemical properties, only mutations that had a Dcharge of 0 and a jDDG coil-a þ DDG b-coil j of less than 2.5 kJ/mol were considered. (B) The relationship between observed ln(m mut /m wt ) and DDG coil-a þ DDG b-coil . To insure that the effect of secondary structure change was considered independent of other physiochemical properties, only mutations which had a Dcharge of 0 and a jDhydrophobicityj of less than 3 kcal/mol were considered. (C) The relationship between the observed ln(m mut /m wt ) and Dcharge. To ensure that the effect of charge change was considered independent of other physiochemical properties, only mutations that had a jDhydrophobicityj of less than 3 kcal/mol and a jDDG coil-a þ DDG b-coil j of less than 2.5 kJ/mol were considered. Wild-type protein was used as a data point at (0,0) in all of the three graphs. The rederived slopes from this figure for the three factors, 0.95 for hydrophobicity, 0.18 for secondary structure, and À0.78 for charge, were applied to calculate aggregation propensities of fALS-causing SOD1 variants presented in Figure 4. Patient survival times were plotted against these aggregation propensities; the corresponding slope and R values differ less than 5% compared to the results in Figure 4  indicating that aggregation propensity and instability are synergistic risk factors for fALS. Note that the aggregation propensity and instability tested in Table 5 were normalized to the range from 0 to 1 (as in Figures 3 and 4), while the values tested in Table 3 were not normalized. As a result of normalization, which decreased the value range of tested factors, the hazard ratios of Table 5 are much larger than  Table 3, and therefore the large values of hazard ratios reported in Table 5 should not be overinterpreted. Significantly, a fALS patient with an SOD1 mutation of relatively low aggregation propensity and high stability is expected to survive longer after disease onset. It has not escaped our attention that the rate of protein aggregation has implications in both sporadic diseases and aging; for example, the toxicity of a given posttranslational modification is a function of its effect on protein stability and aggregation propensity.

Discussion
We describe here synergistic gains of toxic functions of SOD1 in ALS. These are the first results in any neurodegenerative disease demonstrating that protein instability and aggregation propensity are synergistic risk factors. The fact that there are two synergistic risk factors rather than a single toxic gain of function probably has delayed the discovery of the mechanisms of fALS mutant SOD1 toxicity. The SOD1 stability data used in this paper were measured from apo SOD1, and the aggregation rate data used to create the Chiti-Dobson model were from in vitro unfolded proteins. Therefore, formally, the combination of instability and aggregation propensity represents the relative energy in proceeding from apo folded to unfolded SOD1 and then from unfolded to aggregated states. It has been demonstrated experimentally that apo SOD1 has a faster rate of aggregation than that of holo forms [32]. Partial unfolding/misfolding also can lead to aggregation [28,[91][92][93][94][95][96], and our results cannot rule out a role for the aggregation of partially folded, including metallated, SOD1. Previous studies revealed a correlation [15] and conversely a lack of correlation [86] between SOD1 variant stability and patient disease duration. Correlation between SOD1 variant stability and patient disease duration, however, required that the authors omit stability data of 4 of the 15 variants from their regression analysis (on the basis that these variants change the net charge of SOD1).
As presented in Table 3, SOD1 variants' loss of net charge correlates with increased patient survival, while gain of hydrophobicity, loss of a-helix, and gain of b-sheet propensity are ALS risk factors. On the basis of Dobson and co-workers' related work [69,73,97], a loss of net charge is predicted to increase the aggregation propensity of unfolded proteins. If aggregation is toxic, then one would expect loss of net charge to be toxic. In contrast to the synergistic effects for aggregation propensity and instability presented in Table 5, the correlation of loss of net charge with increased survival has an effect of decreasing the hazard ratio presented in the univariate model presented in Table 3. We demonstrate that mutations causing the entire protein to approach neutrality are protective in the context of fALS (Table 3) rather than deleterious as proposed by Oliveberg and co-workers [15,68]. These results should be cautiously interpreted since in contrast to our Cox proportional hazard model result that loss of net charge is protective, the mean patient survival for loss of net charge and gain of net charge mutations, unweighted by the number of patients, are 7.1 and 6.9 years, respectively. Further study clearly is required to understand the role of charge in ALS etiology.
In contrast with the strong familiality shown for disease duration after onset (Table 1), SOD1-mediated ALS showed modest familiality with respect to onset, accounting for only 42% of the variability in A4V and D90A fALS patients [98], and with only G37R and L38V mutations of SOD1 being significant covariates of age of onset [67]. The same analysis shown in Figures 3 and 4 was performed using age at disease  onset rather than disease duration as the dependant variable ( Figure S2), and little or no relationship between disease onset and aggregation propensity or instability was observed. The Chiti-Dobson equation predicts the rate of aggregation after nucleation (rate of elongation). It is tempting therefore to speculate that the rate of nucleation is a determinant of age at onset. Testing this hypothesis would require the development of a model that can predict nucleation rates based upon physicochemical parameters, a task that is hampered by the stochastic nature of in vitro nucleation times [99,100] but that should now be possible given our recent development of methods for modeling in vitro nucleation kinetics [101].
Although our model accounts for 69% of the variability in fALS patient survival after onset, there are clearly genetic components of fALS that our model cannot account for. For example, while D90A is normally a dominantly inherited mutation in North America, 2.5% of people in Sweden and Finland are heterozygous asymptomatic carriers of the D90A SOD1 mutation [102,103] and require two mutant alleles before presenting ALS symptoms. Notably, our results and conclusions were unaffected by including or excluding D90A survival times during data analysis.
It is postulated that diseases for which protein aggregation contributes to patient death will (1) develop in cells with the highest concentration of the aggregation-prone protein in accordance with the concentration dependence of aggregation rates [101,104] and (2) have a prognosis influenced by the aggregation propensity of the aggregating protein, in accordance with the results reported herein. Motor neurons are the cells in the ventral horn of the spinal cord with the highest SOD1 concentration [39,105], perhaps explaining an aspect of the selective vulnerability of these cells.

Materials and Methods
Familial ALS patients' disease duration and age of onset. Familial ALS patients' data were taken from all of the available literature. Disease duration was initiated with onset of the first symptoms until the patient's death or when respiratory assistance was required for patients' survival. The average duration and onset for each mutation were calculated as the weighted average based on the number of patients (Table 6). If the patients were still reported to be alive without respiratory assistance at the end of a study, then their disease durations were not used to calculate the average unless the known duration value was larger than the average calculated with only durations from patients deceased or with respiratory assistance. For studies reporting average disease duration and Kaplan-Meier curves, the reported average durations were used to calculate the weighted averages. The current unavailability of http://www.alsod.org/ made it impossible to review the references provided by the website (from which we had taken survival times before it became unavailable), which created the risk of counting a patient's disease onset or survival twice, and made reproducing our study impossible for other groups. We therefore opted not to report data from this website in this study, thereby eliminating no more than 67 (there were 67 http://www.alsod. org/ patients' data without accompanying literature references that may, or may not, have been represented by our literature search) of 1319 patients' data. However, we did perform a complete, alternative set of analyses that did include http://www.alsod.org/ data (unpublished data), and the statistical correlations in the figures and tables shown herein persisted. Mean values of disease durations also were obtained from Kaplan-Meier curves and tested on SOD1 mutations with known experimental thermodynamic stabilities, and the results were comparable to those in Figure 3. Since the weighted average method can provide disease duration regardless of the number of patients, we opted for its use.
Kaplan-Meier survival curves, the log rank tests, and the Cox proportional hazard model. Kaplan-Meier curves of survival for different fALS-causing SOD1 mutations, non-SOD1-related fALS, and sALS were generated. The hazard ratios of different fALScausing SOD1 mutations and non-SOD1-related fALS compared to sALS were tested as a category variable by Cox proportional hazard model analysis. For studies reporting Kaplan-Meier curves but without individual patients' data, the Engauge Digitizer 4.1 software was used to obtain coordinates for cumulative survival at each time point. This information was used to calculate the number of patients not surviving at each time point under the assumption that there is no censored patient (with unknown exact survival time because of being alive at the end of study, lost to follow-up, or withdrawal from the study) within the course of survival curves. For cumulative survival not reaching 0 at the end of study, those fractions of patients were treated as censored. The error of the estimated number of patients is less than 5% of the number reported. To eliminate the chance that one or two patients' survival data bias the analysis result, a rule of thumb [132] requiring that each tested fALS-related SOD1 mutation includes at least five noncensored patients was applied. Since patients' survival was reported only as an average from a group of patients and individual patient's survival Multivariate survival model was tested with aggregation propensity and instability. Univariate survival model was tested with the sum of aggregation propensity and instability. A significance level of 0.05 was used. Aggregation propensity and instability as well as the sum of aggregation propensity and instability are shown as significantly related to patients' survival. Disease durations from 614 patients with 29 different fALS-causing SOD1 mutations with reported stability values were used in this analysis. Due to the ability to handle censored data in Cox proportional hazard models, these analyses not only include the dataset presented in Figure 3 but also with extra censored patients' survival data. doi:10.1371/journal.pbio.0060170.t005 .2 6 0 (4) ---- [174] .   [182] 17.0 6 7.3 (7) [156] 52.5 6 6.5 (2) [185] 17.3 6 10.7 (4) ---- [185] 16.8 6 6.8 (9) ---- [156] .11(1) ---- [156] 5.0 (1) ---- [176] 11.5 6 10.6 (4) a ---- [186] 19.0 (11) ----22 V47F ---- [173] 36 (1) [193] 5.0 (1) [192] 42.8 6 11.3 (15) [192] .8.0 6 2 (2) [193] 50.0 6 5 (2) [194] .8.0 6 2.6 (3) [173] 35 (1) [192] .3 (1) ---- [193] . 25.0 (1) [175] 68 (1) [199] 14.0 (1) [175] 62 (1) [103] 1.0 6 0 (2) [199] 38.7 6 7.6 (3) [176] .10.5 6 3.5 (2) [103] 49.0 6 17.1 (4) [199] .15.5 6 9.5 (2) [164] 56 (1) [103] .9 (1) ---- [172] .0.8 (1) ---- [103] .  [203] 6.9 6 2.0 (7) [203] 51.9 6 14.9 (13) [203] .11.5 6 3.5 (2) ---- [203] .  information was not described in some publications, one or two publications' survival data might bias the analysis result. To eliminate this chance of bias, the rule of thumb was modified as requiring at least five independent descriptions of noncensored patients' survival data (a reported average without individual patient's survival information was treated as one description). The statistical analysis was performed with the software SPSS 15.0 (SPSS, Inc.). Aggregation propensity calculated from the Chiti-Dobson equation. The hydrophobicity, b-sheet propensity, and charge values for the amino acid residues were obtained from the Supplementary Information of [69]. While applying the AGADIR algorithm at http:// www.embl-heidelberg.de/Services/serrano/agadir/agadir-start.html to obtain a-helical propensities for wild-type (wt) and mutant (mut) P a wt and P a mut values for DDG coil-a calculation for human SOD1, the parameters of pH 7, 310 K, and ionic strength of 0.100 were used. For the protein human SOD1, the N terminus is acetylated, and the C terminus is free in vivo. After the prediction at the residue level was output, the value in the column ''Hel'' at a specific residue was taken as P a . If a value of 0 for P a was obtained, then 0.1 was added to both P a wt and P a mut values for the correct mathematical meaning of ln(P a wt /P a mut ) (F. Chiti, personal communication). The Chiti-Dobson equation terms, Dhydrophobicity, Dcharge, DDG coil-a , and DDG b-coil , were calculated based on equations illustrated in the legend of Table  1 of [69]. The ln(m mut /m wt ) values were calculated based on Equation 1 from [69] and normalized from 0 to 1 using the equation normalized aggregation propensity ¼ (aggregation propensity before normalization -MIN ap )/(MAX ap -MIN ap ), with MIN ap and MAX ap as the minimum and maximum aggregation propensities of fALS-causing mutations with known thermodynamic stabilities, respectively, so that the larger normalized values correlate to larger aggregation propensities.
Normalized DDG. The free energy change difference (DDG) and melting temperature difference (DT m ) of unfolding a pathogenic variant and wild-type protein are parameters used to characterize the thermodynamic stability of a protein. DDG values were taken from Table 2 of [15]. To graph with other protein stability data, the DDG values were normalized by applying the equation normalized DDG ¼ (DDG values before normalization -MIN DDG )/(MAX DDG -MIN DDG ), with MIN DDG and MAX DDG as the minimum and maximum values of DDG in this dataset, respectively.
Normalized DT m . DT m values were taken from Table 1 of [86], Table 1 of [60], Table 2 of [87], Table 3 of [88], and Table II of [31]. DT m values from [60,87,88] were averaged for each mutation. Those results then were averaged with the DT m values from [31,86]  Thermodynamic instability of SOD1 variants. The instability values for SOD1 variants were obtained from the equation normalized instability ¼ 1 -average of normalized DDG and normalized DT m for each mutation, so instability values are simply (1 -normalized DDG or DT m ), and larger values correlate to less stable variants.
The normalized aggregation propensity and instability for each variant were summed and normalized to the range from 0 to 1 to consider the two factors together.

Supporting Information
Alternative Language Abstract S1. Translation of the Abstract into Chinese by Qi Wang Found at doi:10.1371/journal.pbio.0060170.sd001 (62 KB PDF).  . The linear regressions presented in (D-F) were treated equally regardless of the number of patients for each mutation (unweighted) using the software SigmaPlot 9.0 (Systat Software, Inc.). The age of onset data presented in these six graphs are from 649 patients with 29 different fALScausing SOD1 mutations with reported stability values. Aggregation propensity, instability, and sum of aggregation propensity and instability were obtained as described in the Materials and Methods section. Aggregation propensity, instability, or sum of aggregation propensity and instability has little or no correlation with patients' age of onset. Found at doi:10.1371/journal.pbio.0060170.sg002 (261 KB AI). the manuscript. We thank J. Lani from Statistics Solutions for advice on statistical analysis. We especially thank the neurologists and patients worldwide for the patient outcome data used in this study, while noting that improving upon this study and extending this study to sporadic disease requires that physicians begin to include epidemiological information, including patient lifestyle and palliative care for all neurodegenerative disease patients.
Author contributions. JNA and NYRA conceived the hypotheses. QW and JNA designed the analysis. QW, JLJ, and JNA performed the analysis. QW and JNA wrote the paper.
Funding. This work was made possible by award W81XWH-04-0158 from the Department of Defense and grant 1392 from the ALS Association. NYRA was supported by a research award from the The duration and onset data in bold were not used to calculate the average. a Survival data were taken from the time of first disease symptom until respiratory assistance was needed if clearly mentioned in reference. b When the number of patients was not clearly mentioned, a value of three was used as the estimated value. doi:10.1371/journal.pbio.0060170.t006