^{*}

Conceived and designed the experiments: NLK. Performed the experiments: NLK CJT. Analyzed the data: NLK CJT. Contributed reagents/materials/analysis tools: NLK CJT. Wrote the paper: NLK.

The authors have declared that no competing interests exist.

There have been several reports on the varying rates of progression among Alzheimer's Disease (AD) patients; however, there has been no quantitative study of the amount of heterogeneity in AD. Obtaining a reliable quantitative measure of AD progression rates and their variances among the patients for each stage of AD is essential for evaluating results of any clinical study. The Global Deterioration Scale (GDS) and Functional Assessment Staging procedure (FAST) characterize seven stages in the course of AD from normal aging to severe dementia. Each GDS/FAST stage has a published mean duration, but the variance is unknown. We use statistical analysis to reconstruct GDS/FAST stage durations in a cohort of 648 AD patients with an average follow-up time of 4.78 years. Calculations for GDS/FAST stages 4–6 reveal that the standard deviations for stage durations are comparable with their mean values, indicating the presence of large variations in the AD progression among patients. Such amount of heterogeneity in the course of progression of AD is consistent with the existence of several sub-groups of AD patients, which differ by their patterns of decline.

In recent decades, our understanding of Alzheimer's disease (AD) has increased; however, some basic questions still remain unresolved. One of them is: how homogeneous is AD? Is the course of progression more or less the same for most patients, or are there large variations? Our paper studies a large cohort of AD patients which comes from a 23-year-long study, and performs a statistical analysis of progression speed. We quantify the amount of spread in GDS/FAST stage durations (a staging system widely used by clinicians). We arrive at an astonishing conclusion that the mean length of AD stages is comparable with their standard deviation! This means that individual courses of AD progression may differ very much from each other, and from the textbook mean values. This has implications both for clinical trials (how do we assess if a new drug is effective, if the amount of natural spread is so large in untreated patients?), and for our understanding of this disease, which appears to be comprised of sub-diseases with different patterns of decline.

The temporal progression of Alzheimer's Disease (AD) shows a pattern of high variability, with patients transiting the stages of the disease having time-courses ranging from months to decades

Global Deterioration Scale (GDS) was proposed in

Functional Assessment Staging procedure (FAST) was proposed in Ref.

In the literature, the course of AD as characterized by GDS/FAST staging system has been described in quantitative terms. In particular, the stages are thought to follow in a sequential fashion and are characterized by certain stage durations

While this quantification is a useful diagnostic tool, it reflects the average course of the disease and provides no information about possible heterogeneity of AD progression. At the same time, quantifying the variance of GDS/FAST stage durations is essential, as one needs to compare the delay gained by a treatment strategy with the amount of natural variation in stage durations, to be able to judge whether there is significance to any improvements observed. In this paper we investigate the heterogeneity of AD by studying the distribution of GDS/FAST stage durations of AD patients. We ask: how much variability is there in the course of AD, and how well do the average values for GDS/FAST stage durations reflect the disease course of individual patients?

The estimates for the cumulative probability distributions of GDS/FAST stage durations are presented in

The black bars represent GDS stages, and the gray bars – FAST stages. The mean stage values reported in

A striking observation can be made by looking at the calculated values for the standard deviations of the stage durations. In

Analysis of a large longitudinal dataset has revealed a significant degree of variation in the lengths of GDS/FAST stages 4–6 of AD. In particular, the calculated standard deviations for GDS/FAST stage durations turned out to have values similar to their mean durations. This is an indication that the patterns of cognitive and functional decline vary significantly from patient to patient.

The suggestion that AD is a genuinely heterogeneous disease, has been proposed in the literature

The patient data used here come from a longitudinal study conveyed between 1983 and 2006. It is theoretically possible that the large variation observed in the cohort of patients is a consequence of a change in lifestyle factors, which affected the course of AD progression. To explore this possibility, we have split the cohort of patients into two subgroups based on their dates of visit, and calculated the statistics of stage durations both for the “earlier” and the “later” parts of the cohort. We found that within the subgroups, the variances of the stage durations were as large as the ones reported here, and further, the mean values of stage durations were not significantly different.

Note, however, that the analysis performed here was not specifically designed to discern slight trends in the disease progression over the decades. We cannot perform such an analysis with the data at hand because of the data scarcity issues (using smaller sub-groups of patients necessarily jeopardizes the reliability of the statistics). More data would be needed to catch the trends related to changes in life-style and other generational effects. Here we could only conclude that in both early and late halves of the cohort, the variances were large, and stage durations were statistically not different.

Given a high variability of progression patterns, an important question is finding variables that correlate with progression rates. We have attempted to relate the rate of progression to demographic factors, and determine if it correlates with age at baseline,sex, education, or the age of onset of AD (which was back-calculated by using the information on the estimated stage durations). No significant correlations with these factors have been found, which is consistent with several previous papers

Our main finding is the large heterogeneity in the duration of GDS/FAST stages in AD, which is consistent with the reports cited above. Our methods however are very different. In this study we use a very extensive (23-year long) longitudinal dataset for AD patients, where there is a representation of patients at GDS/FAST stages 4–7 of AD. We calculate the amount of variance in patients explicitly, and demonstrate a large spread in values of GDS/FAST stage values for stages 4, 5, and 6. There are several applications of our results.

Most immediately, having a standard deviation values (and not just the mean values) for GDS/FAST stage durations is important for those scientists and clinicians who use the GDS/FAST staging system.

Such large values of variance in GDS/FAST stage durations caution against interpreting the GDS/FAST system as a prognostic tool: the course of decline of individual patients can be very different from the mean.

Having the estimate on the GDS/FAST stage durations calculated in such an extensive longitudinal dataset shows the amount of heterogeneity in the course of progression of AD. This is consistent with the existence of several sub-groups of AD patients, which differ by their patterns of decline, see also

The knowledge of stage durations together with their natural variance is a necessary tool for the clinical trials. It allows to make quantitative judgments about new drugs’ efficiency.

To conclude, we analyzed a longitudinal dataset to extract the mean and the standard deviation for GDS/FAST stage durations for stages 4–6 of AD. Applying similar methodology to larger datasets with more frequent assessments will reveal more accurate results.

In order to calculate the probability distribution of stage durations in AD, we used a longitudinal dataset of AD patients, which is an outcome of a longitudinal study performed between the years 1983 and 2006

(a) A histogram showing the number of records per patient. (b) A histogram showing patient inter-visit times.

Extracting accurate estimates for the standard deviations for longitudinal datasets is complicated by the practical realities of how the data is collected. First of all, we only know the current stage at the times of assessments, but we have no information on when each stage actually starts and the next one begins (in other words, the data is left-and right- censored). Further complication comes from the fact that the patients' total observation time (time from first to last visit) was 4.78 ± 2.94 years, see the histogram of

3 | 4 | 5 | 6 | 7 | |

3 | 3/0 | 8/1 | 8/1 | 7/0 | 3/0 |

4 | 0 | 50/73 | 58/90 | 108/88 | 75/81 |

5 | 0 | 0 | 33/34 | 98/95 | 60/90 |

6 | 0 | 0 | 0 | 58/23 | 67/61 |

7 | 0 | 0 | 0 | 0 | 12/11 |

Analysis of long, multistage disease processes has been addressed in literature in many different context

We view the beginning and the end of each stage as censored events. For each stage _{L},X_{R}]_{L},Z_{R}]

We used the iterative approach developed in

Supporting information.

(PDF)