Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identification of preclinical dementia according to ATN classification for stratified trial recruitment: A machine learning approach

  • Ivan Koychev ,

    Roles Conceptualization, Project administration, Supervision, Writing – original draft, Writing – review & editing

    ivan.koychev@psych.ox.ac.uk

    Affiliation Department of Psychiatry, University of Oxford, Oxford, United Kingdom

  • Evgeniy Marinov,

    Roles Formal analysis, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Big Data for Smart Society (GATE) Institute, Sofia University, Sofia, Bulgaria

  • Simon Young,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Department of Psychiatry, University of Oxford, Oxford, United Kingdom

  • Sophia Lazarova,

    Roles Data curation, Writing – original draft, Writing – review & editing

    Affiliation Big Data for Smart Society (GATE) Institute, Sofia University, Sofia, Bulgaria

  • Denitsa Grigorova,

    Roles Data curation, Writing – original draft, Writing – review & editing

    Affiliations Big Data for Smart Society (GATE) Institute, Sofia University, Sofia, Bulgaria, Faculty of Mathematics and Informatics, Sofia University, Sofia, Bulgaria

  • Dean Palejev

    Roles Conceptualization, Methodology, Project administration, Software, Supervision, Writing – original draft, Writing – review & editing

    Affiliations Big Data for Smart Society (GATE) Institute, Sofia University, Sofia, Bulgaria, Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia, Bulgaria

Abstract

Introduction

The Amyloid/Tau/Neurodegeneration (ATN) framework was proposed to identify the preclinical biological state of Alzheimer’s disease (AD). We investigated whether ATN phenotype can be predicted using routinely collected research cohort data.

Methods

927 EPAD LCS cohort participants free of dementia or Mild Cognitive Impairment were separated into 5 ATN categories. We used machine learning (ML) methods to identify a set of significant features separating each neurodegeneration-related group from controls (A-T-(N)-). Random Forest and linear-kernel SVM with stratified 5-fold cross validations were used to optimize model whose performance was then tested in the ADNI database.

Results

Our optimal results outperformed ATN cross-validated logistic regression models by between 2.2% and 8.3%. The optimal feature sets were not consistent across the 4 models with the AD pathologic change vs controls set differing the most from the rest. Because of that we have identified a subset of 10 features that yield results very close or identical to the optimal.

Discussion

Our study demonstrates the gains offered by ML in generating ATN risk prediction over logistic regression models among pre-dementia individuals.

Introduction

It is now widely recognized that dementia is a clinical syndrome that is a complication of a variety of pathophysiological processes that typically run a lengthy preclinical course. The Amyloid/Tau/Neurodegeneration (ATN) biomarker framework [1] was proposed as means of evidencing the biological state of Alzheimer’s disease (AD) and non-AD pathophysiology that is independent of clinical manifestation. This novel definition of neurodegenerative states has operationalized the recent shift in interventional dementia trials from the syndromal stages of disease to biologically defined preclinical states [2].

The scalability of the ATN framework that is required for large interventional trials is limited by the availability, invasiveness and cost of the biomarker investigations. Prediction of ATN status through known risk factors [3] and dementia scores [4] may help address this limitation through empowering targeted recruitment. A substantial body of literature exists demonstrating the utility of both regression and machine learning based methods to detect amyloid positivity in cognitively healthy individuals, highlighting the role of APOE4 carriership, demographic factors [5,6] as well as neurodegeneration plasma biomarkers [7,8]. ATN status determination has the additional benefit of indicating the stage of disease (i.e. presence of neurodegeneration) as well as identifying non amyloid driven pathology. We have previously shown that ATN prediction is feasible using regression-based modelling [9]. We found that continuous constitutional and modifiable risk factors for late-life dementia outperform the best established future dementia risk scores (e.g. Cardiovascular Risk Factors, Aging and Dementia (CAIDE) [4] and Framingham risk scores [10,11]).

In this study, we sought to explore if further gains in ATN prediction could be granted by artificial intelligence in the form of machine learning (ML) modelling relative to the best regression models. The rationale for this approach is the evidence that data-driven approaches such as ML can outperform classical statistical method in the field of diagnostics. In addition, such algorithms can continuously ‘learn’ and improve over time as data accumulates. We approached the research question by generating the ATN ML model in the European Prevention of Alzheimer’s Dementia Longitudinal Cohort Study (EPAD LCS) dataset [12] in order to compare with our previous regression model [9] and then validating it in the Alzheimer’s disease Neuroimaging Initiative (ADNI) dataset [13].

Methods

Study design and aims

The present study extends previous work [9] that identified 7 risk factors for AD in relation to prediction of ATN pathology in EPAD. We improve the previously reported results by using Machine Learning (ML) methods instead of logistic regression. We planned to do this, applying feature selection methods aimed at identifying additional significant factors for prediction of ATN pathology. Finally, we planned to validate the selected features on independent data from ADNI.

Study cohorts

Training and testing dataset: EPAD LCS.

EPAD LCS is a multicenter, longitudinal cohort study recruiting participants across 21 European sites [12] and we used its V1500.0 dataset to obtain data from 1500 adult participants aged over 50 years. To replicate the previous study [9], we excluded 82 individuals due to diagnosis of dementia or MCI and additional 171 participants were excluded due to Clinical Dementia Rating (CDR) score ≥ 0.5. Further 320 participants were omitted due to missing data. The sample sizes of the different groups used in this work are identical to the sample size reported for the logistic regression analysis reported previously [9] in the legend of Table 4 of the original publication, resulting of a total of 927 individuals (age 64.5 ± 6.77 years, 58.4% female).

Validation dataset: ADNI.

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a multicenter longitudinal study launched in 2004. ADNI follows a population of volunteers classified as either cognitively normal (CN), living with significant memory concern (SMC), mild cognitive impairment (MCI), or Alzheimer’s disease (AD). Full details and protocols can be found at http://adni.loni.usc.edu/.

Out of 3285 ADNI participants, 2578 participants had missing data and were omitted. This led to a sample of 709 individuals (72.2 ± 7.27 years, 47% female) that consisted of 109 (15.37%) AD, 143 (20.17%) CN, 367 (51.76%) MCI and 90 (12.69%) SMC individuals. As applying identical exclusion criteria to the EPAD dataset (i.e. exclusion of all MCI or dementia cases) produced a dataset of insufficient size, we opted not to apply this.

As ADNI was a validation dataset, we selected features that we found to be significant in predicting ATN pathology in EPAD. Due to the established relevance of plasma biomarkers in AD prediction [14], we also included concentrations of plasma phosphorylated tau181 (p-tau181).

Ethics.

The EPAD and ADNI studies received appropriate ethical approval in line with the Declaration of Helsinki. Participants provided written consent prior to any study procedures.

Study assessments

Constitutional risk factors.

Constitutional factors included age, gender, years of education and family history of dementia. Family history of dementia was included as a binary variable showing the presence or absence of diagnosed parental dementia for ADNI and first-degree relative dementia for EPAD.

Cardiovascular variables

Systolic blood pressure (BP) and body-mass index (BMI, derived from height and weight) were included as continuous variables for both ADNI and EPAD cohorts. Smoking score was encoded as a categorical variable with 5 levels (Table A from S1 Appendix) derived from gender, age and smoking status.

EPAD sample also includes physical activity encoded as an ordinal variable with 6 levels (Table B from S1 Appendix).

For both EPAD and ADNI cohorts the proportion of white matter lesion (WML) volume per whole brain volume was derived from volumetric MRI data. Brain imaging protocols for EPAD have been previously described [15] and further details on protocols and procedures for ADNI can be found here http://adni.loni.usc.edu/methods/documents/.

Cognition.

Global cognitive function was derived from total Mini-Mental State Examination (MMSE) score [16]. Performance scores from Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) tests were also included in the EPAD cohort [9]. For the purpose of this study our ADNI sample contained only the results from a semantic fluency assessment task that was identical to the one featured in RBANS.

APOE4 carriership.

Genetic risk was encoded as a binary variable denoting whether a participant was an APOE4 carrier or not. An individual was designated an APOE4 carrier if they had one or more copies of allele 4. A non-carrier was considered to be an individual who had no allele 4 in their genotype. Details on apolipoprotein E (APOE ε4) genotyping methods and collection protocols for EPAD and ADNI can be found at https://ep-ad.org/ and http://adni.loni.usc.edu/, respectively.

CSF Biomarkers.

Cerebral spinal fluid (CSF) concentrations of amyloid beta 42 (Aβ42) and phosphorylated tau (p-Tau) were included in both samples. For the ADNI sample we also included CSF concentrations of total tau (t-Tau) as a neurodegeneration marker [17]. The cut-off for definition of Aβ42 pathology (A+) and p-Tau pathology (T+) for EPAD was < 1025 pg/ml and > 24 pg/ml, respectively [9]. These cut-offs were derived by the authors using Gaussian Mixture Modeling. For ADNI we used cut-offs that were previously validated against amyloid PET visual read. The cut-offs were as follows: A(+) < 1000 pg/ml, (T+) > 27pg/ml and (N+) > 300 pg/ml ([1820], Clinical Values sections).

Neurodegeneration biomarker.

For the EPAD cohort neurodegeneration was estimated with the Scheltens’ visual rating scale [9]. Scheltens’ scale is widely used to assess medial temporal atrophy from structural MRI images [21]. Neurodegenerative pathology (N+) was defined using the following cut-offs: scores > 1 for participants less than 65 years old, > 1.5 for 65 to 74-year-olds, and > 2 for participants older than 75 years [22].

The presence of neurodegeneration in the ADNI sample was inferred from the CSF concentrations of t-Tau. Neurodegenerative pathology (N+) was defined as t-Tau > 300 pg/ml [20].

Plasma biomarkers.

A constantly accumulating body of literature suggests that levels of p-tau181 measured in blood plasma might be useful as a biomarker for AD-related neurodegeneration [2326]. Higher levels of p-tau181 in plasma have been shown to correlate with neurodegeneration and to even predict decline in aging and Alzheimer’s disease [23,25,27]. Furthermore, measuring p-tau181 in blood plasma is much more cost-effective and less invasive than a lumbar puncture or a neuroimaging procedure.

ATN framework.

Participants were categorized into five subgroups in accordance with the ATN Framework [1]:

  1. Normal AD biomarkers: A−T−(N)−
  2. Alzheimer’s pathologic change: A+T−(N)−
  3. Alzheimer’s disease: A+T+(N)±
  4. Alzheimer’s and concomitant non-Alzheimer’s pathologic change: A+T−(N)+
  5. Non-AD pathologic change: A−T ± (N)+; A−T+(N)−

Analysis

Feature and model selection.

We used Information Value (IV) and Weight of Evidence (WOE) [28,29] to determine the most significant features from the EPAD data set. The WOE measures the predictive ability of a feature with respect to the dependent variable. Encoded values of a variable into discrete categories are used to compute WOE assigned to each category. The formulas are shown in [28], Section 2 and the heuristics are described in [29]. The larger the absolute value of WOE, the more discriminative is the corresponding category among the values of the considered variable with respect to the dependent variable. An important assumption is that the dependent variable should be binary to denote the occurrence or no-occurrence of an event. The formulas for IV are again shown in [28], Section 2, and we have used a modified and extended version of the implementations from [30]. Based on these methods, we selected a set of 13 features, including the 7 used in [9] (female, age years, family history, APOE4, Body Mass Index (BMI), white matter lesion volume, MMSE score) and 6 that were not previously included—years of education, RBANS semantic fluency, RBANS delayed memory index, smoking score (ever smoked) Framingham I, physical activity, systolic blood pressure. Summaries of these 13 features from the EPAD and two ADNI datasets that we consider (see the Validation Procedure section) are shown in Table 1. All of these features with the exception of RBANS delayed memory index and physical activity were also present in the ADNI dataset.

thumbnail
Table 1. Descriptive statistics (mean±sd for continuous variables, percentage for binary) of the variables included in the models for European Prevention of Alzheimer’s Dementia Longitudinal Cohort Study (EPAD) and Alzheimer’s Diseasing Neuroimaging Initiative (ADNI).

https://doi.org/10.1371/journal.pone.0288039.t001

In addition to the logistic regression considered in [9], we also applied several tree-based and kernel-based methods. Some of the tested ML methods include eXtreme Gradient Boosting (XGBoost [31,32]), and also the following methods implemented in scikit-learn Python library [33] random forest (RandomForest [34,35] and [36], Chapter 15), Extremely randomized trees (ExtraTrees, [37]), Bagging [38], Adaptive boost (AdaBoost [36,39], Chapter 10) and Support Vector Machines with different kernels ([36], Chapter 6). We use stratified 5-fold cross validation to provide a balance between the model complexity and the quality of predictions and to avoid overfitting ([36], p. 596; p. 613). Cross-validation optimizes the bias-variance tradeoff problem in ML by running the same model on different splits of the same data into training and test sets [40]. Using stratified cross-validation is needed due to the imbalanced classes in some of the comparisons.

Similarly to [9] we derive models for 4 comparisons, each classifying a specific ATN pathology against normal controls. As shown in Table C in S1 Appendix, each optimal model utilizes a different subset of the 13 most informative features as predictors. However, for utility purposes, we prefer models that have the same subset of predictor variables. Therefore, we applied a method based on lattices in order to obtain such a minimal set of features. Through the lattice method we identified 10 features (the 13 without the following three: RBANS delayed memory index, physical activity, smoking score) as the optimal choice unifying the feature sets for all 4 models. Interestingly, these ten features are also present in the ADNI dataset.

The optimal methods and features were determined by maximizing the Area Under the Curve (AUC) [41].

To optimize the computational time and resources, we have used a two-step process. In the first step we used 1 000 iterations and have selected the 100 combinations of features and hyperparameters that have the highest AUC for each group and type of model. Only these best initial combinations were utilized during the second step with 10 000 iterations in order to find the optimal models. We apply the AUC on every fold of the 5-fold cross validation (CV). The final receiver operating characteristic (ROC) [42] curve is the mean of the separate ROC curves from each fold. To explore the variation in the estimates we also computed a 95% CI (confidence interval) for the AUC metric for each one of the 5 CV folds.

Validation procedure.

First, we created a subset of the ADNI data using identical filtering as the original EPAD data excluding AD and MCI patients (‘ADNI reduced’). It included (n = 112) controls and small number of patients with ATN pathology (n = 60 Alzheimer’s pathologic change, n = 24 Alzheimer’s disease, n = 37 Non-AD pathologic change). Because of the small sample size, the AUC results with and without cross validation were not reliable and therefore could not be used to validate the EPAD-based model. Because of this, we validated the models using the larger ADNI dataset with 709 individuals that did not exclude individuals diagnosed with AD or MCI. As there were significant differences between the distributions of the features in the EPAD and ADNI datasets, we could not use the EPAD-derived models on ADNI data directly. Instead, we validated our feature selection by selecting the optimal EPAD models features and hyper-parameters (e.g. number of estimators, maximal depth of tree, minimal number of samples required to split an internal node, minimal impurity decrease) to train the models on the ADNI data using stratified 5-fold cross-validation.

Results

According to the ATN-defined criteria all participants were distributed in one of five groups: Normal AD biomarkers, Alzheimer’s pathologic change, Alzheimer’s disease, Alzheimer’s and concomitant non-Alzheimer’s pathologic change and non-AD pathologic change. The final distribution of both EPAD and ADNI participants is shown in Table 2.

thumbnail
Table 2. Distribution of participants according to the ATN criteria.

https://doi.org/10.1371/journal.pone.0288039.t002

We identified two tree-based methods, random forest and XGBoost, that were consistently maximizing the AUC for the EPAD data, with random forest performing better in most cases. When subsequently analyzing the ADNI data, we found that linear-kernel SVM outperforms both the logistic regression and in almost all cases random forest and XGBoost. Logistic regression work published previously [9] did not use cross validation and therefore we calculated the AUC for logistic regression with cross validation using the original 7 parameters and compared them, together with the original results without cross validation to our results that use cross validation as typical for ML-based methods. For the EPAD data, we were able to either outperform or achieve similar results to the logistic regression with cross validation. In most cases we were even able to outperform or match the logistic regression results without cross validation. The improvements ranged between 2.2% to 8.3% in absolute terms as shown in Table 3 and Figs 1 and 2. We should note that the 2.2% increase represents almost 20% of the gap to the theoretically possible maximum. The results for XGBoost are not shown, as they were similar to those for random forest.

thumbnail
Fig 1. ROC curves and AUC metrics for EPAD dataset using 10 features (female, age years, family history, APOE4, BMI, white matter lesion volume, MMSE score, years of education, RBANS semantic fluency, systolic blood pressure).

Each pane shows one comparison indicated on the top of the pane.

https://doi.org/10.1371/journal.pone.0288039.g001

thumbnail
Fig 2. ROC curves and AUC metrics for ADNI dataset using 10 features (female, age years, family history, APOE4, BMI, white matter lesion volume, MMSE score, years of education, RBANS semantic fluency, systolic blood pressure).

Each pane shows one comparison indicated on the top of the pane. The bottom-left pane is empty due to insufficient sample size.

https://doi.org/10.1371/journal.pone.0288039.g002

thumbnail
Table 3. AUC metric for the corresponding groups and classifiers based on the original 7 features, the selected 10 optimal minimal subset of features and all 13 selected significant features.

https://doi.org/10.1371/journal.pone.0288039.t003

Although there are no universal AUC thresholds for determining the quality of a binary classification, the original logistic regression with 7 features for the Alzheimer’s disease vs normal AD biomarkers groups comparison yields an AUC that is very close to the top of the “excellent” discrimination band as defined previously [43]. Nevertheless, with the original 7 features we improved the already well-performing AUC scores for both random forest and linear-kernel SVM, when using cross validation. Adding additional features resulted in AUC scores for the same comparison being in the highest band, labeled “outstanding” [43].

Using the original 7 features with cross-validation we improved by about 4.5% the comparisons between AD and non-AD pathologic change, and non-AD pathologic change on one hand and normal AD biomarkers on the other. In these cases, the logistic regression results were within the “poor” discrimination band [43] and we were able to achieve results that are well within or at least practically within the “acceptable” bracket [43]. When comparing Alzheimer’s pathologic change to Normal AD biomarkers using random forest we were able to achieve similar results to the logistic regression when used with cross validation. Adding additional 6 features to the random forest models led to extra improvements between approximately 0.8% and 3.5% in AUC.

For the Alzheimer’s pathologic change vs Normal AD biomarkers adding the additional 6 features and utilizing random forest resulted in an improvement of about 2.2% compared to the original cross-validated logistic regression. Overall, the AUC for the resulting set of 13 features outperformed the scores for the original logistic regression with 7 features for all comparisons, with differences being between 2.2% and 8.3%. By design, the results of the reduced set of 10 features, were similar to those of 13 features, therefore making the set of 10 features a good candidate for practical applications.

As we have used the ADNI dataset as a validation one, we are interested in comparing the best ML-based results for the 10-feature models for both datasets. The results when comparing either Alzheimer’s disease or Non-AD pathologic change to Normal AD biomarkers were very similar for EPAD and ADNI. When comparing Alzheimer’s pathologic change to Normal AD biomarkers, the AUC scores for the ADNI logistic regression and linear SVM models even outperformed the respective ones for EPAD by 11%. This validates the feature selection using the EPAD dataset to derive models on the ADNI dataset. Adding plasma p-tau181 yields an increase of almost 3% between the best ML models when comparing Alzheimer’s disease to Normal AD biomarkers without it and an improvement of 1.7% of the ML-based 10-feature result for EPAD. It also adds over 1.4% of improvement for the Non-AD pathologic change to Normal AD biomarkers comparison. Figs 1 and 2 show ROC curves and AUC results for the four comparisons using different features and datasets.

Discussion

In this analysis, we found that machine learning offers an advantage over logistic regression when predicting Alzheimer’s and non-Alzheimer’s pathology as defined through the ATN classification in pre-dementia individuals. While gains of 1–2% were evident for Alzheimer’s disease and Alzheimer’s pathologic change, we found that the improvement in performance was most pronounced for isolated non-Alzheimer’s disease pathologic change (i.e. tau and/or neurodegeneration change in the context of normal amyloid) or concomitant Alzheimer’s and non-Alzheimer’s pathology states (defined as abnormal amyloid and neurodegeneration biomarkers in the absence of tau change).

The ATN framework was developed to provide an unbiased descriptive approach to defining neurodegenerative states based on biomarker evidence that transcends the syndromal stages of the disease [44]. This allowed a reformulation of the development of disease modification agents towards individuals in preclinical disease. The biomarker-based approach allows for trials to not only test individuals with confirmed target pathology but also to modify the disease process before irreversible neurodegeneration sets in [2]. The advantages conferred by the ATN framework are however limited by the reliance on costly and invasive biomarker data being available in individuals with no cognitive change. Accurate prediction of ATN status through routinely available information is therefore a priority. In line with a previous analysis [9], we found that both Alzheimer’s disease and Alzheimer’s pathologic change can be predicted with high degree of accuracy (AUC of 0.89 and 0.66 respectively) through a combination of 7 variables (age, sex, APOE4 carriership, family history of dementia, BMI, MMSE score and white matter lesion hyperintensities). In the currently reported analysis, we show that ML method are not only equivalent to a cross validated version of the logistic regression but also improve on the results when an expanded dataset is used (7 original variables plus education, semantic fluency, delayed recall, smoking, physical activity and blood pressure). The validation of the 10 features ML model in ADNI demonstrated the stability of the model through AUCs of 0.89 and 0.75 for AD and AD pathological change respectively. These results are in line with other recent publications which have shown that amyloid positivity (consistent with Alzheimer’s disease and Alzheimer’s pathologic change) can be effectively predicted. In a recent analysis from the BioFINDER cohort, amyloid positivity in cognitively unimpaired individuals was successfully predicted through a model combining plasma Ab42/40, APOE4 carriership and age [45]. In addition, plasma biomarkers such as p-tau181 have a now well-evidenced utility in identifying AD pathology across the disease spectrum [6]. ML has also been demonstrated to have the capability to improve amyloid prediction models in cognitively unimpaired: a report from the A4 study showed that a ML model combining age, demographics, cognition and APOE4 carriership reached an AUC of 0.73 [6]. While the degree of improvement of using ML over statistical methods can be considered modest in absolute terms for the already well-optimized AD discrimination it bridges almost 20% of the gap to the theoretically possible maximum, and thus the class of ML models holds significant advantages over standard statistical methods. As shown above, the significant features are not necessarily consistent over the different comparisons and utilizing ML methods allows us to include dependent variables without overfitting. ML methods such as random forest do not necessarily require linear separation of the categories and are typically more robust in terms of assumptions for the features distributions and allow greater flexibility.

We found that ML methods had a particularly pronounced superiority over logistic regression when predicting non-AD pathology (AUC improvement of 5% for non-AD and mixed changes in the 10 feature models). This substantial improvement likely reflects that unlike Alzheimer’s pathology where the pathology is well defined, suspected non-Alzheimer’s disease pathophysiology (SNAP [46]) in pre-dementia individuals is likely underpinned by a heterogenous group of pathophysiological processes, including cerebrovascular disease, Lewy body pathology, primary age-related tauopathy and argyrophilic grain disease. This lack of clear unifying features favours data driven methods such as ML. We found that non-AD pathologic change can be predicted through ML with AUCs of up to 0.69 in both EPAD and ADNI which opens the route for research of this group of disorders in their pre-dementia stages. The performance of the ML models in the mixed pathology group was similarly improved by ML in EPAD but the low number of individuals in this group (27 in EPAD and 2 in ADNI) limit the utility and performance of these algorithms.

The ADNI dataset allowed for the exploration of the added value that plasma p-tau181 offers in prediction of pre-dementia ATN status. This biomarker is one of the best validated predictors of amyloid pathology in individuals with MCI and dementia with some evidence for its utility in preclinical disease. We found that adding p-tau181 improved marginally the prediction of Alzheimer’s disease and non-AD pathologic change but not the Alzheimer’s pathologic change. These data add to the evidence that p-tau181 is relevant to the neurodegeneration stage of disease [14]. Other biomarkers in development such as p-tau217, p-tau231 or GFAP show promise in the preclinical stages of the disease [47] and thus may potentially offer more value in AD pathologic change stages.

Limitations

Our study has several limitations. Firstly, in contrast to the EPAD dataset, our ADNI sample included individuals that were diagnosed with AD or MCI. The reason for this was the limited availability of ATN positive individuals who were pre-diagnosis. This resulted in different distributions of the model features for the two datasets. Because of that we performed feature validation, rather than the commonly used model validation. Future work will need to focus on further validating the models we obtained in a pre-diagnosis sample.

As our primary and validation samples were drawn from different studies, in some instance different methodologies were used to assess the same variable. For example, for EPAD we were able to use medial temporal lobe atrophy score (Scheltens’ scale) to assess the level of neurodegeneration while in ADNI we used the concentrations of t-Tau in CSF. While these different means of defining neurodegeneration are recognized, this approach may have still impacted model performance.

On a similar note, some of the variables present in both data samples exhibited significant differences in their distributions. While some distributional variances were expected, WML volumes in EPAD and ADNI had particularly notable different distributions which may be accounted for by different MRI processing methodologies employed in the two studies. Q-Q plots showing these differences are shown in Supplementary Material. Despite these differences, WML variable was a consistent feature in the EPAD model when validated in ADNI.

We also observe that there is no single class of methods that reliably optimizes the comparisons for both datasets. It appears that the ADNI data is better linearly separated than the EPAD one and because of that SVM with linear kernel (shown to be similar to logistic regression [48]) performs better on it even when adding the observations with AD and MCI. In contrast, there is no such separation in the EPAD data and because of that random forest performs better than SVM.

Finally, we note that ethnicity may conceivably be a risk factor that can be utilitized in ATN prediction. However, the EPAD dataset that this analysis as well as the preceding regression analysis were based on consists almost exclusively of Caucasian individuals (785 out of the 791 individuals for which that information was available, with no information available for the remaining 136 individuals). For this reason we were not able to include this factor in our analysis. Although the ADNI sample is marginally more diverse whereby more than 92% of the individuals in that sample are Caucasian. This limitation further underlines the need for research cohorts to diversify to achieve representativeness.

Future directions

Our work is being expanded by the development a pragmatic algorithm which allows investigators recruiting from brain health volunteer registers such as the Great Minds register [49] to select on the basis of likelihood of being in a specific ATN pathological state. In addition to accelerating preclinical dementia research, this application will allow the generation of further data on the accuracy of the models at the point of trial recruitment which will help improve the algorithm further. In addition, such approaches can, over time, yield data on the translation of ATN prediction into clinical progression over time.

A further future direction is the incorporation into the model of scalable means of AD risk monitoring such as blood biomarkers [14] and digital technology [50]. These data either add biological signal directly relevant to the core pathology [51] or have a high degree of granularity that can detect subtle changes in cognitive function [52].

Conclusion

ML methods offer an opportunity to detect both AD and non-AD pathology through routinely collected research data. Such algorithms can be used to accelerate research into preclinical dementia states through their application in brain health volunteer registers.

Acknowledgments

Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf. The extensive numerical calculations were done on the Avitohol supercomputer that is described in [53].

A complete list of EPAD Investigators can be found at: http://ep-ad.org/wp-content/uploads/2020/12/202010_List-of-epadistas.pdf.

A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

References

  1. 1. Jack CR, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA‐AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement 2018;14:535–62. pmid:29653606
  2. 2. Sperling RA, Jack CR, Aisen PS. Testing the Right Target and Right Drug at the Right Stage. Sci Transl Med 2011;3. pmid:22133718
  3. 3. Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. The Lancet 2020;396:413–46. pmid:32738937
  4. 4. Kivipelto M, Ngandu T, Laatikainen T, Winblad B, Soininen H, Tuomilehto J. Risk score for the prediction of dementia risk in 20 years among middle aged people: a longitudinal, population-based study. Lancet Neurol 2006;5:735–41. pmid:16914401
  5. 5. Park CJ, Seo Y, Choe YS, Jang H, Lee H, Kim JP, et al. Predicting conversion of brain β-amyloid positivity in amyloid-negative individuals. Alzheimers Res Ther 2022;14:129. https://doi.org/10.1186/s13195-022-01067-8.
  6. 6. Petersen KK, Lipton RB, Grober E, Davatzikos C, Sperling RA, Ezzati A. Predicting Amyloid Positivity in Cognitively Unimpaired Older Adults: A Machine Learning Approach Using A4 Data. Neurology 2022;98:e2425–35. pmid:35470142
  7. 7. Shen X-N, Huang Y-Y, Chen S-D, Guo Y, Tan L, Dong Q, et al. Plasma phosphorylated-tau181 as a predictive biomarker for Alzheimer’s amyloid, tau and FDG PET status. Transl Psychiatry 2021;11:585. pmid:34775468
  8. 8. Doecke JD, Pérez-Grijalba V, Fandos N, Fowler C, Villemagne VL, Masters CL, et al. Total Aβ 42 /Aβ 40 ratio in plasma predicts amyloid-PET status, independent of clinical AD diagnosis. Neurology 2020;94:e1580–91. https://doi.org/10.1212/WNL.0000000000009240.
  9. 9. Calvin CM, de Boer C, Raymont V, Gallacher J, Koychev I, The European Prevention of Alzheimer’s Dementia (EPAD) Consortium. Prediction of Alzheimer’s disease biomarker status defined by the ‘ATN framework’ among cognitively healthy individuals: results from the EPAD longitudinal cohort study. Alzheimers Res Ther 2020;12:143. pmid:33168064
  10. 10. Kaffashian S, Dugravot A, Nabi H, Batty GD, Brunner E, Kivimäki M, et al. Predictive utility of the Framingham general cardiovascular disease risk profile for cognitive function: evidence from the Whitehall II study. Eur Heart J 2011;32:2326–32. pmid:21606085
  11. 11. Elias MF, Sullivan LM, D’Agostino RB, Elias PK, Beiser A, Au R, et al. Framingham Stroke Risk Profile and Lowered Cognitive Performance. Stroke 2004;35:404–9. pmid:14726556
  12. 12. Solomon A, Kivipelto M, Molinuevo JL, Tom B, Ritchie CW. European Prevention of Alzheimer’s Dementia Longitudinal Cohort Study (EPAD LCS): study protocol. BMJ Open 2018;8:e021017. https://doi.org/10.1136/bmjopen-2017-021017.
  13. 13. Petersen RC, Aisen PS, Beckett LA, Donohue MC, Gamst AC, Harvey DJ, et al. Alzheimer’s Disease Neuroimaging Initiative (ADNI): Clinical characterization. Neurology 2010;74:201–9. pmid:20042704
  14. 14. Koychev I, Jansen K, Dette A, Shi L, Holling H. Blood-Based ATN Biomarkers of Alzheimer’s Disease: A Meta-Analysis. J Alzheimers Dis 2021;79:177–95. pmid:33252080
  15. 15. Ritchie CW, Muniz-Terrera G, Kivipelto M, Solomon A, Tom B, Molinuevo JL. THE EUROPEAN PREVENTION OF ALZHEIMER’S DEMENTIA (EPAD) LONGITUDINAL COHORT STUDY: BASELINE DATA RELEASE V500.0. J Prev Alzheimers Dis 2019:1–7. https://doi.org/10.14283/jpad.2019.46.
  16. 16. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state.” J Psychiatr Res 1975;12:189–98. https://doi.org/10.1016/0022-3956(75)90026-6.
  17. 17. Granadillo E, Paholpak P, Mendez MF, Teng E. Visual Ratings of Medial Temporal Lobe Atrophy Correlate with CSF Tau Indices in Clinical Variants of Early-Onset Alzheimer Disease. Dement Geriatr Cogn Disord 2017;44:45–54. pmid:28675901
  18. 18. Roche Diagnostics International Ltd. Elecsys® Phospho-Tau (181P) CSF 2017. https://www.roche.at/content/dam/rochexx/roche-at/roche_diagnostics/documents/Roche-Factsheet-Elecsys-Phospho-Tau-181P-CSF-EN-web.pdf.
  19. 19. Roche Diagnostics International Ltd. Elecsys® β-Amyloid (1–42) CSF 2018. https://www.roche.at/content/dam/rochexx/roche-at/roche_diagnostics/documents/Roche-Factsheet-Elecsys-beta-Amyloid-(1-42)-CSF-EN-web.pdf.
  20. 20. Roche Diagnostics International Ltd. Elecsys® Total-Tau CSF 2017. https://www.roche.at/content/dam/rochexx/roche-at/roche_diagnostics/documents/Roche-Factsheet-Elecsys-Total-Tau-CSF-EN-web.pdf.
  21. 21. Scheltens P, Leys D, Barkhof F, Huglo D, Weinstein HC, Vermersch P, et al. Atrophy of medial temporal lobes on MRI in “probable” Alzheimer’s disease and normal ageing: diagnostic value and neuropsychological correlates. J Neurol Neurosurg Psychiatry 1992;55:967–72. pmid:1431963
  22. 22. Claus JJ, Staekenborg SS, Holl DC, Roorda JJ, Schuur J, Koster P, et al. Practical use of visual medial temporal lobe atrophy cut-off scores in Alzheimer’s disease: Validation in a large memory clinic population. Eur Radiol 2017;27:3147–55. pmid:28083697
  23. 23. Mielke MM, Hagen CE, Xu J, Chai X, Vemuri P, Lowe VJ, et al. Plasma phospho‐tau181 increases with Alzheimer’s disease clinical severity and is associated with tau‐ and amyloid‐positron emission tomography. Alzheimers Dement 2018;14:989–97. pmid:29626426
  24. 24. Moscoso A, Grothe MJ, Ashton NJ, Karikari TK, Rodriguez JL, Snellman A, et al. Time course of phosphorylated-tau181 in blood across the Alzheimer’s disease spectrum. Brain 2021;144:325–39. pmid:33257949
  25. 25. Tissot C L. Benedet A, Therriault J, Pascoal TA, Lussier FZ, Saha-Chaudhuri P, et al. Plasma pTau181 predicts cortical brain atrophy in aging and Alzheimer’s disease. Alzheimers Res Ther 2021;13:69. pmid:33781319
  26. 26. Thijssen EH, La Joie R, Wolf A, Strom A, Wang P, Iaccarino L, et al. Diagnostic value of plasma phosphorylated tau181 in Alzheimer’s disease and frontotemporal lobar degeneration. Nat Med 2020;26:387–97. pmid:32123386
  27. 27. Wang Y-L, Chen J, Du Z-L, Weng H, Zhang Y, Li R, et al. Plasma p-tau181 Level Predicts Neurodegeneration and Progression to Alzheimer’s Dementia: A Longitudinal Study. Front Neurol 2021;12:695696. pmid:34557143
  28. 28. Alec Zhixiao Lin, Tung-Ying Hsieh,. Expanding the Use of Weight of Evidence and Information Value to Continuous Dependent Variables for Variable Reduction and Scorecard Development. Paper SD-84 2014.
  29. 29. Weed DL. Weight of Evidence: A Review of Concept and Methods: Weight of Evidence. Risk Anal 2005;25:1545–57. https://doi.org/10.1111/j.1539-6924.2005.00699.x.
  30. 30. Nazarko Klaudia. Churn analysis using IV and WOE in Python 2020.
  31. 31. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., San Francisco California USA: ACM; 2016, p. 785–94. https://doi.org/10.1145/2939672.2939785.
  32. 32. He Z, Lin D, Lau T, Wu M. Gradient Boosting Machine: A Survey. ArXiv190806951 Cs Stat 2019.
  33. 33. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res 2011;12:2825–30.
  34. 34. Breiman L. Random Forests. Mach Learn 2001;45:5–32. https://doi.org/10.1023/A:1010933404324.
  35. 35. Genuer R, Poggi J-M, Tuleau C. Random Forests: some methodological insights. ArXiv08113619 Stat 2008.
  36. 36. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. 2nd ed. New York, NY: Springer; n.d.
  37. 37. Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Mach Learn 2006;63:3–42. https://doi.org/10.1007/s10994-006-6226-1.
  38. 38. Breiman L. Bagging predictors. Mach Learn 1996;24:123–40. https://doi.org/10.1007/BF00058655.
  39. 39. Kégl B. The return of AdaBoost.MH: multi-class Hamming trees. ArXiv13126086 Cs 2013.
  40. 40. Kohavi R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. Proc. 14th Int. Jt. Conf. Artif. Intell.—Vol. 2, San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 1995, p. 1137–43.
  41. 41. Sun X, Xu W. Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves. IEEE Signal Process Lett 2014;21:1389–93. https://doi.org/10.1109/LSP.2014.2337313.
  42. 42. Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006;27:861–74. https://doi.org/10.1016/j.patrec.2005.10.010.
  43. 43. Hosmer DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression. 1st ed. Wiley; 2013. https://doi.org/10.1002/9781118548387.
  44. 44. Jack CR, Bennett DA, Blennow K, Carrillo MC, Feldman HH, Frisoni GB, et al. A/T/N: An unbiased descriptive classification scheme for Alzheimer disease biomarkers. Neurology 2016;87:539–47. pmid:27371494
  45. 45. Cullen NC, Janelidze S, Stomrud E, Bateman RJ, Palmqvist S, Hansson O, et al. Plasma amyloid-β42/40 and apolipoprotein E for amyloid PET pre-screening in secondary prevention trials of Alzheimer’s disease. Brain Commun 2023;5:fcad015. https://doi.org/10.1093/braincomms/fcad015.
  46. 46. Jack CR, Knopman DS, Chételat G, Dickson D, Fagan AM, Frisoni GB, et al. Suspected non-Alzheimer disease pathophysiology—concept and controversy. Nat Rev Neurol 2016;12:117–24. pmid:26782335
  47. 47. Suárez‐Calvet M, Karikari TK, Ashton NJ, Lantero Rodríguez J, Milà‐Alomà M, Gispert JD, et al. Novel tau biomarkers phosphorylated at T181, T217 or T231 rise in the initial stages of the preclinical Alzheimer’s continuum when only subtle changes in Aβ pathology are detected. EMBO Mol Med 2020;12. https://doi.org/10.15252/emmm.202012921.
  48. 48. James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. Second edition. New York: Springer; 2021.
  49. 49. Koychev I, Young S, Holve H, Ben Yehuda M, Gallacher J. Dementias Platform UK Clinical Studies and Great Minds Register: protocol of a targeted brain health studies recontact database. BMJ Open 2020;10:e040766. pmid:33247021
  50. 50. Chinner A, Blane J, Lancaster C, Hinds C, Koychev I. Digital technologies for the assessment of cognition: a clinical review. Evid Based Ment Health 2018;21:67–71. pmid:29678927
  51. 51. Hansson O. Biomarkers for neurodegenerative diseases. Nat Med 2021;27:954–63. pmid:34083813
  52. 52. Lancaster C, Koychev I, Blane J, Chinner A, Chatham C, Taylor K, et al. Gallery Game: Smartphone-based assessment of long-term memory in adults at risk of Alzheimer’s disease. J Clin Exp Neuropsychol 2020;42:329–43. pmid:31973659
  53. 53. Atanassov E, Gurov T, Ivanovska S, Karaivanova A. Parallel Monte Carlo on Intel MIC Architecture. Procedia Comput Sci 2017;108:1803–10. https://doi.org/10.1016/j.procs.2017.05.149.