Disease progression modeling of Alzheimer’s disease based on variational probability principal component analysis

Xin Xiong; Ximin Wang; Chenyang Zhu; Jianfeng He; for the Alzheimer’s Disease Neuroimaging Initiative

doi:10.1371/journal.pone.0342549

Abstract

Alzheimer’s disease (AD) is a neurodegenerative disorder and the leading cause of dementia. Early diagnosis and monitoring of disease progression are crucial for effective intervention. This study presents a novel disease progression model based on Variational Probabilistic Principal Component Analysis (VPPCA), which uses a Bayesian framework for dimensionality reduction and uncertainty quantification. By analyzing 1,021 amyloid-positive patients from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, we extracted 25 features, including CSF (ABETA, TAU, PTAU), PET (FDG, AV45), and MRI volumetrics, along with cognitive and functional assessments. VPPCA compresses these multi-modal biomarkers into a single first principal component score (VPPCA1), which serves as a measure of disease progression. To ensure biological grounding and avoid circularity, we demonstrated that a VPPCA1 model using only non-cognitive features (CSF, PET, MRI, demographics) correlates strongly with cognitive decline (r = 0.658 with ADAS-Cog13), confirming that it captures genuine pathological progression rather than simply reflecting cognitive assessments. Block-wise feature ablation revealed that multi-modal integration is essential, with cognitive features showing the highest importance (0.1064), though all modalities contribute complementarily. In classification tasks, VPPCA exhibited strong performance with ROC-AUC values of 0.990 (CN vs Dementia), 0.774 (CN vs MCI), and 0.785 (MCI vs Dementia). A Bayesian hierarchical longitudinal model effectively captured patient-specific progression trajectories, offering personalized predictions of future disease states. VPPCA outperforms Probabilistic PCA (PPCA) by providing uncertainty quantification, with patient-specific confidence levels (σ = 0.086–0.136), which correlate with data quality, enabling automatic risk stratification. This work demonstrates that VPPCA offers a robust, biologically-grounded framework for modeling AD progression, providing actionable uncertainty quantification that improves clinical decision support and facilitates personalized care.

Citation: Xiong X, Wang X, Zhu C, He J, for the Alzheimer’s Disease Neuroimaging Initiative (2026) Disease progression modeling of Alzheimer’s disease based on variational probability principal component analysis. PLoS One 21(3): e0342549. https://doi.org/10.1371/journal.pone.0342549

Editor: Yong Fan, University of Pennsylvania Perelman School of Medicine, UNITED STATES OF AMERICA

Received: August 4, 2025; Accepted: January 26, 2026; Published: March 30, 2026

Copyright: © 2026 Xiong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data are from ADNI database (https://adni.loni.usc.edu/data-samples/adni-data/). If others want to access the data set, they can register an account through this website and submit an application to ADNI. There will usually be a reply within one or two days, and ADNI will allow downloading data for research purposes. We are absolutely sure that anyone can access the data set in the same way. We also confirm that there are no special access rights that others do not have.

Funding: This study was financially supported by the National Natural Science Foundation of China in the form of a grant awarded to XX and JH (82060329). This study was also financially supported by the Yunnan Fundamental Research Projects program in the form of a grant awarded to XX and JH (202201at070108). This study received additional financial support from the First People’s Hospital of Yunnan Province in the form of a grant awarded to XX and JH (2024nmkfkt-09). Further financial support was provided by the Strategic Project of Fuwai Cardiovascular Hospital in the form of a grant awarded to XX and JH (2025yfkt-zl-08). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Alzheimer’s disease (AD) is a neurodegenerative disease and the leading cause of dementia. The number of patients is expected to exceed 7 million by 2030 [1]; by 2050, the global prevalence will triple and is expected to affect more than 100 million people [2]. Early diagnosis of patients can be achieved by understanding the underlying pathological mechanisms, identifying multiple pathogenic and protective genes, and introducing lifestyle changes for intervention using new blood analysis methods. Novel biomarkers obtained by PET scans and plasma analysis of β-amyloid and phosphorylated tau protein have great potential in AD research and clinical diagnosis. Although lifestyle may not directly affect the pathological mechanisms of AD, it may still have a positive impact on patients. It is important to distinguish between biological biomarkers (such as amyloid and tau proteins measured via CSF or PET imaging, and brain structural changes from MRI) and clinical outcome assessments (such as cognitive test scores and functional rating scales), as they capture different aspects of disease progression.

The potential status of AD patients can be described from the biological perspective based on changes in relevant imaging biomarkers. Clinicians can use a trimodal staging system to diagnose AD dementia and determine the intermediate state between individuals with normal cognition (CN) and mild cognitive impairment (MCI) [3]. Kumar et al. [4] have explored predicting correct diagnosis from biomarkers, and Fisher et al. [5] have explored describing individual conditions from a single cognitive test, but these methods may be limited by the different sensitivities of the tests. Jack et al. [6] proposed a dynamic hypothesis model that combines disease stage with individual AD biomarkers, and judges the severity of clinical symptoms by observing whether the biomarkers are abnormal. Martin et al. [7] used probabilistic principal component analysis (PPCA) to describe the potential state of individuals in the Alzheimer’s disease continuum as a single score for a set of related biomarkers. This score can directly reflect the state of AD disease progression and can also be interpreted longitudinally based on time-shifted and multiple visits. The method they proposed can quickly discover low-dimensional representations of variability in biomarkers, and this unsupervised method is comparable to supervised models that require longitudinal data and has strong correlation with other more complex disease progression models.

However, this method still has some limitations. First, the method only selects quantitative biomarkers, and in fact other demographic information is equally important, such as age, years of education, number of APOE ε4 allele carriers, and dementia cognitive test scales. Secondly, regarding the interpretability of this method, although the first principal component was shown to be diagnostic in the latent space in 2021 [8], the authors did not explain whether the use of only the first principal component masks other variations in the biomarker. For PPCA [9], they only used this method to handle missing data and solve the noise of the score, but the uncertainty of the first principal component score was not handled.

Therefore, we proposed variational probability principal component analysis (VPPCA), which can not only estimate missing data, but also construct scores in the latent space from various biomarkers related to Alzheimer’s disease. This value can be summarized as a progression score for the longitudinal model of disease progression, highlighting the changes in biomarkers on the AD continuum to reflect the disease progression of Alzheimer’s patients. This method can also be used for the three-mode staging system classification of the AD continuum. In response to the uncertainty of the score, this method uses variational Bayesian inference to approximate the posterior distribution of the noise of the score obtained in the latent dimension, which is the prior distribution of the noise of the single trajectory score in the longitudinal hierarchical disease progression model.

Materials and methods

Alzheimer’s disease neuroimaging initiative (ADNI)

The data used in this article are from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Our study collected 16,422 records from 2,431 patients, covering data from four stages: ADNI-1, ADNI-2, ADNI-GO, and the latest ADNI-3. On this basis, 1,021 patients who were amyloid-positive at baseline were selected as research subjects, representing individuals in the AD continuum [10]. We selected a series of validated biomarkers for analysis, including amyloid, microtubule-associated protein tau, a series of cognitive test results, and specific brain area volume markers. Table 1 shows the basic demographic information of the patients and their related biomarker data statistics at baseline.

Download:

Table 1. Baseline demographic information of patients who were amyloid-positive at baseline diagnosis.

https://doi.org/10.1371/journal.pone.0342549.t001

Data preprocessing

All experiments in this paper focused on patients who were amyloid-positive, who were considered to be on the AD continuum. All our data were obtained using the ADNIMERGE R package, and only patients who were amyloid-positive were selected for the primary analysis. Amyloid positivity was determined based on each patient’s baseline cerebrospinal fluid (CSF) amyloid-β (1–42) concentration. Two validated immunoassay platforms were used in ADNI, each with established positivity thresholds: Roche Elecsys immunoassay: Aβ42 < 1100 pg/mL indicates amyloid positivity [11]; INNO-BIA AlzBio3 immunoassay: Aβ42 < 192 pg/mL indicates amyloid positivity [12].

Lower CSF Aβ42 concentrations reflect greater sequestration of amyloid-β into cerebral plaques, resulting in reduced levels in the CSF. This inverse relationship between CSF Aβ42 and brain amyloid burden is consistent with the amyloid cascade hypothesis of AD and has been validated against amyloid PET imaging.

Patients were classified as amyloid-positive if their Aβ42 concentration fell below the platform-specific threshold on either assay. Of the initial 2,431 patients, 1,021 (42.0%) were identified as amyloid-positive at baseline using these criteria.

CSF biomarker data were obtained from the ADNIMERGE R package. Specifically, INNO-BIA AlzBio3 measurements were retrieved from the UPENNBIOMK_MASTER file, and Roche Elecsys measurements were extracted from the UPENNBIOMK9, UPENNBIOMK10, and UPENNBIOMK12 files.

For feature selection, we chose measures commonly used in disease progression modeling that are available for most ADNI patients. Our feature set comprises:

(1) Biological biomarkers: CSF biomarkers: β-amyloid protein (ABETA), microtubule-associated protein tau (TAU), and phosphorylated tau (PTAU), which are established markers of neurodegeneration [13]; PET imaging: Fluorodeoxyglucose (FDG) uptake, quantified using the Florbetapir compound for β-amyloid deposition. While Pittsburgh Compound-B and Florbetaben also measure β-amyloid, we excluded them due to small sample sizes and minimal differences from Florbetapir [14]; MRI volumetrics: Volumes of ventricles, hippocampus, entorhinal cortex, whole brain, fusiform gyrus, and medial temporal gyrus [15].
(2) Clinical outcome assessments:
(3) Cognitive measures: Clinical Dementia Rating Scale Sum of Boxes (CDR-SB), Alzheimer’s Disease Assessment Scale cognitive subscale (ADAS-Cog13), ADAS-Cog delayed recall (ADASQ4), Rey Auditory Verbal Learning Test immediate recall (RAVLT Immediate), RAVLT Learning score, RAVLT Forgetting score, RAVLT Percent Forgetting, Logical Memory Delayed Recall (LDELTOTAL), Digit Symbol Substitution Test (DIGITSCOR), and Trail Making Test (TRABSCOR) [16]; Functional assessment: Functional Assessment Questionnaire (FAQ).
(4) Demographic and genetic information: Age (AGE), years of education (PTEDUCAT), and APOE ε4 allele status.

These tests assess the clinical manifestations of AD across cognitive, functional, and demographic domains. To maintain clarity, we hereafter refer to the complete set of 25 measures as ‘features’, distinguishing biological biomarkers from clinical assessments where necessary.

The final feature set contained 25 variables (missing rate<40%), covering biomarkers, clinical assessments, and demographic information (detailed missing rate statistics are shown in Supplementary S1 Table). Among the 1021 amyloid-positive patients at baseline, most variables had low missing rates: cognitive assessment data were nearly complete (missing rate 0–3.53%), MRI volume measurements had low missing rates (3.82–15.18%), and cerebrospinal fluid biomarkers had moderate missing rates (18.51%). The highest missing rates were for AV45-PET (39.76%) and DIGITSCOR (38.89%), both below our pre-set 40% threshold.

To reduce the expected heterogeneity of brain region volumes due to age and intracranial volume, we adjusted the volumes of six brain regions obtained from MRI scans according to age and intracranial volume (ICV) at baseline [17]. This approach allowed us to focus on the common trends in biomarker evolution across affected individuals. All biomarkers were scaled using z-score normalization to reduce differences in scale and magnitude.

Finally, amyloid-positive patients were randomly divided into two groups, training and test sets, of equal size. This grouping was first used for normalization purposes, and the test set was renormalized according to the covariance matrix of the biomarkers processed using VPPCA.

Variational probabilistic principal component analysis (VPPCA)

In the previous section, we extracted 25 features related to Alzheimer’s disease from the original data. In order to capture the differences in disease states in the dataset, we used VPPCA [18] to project the original data into a low-dimensional latent space, which can retain the maximum variability in the data. VPPCA provides a more flexible and robust way to reduce the dimension and extract features from data by introducing a probabilistic model and a variational inference framework. The estimation of model parameters is no longer a single point estimate, but exists in the form of probability distribution, which increases the uncertainty representation and interpretability of the model.

Variational inference for Alzheimer’s disease biomarker analysis.

In the VPPCA model, we use variational inference to estimate the potential changes in biomarkers associated with Alzheimer’s disease. Considering the complexity of biomarker data in Alzheimer’s disease research, including the high dimensionality and missing value problems of the data, variational inference provides us with a method to efficiently estimate model parameters.

In order to maximize the log-likelihood function of the observed data X, where X contains the observed values of 25 features related to the progression of Alzheimer’s disease, a set of latent variables Z (the position of each patient on the AD progression continuum, which reflects the patient’s cognitive state and the degree of disease progression) and model parameters (biomarker loading, which can represent the weight of the influence of different biomarkers on the patient’s potential disease progression state) are introduced, and a simplified variational distribution is introduced to approximate the true posterior distribution .

The goal of variational inference is to find the optimal variational distribution so that it is as close as possible to the true posterior distribution . First, the variational method introduces a set of approximate densities , which is a set of probability densities of latent variables. The goal of variational inference is to find the variational distribution that minimizes the Kullback-Leibler (KL) divergence between and the true posterior p(𝑍, 𝜃|𝑋), thereby making as close as possible to the true posterior. The KL divergence is a measure of dissimilarity between two probability distributions. Formula (1) formally states this optimization objective:

(1)

The KL divergence formula is

(2)

Using Bayes’ theorem and observed data , KL divergence can be rewritten as

(3)

(4)

This formulation (equation 4) decomposes the ELBO into three interpretable terms: the expected log-likelihood of the observed data, the prior log-probability of the latent variables and parameters, and the negative entropy of the variational distribution. Maximizing the ELBO with respect to q is equivalent to minimizing the KL divergence between q and the true posterior, as shown in equation (5).

ELBO (Evidence Lower Bound) is

(5)

In equation (3), (X) is constant with respect to , and the KL divergence 𝐾𝐿(q(𝑍, 𝜃)‖p(𝑍, 𝜃|𝑋)) ≥ 0 by Gibbs’ inequality. From equation (5), we observe that: ln (𝑋) = 𝐾𝐿(q(𝑍, 𝜃)‖p(𝑍, 𝜃|𝑋)) + 𝐿(Q).

Since ln (X) is fixed and the KL divergence is non-negative, maximizing 𝐿(Q) (the Evidence Lower Bound, ELBO) is equivalent to minimizing the KL divergence 𝐾𝐿(q(𝑍, 𝜃)‖p(𝑍, 𝜃|𝑋)). Therefore, variational inference seeks to find the optimal variational distribution q*(𝑍, 𝜃) by maximizing the ELBO.

In order to find the most suitable variational cluster and the optimal parameters to minimize , we introduce the mean field variational family decomposition , where the latent variables Z are independent of each other and each variable is controlled by a different factor in the variational density. By choosing Gaussian distribution as the distribution of , can be minimized on , and the variational distribution in the form of Gaussian distribution is

(6)

where represents the expectation over all parameters except , is the joint probability of the model under the observed data X(biomarker) and parameters (biomarker loading),

(7)

Specify the prior distribution on as

(8)

The update formula of the fixed-point iteration algorithm is set as

(9)

(10)

(11)

(12)

(13)

(14)

Use the fixed-point iteration algorithm to find the optimal parameters ,W,,,. After setting the initial values, gradually adjust the variables until convergence.

Estimating missing values.

Missing values are common in longitudinal clinical datasets like ADNI [19]. Both biological biomarkers (e.g., CSF measures requiring lumbar puncture) and clinical assessments (e.g., cognitive tests not administered at every visit) may be incomplete. In the follow-up of each patient, the actual situation is that the follow-up data cannot be collected completely, and the integrity of the data is crucial to the quality and reliability of subsequent analysis. In order to effectively estimate these missing values, this paper adopts a method based on the Bayesian principal component analysis framework, which uses the latent structure of the data to infer the possible values of the missing data points. This paper focuses on the latent variable Z (latent disease progression score), whose calculation depends on the non-missing value part of the observed data X (feature measurements).

First, for each observation point containing missing values, identify the subset of observed data dimensions and the corresponding weight matrix . Then, calculate the mean of the posterior distribution of the latent variable Z, as follows

(15)

Where, represents the inverse of the posterior distribution covariance matrix of the latent variable Z, represents the precision of the observed noise (i.e., the inverse variance of the noise), and is the mean of the observed data dimension. Then, by using the mean of the posterior distribution of the latent variable Z, the missing value is estimated by formula (16),

(16)

where is the subset of the weight matrix corresponding to the missing dimensions and is the mean of the missing dimensions.

Results and discussion

Comparison of the scores of the first and second principal components

The original data is projected onto the latent space after VPPCA. Fig 1 shows the comparison between the first principal component score and the second principal component score. It can be seen that the data distribution range and interquartile range of VPPCA1 are much larger than those of VPPCA2. The median and mean of both are close to 0, and the data distribution is symmetrical and concentrated. For outliers, VPPCA1 has significantly fewer outliers, while VPPCA2 has more outliers and they are distributed at both ends of the data. There may be some extreme cases.

Download:

Fig 1. Boxplot of the first and second principal components.

https://doi.org/10.1371/journal.pone.0342549.g001

VPPCA analysis of baseline feature scores in patients with Alzheimer’s disease

We calculated the feature scores after the observations were transformed into latent space according to the variational probability principal component analysis method 。We first focused on the scores at the baseline visit, which are related to patients from the training set and the test set. The test set has been normalized according to the biomarkers of the training set. In addition, the scores of the baseline test set are calculated using the covariance matrix generated by iteratively updating the parameters on the training set.

Fig 2 shows the first principal component scores of patients in the test set, grouping their diagnosis at the baseline visit into cognitively normal patients (CN), mild cognitive impairment patients (MCI) and dementia patients (Dementia), and the scores of these patients correspond to their groups. It can be clearly observed that the patients have achieved good diagnostic separation. In the progression of Alzheimer’s disease, the scores are approximately between [−1.5, 2].

Download:

Fig 2. Diagnostic density of VPPCA1 in 448 test set patients.

https://doi.org/10.1371/journal.pone.0342549.g002

Considering the general trend of diagnosis, this figure also provides a natural interpretation of the VPPCA1 score, combined with the clinical staging diagram proposed by Jack et al., which places healthy individuals on the left side of the axis and individuals who have experienced a significant decline on the other side.

According to Fig 3a), no significant correlation was found between the score and age (Pearson: r = 0.020, P < 0.653; Spearman: ρ = 0.055, P < 0.216).

Download:

Fig 3. a) Relationship between VPPCA1 and age. b) Relationship between VPPCA1 and one-dimensional LDA.

https://doi.org/10.1371/journal.pone.0342549.g003

Correlation between VPPCA and linear discriminant analysis in predicting the progression of Alzheimer’s disease

Since variational probability principal component analysis is an unsupervised method, a natural question is how its projection compares to similar methods that use labels (disease diagnosis status in this case). Since the projection of linear discriminant analysis (LDA) is specifically guided by the classification objective [20], our method is compared with LDA. We performed a one-dimensional projection of the biomarker data according to the three categories of CN, MCI, and Dementia using linear discriminant analysis. For comparison with VPPCA, the first principal component after dimensionality reduction by VPPCA was selected as the projection direction of LDA because it maximizes the inter-class distance while minimizing the intra-class distance.

Since there are a large number of missing values in the biomarker data, LDA cannot naturally handle them, so 0 is used to replace the missing values. The projection value is compared with the score using VPPCA. Our goal is to evaluate the correlation between the LDA one-dimensional projection value and the VPPCA1 score, rather than to evaluate the performance of LDA, so LDA is overfitted to the baseline diagnostic grouping labels on the test set to obtain the ideal diagnostic information projection value.

We compared the one-dimensional label-aware projection values obtained using linear discriminant analysis (LDA) with the VPPCA1 scores. As shown in Fig 3b), the VPPCA1 scores were strongly correlated with the one-dimensional linear projection values determined by LDA (Pearson: r = 0.890, P < 0.001; Spearman: ρ = 0.896, P < 0.001).

Robustness: Block-wise feature ablation analysis

Handling missing data is particularly important in the clinical setting, as clinical datasets often have missing measurements. We evaluated robustness to random missing data.

We designed a block-by-block ablation experiment to demonstrate the robustness of the model. We divided the features into five ablation modules: cognitive measurement features, cerebrospinal fluid markers (ABETA, TAU, PTAU), brain volume markers (Ventricles, Hippocampus, Whole Brain, Entorhinal, Fusiform, Mid Temp), imaging markers (FDG, AV45), and demographic features (AGE, PTEDUCAT, APOE4). The results are shown in Fig 4. Fig 4(A) ROC-AUC score heatmap: showing the ROC-AUC performance of 6 feature configurations in 3 classification tasks, where the first one is the model containing complete features. (B) Feature importance ranking: the horizontal bar chart shows the average importance score of each feature module. (C) Task difficulty comparison: the bar chart compares the baseline performance of the three classification tasks. It can be seen that in the classification tasks, CN/D task: all modules reach ≥0.97 AUC, indicating that it is relatively easy to distinguish between healthy people and dementia patients; CN/MCI task: is the most challenging task in Alzheimer’s disease (AUC ≈ 0.69–0.78), where the removal of cognitive markers has the greatest impact (dropping to 0.69); MCI/D task: AUC ≈ 0.71–0.84, relatively robust to feature loss. Subplot (B) shows the average AUC drop for the five feature classes (Demographics, CSF, MRI, PET, and Cognitive). Different colored bars represent the impact of removing each feature class on the overall model performance, with the cognitive assessment feature showing the largest impact (approximately 0.0966). Subplot (C) compares the difficulty of the three classification tasks with the full model configuration. Blue and orange bars show the ROC-AUC and PR-AUC performance scores, respectively. The CN/D task shows the highest classification performance (ROC-AUC = 0.979, PR-AUC = 0.965), while the MCI/D task shows relatively lower performance (ROC-AUC = 0.819, PR-AUC = 0.538).

Download:

Fig 4. Complete model performance, feature importance, and task difficulty analysis.

https://doi.org/10.1371/journal.pone.0342549.g004

Fig 5 illustrates the ROC-AUC performance and confidence intervals of the models under different feature combinations in six classification tasks. The six sub-figures correspond to the VPCA model performance of CN vs Dementia, CN vs MCI, MCI vs Dementia, CN vs Dementia, CN vs MCI, and MCI vs Dementia, respectively. The horizontal axis in each sub-figure represents different feature combinations, including the complete model, CSF removal (w/o CSF), demographic information removal (w/o Demo), PET removal (w/o PET), MRI removal (w/o MRI), and cognitive assessment removal (w/o Cog). The vertical axis displays the ROC-AUC scores, and the bars are color-coded. Error bars indicate the 95% confidence interval to reflect the statistical uncertainty of the model performance.

Download:

Fig 5. Performance comparison of different feature configuration models.

https://doi.org/10.1371/journal.pone.0342549.g005

Fig 6 illustrates the baseline performance and task-specific feature importance analysis for different classification tasks. The top three sub-figures present the performance of the three classification tasks: CN vs Dementia, CN vs MCI, and MCI vs Dementia. For each task, blue bars represent AUC, and orange bars represent PRAUC. The horizontal axis shows that the model performance was evaluated for each task after removing different combinations of features, including the complete model and models after removing CSF, demographic information, PET, MRI, and cognitive assessment. The bottom bar chart shows the task-specific feature importance, using four different colored bars to represent the four tasks: CN/D (CN vs Dementia), CN/MCI (CN vs MCI), MCI/D (MCI vs Dementia), and KC/DT (potentially representing another classification task). The horizontal axis shows five feature classes: CSF, demographic information, PET, MRI, and cognitive assessment, demonstrating the relative importance differences of different features in different classification tasks.

Download:

Fig 6. Task-specific performance metrics and feature contribution analysis.

https://doi.org/10.1371/journal.pone.0342549.g006

Fig 7 illustrates the feature importance ranking of the VPPCA-based model and the performance changes in the leave-one-out feature ablation experiment. The feature importance ranking shows that cognitive assessment has the highest importance score in the model (0.1064), followed by PET imaging biomarkers, cerebrospinal fluid (CSF) biomarkers, MRI volume measurements, and demographic information. The heatmap in the middle shows the model’s performance after removing a single feature in three classification tasks (CN vs Dementia, CN vs MCI, and MCI vs Dementia), quantified using AAUC (ΔAUC) and PR AUC (ΔPRAUC). The bar charts below specifically show the changes in ΔAUC and ΔPRAUC for each classification task after removing CSF, demographic information, PET, MRI, and cognitive assessment, with the most significant performance decline in the CN vs Dementia task after removing cognitive assessment (ΔAUC = 0.101, ΔPRAUC = 0.147).

Download:

Fig 7. Feature importance analysis and ablation study results.

https://doi.org/10.1371/journal.pone.0342549.g007

Diagnostic classification: Threshold-independent performance evaluation

To gain a more intuitive understanding of the score, it was applied to the Alzheimer’s disease diagnosis classification task. The goal was to evaluate whether the obtained score was sufficient to predict the patient’s diagnosis. We chose a shallow decision tree with a depth of 1 to predict whether the patient was assigned one of the two diagnoses. A threshold based on the VPPCA1 score was used for prediction. This threshold was determined by the training set and the node impurity was measured using the Gini index. Node impurity is the degree of mixing of patients with diagnoses below or above the threshold. Impurity is defined for each node m and a parameterized threshold T is used to define

(18)

In the experiment, based on the number of possible diagnosis categories, we set K = 2, and is the empirical proportion of observations of category K in node m. This method can well distinguish different diagnoses of patients and requires fewer parameters than linear models.

A decision tree based on depth 1 and baseline patient data with Gini index as the standard are used. According to the VPPCA1 score and the threshold determined according to the training set, it is used to separate patients and transform the potential disease progression modeling into a diagnostic classification problem. From Table 2, it can be seen that the scores of CN/MCI, CN/Dementia, and MCI/Dementia are all above 0.75, and the recall rates are all above 0.77. Especially for the classification task of CN/Dementia, the accuracy rate reaches 98%, but the classification performance of CN/MCI is lower, and the accuracy rate is much lower than that of the other two classification tasks, because this task is currently considered to be a very difficult task.

Download:

Table 2. Disease diagnosis classification results of test set patients based on decision tree.

https://doi.org/10.1371/journal.pone.0342549.t002

Table 3 provides a detailed comparison of VPPCA’s performance against other disease ‘progression models, such as PPCA, LTJMM, and Z-Score, across key Alzheimer’s disease diagnostic classification tasks. As shown, VPPCA outperforms the other models in multiple metrics, particularly in the CN/Dementia classification task, with a ROC-AUC of 0.990 and a PR-AUC of 0.982. Additionally, VPPCA demonstrates strong performance in the MCI/Dementia task, where it achieves a ROC-AUC of 0.785, surpassing PPCA and LTJMM.

Download:

Table 3. Performance comparison of VPPCA and other disease progression models for Alzheimer’s disease diagnostic classification.

https://doi.org/10.1371/journal.pone.0342549.t003

To comprehensively evaluate the classification performance of VPPCA along the Alzheimer’s disease (AD) continuum without relying on single-threshold metrics, we employed threshold-free measures including receiver operating characteristic (ROC) curves, area under the ROC curve (AUC) with confidence intervals (CI), and precision-recall AUC (PR-AUC). This analysis focused on three critical binary classification tasks: cognitively normal (CN) versus dementia (AD), CN versus mild cognitive impairment (MCI), and MCI versus AD, using identical data splits and preprocessing pipelines across all methods to ensure comparability.

As illustrated in the Fig 8, panels A, D, and G display the ROC curves for CN vs. AD, CN vs. MCI, and MCI vs. AD classifications, respectively, while panels B, E, and H present the corresponding precision-recall curves. Panels C, F, and I provide bar charts comparing performance metrics across methods, including VPPCA, probabilistic PCA (PPCA), and other benchmarks. For CN vs. AD classification, the ROC-AUC reached 0.990 (95% CI: 0.985–0.995) and PR-AUC was 0.982 (95% CI: 0.975–0.989), demonstrating near-perfect separability. In the more challenging CN vs. MCI task, ROC-AUC was 0.774 (95% CI: 0.750–0.798) with PR-AUC of 0.861 (95% CI: 0.840–0.882), indicating moderate discriminative power. For MCI vs. AD, ROC-AUC was 0.785 (95% CI: 0.760–0.810) and PR-AUC was 0.482 (95% CI: 0.450–0.514), reflecting the heterogeneity within MCI subgroups. All CIs were calculated via bootstrap resampling to ensure statistical robustness. This threshold-free approach validates the differential separability across the AD continuum, particularly for the nuanced CN-MCI and MCI-AD transitions.

Download:

Fig 8. Diagnostic classification performance comparison across different methods.

https://doi.org/10.1371/journal.pone.0342549.g008

Application of time shift in Alzheimer’s disease progression model

In previous literature on mixed-effect disease progression models, a single time shift is often considered an indispensable element, while study time or age cannot model the evolution of biomarkers well, including the LTJMM model proposed by Li et al. [21] in 2019. Martin et al. [7] demonstrated that the first principal component score from PPCA closely tracks the time-shift parameter of the LTJMM model. To avoid circularity when relating VPPCA1 to cognitive outcomes, we trained an alternative VPPCA model using only non-cognitive features—specifically, biological biomarkers (CSF measures, FDG-PET, MRI volumes) and demographic variables (age, education, APOE ε4 status), excluding all clinical cognitive assessments (ADAS-Cog13, CDR-SB, FAQ, RAVLT subtests, LDELTOTAL, DIGITSCOR, TRABSCOR). This ensures that the correlation between VPPCA1 and ADAS-13 is not tautological [22,23].

To avoid circularity (where cognitive features in the input would trivially correlate with cognitive outcomes), we trained a separate VPPCA model for this analysis using only the following non-cognitive features: Included (N = 14 features): CSF biomarkers: ABETA, TAU, PTAU (3 features);

PET imaging: FDG, AV45 (2 features); MRI volumes: Ventricles, Hippocampus, Entorhinal, Whole Brain, Fusiform, Mid Temp (6 features); Demographics: Age, Years of Education, APOE ε4 status (3 features) Excluded (N = 11 clinical assessments):CDR-SB, ADAS-Cog13, ADASQ4, FAQ, RAVLT Immediate, RAVLT Learning, RAVLT Forgetting, RAVLT Percent Forgetting, LDELTOTAL, DIGITSCOR, TRABSCOR.

The 1,021 amyloid-positive patients were randomly split into training (N = 510) and test (N = 511) sets at the patient level. The non-cognitive VPPCA model was fitted on the training set with z-score normalization applied to all features and MRI volumetric measures adjusted for age and intracranial volume (ICV) at baseline. VPPCA1_noncog scores were then computed for the test set, and the correlation with ADAS-Cog13 was evaluated on test set observations to ensure independent validation.

This non-cognitive VPPCA model was fitted on the training set using the same variational inference procedure, and the first principal component score (denoted VPPCA1_noncog) was computed for all longitudinal visits. We then examined the correlation between VPPCA1_noncog and ADAS-Cog13 to assess whether a biologically-grounded progression score can predict cognitive decline without directly including cognitive measures in the model. Cognitive variables include CDR-SB, ADAS-Cog13, ADASQ4, RAVLT Immediate, RAVLT Learning score, RAVLT Forgetting score, RAVLT Percent Forgetting, LDELTOTAL, DIGITSCOR, and TRABSCOR. We compared the model without cognitive variables and the complete model, and the results are shown in Fig 9. It can be seen that the VPPCA1 score in the model with complete variables is highly correlated with ADAS13 (Pearson r = 0.899), with a determination coefficient R² of 80.83%. The VPPCA score in the model without cognitive variables also shows some correlation with ADAS13 (Pearson r = 0.658), with a determination coefficient R² of 43.23%, but it remains statistically significant. Correlation coefficient comparison shows significant differences between the two models in Pearson and Spearman coefficients (approximately 0.899/0.904 for the complete model and approximately 0.658/0.676 for the non-cognitive model). This demonstrates that biologically-based progression markers can meaningfully track cognitive decline, supporting the use of VPPCA1_noncog as a disease-progression pseudotime scale. Patients with lower VPPCA1_noncog scores (indicating more severe biological pathology) typically exhibited higher ADAS-Cog13 scores (indicating greater cognitive impairment).Importantly, because VPPCA1_noncog is derived solely from biological and demographic features, this relationship is not circular. Using VPPCA1_noncog as a pseudotime measure better captured the evolution of cognitive symptoms than chronological age, supporting the hypothesis that a latent biological progression axis underlies clinical manifestations of AD.

Download:

Fig 9. VPPCA non-cognitive variable modeling vs. complete variable modeling.

https://doi.org/10.1371/journal.pone.0342549.g009

Longitudinal interpretation of disease progression models and predictions

Currently, assessing the development of Alzheimer’s disease (AD) requires comprehensive consideration of multiple time-varying factors, especially individual differences between patients. This is crucial for determining the stage and prediction of the patient’s disease. Our score is a value on the same dimension for patients with different latent states, which can be interpreted as a continuum of Alzheimer’s disease. The score can also be calculated from the biomarkers at any visit of the patient, so it can be estimated for all follow-up records.

In order to more accurately capture the dynamic changes of Alzheimer’s disease in each patient over time, we used a hierarchical model for longitudinal interpretation. A pair of coefficients is determined for each patient, describing the patient’s score as an affine transformation of time. In the longitudinal hierarchical explanatory model, the variational probability principal component analysis (VPPCA) method is used to analyze the disease progression of Alzheimer’s disease (AD) patients. The model captures each patient’s baseline score and the rate of change of the score over time by linking the patient’s follow-up data with time.

Specifically, the intercept () in the model represents each patient’s baseline score, reflecting the patient’s disease state at the initial moment. The slope () indicates the rate of change of the score over time, reflecting the speed of disease progression. By estimating these parameters, the model is able to provide personalized disease progression predictions for each patient. In addition, the model also takes into account the correlation between the intercept and the slope, and captures this correlation through Cholesky decomposition, thereby improving the robustness and accuracy of the model. The model is shown below:

For each patient p (where p = 1, 2, …, P), we assume that its intercept and slope follow a multivariate normal distribution:

(19)

where is the Cholesky decomposition of the correlation matrix.

For each observation point n (where n = 1, 2,..., N), assume that the observed score is:

(20)

Where is the expected score calculated based on the intercept and slope of the corresponding patient:

(21)

The prior distribution is:

(22)

(23)

(24)

(25)

(26)

(27)

Based on the results obtained from the longitudinal model, we obtained the best description of the patient score trajectory. The MCMC method was used to estimate the posterior distribution of the parameters. The number of warm-up iterations was set to 3000, and the Bayesian results were obtained in 3000 iterations after the warm-up iterations. The number of chains was set to 4. Fig 10 shows the posterior distribution of the parameters of the longitudinal hierarchical model. The R-hat statistic was used to check whether the MCMC chain converged. Through calculation, the R-hat statistic of all parameters was very close to 1 ([0.999, 1.001]). The posterior distribution probability of each parameter is shown in Fig 11. Fig 12 shows the disease progression in three patient groups: cognitively normal (CN), mild cognitive impairment (MCI), and dementia. The progression is based on patient grouping at baseline visits and the average progression within each group. It can be seen that patients with existing cognitive decline had higher initial scores and progressed more rapidly.

Download:

Fig 10. Distribution of the posterior convergence region of parameters of the vertical hierarchical model.

https://doi.org/10.1371/journal.pone.0342549.g010

Download:

Fig 11. Parameter posterior distribution probability diagram of the longitudinal hierarchical model.

https://doi.org/10.1371/journal.pone.0342549.g011

Download:

Fig 12. Trajectory diagram of test set patients based on baseline diagnosis classification (follow-up records at least three times).

https://doi.org/10.1371/journal.pone.0342549.g012

Comparison with other disease progression models

Our model was compared with three different methods. First, we compared the recently proposed probabilistic principal component analysis, which also describes the patient’s disease progression as an individual progression score. It uses the volume changes of some intracranial regions extracted from MRI for modeling, and replaces the time-shift changes with the first principal component score; we compared the correlation between the first principal component scores of probabilistic principal component analysis and variational probabilistic principal component analysis. Secondly, we compared the latent time joint mixed model (LTJMM), which calculates the time shift of individual variability and describes the changes of biomarkers through a mixed effect model. The time shift parameters of this method are estimated by longitudinal data, but are not time-varying. We compared them in the prediction of Alzheimer’s disease diagnosis classification. The last method compared was the average z-score processing biomarker.

As shown in Table 4, although the probabilistic principal component analysis (PPCA) achieved an accuracy of 99% in the CN/Dementia classification task, its indicators were inferior to VPPCA in the other two classification tasks. For the average z-score, the accuracy is comparable to that of PPCA in the CN/MCI and MCI/Dementia classification tasks, but it cannot be used as an excellent evaluation indicator in the CN/Dementia classification task, which is consistent with the conclusion of PPCA and average z-score discussed in previous papers.

Download:

Table 4. Diagnostic prediction results of VPPCA and other three methods for different categories.

https://doi.org/10.1371/journal.pone.0342549.t004

Fig 13 shows the correlation between the first principal component scores of VPPCA and PPCA (Pearson: r = 0.990, P < 0.001; Spearman: ρ = 0.990, P < 0.001). Fig 14 demonstrates VPPCA’s uncertainty quantification capabilities compared to PPCA’s deterministic point estimates. Panel A shows that VPPCA generates posterior distributions with patient-specific uncertainty levels (σ = 0.086–0.136), while PPCA provides only single-point estimates (dashed lines) without confidence information. The uncertainty distribution (Panel B, median σ = 0.106) reveals appropriate calibration with a right-skewed pattern, indicating most predictions achieve moderate confidence while a minority exhibits elevated uncertainty requiring clinical scrutiny.

Download:

Fig 13. Relationship between VPPCA1 and PPCA1.

https://doi.org/10.1371/journal.pone.0342549.g013

Download:

Fig 14. Comparison of VPPCA and PPCA uncertainty quantification.

https://doi.org/10.1371/journal.pone.0342549.g014

Critically, Panel C demonstrates that VPPCA’s uncertainty is not arbitrary but correlates significantly with reconstruction error (r = 0.343, p < 0.001), functioning as an automatic data quality indicator. Higher uncertainty flags predictions from incomplete or low-quality data without manual quality control. Panel D contrasts prediction formats: PPCA’s point estimates (red crosses) treat all predictions uniformly, whereas VPPCA’s 95% confidence intervals (blue bars, mean width = 0.4323) enable risk-differentiated decision-making—narrow intervals (samples 40–60) support confident clinical action, while wide intervals (samples 0–20) warrant additional testing.

Panel E shows VPPCA’s automatic Bayesian risk stratification: low (25.0%), medium (50.0%), and high (25.0%) uncertainty groups, identifying 682 patients requiring enhanced monitoring without arbitrary thresholds. This probabilistic approach surpasses PPCA’s threshold-based stratification, which cannot distinguish patients with identical point estimates but different data quality. These results demonstrate that VPPCA preserves PPCA’s disease signal while adding calibrated uncertainty that enables: (1) individualized confidence assessment, (2) automatic quality flagging (r = 0.343), and (3) automated risk stratification—transforming deterministic estimates into actionable probabilistic predictions for clinical decision support.

Discussion

This paper proposes a novel approach, namely variational probability principal component analysis (VPPCA), to predict and analyze the disease progression of Alzheimer’s disease (AD). Through the analysis of the ADNI dataset, the experiments demonstrate the effectiveness and robustness of the method in processing high-dimensional biomarker data, coping with missing values, and extracting disease progression features. The experiments show that most of the characteristics and variability of the data can be captured using only the first principal component, which shows that the method can well process high-dimensional datasets into a single value for subsequent analysis. In the ADNI dataset, variational probability principal component analysis can accurately find low-dimensional representations of heterogeneity. By predicting the disease diagnosis classification of amyloid-positive patients, this unsupervised method provides stronger discrimination ability than supervised models and DPMs models that require longitudinal data. We implement a longitudinal hierarchical interpretation model for the first principal component score of variational probability principal component analysis, which can estimate the rate of longitudinal change and provide a prediction of the disease trajectory for each patient.

Feature selection, preprocessing, and variational inference framework

We selected 25 features capturing diverse aspects of Alzheimer’s disease (AD) pathophysiology, including CSF biomarkers (ABETA, TAU, PTAU), neuroimaging (FDG-PET, AV45-PET, MRI volumes), cognitive assessments (ADAS-Cog13, CDR-SB, RAVLT), functional assessments (FAQ), and demographic/genetic data (age, education, APOE ε4). All features had missing rates below 40%. Unlike prior work focused on MRI volumes, our approach integrates molecular, structural, and functional data. Linear regression adjustment for brain volumes using age and baseline ICV was performed, confirming that VPPCA1 scores are not significantly correlated with age (Pearson r = 0.020, P = 0.653), thus reflecting disease progression rather than normal aging.

VPPCA, extending PCA through a Bayesian framework, overcomes key limitations of traditional methods, particularly in handling missing data and quantifying uncertainty. The procedure reduces the 25 features to a low-dimensional representation. As shown in Fig 1, VPPCA1 captures the primary disease progression signal with a broader distribution range and fewer outliers compared to VPPCA2. Fig 2 demonstrates clear diagnostic separation across 448 test patients, with minimal overlap between CN and dementia groups, and broader distributions for MCI, highlighting biological heterogeneity. Despite being unsupervised, VPPCA1 correlates strongly with LDA projections (r = 0.890, P < 0.001), validating its ability to capture the main axis of disease progression without labeled training data.

Robustness and classification performance

A block-wise feature ablation analysis demonstrated the importance of multi-modal integration for model robustness. The 25 features were divided into five blocks: cognitive assessments, CSF biomarkers (ABETA, TAU, PTAU), MRI volumes, PET imaging (FDG, AV45), and demographics (age, education, APOE ε4). Cognitive assessments had the highest importance score (0.1064), followed by PET, CSF, MRI, and demographics. Performance degradation (ΔAUC and ΔPR-AUC) was assessed across three classification tasks: CN vs Dementia, CN vs MCI, and MCI vs Dementia. As shown in the results, no single modality could match the full model’s performance, reinforcing the need for multi-modal integration to optimize diagnostic separation.

VPPCA handles missing data effectively via its variational inference framework, incorporating posterior distribution inference. The ADNI dataset, which shows varying missing rates for biomarkers, demonstrated that even with missing data, VPPCA maintained robustness (mean absolute difference in scores: < 0.049). Unlike methods requiring imputation, VPPCA adjusts for missing data automatically, as validated by the significant correlation between uncertainty and reconstruction error (r = 0.343, p < 0.001).

In threshold-free classification evaluations, VPPCA achieved excellent performance with ROC-AUC and PR-AUC values, particularly for the CN vs Dementia task (ROC-AUC = 0.990, PR-AUC = 0.982). However, MCI vs Dementia showed more moderate performance, reflecting the inherent challenges of distinguishing these subgroups. Importantly, VPPCA is more robust than traditional PPCA, as it provides uncertainty quantification, an advantage not present in deterministic methods. This capability is key for modeling disease trajectories and making personalized predictions over time.

The analysis confirms that VPPCA’s strength lies not in diagnostic classification but in its ability to capture disease progression across the Alzheimer’s continuum, offering a reliable measure for personalized patient care and longitudinal analysis.

Quantification of uncertainty in VPPCA and discussion on circular reasoning

A key concern in disease progression modeling is circularity—when cognitive measures used to generate progression scores lead to tautological correlations with cognitive outcomes, like ADAS-Cog13. To avoid this, we trained an alternative VPPCA model using only non-cognitive features: CSF biomarkers (ABETA, TAU, PTAU), PET imaging (FDG, AV45), MRI volumes (ventricles, hippocampus, entorhinal, fusiform, middle temporal), and demographics (age, education, APOE ε4). This model excluded all cognitive assessments, ensuring that the progression score was based solely on biological markers. Fig 9 shows that even without cognitive features, the non-cognitive VPPCA1 model maintains a meaningful correlation with ADAS-Cog13 (r = 0.658). This confirms that VPPCA1 captures disease progression based on biological markers, not just cognitive assessments, aligning with the ATN framework, where molecular and structural biomarkers precede cognitive symptoms.

VPPCA and PPCA show a near-perfect correlation (r = 0.990), suggesting they identify similar disease signals. However, VPPCA offers an important advantage through its probabilistic framework, which provides uncertainty quantification. As shown in Fig 14, VPPCA generates posterior distributions with uncertainty levels ranging from σ = 0.086 to 0.136, whereas PPCA only provides deterministic point estimates. This uncertainty is meaningful, as it correlates with reconstruction error (r = 0.343, p < 0.001), serving as an automatic data quality indicator. Higher uncertainty flags predictions that may arise from incomplete or low-quality data, without the need for manual quality control. Additionally, VPPCA’s 95% confidence intervals allow for more nuanced decision-making, with narrower intervals indicating higher confidence and wider intervals suggesting the need for additional testing. Fig 14 also demonstrates that VPPCA can automatically categorize patients into low, medium, and high uncertainty groups, enabling more targeted clinical monitoring. This probabilistic approach enhances clinical decision-making by providing not just point estimates but actionable uncertainty, something PPCA’s deterministic model lacks. Thus, VPPCA maintains PPCA’s disease signal while adding the critical capability of uncertainty quantification, transforming predictions into clinically interpretable and risk-differentiated insights.

Longitudinal explanation of disease progression model

In this study, we applied a longitudinal approach to model Alzheimer’s disease (AD) progression using hierarchical models and VPPCA. This model captures the dynamic progression of the disease over time, offering personalized predictions for each patient’s disease trajectory.

The model determines two key coefficients for each patient: the intercept (), which reflects the patient’s baseline disease state, and the slope (), which indicates the rate of disease progression. By estimating these parameters, the model can predict individual disease trajectories more accurately, helping clinicians tailor treatment plans to each patient’s unique progression. This contrasts with traditional cross-sectional studies that struggle to capture the disease’s dynamic nature over time. By integrating follow-up data with time, our model provides a clearer picture of disease progression, which is vital for early intervention and evaluating treatment outcomes.

Additionally, the model accounts for individual variability, which is often overlooked in traditional methods. It assumes that the intercept () and slope () follow a multivariate normal distribution, capturing the correlation between these two parameters through Cholesky decomposition. This improves the model’s robustness and accuracy.

We used the MCMC method to estimate the posterior distributions of the parameters. The model’s convergence was assessed using the R-hat statistic, which indicated strong convergence with values close to 1, suggesting that the parameter estimates were stable. Fig 9 illustrates the results, showing that patients with dementia typically start with higher scores and experience faster progression compared to cognitively normal (CN) individuals. For patients with mild cognitive impairment (MCI), their disease trajectories were more varied: some progressed similarly to CN patients, while others progressed at rates comparable to or faster than those with dementia. This variability highlights the importance of personalized predictions for Alzheimer’s disease progression.

Disease progression model with VPPCA1 score replacing time-lapse changes

In Alzheimer’s disease research, people usually build models to relate amyloid levels to time to describe the common progression scale of patients and predict changes in biomarkers. Similar studies often track patients through cognitive test scales to predict disease progression. However, these methods are usually limited to relative analysis of individuals and may not be effectively compared on a natural time scale. The limitation of this method is that they cannot handle the heterogeneity problem between individuals well.

Our method can effectively solve this heterogeneity problem by describing the biomarker changes of patients as a progression score. Specifically, we use VPPCA to generate a progression score as a unified pseudo time scale to replace the natural time scale. This approach not only more accurately reflects the disease progression trajectory of patients, but also enables different models to be compared on a unified pseudo time scale, thereby improving the comparability of studies and the accuracy of predictions.

With this progression score, we can better explain the relationship between biomarker changes and cognitive test results in Alzheimer’s patients. Fig 6 shows that considering age cannot well simulate the evolution of biomarkers in the progression of Alzheimer’s disease.Fig 7 shows that patients with lower VPPCA1 scores generally have lower ADAS values, which indicates that VPPCA1 as a measure of time can better explain the changes in ADAS values in patients with dementia.Overall, this approach verifies the superiority of VPPCA in simulating disease progression and provides a more accurate and unified way to study and predict the progression of Alzheimer’s disease.

Limitations and future prospects

Although VPPCA performs well in dimensionality reduction and feature extraction, the interpretability of its results still needs to be further enhanced. Although the first principal component can capture the specificity of most biomarkers, using it as an individual progression score may still fail to capture other sources of variability in some biomarkers in some patients, and for MCI patients, we only performed a broad classification without in-depth consideration of their potential subtypes.

Current studies mainly focus on AD-related biomarkers extracted by MRI. In the future, additional biological biomarkers (e.g., genomic data, proteomic panels, neuroimaging tracers) and environmental/lifestyle factors can be incorporated. This will help build a more comprehensive disease progression model and improve the accuracy and robustness of predictions. In addition, advanced feature selection methods can be used to automatically screen out the most predictive biomarkers.

In order to verify the universality and robustness of the VPPCA method, future studies should test and verify it in more independent datasets. This includes patient data from different regions, different races, and different age groups. At the same time, the application of VPPCA in other neurodegenerative diseases (such as Parkinson’s disease, Huntington’s disease, etc.) can be explored to evaluate its performance in different disease contexts.

Although VPPCA performs well in handling high-dimensional data and missing values, a single model may not be able to capture all disease characteristics. Future studies can consider integrating VPPCA with other advanced disease progression models (such as mixed effect models) to build a multi-model fusion system to improve the accuracy and robustness of predictions.

Conclusions

The variational probability principal component analysis (VPPCA) method proposed in this paper performs well in dealing with high-dimensional feature data and missing value problems in Alzheimer’s disease (AD). Experimental results show that VPPCA can accurately extract low-dimensional representations that reflect AD disease progression and effectively distinguish patients with different disease states. Compared with traditional methods, VPPCA has higher performance and robustness in predicting AD disease progression. However, this method still has limitations in its dependence on biomarker selection and its ability to distinguish different AD subtypes. Future work will further improve the interpretability and robustness of the model and explore its potential for application in other neurodegenerative diseases.

Supporting information

S1 Table. Feature set missing rate statistics.

https://doi.org/10.1371/journal.pone.0342549.s001

(DOCX)

Acknowledgments

The authors greatly appreciate the Alzheimer’s Disease Neuroimaging Initiative (ADNI) for providing the data used in the preparation of this article. ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research provides funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

References

1. Scheltens P, Strooper BD, Kivipelto M, Holstege H, Chételat G, Teunissen CE. Lancet. 2021;397(10284):1577–90.
- View Article
- Google Scholar
2. Brookmeyer R, Johnson E, Ziegler-Graham K, Arrighi HM. Forecasting the global burden of Alzheimer’s disease. Alzheimers Dement. 2007;3(3):186–91. pmid:19595937
- View Article
- PubMed/NCBI
- Google Scholar
3. Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM, et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):280–92. pmid:21514248
- View Article
- PubMed/NCBI
- Google Scholar
4. Kumar S, Oh I, Schindler S, Lai AM, Payne PRO, Gupta A. Machine learning for modeling the progression of Alzheimer disease dementia using clinical data: a systematic literature review. arXiv e-prints. 2021. https://arxiv.org/abs/2108.04174
5. Fisher CK, Smith AM, Walsh JR, Coalition Against Major Diseases, Abbott, Alliance for Aging Research, Alzheimer’s Association, et al. Machine learning for comprehensive forecasting of Alzheimer’s Disease progression. Sci Rep. 2019;9(1):13622. pmid:31541187
- View Article
- PubMed/NCBI
- Google Scholar
6. Jack CR Jr, Knopman DS, Jagust WJ, Shaw LM, Aisen PS, Weiner MW, et al. Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cascade. Lancet Neurol. 2010;9(1):119–28. pmid:20083042
- View Article
- PubMed/NCBI
- Google Scholar
7. Saint-Jalmes M, Fedyashov V, Beck D, Baldwin T, Faux NG, Bourgeat P, et al. Disease progression modelling of Alzheimer’s disease using probabilistic principal components analysis. Neuroimage. 2023;278:120279. pmid:37454702
- View Article
- PubMed/NCBI
- Google Scholar
8. Ayvaz DS, Baytas IM. Investigating conversion from mild cognitive impairment to Alzheimer’s disease using latent space manipulation. 2021.
9. Tipping ME, Bishop CM. Probabilistic principal component analysis. Aston University. 1999.
10. Jack CR Jr, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14(4):535–62. pmid:29653606
- View Article
- PubMed/NCBI
- Google Scholar
11. Shaw LM, Waligorska T, Fields L, Korecka M, Figurski M, Trojanowski JQ, et al. Derivation of cutoffs for the Elecsys® amyloid β (1-42) assay in Alzheimer’s disease. Alzheimers Dement (Amst). 2018;10:698–705. pmid:30426066
- View Article
- PubMed/NCBI
- Google Scholar
12. Shaw LM, Vanderstichele H, Knapik-Czajka M, Clark CM, Aisen PS, Petersen RC, et al. Cerebrospinal fluid biomarker signature in Alzheimer’s disease neuroimaging initiative subjects. Ann Neurol. 2009;65(4):403–13. pmid:19296504
- View Article
- PubMed/NCBI
- Google Scholar
13. Blennow K, Zetterberg H, Fagan AM. Fluid biomarkers in Alzheimer disease. Cold Spring Harb Perspect Med. 2012;2(9):a006221. pmid:22951438
- View Article
- PubMed/NCBI
- Google Scholar
14. Villemagne VL, Pike KE, Chételat G, Ellis KA, Mulligan RS, Bourgeat P, et al. Longitudinal assessment of Aβ and cognition in aging and Alzheimer disease. Ann Neurol. 2011;69(1):181–92. pmid:21280088
- View Article
- PubMed/NCBI
- Google Scholar
15. Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Cedarbaum J, et al. 2014 Update of the Alzheimer’s Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimers Dement. 2015;11(6):e1-120. pmid:26073027
- View Article
- PubMed/NCBI
- Google Scholar
16. Jagust W, Landau SM, Koeppe RA, Reiman EM, Chen K, Mathis CA. The Alzheimer’s Disease Neuroimaging Initiative 2 PET Core: 2015. Alzheimers Dement. 2015;11(7):757–71.
- View Article
- Google Scholar
17. Barnes J, Ridgway GR, Bartlett J, Henley SMD, Lehmann M, Hobbs N, et al. Head size, age and gender adjustment in MRI studies: a necessary nuisance?. Neuroimage. 2010;53(4):1244–55. pmid:20600995
- View Article
- PubMed/NCBI
- Google Scholar
18. Bishop CM. Variational principal components. In: 9th International Conference on Artificial Neural Networks: ICANN ’99, 1999. 509–14.
- View Article
- Google Scholar
19. Coley N, Gardette V, Cantet C, Gillette-Guyonnet S, Nourhashemi F, Vellas B, et al. How should we deal with missing data in clinical trials involving Alzheimer’s disease patients?. Curr Alzheimer Res. 2011;8(4):421–33. pmid:21244348
- View Article
- PubMed/NCBI
- Google Scholar
20. Liu Z, Wang G, Xiaohong L. Orthogonal sparse linear discriminant analysis. Int J Sys Sci. 2018.
- View Article
- Google Scholar
21. Li D, Iddi S, Thompson WK, Donohue MC, Alzheimer’s Disease Neuroimaging Initiative. Bayesian latent time joint mixed effect models for multicohort longitudinal data. Stat Methods Med Res. 2019;28(3):835–45. pmid:29168432
- View Article
- PubMed/NCBI
- Google Scholar
22. Raket LL. Statistical Disease Progression Modeling in Alzheimer Disease. Front Big Data. 2020;3:24. pmid:33693397
- View Article
- PubMed/NCBI
- Google Scholar
23. Lorenzi M, Filippone M, Frisoni GB, Alexander DC, Ourselin S, Alzheimer’s Disease Neuroimaging Initiative. Probabilistic disease progression modeling to characterize diagnostic uncertainty: Application to staging and prediction in Alzheimer’s disease. Neuroimage. 2019;190:56–68. pmid:29079521
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Scheltens P, Strooper BD, Kivipelto M, Holstege H, Chételat G, Teunissen CE. Lancet. 2021;397(10284):1577–90.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Brookmeyer R, Johnson E, Ziegler-Graham K, Arrighi HM. Forecasting the global burden of Alzheimer’s disease. Alzheimers Dement. 2007;3(3):186–91. pmid:19595937
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM, et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):280–92. pmid:21514248
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref4] 4. Kumar S, Oh I, Schindler S, Lai AM, Payne PRO, Gupta A. Machine learning for modeling the progression of Alzheimer disease dementia using clinical data: a systematic literature review. arXiv e-prints. 2021. https://arxiv.org/abs/2108.04174

[ref5] 5. Fisher CK, Smith AM, Walsh JR, Coalition Against Major Diseases, Abbott, Alliance for Aging Research, Alzheimer’s Association, et al. Machine learning for comprehensive forecasting of Alzheimer’s Disease progression. Sci Rep. 2019;9(1):13622. pmid:31541187
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref6] 6. Jack CR Jr, Knopman DS, Jagust WJ, Shaw LM, Aisen PS, Weiner MW, et al. Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cascade. Lancet Neurol. 2010;9(1):119–28. pmid:20083042
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref7] 7. Saint-Jalmes M, Fedyashov V, Beck D, Baldwin T, Faux NG, Bourgeat P, et al. Disease progression modelling of Alzheimer’s disease using probabilistic principal components analysis. Neuroimage. 2023;278:120279. pmid:37454702
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref8] 8. Ayvaz DS, Baytas IM. Investigating conversion from mild cognitive impairment to Alzheimer’s disease using latent space manipulation. 2021.

[ref9] 9. Tipping ME, Bishop CM. Probabilistic principal component analysis. Aston University. 1999.

[ref10] 10. Jack CR Jr, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14(4):535–62. pmid:29653606
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref11] 11. Shaw LM, Waligorska T, Fields L, Korecka M, Figurski M, Trojanowski JQ, et al. Derivation of cutoffs for the Elecsys® amyloid β (1-42) assay in Alzheimer’s disease. Alzheimers Dement (Amst). 2018;10:698–705. pmid:30426066
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref12] 12. Shaw LM, Vanderstichele H, Knapik-Czajka M, Clark CM, Aisen PS, Petersen RC, et al. Cerebrospinal fluid biomarker signature in Alzheimer’s disease neuroimaging initiative subjects. Ann Neurol. 2009;65(4):403–13. pmid:19296504
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref13] 13. Blennow K, Zetterberg H, Fagan AM. Fluid biomarkers in Alzheimer disease. Cold Spring Harb Perspect Med. 2012;2(9):a006221. pmid:22951438
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref14] 14. Villemagne VL, Pike KE, Chételat G, Ellis KA, Mulligan RS, Bourgeat P, et al. Longitudinal assessment of Aβ and cognition in aging and Alzheimer disease. Ann Neurol. 2011;69(1):181–92. pmid:21280088
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref15] 15. Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Cedarbaum J, et al. 2014 Update of the Alzheimer’s Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimers Dement. 2015;11(6):e1-120. pmid:26073027
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref16] 16. Jagust W, Landau SM, Koeppe RA, Reiman EM, Chen K, Mathis CA. The Alzheimer’s Disease Neuroimaging Initiative 2 PET Core: 2015. Alzheimers Dement. 2015;11(7):757–71.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref17] 17. Barnes J, Ridgway GR, Bartlett J, Henley SMD, Lehmann M, Hobbs N, et al. Head size, age and gender adjustment in MRI studies: a necessary nuisance?. Neuroimage. 2010;53(4):1244–55. pmid:20600995
View Article
PubMed/NCBI
Google Scholar

[55] View Article

[56] PubMed/NCBI

[57] Google Scholar

[ref18] 18. Bishop CM. Variational principal components. In: 9th International Conference on Artificial Neural Networks: ICANN ’99, 1999. 509–14.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref19] 19. Coley N, Gardette V, Cantet C, Gillette-Guyonnet S, Nourhashemi F, Vellas B, et al. How should we deal with missing data in clinical trials involving Alzheimer’s disease patients?. Curr Alzheimer Res. 2011;8(4):421–33. pmid:21244348
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref20] 20. Liu Z, Wang G, Xiaohong L. Orthogonal sparse linear discriminant analysis. Int J Sys Sci. 2018.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref21] 21. Li D, Iddi S, Thompson WK, Donohue MC, Alzheimer’s Disease Neuroimaging Initiative. Bayesian latent time joint mixed effect models for multicohort longitudinal data. Stat Methods Med Res. 2019;28(3):835–45. pmid:29168432
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref22] 22. Raket LL. Statistical Disease Progression Modeling in Alzheimer Disease. Front Big Data. 2020;3:24. pmid:33693397
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref23] 23. Lorenzi M, Filippone M, Frisoni GB, Alexander DC, Ourselin S, Alzheimer’s Disease Neuroimaging Initiative. Probabilistic disease progression modeling to characterize diagnostic uncertainty: Application to staging and prediction in Alzheimer’s disease. Neuroimage. 2019;190:56–68. pmid:29079521
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Alzheimer’s disease neuroimaging initiative (ADNI)

Data preprocessing

Variational probabilistic principal component analysis (VPPCA)

Variational inference for Alzheimer’s disease biomarker analysis.

Estimating missing values.

Results and discussion

Comparison of the scores of the first and second principal components

VPPCA analysis of baseline feature scores in patients with Alzheimer’s disease

Correlation between VPPCA and linear discriminant analysis in predicting the progression of Alzheimer’s disease

Robustness: Block-wise feature ablation analysis

Diagnostic classification: Threshold-independent performance evaluation

Application of time shift in Alzheimer’s disease progression model

Longitudinal interpretation of disease progression models and predictions

Comparison with other disease progression models

Discussion

Feature selection, preprocessing, and variational inference framework

Robustness and classification performance

Quantification of uncertainty in VPPCA and discussion on circular reasoning

Longitudinal explanation of disease progression model

Disease progression model with VPPCA1 score replacing time-lapse changes

Limitations and future prospects

Conclusions

Supporting information

S1 Table. Feature set missing rate statistics.

Acknowledgments

References