Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Machine learning-based risk prediction model for cognitive dysfunction in elderly individuals

  • Lei Zhang,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Xuan Xiang,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Wei Chen,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Haijun Miao,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Ting Zou,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Ruikai Wu,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

  • Xiaohui Zhou

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    zhouxiaohui858@sina.com

    Affiliation Department of Geriatrics, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China

Abstract

Background

With the advancement of globalization, the prevalence of cognitive dysfunction in the elderly population has risen significantly. Early intervention may dramatically alleviate the disease burden and reduce economic costs associated with cognitive impairment. This study aims to construct a risk prediction model for cognitive dysfunction based on machine learning (ML) algorithms, providing healthcare professionals and patients with a more accurate and effective tool for risk assessment.

Methods

This study included 1,325 elderly participants who completed cognitive assessments and comprehensive laboratory blood tests. Risk factors for cognitive dysfunction were identified through univariate analysis, multivariate logistic regression, LASSO regression, and the Boruta algorithm. Nine ML methods—Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), Logistic Regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Artificial Neural Network (ANN), Decision Tree, and Elastic Net—were employed to construct the prediction models. The Shapley Additive Explanations (SHAP) algorithm was utilized to interpret the final model.

Results

The Random Forest model exhibited the highest predictive performance, with an AUC value exceeding those of other models. SHAP analysis identified age, race, education level, diabetes, and depression as the primary predictors of cognitive dysfunction in the elderly. The calibration curve indicated a strong alignment between the model’s predictions and actual outcomes, while the decision curve confirmed the model’s clinical applicability.

Conclusion

Age, race, education level, diabetes, and depression are significant influencing factors of cognitive dysfunction in the elderly. Among the ML algorithms evaluated, the Random Forest model exhibited the best predictive performance.

1. Introduction

The global population is aging rapidly, and increased life expectancy has made cognitive dysfunction in the elderly a major public health concern [1]. Cognitive impairment not only diminishes cognitive function and quality of life but also imposes substantial disease and economic burdens on patients and their families [24]. According to the World Alzheimer Report, an estimated 46.8 million individuals worldwide were affected by dementia in 2015, with projections indicating a rise to 131.5 million by 2050. [5]. Every three seconds, someone is diagnosed with dementia, and the annual cost of dementia is estimated at $1 trillion, a figure expected to double by 2030. Alzheimer’s disease (AD) is the most common form of dementia [6]. Primary prevention of AD holds significant potential, as one-third of global AD cases are attributable to modifiable risk factors. The World Health Organization (WHO) Guidelines for Risk Reduction of Cognitive Decline and Dementia [7] and the 2020 Lancet Commission report [8] identified several modifiable risk factors for dementia, including low education, advanced age, smoking, excessive alcohol consumption, obesity, depression, physical inactivity, hearing impairment, hypertension, diabetes, social isolation, traumatic brain injury, and air pollution.

Growing evidence suggests a link between cognitive dysfunction and inflammation [9]. Immunosenescence and inflammaging are hallmark features of aging [10], with inflammation playing a pivotal role in the pathogenesis of cognitive decline and dementia [1114]. The systemic immune-inflammation index (SII) and systemic inflammation response index (SIRI), recently developed composite inflammatory markers, are widely used to assess systemic inflammation [15,16]. Studies by Wang et al.[1722] have demonstrated a significant association between SII, SIRI, and cognitive impairment. Insulin resistance (IR) is also significantly associated with an increased risk of cognitive decline [23]. The triglyceride-glucose (TyG) index, derived from fasting triglyceride (TG) and blood glucose (FBG) levels, is a cost-effective and readily available surrogate marker for IR [24]. Research [2528] indicates that a higher TyG index is significantly correlated with an elevated risk of dementia. Investigating the relationship between SII, SIRI, the TyG index, and cognitive function may provide a basis for early detection of cognitive impairment.

With the rapid advancement of artificial intelligence, risk models incorporating demographic, behavioral, and psychosocial factors have emerged. Machine learning (ML) offers unique advantages in medical prediction by automatically identifying key predictors and their interactions through feature importance ranking and decision-splitting mechanisms [29]. However, existing studies are often limited to single-algorithm validation, lacking multi-model performance comparisons and interpretability analyses. Therefore, this study leverages data from the 2011–2014 National Health and Nutrition Examination Survey (NHANES) to explore risk factors for cognitive dysfunction in the elderly. By employing nine ML algorithms to construct predictive models and comparing their performance using calibration and decision curve analyses, this study aims to identify the optimal model, offering new insights for healthcare professionals in predicting cognitive impairment risk among the elderly.

2. Materials and methods

2.1 Study population

This study utilized cross-sectional data from the 2011–2014 National Health and Nutrition Examination Survey (NHANES), conducted by the National Center for Health Statistics (NCHS) and the Centers for Disease Control and Prevention (CDC) to assess the health and nutritional status of individuals across all age groups in the United States. The survey was approved by the NCHS Ethics Review Board, and all participants provided written informed consent (https://www.cdc.gov/nchs/nhanes/about/erb.html?CDC_AAref_Val=https://www.cdc.gov/nchs/nhanes/irba98.htm). Inclusion criteria were: (1) age ≥ 60 years; and (2) complete responses to all survey items. A total of 1,325 elderly participants were included in the final analysis (Fig 1).

2.2 Study variables and definitions

2.2.1 Cognitive function assessment.

Cognitive ability in participants aged ≥60 years was evaluated using three standardized tests:

  1. Consortium to Establish a Registry for Alzheimer’s Disease (CERAD): Including immediate recall (CERAD-IR) and delayed recall (CERAD-DR) tests.
  2. Animal Fluency Test (AFT).
  3. Digit Symbol Substitution Test (DSST).

These tools are widely used in studies analyzing cognitive function and its risk factors [3032]. A composite Z-score, termed the Overall cognitive ability score, was calculated by averaging the standardized scores of the CERAD, AFT, and DSST tests [3336]. Although no definitive threshold for cognitive impairment has been established in prior studies, the 25th percentile of the Overall cognitive ability score was used as the cutoff in this study [3739]. Participants were categorized into two groups: normal cognitive function and cognitive impairment.

2.2.2 Depression.

The Patient Health Questionnaire-9 (PHQ-9), a self-report tool widely used in clinical practice and research, was employed to screen, diagnose, and assess depression. The questionnaire consists of nine items covering core depressive symptoms, including low mood, loss of interest, sleep disturbances, appetite changes, fatigue, feelings of worthlessness, poor concentration, psychomotor retardation, and suicidal ideation. Each item is scored from 0 (“not at all”) to 3 (“nearly every day”), with a maximum total score of 27. A PHQ-9 score ≥10 was considered indicative of depression [40].

2.2.3 Immune-inflammatory indices and triglyceride-glucose index.

The systemic immune-inflammation index (SII), systemic inflammation response index (SIRI), and triglyceride-glucose (TyG) index were calculated using complete blood count (CBC) laboratory results from the NHANES database [18]. The following measurements were used (reported in 1,000 cells/μL or mg/dL): platelet count (PC), neutrophil count (NC), monocyte count (MC), lymphocyte count (LC), triglycerides (TG), and fasting blood glucose (FPG). The indices were derived as follows:

  1. SII = (platelet count × neutrophil count)/lymphocyte count;
  2. SIRI = (neutrophil count × monocyte count)/lymphocyte count.
  3. TyG=In[fasting triglycerides (TG, mg/dL) X fasting blood glucose (FPG, mg/dL)/2]

2.2.4 Covariates.

Based on study design requirements, variables were assessed across different dimensions, with categorical variables assigned numerical values. Binary variables were coded as 0 or 1, while multi-category variables were assigned incremental values (e.g., 0, 1, 2). The selected covariates included:

Continuous variables: Age, minutes of sedentary activity.

Categorical variables: Gender (male, female); Race/ethnicity (Mexican American, Other Hispanic, Non-Hispanic White, Non-Hispanic Black, Other Race—including multiracial); Education level (<9th grade, 9–11th grade [including 12th grade without diploma], high school graduate/GED or equivalent, some college or AA degree, college graduate or above); Marital status (married, widowed, divorced, separated, never married, living with partner); BMI (<25, 25– < 30, ≥ 30); Self-reported sleep trouble (no, yes); Diabetes (no, borderline, yes); Heart disease (no, yes); Stroke (no, yes); Depression (no, yes).

2.3 Model development

Risk factors for cognitive impairment were screened using univariate analysis, multivariate logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and the Boruta algorithm. The dataset was randomly split into training (70%) and testing (30%) sets. Nine supervised machine learning (ML) algorithms were employed to construct prediction models:

1. Random Forest (RF); 2. Light Gradient Boosting Machine (LightGBM); 3. Extreme Gradient Boosting (XGBoost); 4. Logistic Regression; 5. K-Nearest Neighbor (KNN); 6. Support Vector Machine (SVM); 7. Artificial Neural Network (ANN); 8. Decision Tree; 9. Elastic Net; Hyperparameter optimization is crucial for model performance. We employed grid search with 5-fold cross-validation to identify the optimal hyperparameter combinations for each algorithm. The hyperparameter space for each algorithm is detailed in Table 1. The configuration yielding the best average performance across the folds was selected for model building. (Table 1).

thumbnail
Table 1. Hyperparameter values for each machine learning algorithm.

https://doi.org/10.1371/journal.pone.0336058.t001

2.4 Evaluation metrics

Performance metrics included accuracy, recall, specificity, positive predictive value (PPV), negative predictive value (NPV), area under the receiver operating characteristic curve (AUC-ROC), and F1-score. Calibration curves were used to assess model consistency, with the Brier score quantifying calibration performance (lower scores indicate better accuracy: 0–0.1 = excellent, 0.1–0.25 = good, > 0.25 = poor). Decision curve analysis (DCA) evaluated clinical utility, and the Shapley Additive Explanations (SHAP) algorithm interpreted feature contributions to model predictions.

2.5 Statistical analysis

Analyses were conducted using R 4.3 and Python 3.11.5. Non-normally distributed continuous variables were expressed as median (interquartile range) [M(P25, P75)], with group comparisons performed using nonparametric tests. Categorical variables were reported as frequencies and percentages (n, %), with group comparisons assessed via Z-tests. A two-tailed P < 0.05 was considered statistically significant.

3. Results

3.1 Participant characteristics

Among the 1,325 participants, 1,092 (82.42%) had normal cognitive function, while 233 (17.58%) exhibited cognitive impairment. Significant differences (P < 0.05) were observed between groups for age, race, education level, marital status, diabetes, stroke, and depression. No significant differences (P > 0.05) were found for sedentary activity, SII, SIRI, TyG index, gender, BMI, self-reported sleep trouble, or heart disease (Table 2).

thumbnail
Table 2. Basic characteristics of study participants (n = 1325).

https://doi.org/10.1371/journal.pone.0336058.t002

3.2 Feature selection

Use multiple logistic regression, Lasso regression, and the Boruta algorithm to screen risk factors closely related to cognitive impairment in the elderly, and include variables with statistically significant differences in univariate analysis. The five important predictive factors obtained from multiple logistic regression, Lasso regression, and Boruta algorithm are Age, Race, Education level, Diabetes, and Depression (Table 3, Fig 2).

thumbnail
Fig 2. Lasso regression and Boruta algorithm for screening predictive factors.

(A) Coefficient path of Lasso regression. (B) Cross-validation results of Lasso regression. (C) Boruta algorithm results.

https://doi.org/10.1371/journal.pone.0336058.g002

3.3 Performance comparison of nine prediction models

The nine ML algorithms were trained using the selected predictors. Random Forest demonstrated the highest performance: Training set: AUC = 0.872 (95% CI: 0.854–0.890), accuracy = 0.787, sensitivity = 0.795, specificity = 0.780, PPV = 0.786, NPV = 0.792, F1-score = 0.789. Testing set: AUC = 0.870 (95% CI: 0.850–0.890), accuracy = 0.770, sensitivity = 0.778, specificity = 0.762, PPV = 0.768, NPV = 0.775, F1-score = 0.772 (Table 4, Fig 3).

thumbnail
Table 4. Performance comparison of 9 machine learning prediction models.

https://doi.org/10.1371/journal.pone.0336058.t004

thumbnail
Fig 3. ROC curves of 9 machine learning prediction models.

(A) Training set. (B) Test set.

https://doi.org/10.1371/journal.pone.0336058.g003

On the test set, Random Forest LightGBM XGBoost Logistic Regression KNN SVM ANN Decision Tree Elastic Net, evaluate the accuracy and clinical practicality of the model. The calibration curve of the test set shows that the Brier scores of all 9 models are below 0.20, indicating that the 9 models have good accuracy and the predicted results are consistent with the actual results (Fig 4).

thumbnail
Fig 4. Calibration curves of 9 machine learning prediction models.

(A) Training set. (B) Test set.

https://doi.org/10.1371/journal.pone.0336058.g004

The DCA curve showed that when the risk threshold was between 0.1 and 0.8, Random Forest LightGBM XGBoost Logistic Regression KNN SVM ANN Decision Tree,Elastic Net the model can obtain better clinical net benefit, indicating that the model has better clinical applicability (Fig 5).

thumbnail
Fig 5. DCA curves of 9 machine learning prediction models.

(A) Training set. (B) Test set.

https://doi.org/10.1371/journal.pone.0336058.g005

Based on the above analysis, the Random Forest model performs the best in predicting the risk of cognitive impairment in the elderly, with high prediction accuracy and good clinical practicality. Therefore, the Random Forest model was chosen as the final model for predicting the risk of cognitive impairment in the elderly.

3.4 Feature importance analysis

SHAP analysis of the Random Forest model ranked predictor importance as: education level > age > race > diabetes > depression (Fig 6A). Swarm plots revealed negative associations between education level and cognitive impairment, and positive associations for age, diabetes, and depression (Fig 6B).

thumbnail
Fig 6. SHAP’s Visual Explanation of the Global Model.

(A) Bar chart. (B) Bee colony chart.

https://doi.org/10.1371/journal.pone.0336058.g006

Single-sample SHAP diagrams, waterfall diagrams, and decision diagrams can explain the prediction results of a single case. For example, the data from Case 1 shows that the Education level is Less than 9th grade, Race is Mexican American, Age is 60 years old, no diabetes, no depression, and the Random Forest risk model predicts a probability of 0.96 for the risk of cognitive impairment in the elderly (Fig 7).

thumbnail
Fig 7. Visual interpretation of SHAP for single-sample cases.

(A) Force Plot. (B) Waterfall Plot. (C) Decision Plot.

https://doi.org/10.1371/journal.pone.0336058.g007

4 Discuss

In recent years, machine learning (ML), deep learning, artificial intelligence, and statistical analysis have been increasingly applied to medical research [4145]. ML algorithms can leverage large datasets for training, thereby enhancing the accuracy and predictive power of models. These algorithms autonomously learn patterns and relationships from data to construct predictive models without manual intervention, improving efficiency. Moreover, they can continuously update and optimize models with new data, ensuring adaptability to evolving environments and datasets [46]. ML excels at processing high-dimensional data and complex relationships, uncovering nonlinear associations and patterns that may elude traditional methods. Its capacity to handle large-scale data enables the extraction of actionable insights, while its interpretability clarifies model mechanics and decision-making processes, offering precise predictive and decision-support tools in medicine [47].

During model development, nine ML algorithms were evaluated, with the Random Forest (RF) model demonstrating superior performance in predicting cognitive impairment risk among elderly individuals. As a classical ML algorithm, RF efficiently handles high-dimensional data by employing an ensemble of decision trees, which mitigates overfitting and captures complex feature interactions [4850]. Its robustness to noise and outliers further enhances reliability in real-world applications. To improve transparency, the Shapley Additive Planations (SHAP) algorithm was employed for model interpretation. Globally, SHAP quantified the relative contribution of each feature to cognitive impairment risk; locally, it elucidated how individual predictors influenced specific cases. This dual interpretability strengthens the model’s clinical utility [51].

Feature or variable selection is the core of predictive model development [52,53]. This study used three algorithms, namely multi-factor logistic regression, Lasso regression, and Boruta algorithm, to obtain Age, Race, Education level, Diabetes, and Depression predictive factors. Based on the five predictive factors, a risk prediction model was constructed, which has good predictive performance, accuracy, and clinical benefits. Previous studies have focused on age [54,55] Education level [56,57] Race [58] Diabetes [59],Depression [44]. The literature indicates that the predictive factors we have chosen are available and reliable.

Through the SHAP algorithm, it was found that education level is the primary predictor, and a higher education level can reduce the risk of cognitive impairment. Education can help improve memory, cognitive stimulation, and cognitive abilities [60]. Cognitive stimulation activities may slow down the rate of hippocampal atrophy during normal aging [61], and may even prevent the accumulation of amyloid plaques [62]. The deposition of amyloid beta (Aβ) is a biomarker for cognitive impairment. Education mainly strengthens the control of processes and the understanding of concepts in cognitive function. Compared with those with shorter education periods, those with longer education periods have an 85% lower risk of Mild Cognitive Impairment (MCI) and Alzheimer’s disease [63]. The cognitive reserve hypothesis [64] proposes that stimulating the environment promotes the growth of new neurons in the form of neurogenesis, thereby promoting neural plasticity. With the improvement of cultural level and the increase of cognitive reserve, the expression of cognitive decline may be delayed [65].

Cognitive guidelines and expert consensus point out [6669] that age is one of the predictive factors for the risk of cognitive impairment. As the body gradually ages, various organs and tissues begin to age, and the functional connections of the brain network will selectively weaken, inevitably leading to a decline in cognitive ability. Lee et al. [70,71] found that hippocampal neurons located deep in the temporal lobe of the brain help us classify and understand human perception and experience from the most basic to highly complex things. As we age, the balance between pattern separation and pattern completion is disrupted, and memory is impaired. Moreover, the hippocampus is highly susceptible to hypoxia/ischemic damage, and the function of the hippocampal vascular system is crucial for maintaining neurocognitive health. The decrease in hippocampal blood flow occurs during healthy aging and can lead to neuronal atrophy and memory decline in the hippocampus [72,73].

There is a close relationship between diabetes and cognitive impairment [74]. Type 2 diabetes can increase the incidence of Alzheimer’s disease (AD) by 1.5–2.5 times [71]. Many scholars have found that diabetes and cognitive impairment share many common pathophysiological bases. Diabetes can cause an inflammatory reaction, metabolic disorder, microvascular disease, oxidative stress, A β deposition, neurofibrillar tangle, leading to insulin resistance, damage to synaptic plasticity, synaptic degeneration, and cell death. There are abnormal insulin signal transduction pathways, weakened mitochondrial function, autonomic nervous dysfunction, and neurocellular inflammation in diabetes patients, which can affect the brain tissue and structure, and ultimately lead to cognitive decline [75,76].

The reasons why depression increases the risk of cognitive impairment involve multiple mechanisms at the neurobiological, endocrine, and behavioral levels. Firstly, depression leads to abnormal levels of key neurotransmitters such as serotonin and dopamine in the brain, impairing synaptic transmission efficiency and directly affecting memory encoding and cognitive flexibility [77]. Long-term depression may interfere with glutamate-mediated neuronal excitability, inhibit hippocampal neural plasticity, and accelerate cognitive decline [78]. Depressive states activate microglia, promote the release of pro-inflammatory factors such as IL-6 and TNF-α, damage neurons, and hinder synaptic remodeling [79]. Depression leads to hyperfunction of the hypothalamic pituitary adrenal (HPA) axis, sustained elevation of cortisol, and direct toxicity to hippocampal neurons [80]. Depression is often accompanied by abnormal glucose metabolism, and an increase in the TyG index reflects insulin resistance, which can reduce brain glucose utilization and impair cognitive function [29]. Depression related stress hormones can promote abnormal phosphorylation of tau protein, accelerate the formation of neurofibrillary tangles, and are directly associated with cognitive symptoms of Alzheimer’s disease [81]. There is a bidirectional relationship between cognitive impairment and depression [82]. On the one hand, cognitive impairment leads to a decrease in social participation and emotional regulation ability, which in turn triggers depression and exacerbates depressive symptoms [83,84]. On the other hand, depression accelerates cognitive decline by promoting neuroinflammation and abnormal brain function, and this vicious cycle accelerates the transition to dementia.

Previous studies have found that as the SII, SIRI, and TyG indices increase, the risk of cognitive impairment in the elderly increases [14,18,8592]. However, after cleaning the null values of complete blood count (CBC) in NHANES laboratory tests from 2011 to 2014, this study did not find any direct statistical differences in SII, SIRI, and TyG between the cognitively normal and cognitively impaired elderly groups during analysis. Considering the cleaning of missing values may result in a small sample size, which may reduce the power of the test and prevent the detection of actual differences. Secondly, SII and SIRI inflammatory markers mainly reflect systemic inflammatory status, but their specificity in cognitive impairment may not be high. Although inflammation is associated with cognitive decline, SII and SIRI indices are not sensitive biomarkers for cognitive impairment, especially in the elderly population, which is influenced by multiple confounding factors such as coexisting chronic diseases and drug treatment. The TyG index evaluates insulin resistance. Although there is a theoretical association between insulin resistance and cognitive impairment (such as Alzheimer’s disease), in the actual population, individual variations in metabolic factors (such as lifestyle and genetic background) may dilute the differences between the cognitively normal group and the cognitively impaired group. This also leads to the lack of statistical significance of SII, SIRI, and TyG as predictive factors in this study.

There are certain limitations to this study. Firstly, the sample size of SII, SIRI, and TyG indices is relatively small after cleaning the laboratory to check the complete blood count (CBC) null value, which limits the learning ability of ML; Secondly, there are shortcomings in feature selection, which fail to fully explore all potential factors that affect the risk of cognitive impairment in the elderly; The third issue is that model selection and parameter tuning need to be optimized. The above factors have led to significant room for improvement in indicators such as AUC and recall rate of our research model, although they are within an acceptable range. In future research, it is expected to expand the sample size of indices such as SII, SIRI, TyG, etc. to improve testing efficiency, and combine multimodal evaluation (such as imaging and genetic markers) to reduce confounding bias. At the same time, more efficient algorithms will be explored to improve model performance and expand the applicability of the model.

In summary, Age, Race, Education level, Diabetes, and Depression are the influencing factors of cognitive impairment in the elderly. This study constructs a prediction model for cognitive impairment risk in the elderly based on machine learning algorithms. Among them, the random forest algorithm has the best prediction performance and certain predictive value, which can provide new ideas and methods for early identification and intervention of cognitive impairment risk in the elderly.

References

  1. 1. Partridge L, Deelen J, Slagboom PE. Facing up to the global challenges of ageing. Nature. 2018;561(7721):45–56. pmid:30185958
  2. 2. Pan C-W, Wang X, Ma Q, Sun H-P, Xu Y, Wang P. Cognitive dysfunction and health-related quality of life among older Chinese. Sci Rep. 2015;5:17301. pmid:26601612
  3. 3. Jia J, Wei C, Chen S, Li F, Tang Y, Qin W, et al. The cost of Alzheimer’s disease in China and re-estimation of costs worldwide. Alzheimers Dement. 2018;14(4):483–91. pmid:29433981
  4. 4. Academy of Cognitive Disorders of China (ACDC), Han Y, Jia J, Li X, Lv Y, Sun X, et al. Expert consensus on the care and management of patients with cognitive impairment in China. Neurosci Bull. 2020;36(3):307–20. pmid:31792911
  5. 5. Broadhouse KM, Singh MF, Suo C, Gates N, Wen W, Brodaty H, et al. Hippocampal plasticity underpins long-term cognitive gains from resistance exercise in MCI. Neuroimage Clin. 2020;25:102182. pmid:31978826
  6. 6. Li J-Q, Tan L, Wang H-F, Tan M-S, Tan L, Xu W, et al. Risk factors for predicting progression from mild cognitive impairment to Alzheimer’s disease: a systematic review and meta-analysis of cohort studies. J Neurol Neurosurg Psychiatry. 2016;87(5):476–84. pmid:26001840
  7. 7. Chowdhary N, Barbui C, Anstey KJ, Kivipelto M, Barbera M, Peters R, et al. Reducing the risk of cognitive decline and dementia: WHO recommendations. Front Neurol. 2022;12:765584. pmid:35082745
  8. 8. Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the lancet commission. Lancet. 2020;396(10248):413–46. pmid:32738937
  9. 9. Barter J, Kumar A, Bean L, Ciesla M, Foster TC. Adulthood systemic inflammation accelerates the trajectory of age-related cognitive decline. Aging (Albany NY). 2021;13(18):22092–108. pmid:34587117
  10. 10. Aiello A, Farzaneh F, Candore G, Caruso C, Davinelli S, Gambino CM, et al. Immunosenescence and its hallmarks: how to oppose aging strategically? A review of potential options for therapeutic intervention. Front Immunol. 2019;10:2247. pmid:31608061
  11. 11. Irwin MR, Vitiello MV. Implications of sleep disturbance and inflammation for Alzheimer’s disease dementia. Lancet Neurol. 2019;18(3):296–306. pmid:30661858
  12. 12. Wang M, Zeng X, Liu Q, Yang Z, Li J. The association between sleep duration and cognitive function in the U.S. elderly from NHANES 2011-2014: A mediation analysis for inflammatory biomarkers. J Affect Disord. 2025;375:465–71. pmid:39900296
  13. 13. Tondo G, Aprile D, De Marchi F, Sarasso B, Serra P, Borasio G, et al. Investigating the prognostic role of peripheral inflammatory markers in mild cognitive impairment. J Clin Med. 2023;12(13):4298. pmid:37445333
  14. 14. Zheng Y, Yu Y, Gao L, Yu M, Jiang L, Zhu Q. Association of red blood cell count, hemoglobin concentration, and inflammatory indices with cognitive impairment severity in Alzheimer’s disease. Sci Rep. 2025;15(1):17425. pmid:40394088
  15. 15. Wang Q, Ma J, Jiang Z, Ming L. Prognostic value of neutrophil-to-lymphocyte ratio and platelet-to-lymphocyte ratio in acute pulmonary embolism: a systematic review and meta-analysis. Int Angiol. 2018;37(1):4–11. pmid:28541022
  16. 16. Gasparyan AY, Ayvazyan L, Mukanova U, Yessirkepov M, Kitas GD. The platelet-to-lymphocyte ratio as an inflammatory marker in rheumatic diseases. Ann Lab Med. 2019;39(4):345–57. pmid:30809980
  17. 17. Wang X, Li T, Li H, Li D, Wang X, Zhao A, et al. Association of dietary inflammatory potential with blood inflammation: the prospective markers on mild cognitive impairment. Nutrients. 2022;14(12):2417. pmid:35745147
  18. 18. Lu W, Zhang K, Chang X, Yu X, Bian J. The association between systemic immune-inflammation index and postoperative cognitive decline in elderly patients. Clin Interv Aging. 2022;17:699–705. pmid:35535363
  19. 19. Xiao Y, Teng Z, Xu J, Qi Q, Guan T, Jiang X, et al. Systemic immune-inflammation index is associated with cerebral small vessel disease burden and cognitive impairment. Neuropsychiatr Dis Treat. 2023;19:403–13. pmid:36852257
  20. 20. Guo Z, Zheng Y, Geng J, Wu Z, Wei T, Shan G, et al. Unveiling the link between systemic inflammation markers and cognitive performance among older adults in the US: A population-based study using NHANES 2011-2014 data. J Clin Neurosci. 2024;119:45–51. pmid:37979310
  21. 21. Nolasco-Rosales GA, Alonso-García CY, Hernández-Martínez DG, Villar-Soto M, Martínez-Magaña JJ, Genis-Mendoza AD, et al. Aftereffects in epigenetic age related to cognitive decline and inflammatory markers in healthcare personnel with post-COVID-19: A cross-sectional study. Int J Gen Med. 2023;16:4953–64. pmid:37928957
  22. 22. van der Willik KD, Koppelmans V, Hauptmann M, Compter A, Ikram MA, Schagen SB. Inflammation markers and cognitive performance in breast cancer survivors 20 years after completion of chemotherapy: a cohort study. Breast Cancer Res. 2018;20(1):135. pmid:30442190
  23. 23. Hong S, Han K, Park C-Y. The insulin resistance by triglyceride glucose index and risk for dementia: population-based study. Alzheimers Res Ther. 2021;13(1):9. pmid:33402193
  24. 24. Minh HV, Tien HA, Sinh CT, Thang DC, Chen C-H, Tay JC, et al. Assessment of preferred methods to measure insulin resistance in Asian patients with hypertension. J Clin Hypertens (Greenwich). 2021;23(3):529–37. pmid:33415834
  25. 25. Wei B, Dong Q, Ma J, Zhang A. The association between triglyceride-glucose index and cognitive function in nondiabetic elderly: NHANES 2011-2014. Lipids Health Dis. 2023;22(1):188. pmid:37932783
  26. 26. Nayak SS, Kuriyakose D, Polisetty LD, Patil AA, Ameen D, Bonu R, et al. Diagnostic and prognostic value of triglyceride glucose index: a comprehensive evaluation of meta-analysis. Cardiovasc Diabetol. 2024;23(1):310. pmid:39180024
  27. 27. Zhang Z, Sheng Z, Liu J, Zhang D, Wang H, Wang L, et al. The association of the triglyceride-glucose index with Alzheimer’s disease and its potential mechanisms. J Alzheimers Dis. 2024;102(1):77–88. pmid:39497312
  28. 28. Bai W, An S, Jia H, Xu J, Qin L. Relationship between triglyceride-glucose index and cognitive function among community-dwelling older adults: a population-based cohort study. Front Endocrinol (Lausanne). 2024;15:1398235. pmid:39104819
  29. 29. Ding C, Lu R, Kong Z, Huang R. Exploring the triglyceride-glucose index’s role in depression and cognitive dysfunction: Evidence from NHANES with machine learning support. J Affect Disord. 2025;374:282–9. pmid:39805501
  30. 30. Pang K, Liu C, Tong J, Ouyang W, Hu S, Tang Y. Higher total cholesterol concentration may be associated with better cognitive performance among elderly females. Nutrients. 2022;14(19):4198. pmid:36235850
  31. 31. Lee S, Min J-Y, Kim B, Ha S-W, Han JH, Min K-B. Serum sodium in relation to various domains of cognitive function in the elderly US population. BMC Geriatr. 2021;21(1):328. pmid:34030649
  32. 32. Botelho J, Leira Y, Viana J, Machado V, Lyra P, Aldrey JM, et al. The role of inflammatory diet and vitamin d on the link between periodontitis and cognitive function: a mediation analysis in older adults. Nutrients. 2021;13(3):924. pmid:33809193
  33. 33. Scherr M, Kunz A, Doll A, Mutzenbach JS, Broussalis E, Bergmann HJ, et al. Ignoring floor and ceiling effects may underestimate the effect of carotid artery stenting on cognitive performance. J Neurointerv Surg. 2016;8(7):747–51. pmid:26063796
  34. 34. Lim CR, Harris K, Dawson J, Beard DJ, Fitzpatrick R, Price AJ. Floor and ceiling effects in the OHS: an analysis of the NHS PROMs data set. BMJ Open. 2015;5(7):e007765. pmid:26216152
  35. 35. Wang X, Wen Q, Li Y, Zhu H, Zhang F, Li S, et al. Systemic inflammation markers (SII and SIRI) as predictors of cognitive performance: evidence from NHANES 2011-2014. Front Neurol. 2025;16:1527302. pmid:40417122
  36. 36. Liao X, Li Y, Zhang Z, Xiao Y, Yu X, Huang R, et al. Associations of the body roundness index with cognitive function in US older adults and the mediating role of depression: a cross-sectional study from the NHANES 2011-2014. Sci Rep. 2025;15(1):16884. pmid:40374704
  37. 37. Nie W, Hu J. The relationship between grip strength and cognitive impairment: evidence from NHANES 2011-2014. Brain Behav. 2025;15(3):e70381. pmid:40103203
  38. 38. Qian Y, Liu Q, Li T. Association between composite dietary antioxidant index and cognitive function impairment in the elderly: evidence from NHANES 2011-2014. Front Neurol. 2025;16:1529989. pmid:40352776
  39. 39. Wu J, Shi M, Wang C. Association between tinnitus and cognitive impairment: analysis of National health and nutrition examination survey 2011:2014. Front Neurol. 2025;16:1533821. pmid:40520604
  40. 40. Qato DM, Ozenberger K, Olfson M. Prevalence of Prescription Medications With Depression as a Potential Adverse Effect Among Adults in the United States. JAMA. 2018;319(22):2289–98. pmid:29896627
  41. 41. Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: A cross-sectional observational study. Int J Nurs Stud. 2023;146:104562. pmid:37531702
  42. 42. Vermeulen RJ, Andersson V, Banken J, Hannink G, Govers TM, Rovers MM, et al. Limited generalizability and high risk of bias in multivariable models predicting conversion risk from mild cognitive impairment to dementia: A systematic review. Alzheimers Dement. 2025;21(4):e70069. pmid:40189799
  43. 43. Wang X, Zhou S, Ye N, Li Y, Zhou P, Chen G, et al. Predictive models of Alzheimer’s disease dementia risk in older adults with mild cognitive impairment: a systematic review and critical appraisal. BMC Geriatr. 2024;24(1):531. pmid:38898411
  44. 44. Grueso S, Viejo-Sobera R. Machine learning methods for predicting progression from mild cognitive impairment to Alzheimer’s disease dementia: a systematic review. Alzheimers Res Ther. 2021;13(1):162. pmid:34583745
  45. 45. Nagaraj S, Duong TQ. Deep learning and risk score classification of mild cognitive impairment and Alzheimer’s disease. J Alzheimers Dis. 2021;80(3):1079–90. pmid:33646166
  46. 46. Chen Y, Qian X, Zhang Y, Su W, Huang Y, Wang X, et al. Prediction Models for conversion from mild cognitive impairment to Alzheimer’s Disease: A systematic review and meta-analysis. Front Aging Neurosci. 2022;14:840386. pmid:35493941
  47. 47. Oh SS, Kang B, Hong D, Kim JI, Jeong H, Song J, et al. A multivariable prediction model for mild cognitive impairment and dementia: algorithm development and validation. JMIR Med Inform. 2024;12:e59396. pmid:39576972
  48. 48. Velazquez M, Lee Y, Alzheimer’s Disease Neuroimaging Initiative. Random forest model for feature-based Alzheimer’s disease conversion prediction from early mild cognitive impairment subjects. PLoS One. 2021;16(4):e0244773. pmid:33914757
  49. 49. Couronné R, Probst P, Boulesteix A-L. Random forest versus logistic regression: a large-scale benchmark experiment. BMC Bioinformatics. 2018;19(1):270. pmid:30016950
  50. 50. Li J, Tian Y, Zhu Y, Zhou T, Li J, Ding K, et al. A multicenter random forest model for effective prognosis prediction in collaborative clinical research network. Artif Intell Med. 2020;103:101814. pmid:32143809
  51. 51. Song Y, Yuan Q, Liu H, Gu K, Liu Y. Machine learning algorithms to predict mild cognitive impairment in older adults in China: A cross-sectional study. Journal of Affective Disorders. 2025;368:117–26.
  52. 52. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30. pmid:26572668
  53. 53. Liu Y, Zhou S, Wei H, An S. A comparative study of forest methods for time-to-event data: variable selection and predictive performance. BMC Med Res Methodol. 2021;21(1):193. pmid:34563138
  54. 54. Hu M, Shu X, Yu G, Wu X, Välimäki M, Feng H. A risk prediction model based on machine learning for cognitive impairment among chinese community-dwelling elderly people with normal cognition: development and validation study. J Med Internet Res. 2021;23(2):e20298. pmid:33625369
  55. 55. Livingston G, Sommerlad A, Orgeta V, Costafreda SG, Huntley J, Ames D, et al. Dementia prevention, intervention, and care. Lancet. 2017;390(10113):2673–734. pmid:28735855
  56. 56. Na K-S. Prediction of future cognitive impairment among the community elderly: A machine-learning based approach. Sci Rep. 2019;9(1):3335. pmid:30833698
  57. 57. Sattler C, Toro P, Schönknecht P, Schröder J. Cognitive activity, education and socioeconomic status as preventive factors for mild cognitive impairment and Alzheimer’s disease. Psychiatry Res. 2012;196(1):90–5. pmid:22390831
  58. 58. Aschwanden D, Aichele S, Ghisletta P, Terracciano A, Kliegel M, Sutin AR, et al. Predicting cognitive impairment and dementia: a machine learning approach. J Alzheimers Dis. 2020;75(3):717–28. pmid:32333585
  59. 59. Casagrande SS, Lee C, Stoeckel LE, Menke A, Cowie CC. Cognitive function among older adults with diabetes and prediabetes, NHANES 2011-2014. Diabetes Res Clin Pract. 2021;178:108939. pmid:34229005
  60. 60. Casemiro FG, Quirino DM, Diniz MAA, Rodrigues RAP, Pavarini SCI, Gratão ACM. Effects of health education in the elderly with mild cognitive impairment. Rev Bras Enferm. 2018;2:801–10. pmid:29791634
  61. 61. Valenzuela MJ, Sachdev P, Wen W, Chen X, Brodaty H. Lifespan mental activity predicts diminished rate of hippocampal atrophy. PLoS One. 2008;3(7):e2598. pmid:18612379
  62. 62. Landau SM, Marks SM, Mormino EC, Rabinovici GD, Oh H, O’Neil JP, et al. Association of lifetime cognitive engagement and low β-amyloid deposition. Arch Neurol. 2012;69(5):623–9. pmid:22271235
  63. 63. Lachman ME, Agrigoroaei S, Murphy C, Tun PA. Frequent cognitive activity compensates for education differences in episodic memory. Am J Geriatr Psychiatry. 2010;18(1):4–10. pmid:20094014
  64. 64. Stern Y. Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurol. 2012;11(11):1006–12. pmid:23079557
  65. 65. Le Carret N, Lafont S, Mayo W, Fabrigoule C. The effect of education on cognitive performances and its implication for the constitution of the cognitive reserve. Dev Neuropsychol. 2003;23(3):317–37. pmid:12740188
  66. 66. Stern Y, Arenaza-Urquijo EM, Bartrés-Faz D, Belleville S, Cantilon M, Chetelat G, et al. Whitepaper: Defining and investigating cognitive reserve, brain reserve, and brain maintenance. Alzheimers Dement. 2020;16(9):1305–11. pmid:30222945
  67. 67. Różyk-Myrta A. Guidelines for prevention and treatment of cognitive impairment in the elderly. Med Sci Monit. 2015;21:585–97.
  68. 68. Sadowsky CH, Galvin JE. Guidelines for the management of cognitive and behavioral problems in dementia. J Am Board Fam Med. 2012;25(3):350–66. pmid:22570399
  69. 69. Ngo J, Holroyd-Leduc JM. Systematic review of recent dementia practice guidelines. Age Ageing. 2015;44(1):25–33. pmid:25341676
  70. 70. Lee H, Wang Z, Tillekeratne A, Lukish N, Puliyadi V, Zeger S, et al. Loss of functional heterogeneity along the CA3 transverse axis in aging. Curr Biol. 2022;32(12):2681-2693.e4. pmid:35597233
  71. 71. Anacker C, Hen R. Adult hippocampal neurogenesis and cognitive flexibility - linking memory and mood. Nat Rev Neurosci. 2017;18(6):335–46. pmid:28469276
  72. 72. Johnson AC. Hippocampal vascular supply and its role in vascular cognitive impairment. Stroke. 2023;54(3):673–85. pmid:36848422
  73. 73. Lisman J, Buzsáki G, Eichenbaum H, Nadel L, Ranganath C, Redish AD. Viewpoints: how the hippocampus contributes to memory, navigation and cognition. Nat Neurosci. 2017;20(11):1434–47. pmid:29073641
  74. 74. Black S, Kraemer K, Shah A, Simpson G, Scogin F, Smith A. Diabetes, depression, and cognition: a recursive cycle of cognitive dysfunction and glycemic dysregulation. Curr Diab Rep. 2018;18(11):118. pmid:30267224
  75. 75. Gispen WH, Biessels GJ. Cognition and synaptic plasticity in diabetes mellitus. Trends Neurosci. 2000;23(11):542–9. pmid:11074263
  76. 76. Artola A. Diabetes-, stress- and ageing-related changes in synaptic plasticity in hippocampus and neocortex--the same metaplastic process?. Eur J Pharmacol. 2008;585(1):153–62. pmid:18395200
  77. 77. Liu X, Hao J, Yao E, Cao J, Zheng X, Yao D, et al. Polyunsaturated fatty acid supplement alleviates depression-incident cognitive dysfunction by protecting the cerebrovascular and glymphatic systems. Brain Behav Immun. 2020;89:357–70. pmid:32717402
  78. 78. McEwen BS, Nasca C, Gray JD. Stress effects on neuronal structure: hippocampus, amygdala, and prefrontal cortex. Neuropsychopharmacology. 2016;41(1):3–23. pmid:26076834
  79. 79. Jin K, Lu J, Yu Z, Shen Z, Li H, Mou T, et al. Linking peripheral IL-6, IL-1β and hypocretin-1 with cognitive impairment from major depression. J Affect Disord. 2020;277:204–11. pmid:32829196
  80. 80. Reppermund S, Zihl J, Lucae S, Horstmann S, Kloiber S, Holsboer F, et al. Persistent cognitive impairment in depression: the role of psychopathology and altered hypothalamic-pituitary-adrenocortical (HPA) system regulation. Biol Psychiatry. 2007;62(5):400–6. pmid:17188252
  81. 81. Garcia MJ, Leadley R, Ross J, Bozeat S, Redhead G, Hansson O, et al. Prognostic and Predictive Factors in Early Alzheimer’s disease: a systematic review. J Alzheimers Dis Rep. 2024;8(1):203–40. pmid:38405341
  82. 82. Rubin R. Exploring the relationship between depression and dementia. JAMA. 2018;320(10):961.
  83. 83. Mohan D, Iype T, Varghese S, Usha A, Mohan M. A cross-sectional study to assess prevalence and factors associated with mild cognitive impairment among older adults in an urban area of Kerala, South India. BMJ Open. 2019;9(3):e025473. pmid:30898818
  84. 84. Song D, Yu DSF, Li PWC, Sun Q. Identifying the factors related to depressive symptoms amongst community-dwelling older adults with mild cognitive impairment. Int J Environ Res Public Health. 2019;16(18):3449. pmid:31533269
  85. 85. Zhang W, Wu J, Yu L, Zhang C, Zhang H, Guo P, et al. Association between spinal cord injury and cognitive impairment, and the mediating role of inflammation. Journal of Craniofacial Surgery. 2025;36(6):e799–803.
  86. 86. Geng C, Chen C. Association between elevated systemic inflammatory markers and the risk of cognitive decline progression: a longitudinal study. Neurol Sci. 2024;45(11):5253–9. pmid:38890170
  87. 87. Chen K, Wang L, Ning H, Pan H, Zhang W. Neutrophil-to-lymphocyte ratio; platelet-to-lymphocyte ratio; systemic immune-inflammatory Index: inflammatory indicators of cognitive impairment in schizophrenia patients. Front Psychiatry. 2025;16:1552451. pmid:40291517
  88. 88. Liu Q, Li Z, Huang L, Zhou D, Fu J, Duan H, et al. Telomere and mitochondria mediated the association between dietary inflammatory index and mild cognitive impairment: A prospective cohort study. Immun Ageing. 2023;20(1):1. pmid:36604719
  89. 89. Liu Q, Zhou D, Duan H, Zhu Y, Du Y, Sun C, et al. Association of dietary inflammatory index and leukocyte telomere length with mild cognitive impairment in Chinese older adults. Nutr Neurosci. 2023;26(1):50–9. pmid:34957928
  90. 90. Ding T, Aimaiti M, Cui S, Shen J, Lu M, Wang L, et al. Meta-analysis of the association between dietary inflammatory index and cognitive health. Front Nutr. 2023;10:1104255. pmid:37081917
  91. 91. Mohammadi A, Mohammadi M, Almasi-Dooghaee M, Mirmosayyeb O. Neutrophil to lymphocyte ratio in Alzheimer’s disease: A systematic review and meta-analysis. PLoS One. 2024;19(6):e0305322. pmid:38917167
  92. 92. Algul FE, Kaplan Y. Increased systemic immune-inflammation index as a novel indicator of Alzheimer’s disease severity. J Geriatr Psychiatry Neurol. 2025;38(3):214–22. pmid:39271460