Figures
Abstract
Background
Machine learning (ML) shows promise in using clinical data to predict chronic diseases. However, its application in PMOP risk assessment using readily available clinical and biochemical parameters is underexplored.
Objective
This study aimed to develop and validate an interpretable ML-based model for assessing PMOP using clinical features and laboratory biomarkers, and to identify factors associated with PMOP using SHapley Additive exPlanations (SHAP).
Methods
A retrospective cross-sectional study included 1,717 postmenopausal women from two hospitals in Northwest China. PMOP was diagnosed with dual-energy X-ray absorptiometry (DXA T-score ≤−2.5). Data collected included demographics, clinical details, and various laboratory parameters, such as bone metabolism markers, 25-hydroxyvitamin D [25-(OH)D], electrolytes, and routine blood counts. Ten ML algorithms were employed for feature selection and model construction on a dataset split into training (n = 1201) and testing (n = 516) sets. Performance was evaluated using the Area Under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and calibration.
Results
The Extra Trees (ET) model achieved the best test-set performance, with an AUC of 0.717 (95% CI: 0.682–0.752). SHAP analysis revealed that age was the most significant associated factor (SHAP value: 0.0648), followed by body mass index (BMI) (0.0243) and chloride ion levels (0.0209). Other top predictors included the use of antihypertensive drugs and years since menopause.
Citation: Guo Y, Jiang S, Zhu W, Tan L, Liu C, Jia Y, et al. (2026) Identification of key predictors of postmenopausal osteoporosis from routine clinical indicators using explainable machine learning. PLoS One 21(6): e0351334. https://doi.org/10.1371/journal.pone.0351334
Editor: Gaetano Paride Arcidiacono, University of Padova: Universita degli Studi di Padova, ITALY
Received: January 19, 2026; Accepted: May 26, 2026; Published: June 24, 2026
Copyright: © 2026 Guo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Due to data protection policies of the School of Medicine, Shaanxi Institute of International Trade & Commerce and its affiliated cooperating hospitals, the data supporting this study are not publicly available. The datasets used and/or analyzed during the current study are available upon reasonable request to the the Ethics Committee of the School of Medicine, Shaanxi Institute of International Commerce & Trade. Please contact the Ethics Committee via its email address: 174457802@qq.com. Additional queries may be directed to the Corresponding Author or first author.
Funding: This study is supported by Research University Grant provided by Universiti Kebangsaan Malaysia (GUP-2024-026 to K-YC).
Competing interests: All authors have approved the submitted version and declare no competing interests. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Introduction
Osteoporosis is a systemic metabolic disorder characterized by an imbalance in bone homeostasis, reduced bone mass, and deterioration of bone microarchitecture, leading to increased bone fragility and susceptibility to fractures [1,2]. It exhibits a high global prevalence, ranking as the third most common chronic disease following cardiovascular diseases and diabetes [3]. Postmenopausal osteoporosis (PMOP), the most common type of primary osteoporosis, is triggered by the decline in ovarian function after menopause [4]. In China, the prevalence of osteoporosis is substantially higher in women than in men, particularly among individuals aged ≥50 years (19.2%), and reaches 51.6% in women over 65 years [5–7]. Most patients show no obvious symptoms in the early stages, but as the disease progresses, clinical manifestations, such as chronic low back pain, fatigue, and loss of height, may gradually appear [5].
With increasing clinical and research focus on osteoporosis-related diseases, a growing number of bone turnover markers relevant to the auxiliary diagnosis and treatment of osteoporosis are being explored. Bone turnover markers play a crucial role in early screening and therapeutic monitoring of osteoporosis, offering high sensitivity and specificity [8,9]. Beyond traditional bone turnover markers, electrolyte levels are recognized as significant factors influencing bone metabolism [10]. Electrolytes, such as calcium, phosphorus, magnesium, sodium, and potassium, play crucial roles in maintaining a balance of bone mineralization, bone remodeling, and cellular function [11]. Their dysregulation is closely associated with the onset and progression of osteoporosis [12].
Among these, serum calcium, as the primary mineral component of bone, directly influences the dynamic balance between bone formation and resorption by altering its concentration, serving as a key indicator of bone metabolic status [13]. For instance, calcium ions directly influence osteoblast and osteoclast activity via the calcium-sensing receptor signaling pathway [14,15]. Serum phosphorus works synergistically with calcium in the process of bone salt deposition. An imbalance in their ratio can lead to impaired bone mineralization. Serum magnesium indirectly affects bone density and strength by regulating parathyroid hormone (PTH) and vitamin D metabolism [16]. Chloride ions (CI-) participate in osteoclast-mediated bone resorption through Anoctamin-1 (ANO1) CI- channels [17]. Furthermore, sodium and potassium ions also play regulatory roles in bone metabolism. A high-sodium diet can promote urinary calcium excretion, leading to bone calcium loss [18]. In contrast, adequate potassium intake helps buffer the acid load and reduce calcium loss, thereby exerting a protective effect on bone [19]. Therefore, electrolyte levels not only reflect the body’s metabolic homeostasis but may also serve as important indicators of osteoporosis risk and guide clinical interventions.
Vitamin D and PTH are critical regulators of calcium-phosphorus metabolism and bone mineralization. The most reliable biomarker of vitamin D status is serum 25-hydroxyvitamin D [25(OH)D], which maintains calcium and phosphorus balance by promoting their intestinal absorption, renal reabsorption, and bone deposition, thereby ensuring bone health and mechanical strength [20]. Insufficient vitamin D is closely associated with reduced bone mass and increased adiposity [21]. PTH, secreted by the parathyroid glands, regulates bone remodeling through its dual anabolic and catabolic effects depending on the secretion pattern and concentration. Clinically, monitoring vitamin D and PTH levels provides important guidance for diagnosing metabolic bone disorders and optimizing the management of osteoporosis [20,21].
With the rapid development of artificial intelligence (AI) and its applications in clinical research, technologies such as machine learning (ML) and deep learning (DL) can extract clinical information from large datasets to aid clinical decision-making [22]. Many ML techniques have been used to develop chronic disease prediction models, most of which show good predictive performance [23,24]. Although several ML models have been proposed for osteoporosis risk prediction, most lack interpretability, rely on specialized biomarkers, or have not been validated in community-dwelling postmenopausal populations. Therefore, this study aims to develop an explainable ML model using only routine clinical indicators to assess osteoporosis risk in postmenopausal women; identify key predictors using SHAP analysis; fill the gap between model performance and clinical interpretability, and evaluate their value in identifying osteoporosis. This study is expected to provide a reference for the adoption and integration of ML technology in bone health management.
Materials and methods
Study design and subjects
This retrospective study included postmenopausal patients hospitalized in the orthopedic wards of two tertiary hospitals in northwest China between 01/01/2024 and 30/06/2025. Clinical data were extracted from electronic medical records. The study protocol was reviewed and approved by the Ethics Committee of the School of Medicine, Shaanxi Institute of International Commerce & Trade (Approval No.: YYXY-HLX-2024-12-25) on December 25, 2024. The researchers accessed the fully anonymized data for analysis between 01/07/2025 and 31/08/2025. Thus, all data collection (i.e., data extraction) was performed after ethical approval, and patient consent was waived due to the retrospective and anonymized nature of the study.
Subjects were screened based on the following inclusion and exclusion criteria. Participants were eligible for inclusion if they were menopausal women aged 50 years or older, had complete body mass index (BMI), imaging, and laboratory data, had not received prior osteoporosis treatment, and had a diagnosis of primary osteoporosis. Individuals were excluded if they had other metabolic bone diseases (e.g., osteomalacia, Paget disease, or rickets), medical conditions associated with secondary osteoporosis (including hyperthyroidism, Cushing syndrome, hyperprolactinemia, hematological disorders, connective tissue diseases such as rheumatoid arthritis and systemic lupus erythematosus, bone tumors, or chronic kidney failure), were taking medications known to affect bone metabolism (e.g., hormone replacement or ablation therapy, glucocorticoids, thyroid supplements, anticonvulsants, warfarin, thiazide diuretics, or chemotherapy agents), had metal implants that could interfere with dual-energy X-ray absorptiometry (DXA) scans, or had incomplete clinical data.
Diagnosis of osteoporosis
The diagnosis of osteoporosis was based on the lumbar spine (L1-L4) BMD, which was determined by DXA, as defined by the World Health Organization (WHO) in 1994. Based on BMD values, T-scores for subjects were calculated and categorized as in Table 1.
Data collection
The cases that met the requirements from 01/01/2024 to 30/06/2025 were screened and included. All patient data were fully anonymized prior to researcher access, and the study protocol was reviewed and approved by the Institutional Review Board of the participating institutions. The ethics committee waived the requirement for informed consent due to the retrospective and fully anonymized nature of the research. The data retrieved from the electronic medical record management systems of the two hospitals included the patient’s age, height, weight, BMI, past medical history (including medications used and surgical history). Data of patients’ electrolytes (Hitachi 7600, Hitachi High-Tech, Tokyo, Japan) were also retrieved; bone metabolism markers (MAGLUMI-X6, Shenzhen New Industry Biomedical Engineering Co., Ltd., Shenzhen, China), 25-OH-D (Abbott A3600, Abbott Diagnostics, Illinois, USA), blood routine (Murray BC-5309, Shenzhen Mindray Bio-Medical Electronics Co., Ltd, Shenzhen, China). Both hospitals used MEDIX-DR (Medilink/DMS Imaging, le Montougeux, France) to assess BMD at L1-L4. The patients’ BMD and T-scores were retrieved from the electronic medical record. If the record was unavailable, the hospital’s imaging department was contacted to identify the records through the patient’s name or medical record number.
Feature selection
In this study, a systematic and comprehensive feature screening was conducted to identify key factors associated with osteoporosis in postmenopausal women. Specifically, we combined 10 distinct machine learning algorithms to ensure robustness and comprehensive feature selection. These algorithms included: Logistic Regression (LR, class weight = balanced), SVM-RBF (SVM), Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), AdaBoost (ADA), XGBoost (XGB), K-Nearest Neighbors (KNN), and Naive Bayes (NB). During the feature selection process, each algorithm employed the Recursive Feature Elimination (RFE) method, which iteratively removes less informative variables based on assessment performance to optimize model performance. The study sample was randomly split into a training set (n = 1201) and a test set (n = 516) at a ratio of 7:3 [25]. During the training phase, we further applied 10-fold cross-validation to evaluate model performance and stability [26]. Although all 30 features were retained based on RFE performance, multicollinearity and overfitting remain potential concerns. To mitigate these issues, we applied 10-fold cross-validation during training and used SHAP analysis post hoc to evaluate feature contributions, thereby identifying redundant variables for future simplification.
Model verification
In this study, the best-performing model was evaluated using internal validation on a hold-out test set. The Youden index was employed to determine the optimal threshold, as it maximizes the combined benefit of sensitivity and specificity. During validation, a cumulative lift chart was used to assess the model’s performance and compare it with random selection. A confusion matrix was constructed to visually illustrate the discrepancies between assessment outcomes and actual observations. Model performance was evaluated using metrics including accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC-ROC). Furthermore, model calibration was examined by comparing assessment probabilities with actual observed outcomes.
Characteristic importance assessment
SHAP values were used to measure the importance and contribution of each input variable to the model output [27]. As an interpretation method based on game theory, SHAP extends the classical Shapley value to interpret outcomes from machine learning models. By reasonably allocating the “contribution share” of features to assessments, it achieves optimal attribution for model outputs. This method enables researchers to gain deeper insights into the relative influence of various factors on osteoporosis risk assessment, thereby enhancing the model’s interpretability and clinical application value.
Statistical analysis
Statistical analysis using SPSS 26.0 (IBM, Armonk, NY, USA) and Python 3.10.1 (Python Software Foundation) software. Classification variables were expressed as frequencies or proportions and compared using the chi-square test or Fisher’s exact test. The Kolmogorov-Smirnov-Lilliefors (K-S-L) test was used to test the normality of continuous data. Non-normal distribution variables were evaluated using the Wilcoxon rank sum test and are presented as median, first quartile (Q1), and third quartile (Q3). When P < 0.05, the difference was considered significant.
Results
Baseline data evaluation
This study enrolled postmenopausal female patients from the orthopedic wards of two tertiary Grade A hospitals in Northwest China. A total of 1717 cases were included based on the predetermined inclusion and exclusion criteria. Among them, 819 individuals (47.70%) were diagnosed with osteoporosis, while 898 (52.30%) were not. All clinical data were obtained from patients’ electronic medical records, totaling 30 clinical features. Detailed characteristics are presented in Table 2. The cohort was randomly divided into a training set (n = 1201) and a test set (n = 516) at a 7:3 ratio, using stratified sampling. There were no statistically significant differences in the distributions of variables between the two subsets, as shown in Table 3.
Machine learning-based recursive prediction of OP
With reference to the Recursive Feature Elimination (RFE) results from ten different algorithms (Fig 1), the assessment performance was optimal when all 30 clinical predictor variables were included. Therefore, all features were retained for model construction in this study. To mitigate the potential risks of overfitting and redundancy, we applied 10-fold cross-validation during training and subsequently used SHAP analysis to evaluate individual feature contributions, which helped identify the most influential predictors without prematurely removing variables that might collectively contribute to model performance. The initial performance of the ten models is presented in Table 4. In the preliminary evaluation stage, the AUC was designated as the primary metric for model assessment. Based on AUC values, three algorithms with superior performance were selected for further analysis: ET (AUC = 0.717), RF (AUC = 0.707), and XGB (AUC = 0.693), as illustrated in Fig 2.
Logistic Regression (LR, class_weight = balanced), SVM-RBF (SVM), Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), AdaBoost (ADA), XGBoost (XGB), KNN (K-Nearest Neighbors), Naive Bayes (NB).
Logistic Regression (LR, class_weight = balanced), SVM-RBF (SVM), Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), AdaBoost (ADA), XGBoost (XGB), KNN (K-Nearest Neighbors), Naive Bayes (NB).
To further assess model robustness, AUCs with 95% confidence intervals (CIs) were calculated using 2000 bootstrap replicates. The ET model achieved an AUC of 0.717 (95% CI: 0.682–0.752), outperforming logistic regression (LR), which yielded an AUC of 0.672 (95% CI: 0.637–0.707). DeLong’s test demonstrated a statistically significant difference between ET and LR (P = 0.023), although the absolute improvement in discrimination was modest (ΔAUC = 0.045).
Model development and selection
Based on the preliminary evaluation, further in-depth refinement was conducted on the three top-performing algorithms: ET, RF, and XGB. The hyperparameters of these algorithms were optimized using a randomized search method with 10-fold cross-validation. The optimized prediction models were then combined into a stacked ensemble model, which leverages the robustness of the ET, RF, and XGB algorithms synergistically. The ensemble was constructed using a two-stage approach: the first stage involved independently training the ET, RF, and XGB models on the designated training dataset, and their resulting associated factors were used as inputs to the second-stage model to generate an integrated assessment. To evaluate the reliability of the designed machine learning framework, its performance was benchmarked using 10-fold cross-validation on the training set. Therefore, the ET model was selected as the final model for subsequent SHAP analysis and interpretation. The AUC of 0.717 (95% CI: 0.682–0.752) indicates only modest discriminative ability, suggesting that further improvements in feature engineering or model architecture are needed. During internal validation, the ET model demonstrated better predictive performance than the standalone RF and XGB models, as shown in Figs 3 and 4. The calibration curves presented in Fig 5 provide insight into model calibration.
(a) The decision curve of the best model. (b) The DCA of the model: Net benefit at 0.10 = 0.420, at 0.20 = 0.353, at 0.30 = 0.281, max net benefit = 0.471 at threshold 0.01.
Model performance and feature importance
Although a stacked ensemble combining ET, RF, and XGB was constructed as described in the Methods, it did not yield improved performance over the standalone ET model in our cross-validation analysis. Therefore, the ET model was selected as the final model for subsequent SHAP analysis and interpretation. Although the ET model was selected as the best-performing algorithm, its AUC of 0.717 reflects only modest predictive discrimination, indicating that further improvements, such as incorporating additional predictive features or optimizing model architecture, are needed before clinical deployment. To better understand the impact of individual features on the ET algorithm-based OP risk assessment model, the SHAP values for each feature were calculated. Based on the ET model, the top 20 features were selected based on their importance ranking, as measured by mean absolute SHAP values (Fig 6a). The top five features, in descending order of importance based on mean absolute SHAP values, were age, BMI, chloride ion, use of antihypertensive medications, and years since menopause. Among bone turnover markers, bone-specific alkaline phosphatase (BALP) contributed the most, though only 0.0105, ranking 12th. The remaining bone turnover markers contributed relatively little. Fig 6b presents violin plots for each feature, illustrating the correlation between feature values and SHAP values. A larger absolute SHAP value for a feature indicates a greater influence of that feature on the ET-based prediction model. Red dots represent higher values of the feature, while blue dots represent lower values.
Age: 0.0648; BMI: 0.0243; CI-:0.0209; Antipertensives: 0.0164; Menopause year: 0.0125; Na+: 0.0118; Hemoglobin: 0.0116; Albumin: 0.0114; Ca2+: 0.0113; PHT: 0.0110; Antilipidemics: 0.0107; BALP: 0.0105; P1NP: 0.0101; RBC: 0.0100; N-MID: 0.0099; A/G: 0.0092; Antidiabetics: 0.0090; NUET%:0.0090; Hematocrit: 0.0088; GLB: 0.0085. (b) Distribution of the impact of each feature on the output of the ET model estimated using SHAP values.
Discussion
In this study, we developed and validated an explainable machine learning model to assess osteoporosis risk in postmenopausal women using routinely available clinical indicators. The ET algorithm demonstrated the best assessment performance with an AUC of 0.717, outperforming traditional methods such as logistic regression (ΔAUC = 0.045, P = 0.023). SHAP analysis identified age, BMI, and serum chloride as the most influential predictors, highlighting the multifactorial nature of postmenopausal OP and suggesting that electrolyte balance may influence bone health beyond traditional markers. These findings underscore the potential to integrate routine laboratory tests into osteoporosis risk assessment, thereby facilitating early identification of high-risk individuals in clinical practice. These findings are consistent with several recent studies. Yen et al. demonstrated that integrating imaging and clinical data using deep learning significantly improved fracture prediction accuracy (AUC = 0.88) [28], while Zitu et al. further confirmed that ML models exhibit superior generalizability in complex clinical settings compared to conventional statistical methods [29]. More specifically, our ET model is comparable to other clinical-feature-based ML models for OP. For instance, a study by Sun et al. [30] using only age and BMI achieved an AUC of 0.69 in a Korean cohort. The moderate performance of our model relative to Sun et al. may reflect the absence of specialized bone biomarkers in our routine clinical dataset. Conversely, our model outperformed simple logistic regression, consistent with previous reports that ensemble methods offer modest gains over traditional approaches.
Although our model demonstrated promising performance, with an AUC of 0.717, it did not reach the ideal threshold of 0.8, indicating room for improvement before clinical deployment. This moderate performance may be attributed to several factors, including data characteristics, model complexity, and evaluation metrics. First, the dataset was nearly balanced (47.7% OP vs. 52.3% non-OP), so class imbalance was not a major concern in this study. AUC is known to be robust to class imbalance when the class distributions are not extremely skewed. Thus, our evaluation metrics remain valid. Nevertheless, other factors, such as multicollinearity among the 31 clinical variables, may have affected model stability. Second, from a modeling perspective, while the ET algorithm, as an ensemble method, can handle high-dimensional features, the inclusion of 31 clinical variables, such as electrolyte levels and medication history, introduced potential multicollinearity issues. For instance, the notably large standard deviation of β-CTX may have adversely affected model stability. Finally, limitations inherent to the AUC metric must be considered. AUC only reflects the model’s ranking ability and does not account for the probability of calibration. For example, the calibration curve of the ET model in this study indicated discrepancies between assessment probabilities and actual outcomes (Fig 4a), suggesting suboptimal model fit. Moreover, AUC does not provide information on error distribution, potentially masking performance deficiencies in specific subgroups (e.g., advanced age or low-BMI populations) [31,32]. Therefore, future studies should incorporate more granular metrics (such as GAUC or F1-score) and external validation to comprehensively evaluate model performance.
This study further confirms the critical role of age in the development and progression of OP in postmenopausal women. SHAP analysis identified age as the most important associated factor (SHAP value: 0.0648), showing a significant positive correlation with OP risk. This finding is highly consistent with recent research on the role of cellular senescence in bone metabolism. With advancing age, senescent cells in the bone microenvironment secrete senescence-associated secretory phenotype factors (e.g., interleukins, metalloproteinases), creating a pro-inflammatory environment that activates osteoclasts and suppresses osteoblasts [33]. Senescent osteocytes also upregulate the RANKL/OPG ratio, promoting osteoclast differentiation [34]. Age-related mitochondrial dysfunction increases ROS production, inducing osteoblast apoptosis and promoting osteoclastogenesis via NF-κB activation.
Notably, we observed a negative correlation between BMI and OP risk (SHAP value: 0.0243), supporting the “obesity paradox”. Higher BMI may protect bone through mechanical loading (activating Wnt/β-catenin), aromatase-mediated estrogen production, and adipokine regulation (leptin, adiponectin) [35,36]. However, this protective effect is not linear, as extreme obesity may become detrimental due to chronic inflammation. The bidirectional relationship between BMI and OP observed in this study suggests that maintaining a moderate body weight may be beneficial for skeletal health in postmenopausal women.
In the current model, the significance of electrolyte indicators was identified through SHAP analysis, though their underlying mechanisms remain unclear. Notably, Cl⁻ ranked third among the most important features, after age and BMI, suggesting a potential role in bone metabolism. This may be attributed to the influence of electrolyte balance on the bone microenvironment, specifically by modulating pH or ion channels, which, in turn, affects osteoblast and osteoclast activity. Recent studies have revealed that ANO1 Cl- channel plays a key regulatory role in osteoclast differentiation and bone resorption. ANO1 facilitates Cl- efflux, enhancing H+ secretion and bone matrix dissolution. ANO1 also interacts with RANK, activating RANKL-RANK signaling and accelerating bone resorption [37]. Sodium ions influence calcium homeostasis via sodium-calcium exchange: high sodium intake increases urinary calcium excretion (20–60 mg Ca per 1000 mg Na), leading to negative calcium balance, and may inhibit intestinal calcium absorption while stimulating PTH secretion, thereby promoting bone loss [38]. However, these mechanisms were not fully captured in the multivariate model, likely due to data limitations and the model’s insufficient capacity to fit complex nonlinear relationships.
An important insight from our study is the relatively limited contribution of traditional bone turnover markers to the multivariate ML model. This finding does not negate the biological relevance of these markers but rather highlights that their predictive value may be overshadowed by more stable clinical features such as age and BMI when assessed in a multifactorial framework. From a clinical perspective, this suggests that while bone turnover markers remain valuable for monitoring treatment response, routine clinical parameters may be more practical for initial osteoporosis risk stratification in primary care settings. In our study, markers such as procollagen type I N-terminal propeptide (P1NP) and the N-terminal mid-fragment of osteocalcin (N-MID) exhibited relatively limited contributions to the prediction model. This aligns with the conclusion drawn by Yoo et al. in a Korean multicenter study, which demonstrated that the predictive value of individual BTMs is modest [39]. From a biological perspective, although traditional bone turnover markers reflect the rate of bone remodeling, their serum levels are subject to complex regulation by factors such as circadian rhythm and feeding status. Studies have shown that bone turnover markers exhibit significant circadian fluctuations, with variations ranging from 10% to 20% [40]. More importantly, postmenopausal osteoporosis involves a coupling imbalance between osteoblasts and osteoclasts within bone remodeling units, driven by immune cell-bone cell interactions (e.g., T-cell-secreted RANKL and Wnt signaling inhibitors) that are not captured by single biomarkers [41].
The 25(OH)D level did not show significant differences between OP and non-OP subjects, which may reflect its dual role in bone remodeling. While adequate vitamin D promotes bone mineralization, excessively high levels may stimulate bone resorption [42]. Our Northwest Chinese cohort is characterized by high latitude, limited sunlight exposure, and a wheat-based diet low in vitamin D-rich foods, potentially leading to vitamin D deficiency. This deficiency can trigger compensatory PTH elevation, accelerating bone turnover and obscuring a linear 25(OH)D-BMD association. A threshold effect exists, whereby the adverse effects of PTH on BMD become significant only when 25(OH)D falls below 20 ng/mL [43]. Additionally, VDR gene polymorphisms may modulate this relationship.
Given that our study cohort comprised exclusively postmenopausal women of Chinese (mainland) descent, the heterogeneity in ML-based osteoporosis prediction models across Asian populations warrants discussion. In Southern Taiwan, Huang et al. developed ML models using 2,638 health examination participants and found that artificial neural network achieved better performance than OSTA, with age, gender, and body weight as top predictors [44]. In contrast, our ET model identified age, BMI, and chloride as the most important features, suggesting that electrolyte markers may carry greater predictive value in mainland populations, possibly due to dietary or regional differences. Korean studies have also reported XGBoost models for osteoporosis risk classification in women, achieving accuracy of 0.705 and F1 of 0.738, with age at menopause as the strongest predictor [45]. Another Korean study incorporating genetic risk scores and up to 122 features achieved AUC exceeding 0.85 [46]. These cross-Asian comparisons highlight that model performance and feature importance vary substantially across populations, underscoring the need for region-specific model development and external validation before generalizing our findings to other Asian cohorts.
Limitations
Although this study provides a machine learning perspective on risk assessment for OP in postmenopausal women, several limitations should be cautiously considered. First, because of the retrospective cross-sectional design, causal inferences cannot be made; the identified predictors should be interpreted as factors associated with PMOP rather than causal risk factors. Additionally, the data were sourced from only two tertiary hospitals in Northwest China, which may introduce selection bias and regional limitations, thereby limiting the model’s generalizability. Second, although multiple clinical and laboratory indicators were included, several potentially important predictors, such as detailed dietary patterns, alcohol intake, physical activity levels, sunlight exposure, genetic factors, and emerging biomarkers, were not incorporated, potentially limiting the predictive ability of the models. Furthermore, despite the use of cross-validation, the performance of the machine learning model (AUC = 0.717) remains moderate, indicating the need for further optimization in feature engineering and algorithm selection. Future studies should adopt prospective multi-center designs to enlarge sample size and improve geographical representativeness, and integrate multi-omics data (e.g., genomic, epigenomic, metabolomic) to develop more robust assessment models. In addition, external validation using independent cohorts from different regions and healthcare settings is strongly recommended to further evaluate the model’s stability, calibration, and clinical applicability, thereby enhancing its generalizability and translational value.
Conclusion
This study developed and validated a machine learning-based assessment model for OP in postmenopausal women. The results demonstrated that the Extra Trees model exhibited the best predictive performance. SHAP analysis identified age, BMI, and electrolyte indicators, particularly Cl ⁻ , as the most influential clinical features associated with PMOP. Notably, the relatively limited contribution of traditional bone turnover markers in our multivariate model suggests that the pathophysiology of osteoporosis involves complex systemic interactions that may not be fully captured by individual biomarkers. This study presents an interpretable ML framework for assessing osteoporosis in postmenopausal women and highlights the value of integrating multidimensional clinical data. Future research should focus on prospective multi-center validation and the integration of emerging biomarkers to further enhance model generalizability and clinical utility.
References
- 1. Yang R, Ma Q, Zhang X, Zhao Q, Zeng S, Yan H, et al. A study on the prevalence of osteoporosis in people with different altitudes in Sichuan, China. Clin Interv Aging. 2024;19:1819–28. pmid:39525876
- 2. Shen Y, Huang X, Wu J, Lin X, Zhou X, Zhu Z. The global burden of osteoporosis, low bone mass, and its related fracture in 204 countries and territories, 1990-2019. Front Endocrinol (Lausanne). 2022;13:882241.
- 3. Chandran M, Brind’Amour K, Fujiwara S, Ha Y-C, Tang H, Hwang J-S, et al. Prevalence of osteoporosis and incidence of related fractures in developed economies in the Asia Pacific region: a systematic review. Osteoporos Int. 2023;34(6):1037–53. pmid:36735053
- 4. Zhao H, Yu F, Wu W. New perspectives on postmenopausal osteoporosis: mechanisms and potential therapeutic strategies of sirtuins and oxidative stress. Antioxidants (Basel). 2025;14(5):605. pmid:40427485
- 5. Wang J, Shu B, Tang D-Z, Li C-G, Xie X-W, Jiang L-J, et al. The prevalence of osteoporosis in China, a community based cohort study of osteoporosis. Front Public Health. 2023;11:1084005. pmid:36875399
- 6. Liu Y, Huang X, Tang K, Wu J, Zhou J, Bai H, et al. Prevalence of osteoporosis and associated factors among Chinese adults: a systematic review and modelling study. J Glob Health. 2025;15:04009. pmid:39820179
- 7. Özmen S, Kurt S, Timur HT, Yavuz O, Kula H, Demir AY, et al. Prevalence and risk factors of osteoporosis: a cross-sectional study in a tertiary center. Medicina (Kaunas). 2024;60(12):2109. pmid:39768987
- 8. Brown JP, Don-Wauchope A, Douville P, Albert C, Vasikaran SD. Current use of bone turnover markers in the management of osteoporosis. Clin Biochem. 2022;109–110:1–10. pmid:36096182
- 9. Brouwers P, Bouquegneau A, Cavalier E. Insight into the potential of bone turnover biomarkers: integration in the management of osteoporosis and chronic kidney disease-associated osteoporosis. Curr Opin Endocrinol Diabetes Obes. 2024;31(4):149–56. pmid:38804196
- 10. Farag MA, Abib B, Qin Z, Ze X, Ali SE. Dietary macrominerals: updated review of their role and orchestration in human nutrition throughout the life cycle with sex differences. Curr Res Food Sci. 2023;6:100450. pmid:36816001
- 11. Sun M, Zhang W, Sun X, Yu K, He X, Zheng L, et al. Magnesium promoting OVX rats’ rotator cuff tear repair with relieving stem cell senescence effect. Exp Cell Res. 2025;449(2):114593. pmid:40348155
- 12. Skalny AV, Aschner M, Silina EV, Stupin VA, Zaitsev ON, Sotnikova TI, et al. The role of trace elements and minerals in osteoporosis: a review of epidemiological and laboratory findings. Biomolecules. 2023;13(6):1006. pmid:37371586
- 13. Lee J, Vasikaran S. Current recommendations for laboratory testing and use of bone turnover markers in management of osteoporosis. Ann Lab Med. 2012;32(2):105–12. pmid:22389876
- 14. Santa Maria C, Cheng Z, Li A, Wang J, Shoback D, Tu C-L, et al. Interplay between CaSR and PTH1R signaling in skeletal development and osteoanabolism. Semin Cell Dev Biol. 2016;49:11–23. pmid:26688334
- 15. Liu K, Xu T, Fan J, Li Y, Guo X, Zhang H, et al. Calcium-sensing receptors promoted Homer1 expression and osteogenic differentiation in bone marrow mesenchymal stem cells. Open Life Sci. 2025;20(1):20221059. pmid:40059879
- 16. Lin X-B, Ye H, He L-J, Xu Z-B. Analysis of changes in serum high t-PINP/β-CTX ratio and risk of re-fracture after vertebral osteoporotic fracture surgery. Eur Rev Med Pharmacol Sci. 2023;27(22):10860–7. pmid:38039015
- 17. Sun W, Li Y, Li J, Tan Y, Yuan X, Meng H. Mechanical stimulation controls osteoclast function through the regulation of Ca(2)-activated Cl(-) channel Anoctamin 1. Commun Biol. 2023;6(1):407.
- 18. Imash D, Gusmanov A, Chan M-Y. High salt intake and bone health in postmenopausal women: exposing the lack of studies - a systematic review and meta-analysis. Front Endocrinol (Lausanne). 2025;16:1694539. pmid:41347127
- 19. Abate V, Vergatti A, Altavilla N, Garofano F, Salcuni AS, Rendina D, et al. Potassium intake and bone health: a narrative review. Nutrients. 2024;16(17):3016. pmid:39275337
- 20. Lee S, Chung HJ, Jung S, Jang HN, Chang SH, Kim HJ. 24,25-dihydroxy vitamin D and vitamin D metabolite ratio as biomarkers of vitamin D in chronic kidney disease. Nutrients. 2023;15(3):578.
- 21. Khodabakhshi A, Mahmoudabadi M, Vahid F. The role of serum 25 (OH) vitamin D level in the correlation between lipid profile, body mass index (BMI), and blood pressure. Clin Nutr ESPEN. 2022;48:421–6. pmid:35331523
- 22. Sadr H, Nazari M, Khodaverdian Z, Farzan R, Yousefzadeh-Chabok S, Ashoobi MT, et al. Unveiling the potential of artificial intelligence in revolutionizing disease diagnosis and prediction: a comprehensive review of machine learning and deep learning approaches. Eur J Med Res. 2025;30(1):418. pmid:40414894
- 23. Delpino FM, Costa ÂK, Farias SR, Chiavegatto Filho ADP, Arcêncio RA, Nunes BP. Machine learning for predicting chronic diseases: a systematic review. Public Health. 2022;205:14–25. pmid:35219838
- 24. Smith LA, Oakden-Rayner L, Bird A, Zeng M, To M-S, Mukherjee S, et al. Machine learning and deep learning predictive models for long-term prognosis in patients with chronic obstructive pulmonary disease: a systematic review and meta-analysis. Lancet Digit Health. 2023;5(12):e872–81. pmid:38000872
- 25. Zhang H, Wang Y, Xie Y, Wang C, Ma Y, Jin X. Prediction models based on machine learning algorithms for COVID-19 severity risk. BMC Public Health. 2025;25(1):1748. pmid:40361078
- 26. Wilimitis D, Walsh CG. Practical considerations and applied examples of cross-validation for model development and evaluation in health care: tutorial. JMIR AI. 2023;2:e49023. pmid:38875530
- 27. Ponce-Bobadilla AV, Schmitt V, Maier CS, Mensing S, Stodtmann S. Practical guide to SHAP analysis: explaining supervised machine learning model predictions in drug development. Clin Transl Sci. 2024;17(11):e70056. pmid:39463176
- 28. Yen T-Y, Ho C-S, Chen Y-P, Pei Y-C. Diagnostic accuracy of deep learning for the prediction of osteoporosis using plain X-rays: a systematic review and meta-analysis. Diagnostics (Basel). 2024;14(2):207. pmid:38248083
- 29. Zitu MM, Zhang S, Owen DH, Chiang C, Li L. Generalizability of machine learning methods in detecting adverse drug events from clinical narratives in electronic medical records. Front Pharmacol. 2023;14:1218679. pmid:37502211
- 30. Oh SM, Song BM, Nam BH, Rhee Y, Moon SH, Kim DY, et al. Development and validation of osteoporosis risk-assessment model for Korean men. Yonsei Med J. 2016;57(1):187–96. pmid:26632400
- 31. Richardson E, Trevizani R, Greenbaum JA, Carter H, Nielsen M, Peters B. The receiver operating characteristic curve accurately assesses imbalanced datasets. Patterns (N Y). 2024;5(6):100994. pmid:39005487
- 32. Wu Y, Chao J, Bao M, Zhang N. Predictive value of machine learning on fracture risk in osteoporosis: a systematic review and meta-analysis. BMJ Open. 2023;13(12):e071430. pmid:38070927
- 33. He X, Hu W, Zhang Y, Chen M, Ding Y, Yang H, et al. Cellular senescence in skeletal disease: mechanisms and treatment. Cell Mol Biol Lett. 2023;28(1):88. pmid:37891477
- 34. Li K, Hu S, Chen H. Cellular senescence and other age-related mechanisms in skeletal diseases. Bone Res. 2025;13(1):68. pmid:40623977
- 35. Liu Y, Liu Y, Huang Y, Le S, Jiang H, Ruan B, et al. The effect of overweight or obesity on osteoporosis: a systematic review and meta-analysis. Clin Nutr. 2023;42(12):2457–67. pmid:37925778
- 36. Yu X, Zheng Y, Liu Y, Han P, Chen X, Zhang N, et al. Association of osteoporosis with sarcopenia and its components among community-dwelling older Chinese adults with different obesity levels: a cross-sectional study. Medicine (Baltimore). 2024;103(24):e38396. pmid:38875436
- 37. Partridge NC, Lacruz RS. Ca2+-activated chloride channel ANO1: a new regulator of osteoclast function. Cell Calcium. 2022;106:102633. pmid:35908317
- 38. Tiyasatkulkovit W, Aksornthong S, Adulyaritthikul P, Upanan P, Wongdee K, Aeimlapa R, et al. Excessive salt consumption causes systemic calcium mishandling and worsens microarchitecture and strength of long bones in rats. Sci Rep. 2021;11(1):1850. pmid:33473159
- 39. Yoo J-I, Park SY, Kim D-Y, Ha J, Rhee Y, Hong N, et al. Effectiveness and usefulness of bone turnover marker in osteoporosis patients: a multicenter study in Korea. J Bone Metab. 2023;30(4):311–7. pmid:38073264
- 40. Amer OE, Wani K, Ansari MGA, Alnaami AM, Aljohani N, Abdi S, et al. Associations of bone mineral density with RANKL and osteoprotegerin in arab postmenopausal women: a cross-sectional study. Medicina (Kaunas). 2022;58(8):976. pmid:35893092
- 41. Lei S, Zhang X, Song L, Wen J, Zhang Z, Tian J, et al. Expert consensus on vitamin D in osteoporosis. Ann Jt. 2025;10:1. pmid:39981430
- 42. Chen X, Shen L, Gao C, Weng R, Fan Y, Xu S, et al. Vitamin D status and its associations with bone mineral density, bone turnover markers, and parathyroid hormone in Chinese postmenopausal women with osteopenia and osteoporosis. Front Nutr. 2024;10:1307896. pmid:38268673
- 43. Bhattoa HP, Vasikaran S, Trifonidi I, Kapoula G, Lombardi G, Jørgensen NR, et al. Update on the role of bone turnover markers in the diagnosis and management of osteoporosis: a consensus paper from The European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO), International Osteoporosis Foundation (IOF), and International Federation of Clinical Chemistry and Laboratory Medicine (IFCC). Osteoporos Int. 2025;36(4):579–608. pmid:40152990
- 44. Huang W-C, Chen I-S, Yu H-C, Chen C-S, Wu F-Z, Hsu C-L, et al. A simple and user-friendly machine learning model to detect osteoporosis in health examination populations in Southern Taiwan. Bone Rep. 2025;24:101826. pmid:39896106
- 45. Je M, Hwang S, Lee S, Kim Y. Development and evaluation of a machine learning model for osteoporosis risk prediction in Korean women. BMC Womens Health. 2025;25(1):146. pmid:40155887
- 46. Wu X, Park S. A Prediction model for osteoporosis risk using a machine-learning approach and its validation in a large cohort. J Korean Med Sci. 2023;38(21):e162. pmid:37270917