Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Explainable mortality prediction models incorporating social health determinants and physical frailty for heart failure patients

  • Zhenyue Gao ,

    Contributed equally to this work with: Zhenyue Gao, Xiaoli Liu

    Roles Conceptualization, Data curation, Methodology

    Affiliations Medical Innovation Research Department, The General Hospital of PLA, Beijing, China, Beijing Engineering Research Center of Industrial Spectrum Imaging, School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

  • Xiaoli Liu ,

    Contributed equally to this work with: Zhenyue Gao, Xiaoli Liu

    Roles Conceptualization, Data curation, Formal analysis

    Affiliations Medical Innovation Research Department, The General Hospital of PLA, Beijing, China, School of Biological Science and Medical Engineering, Beihang University, Beijing, China

  • Yu Kang,

    Roles Data curation, Formal analysis

    Affiliation Department of Cardiology, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China

  • Pan Hu,

    Roles Investigation, Resources, Supervision

    Affiliations Department of Anesthesiology, The 920 Hospital of Joint Logistic Support Force of Chinese PLA, Kunming Yunnan, China, Department of Critical Care Medicine, The First Medical Center, The General Hospital of PLA, Beijing, China

  • Xiu Zhang,

    Roles Data curation, Investigation

    Affiliation Rehabilitation Medicine Center, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China

  • Mengwei Li,

    Roles Data curation, Formal analysis

    Affiliation Medical Innovation Research Department, The General Hospital of PLA, Beijing, China

  • Yumeng Peng,

    Roles Conceptualization, Investigation, Validation, Visualization

    Affiliation Medical Innovation Research Department, The General Hospital of PLA, Beijing, China

  • Wei Yan,

    Roles Methodology, Software, Supervision

    Affiliation Department of Hyperbaric Oxygen Therapy, the First Medical Center, Chinese PLA General Hospital, Beijing, China

  • Muyang Yan,

    Roles Conceptualization, Formal analysis

    Affiliation Department of Hyperbaric Oxygen Therapy, the First Medical Center, Chinese PLA General Hospital, Beijing, China

  • Pengming Yu,

    Roles Investigation, Supervision

    Affiliation Rehabilitation Medicine Center, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China

  • Zhengbo Zhang ,

    Roles Funding acquisition, Investigation, Resources, Writing – original draft, Writing – review & editing

    ‡ ZZ and QZ also contributed equally to this work.

    Affiliation Medical Innovation Research Department, The General Hospital of PLA, Beijing, China

  • Qing Zhang ,

    Roles Supervision, Validation

    ‡ ZZ and QZ also contributed equally to this work.

    Affiliation Department of Cardiology, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China

  • Wendong Xiao

    Roles Methodology, Validation, Writing – original draft, Writing – review & editing

    wdxiao@ustb.edu.cn

    Affiliation Beijing Engineering Research Center of Industrial Spectrum Imaging, School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

Abstract

There is limited evidence on how social determinants of health (SDOH) and physical frailty (PF) influence mortality prediction in heart failure (HF), particularly for in-hospital, 90-day, and 1-year outcomes. This study aims to develop explainable machine learning (ML) models to assess the prognostic value of SDOH and PF at multiple time points. We analyzed data from adult patients admitted to the intensive care unit (ICU) for the first time with a diagnosis of HF. Key variables extracted from electronic health records included SDOH (e.g., primary language, insurance type), PF indicators (Braden mobility, nutrition, activity, and fall risk scores), vital signs, laboratory tests, and lung sounds (LS) from both ICU admission and discharge. We employed the eXtreme Gradient Boosting (XGBoost) algorithm to build models for short- and long-term mortality prediction, and used SHapley Additive exPlanations (SHAP) to interpret model outputs and quantify the importance of each feature. The observed mortality rates were 14.8% in-hospital (n = 12,856), 7.0% at 90 days (n = 10,990), and 13.5% at 1 year (n = 10,221). The prediction models achieved area under the receiver operating characteristic curve (AUROC) scores of 0.836 (95% CI: 0.831–0.844) for in-hospital, 0.790 (95% CI: 0.780–0.800) for 90-day, and 0.789 (95% CI: 0.780–0.799) for 1-year mortality. These models outperformed baseline ML algorithms and conventional clinical risk scores. Key predictors of HF outcomes included age, fall risk, primary language, blood urea nitrogen, comorbidities, urine output, insurance type, and LS findings. Incorporating PF at ICU admission and discharge, along with SDOH such as language proficiency and insurance status, could enhance the identification of high-risk HF patients and may inform targeted interventions.

Introduction

Heart failure (HF) is a growing global health burden, affecting an estimated 64 million adults worldwide and contributing substantially to hospitalizations and healthcare costs [13]. Up to 25–50% of hospitalized HF patients require admission to the intensive care unit (ICU), where they face high risks of in-hospital and post-discharge mortality [4,5]. Despite advances in treatment, HF remains the leading cause of death among cardiovascular diseases globally [6]. Accurate mortality prediction is essential for improving HF management, particularly in identifying high-risk patients and allocating healthcare resources effectively [7,8].

Several risk stratification tools for HF have been developed, including the Get With The Guidelines–Heart Failure (GWTG-HF), ADHERE, and OPTIMIZE-HF scores [911]. These models, grounded in traditional linear regression, offer interpretability but are limited in their ability to capture complex, nonlinear interactions among risk factors. In contrast, machine learning (ML) approaches have shown promise in enhancing predictive performance by modeling high-dimensional, nonlinear relationships. However, many ML models overlook key prognostic factors such as social determinants of health (SDOH), physical frailty (PF), and lung sounds (LS), which are increasingly recognized as influential in HF outcomes [12].

SDOH—including socioeconomic status, access to care, and language proficiency—significantly impact disease progression and outcomes in HF [13,14]. Language barriers, in particular, can hinder effective communication during care transitions and discharge, leading to higher readmission rates [15,16]. In the United States, where over 26 million individuals have limited English proficiency, the prognostic role of language remains underexplored in both short- and long-term HF outcomes [15,17]. Insurance type (e.g., Medicare, Medicaid) may also reflect socioeconomic vulnerability, yet its utility as a stratification factor in HF prognosis is not well established [18,19].

PF further complicates HF management. Frailty, characterized by reduced physiological reserve and increased vulnerability to stressors, affects up to 79% of HF patients and is associated with worse clinical outcomes [19,20]. While various tools assess frailty, the Braden Scale—routinely used by nurses—offers a practical measure encompassing mobility, nutrition, and fall risk [21,22]. Previous studies have identified components of the Braden Scale, including fall risk, as independent predictors of mortality in HF [22,23]. Nevertheless, research in this area is limited by small sample sizes, restricted variable sets, and inconsistent outcome definitions.

LS also hold potential prognostic value in HF. Abnormal auscultatory findings such as crackles are common during HF exacerbations and may precede clinical deterioration [24,25]. Despite their relevance, LS are rarely incorporated into existing predictive models.

To address these limitations, there is a critical need for advanced, explainable ML models that integrate SDOH, PF, and LS to enhance risk stratification across care timelines. In this study, we aimed to develop and validate interpretable ML models to predict in-hospital, 90-day, and 1-year mortality among ICU-admitted HF patients. We further examined the contribution of SDOH, PF, and LS features to model performance and explored their potential as prognostic indicators to inform early clinical decision-making.

Materials and methods

We performed a retrospective study using open-access databases including the Medical Information Mart for Intensive Care Database v1.4 (MIMIC-III, CareVue) and MIMIC-IV v1.0 (part), which were collected from the Beth Israel Deaconess Medical Center in Boston from 2001 to 2008 and 2008–2016, respectively [26,27]; An overview of the study flow is presented in Fig 1.

thumbnail
Fig 1. Study design and survival outcomes across key subgroups.

(a) The study flows. (b) An overview of inclusion criteria with all study cohorts. (c)-(e) The K-M curves of age groups, primary language, and insurance.

https://doi.org/10.1371/journal.pone.0327979.g001

Study cohorts

We included all first-time ICU admissions for HF patients over 16 years old according to the International Classification of Diseases diagnostic codes [28]. We excluded the patients with unknown outcomes after discharge within 1 year and who stayed in the ICU for less than 24 hours. The inclusion criteria of the three study cohorts were displayed in Fig 1. Hospital survivors were used to analyze the out-of-hospital outcomes.

Candidate variables

Data collected included the eight types of information for model development: Basic information of age, gender, weight, body mass index (BMI), Charlson Comorbidity Index (CCI), etc.; SDOH including primary language, insurance, marital status; Vital signs such as heart rate, respiratory rate, systolic blood pressure (SBP), Glasgow Coma Scale (GCS), etc.; Laboratory tests like glucose, creatinine, blood urea nitrogen (BUN), total bilirubin, etc.; Urine output; Treatments received like mechanical ventilation and vasopressors etc.; PF assessments including activity, fall risks, Braden nutrition and Braden activity; LS covering right and left sides. Details about all types of candidate variables are provided in the S1 File.

Feature construction

The data measured during the first and last days of ICU admission were utilized to predict 90-day and 1-year mortality outcomes, while only first-day data were used for in-hospital mortality prediction. To distinguish between data measured on the first and last days, features from the last day were suffixed with ‘(leave)’. Additionally, we engineered new features to capture clinically relevant changes during the ICU stay, such as the difference in weight between the first and last day, which could reflect fluid balance and nutritional status, and the ratio of pulse oxygen saturation (SpO2) to fraction of inspired oxygen (FiO2), an indicator of respiratory efficiency.

Representative statistical features were calculated for each variable type to capture essential summary information. For example, we computed the maximum, minimum, mean, difference, and ratio values where appropriate, as these statistics can reflect the severity and variability of a patient’s condition. To address missing data, the median value of each continuous variable was used for imputation, with the exception of FiO2, for which a value of 21% was imputed when missing, aligning with the standard atmospheric oxygen concentration. This approach ensures that the imputed values are both clinically plausible and minimize bias introduced by missing data. Furthermore, for variables with missing data in 30% or more of the patient population, we generated a missing value indicator to preserve information potentially embedded in the absence of data, as missingness itself may carry predictive value [29].

Overall, 79 features were constructed for in-hospital mortality prediction and 122 features for out-of-hospital mortality predictions. A comprehensive list of all variables studied, along with the specific features derived from them, is provided in S1 File. Detailed information on the missing data, including the missing ratios for each feature, can be found in S1 File. This thorough feature construction process ensures that the model captures a wide range of clinically relevant information while addressing potential issues related to missing data.

Data preprocessing

To reduce the potential bias of ML algorithms in vulnerable subpopulations focusing on primary language and insurance characteristics, we adopted the reweighing algorithm before modeling, a data preprocessing technique, to compute the weights of patients to decrease the imbalance and unfairness [30,31].

Model development

The eXtreme Gradient Boosting (XGBoost) algorithm was selected for developing the mortality prediction models for HF patients due to its superior performance in handling structured data and its ability to effectively manage missing values, interactions, and non-linear relationships. XGBoost has consistently demonstrated high predictive accuracy in various healthcare applications, making it a robust choice for this study [32,33]. To provide a comprehensive evaluation, two other ML algorithms, logistic regression (LR) and random forests (RF), were used as baseline models. LR was chosen due to its simplicity, interpretability, and widespread use in clinical settings, while RF was selected for its strong performance in capturing complex interactions and non-linearities. By comparing the performance of XGBoost against these baseline models, we aim to illustrate the advantages of using advanced ML techniques in predicting both short- and long-term mortality outcomes in HF patients. For model development, all patients were randomly split into an 80% training set, which was used for training the models and tuning hyperparameters via grid search, and a 20% testing set, which was reserved for evaluating model performance.

Model evaluation

The discrimination performance of our prediction models was assessed on the test set, comparing against the baseline models and conventional clinical scoring systems including SOFA and GWTG-HF score. Seven evaluation metrics were calculated with their 95% confidence intervals (95% CI), including the area under the curve of the receiver operating characteristic curve (AUROC), sensitivity, specificity, accuracy, F1 score, precision, and area under the precision recall curve.

Model interpretation

The SHapley Additive exPlanations (SHAP) technique, which is based on a game theory framework, is a popular method for assessing the impact of a feature on model predictions and ranking its importance and relevance [34,35]. The Shapley value significantly enhances the interpretability of intricate ML models. We elected to utilize SHAP to elucidate the relative importance of diverse features that collectively influenced mortality prediction outcomes.

Statistical analysis

The medians and interquartile ranges (IQR) for continuous variables were presented. The t test or Wilcoxon Rank Sum Test was used when appropriate to compare between survivors and non-survivors with HF. Categorical variables were reported by the total number and percentage. Two-sided p-values of less than.05 were considered statistically significant.

Ethical statement

This study was exempt from institutional review board approval due to the retrospective design and lack of direct patient intervention. All data from patients were retrospectively collected from the electronic health care records systems (in the form of third-party public databases or hospital health care systems), which originated from daily clinical work.

All data were de-identified before the analysis. Third-party public databases were used in this study. The institutional review boards of the Massachusetts Institute of Technology (number 0403000206) and Beth Israel Deaconess Medical Center (number 2001-P-001699/14) approved the use of the database for research.

The requirement for individual patient consent was waived because the study did not impact clinical care, all protected health information was deidentified, and all available data in the databases were anonymous.

Results

Patient characteristics

The three study cohorts comprised 12,856 in-hospital patients (14.8% in-hospital mortality), 10,990 90-day discharged patients (7.0% 90-day mortality), and 10,990 1-year discharged patients (13.5% 1-year mortality), respectively. The characteristics of the aforementioned groups are presented in Table 1. In comparison to survivors, patients who experienced unfavorable outcomes demonstrated a higher prevalence of advanced age, lower BMI, a greater proportion of urgent admissions, elevated CCI scores, elevated fall risk, a predominantly bedridden status, poor nutritional status, markedly limited mobility, muddy LS, a higher proportion of non-English primary languages, and Medicare insurance coverage. Additionally, these patients exhibited prolonged ICU and hospital stay durations. Furthermore, a consistent observation was made of a decrease in weight, limited activity, increased fall risk, increased poor malnutrition, muddy LS, Medicare, and limited language speaking when comparing each term of bad outcome (see S1 File). However, higher mechanical ventilation was only observed in hospital non-survivors.

thumbnail
Table 1. The baseline characteristic of the total study cohorts divided by target outcomes.

https://doi.org/10.1371/journal.pone.0327979.t001

Model performance evaluation

The performance of the models was evaluated on the test sets using AUROC curves, as illustrated in Fig 2. The discrimination of the prediction models demonstrated satisfactory performance, as evidenced by the AUROC values, which were 0.836 (95% CI: 0.831–0.844) for in-hospital mortality, 0.790 (95% CI: 0.780–0.800) for 90-day mortality, and 0.789 (95% CI: 0.780–0.799) for 1-year mortality. The AUROC for short-term outcome prediction was relatively higher than that for long-term prediction. Furthermore, our model was compared against two baseline ML models and conventional clinical scores for all outcomes of interest, with a total of seven metrics considered in detail. As shown, our prediction models consistently demonstrated superior performance compared to all other models and scores.

thumbnail
Fig 2. The comparison of the three final models against 2 baseline ML models and conventional clinical scores.

(a) in-hospital, (b) 90-day and (c) 1-year.

https://doi.org/10.1371/journal.pone.0327979.g002

Feature importance and interpretation

In Fig 3, we present the top 20 risk factors for in-hospital, 90-day, and 1-year mortality predictions, with all feature rankings provided in S1 File. The relative importance of features is indicated by their position on the y-axis, with the most influential features positioned higher. The x-axis represents the SHAP value, which reflects the impact of each feature on the model’s predictions. A positive SHAP value indicates that the feature contributes to an elevated risk of mortality. In the case of continuous features, a color gradient from red to blue is used to indicate a decrease in feature value, with red representing a higher value and blue representing a lower value. In the case of binary features, the color red indicates the presence of a condition (e.g., “yes”), while blue represents its absence (“no”).

thumbnail
Fig 3. The models’ interpretation of the top 20 important features based on the optimal models.

(a) in-hospital mortality. (b) 90-day mortality. (c) 1-year mortality. The higher the SHAP value a feature is given, the higher the risk of death for the patient. The red part in feature value represents a higher value. (d) Alluvial plot of the top 20 risk factors for outcome predictions of HF patients.

https://doi.org/10.1371/journal.pone.0327979.g003

As illustrated in Fig 3, several factors, including the GCS, urine output, BUN, age, respiratory rate, SBP, and weight, were identified as crucial for the early assessment of in-hospital mortality. It is noteworthy that Braden mobility and nutrition, activity, and right upper lobe LS emerged as novel and significant predictors, ranked 3rd, 11th, 12th, and 18th, respectively. These findings indicate that integrating functional and nutritional status with traditional vital signs may facilitate enhanced early risk stratification in clinical practice.

In Fig 3, the pre-ICU discharge status indicators, including BUN, temperature, urine output, SBP, respiratory rate, and weight, exhibited a more pronounced correlation with the 90-day outcomes. Age and fall risk (leave) were the most significant predictors, indicating that elderly patients with impaired mobility or a higher fall risk at discharge are at an elevated risk for adverse outcomes. The prominence of primary language and insurance status (SDOH) ranked 4th and 11th, respectively, underscoring the necessity of considering social factors in post-discharge care planning. This could facilitate the implementation of tailored interventions and follow-up strategies.

Our findings indicate that a combination of basic demographic information, admission, and discharge status in the ICU is a significant predictor of one-year outcomes in Fig 3. Factors such as primary language, CCI score, age, BMI, pre-ICU length of stay (LOS), and insurance status were identified as the top 13 predictors. Furthermore, creatinine, chloride, and platelet levels were identified as significant biomarkers, ranking 9th, 17th, and 20th, respectively. The assessment of fall risk prior to discharge was identified as the most critical predictor, ranking 1st, emphasizing the importance of comprehensive discharge assessments to predict long-term outcomes.

In order to establish a connection between model explanation and clinical application, we presented the practical utility of SHAP values in S1 File through the use of two case studies, one survivor and one non-survivor, for each outcome prediction. The case studies demonstrate how SHAP values can be employed to identify pivotal risk factors for individual patients, thereby facilitating the formulation of personalized treatment plans. For example, a high SHAP value for BUN or fall risk could prompt closer monitoring or early intervention to mitigate the identified risks. By translating model explanations into actionable insights, clinicians can enhance decision-making and patient outcomes through more targeted and informed care strategies.

Contribution of SDOH, PF, and LS

The performance contribution of SDOH (insurance and primary language), PF (fall risk, activity, Braden nutrition, Braden mobility, Braden activity) and LS were separately evaluated by removing them to re-train all outcome prediction models. In Fig 4 and S1 File displayed the AUROC comparisons with the fully models. Our findings indicated that all of the factors could enhance the discrimination, and it was obvious in 90-day and 1-year outcome prediction.

thumbnail
Fig 4. The reduction of discrimination when dropping the variables of SDOH, PF and LS.

(a) in-hospital mortality. (b) 90-day mortality. (c) 1-year mortality.

https://doi.org/10.1371/journal.pone.0327979.g004

Sensitivity analysis of different variables sets

We performed two types of sensitivity analysis to acquire the stability models. The top 5 to all variables (interval 5) were used for model building to analyze the change of discrimination in all outcome predictions. S1 File present the variation of AUROC in the development and test sets for three tasks. For the short-term prediction, AUROC over 0.8 when 40 features, importance ranked, were included. For the long-term predictions, the models can achieve relatively acceptable predictive performance when including 30 ranked features. Furthermore, the model with only 5 features outperformed both clinical scores in in-hospital mortality prediction. The importance of assessing HF patients on their first and last days in ICU for long-term prognosis was analyzed separately, shown in S1 File. Out-of-hospital outcomes can be better evaluated by the status assessment of patients before ICU discharge than admission. Combining the two parts allows for more precise prediction.

Discussion

This retrospective prognostic study developed and validated three ML models to predict in-hospital, 90-day, and 1-year mortality in critically ill patients with HF, based on data from the first and last ICU admission days. To enhance model performance and mitigate disparities, we incorporated key variables including SDOH such as primary language and insurance type; PF indicators including activity level, fall risk, Braden mobility, and nutrition scores; and LS. Our findings demonstrate that these features significantly improved the predictive accuracy of the models, particularly in long-term outcome assessments.

We observed that different variables contributed uniquely to short- versus long-term prognostication. Braden mobility, nutrition, and activity were more influential in predicting in-hospital outcomes, whereas fall risk emerged as a stronger predictor for 90-day and 1-year mortality. The inclusion of SDOH became increasingly important in out-of-hospital outcome prediction. Across all timepoints, our models consistently outperformed baseline ML models and widely used clinical scores such as SOFA and GWTG-HF, achieving AUROCs of 0.836 (in-hospital), 0.790 (90-day), and 0.789 (1-year).

To evaluate generalizability, we conducted an external validation using the eICU Collaborative Research Database, which includes ICU admissions from over 200 U.S. hospitals. Although follow-up data for 90-day and 1-year outcomes were not available, the model performed well in predicting in-hospital mortality, with an AUROC of 0.769 (see S1 File).

A novel contribution of our study is the inclusion of primary language as a predictor variable—an aspect rarely considered in previous HF prognostic models. Language proficiency ranked as the 13th, 4th, and 2nd most important feature for in-hospital, 90-day, and 1-year mortality, respectively (Fig 3). Kaplan-Meier curves revealed substantial survival differences between English-proficient and non-proficient patients (Fig 1), underscoring the need to incorporate language into risk stratification or to examine it as a separate analytic factor.

Similarly, we found that insurance type—particularly Medicare, which predominantly covers older adults and those with disabilities—was associated with long-term outcomes. Patients covered by Medicare showed lower survival rates, reflecting both age-related and socioeconomic vulnerabilities (Fig 1). While prior studies have examined the link between insurance and HF outcomes [41, 42], our study is among the first to evaluate insurance as a formal input to prediction models. Its inclusion ranked within the top 13 features (Fig 3), and model performance declined when the insurance variable was removed (see S1 File), reinforcing its prognostic value.

We further assessed PF by analyzing Braden scale components and activity status. Braden mobility, nutrition, and activity were significant predictors of in-hospital outcomes, reflecting the patient’s immediate functional capacity. In contrast, fall risk at ICU discharge played a more critical role in long-term outcome prediction, suggesting its potential utility for post-discharge care planning and rehabilitation.

Several conventional clinical variables, including age, BUN, and the CCI score, consistently ranked as top predictors across all models, reinforcing their clinical relevance. Notably, urine output on the last ICU day ranked 6th in importance for long-term outcomes, indicating its role as a potential marker of renal function and volume status. Conversely, right upper lobe LS emerged as a relevant predictor primarily for in-hospital mortality (ranked 18th), likely due to its association with acute pulmonary complications rather than long-term prognosis.

Our work contributes several key advances to the field of HF outcome prediction. We demonstrated that integrating SDOH—specifically primary language and insurance type—can enhance long-term risk assessment. We also expanded the consideration of frailty metrics, including fall risk, which proved to be a meaningful predictor of post-discharge mortality. By selecting variables readily available at ICU admission and discharge, we developed robust, explainable ML models that outperform existing tools and support early identification of high-risk HF patients both during hospitalization and after discharge.

Study limitations

Several limitations should be acknowledged in this study. First, the retrospective nature of the analysis and the use of data from a single institution may limit the generalizability of the findings to broader HF populations and diverse clinical settings. Second, the prediction models were developed using data only from the first and last ICU days, excluding the intermediate course of ICU treatment. This omission may have led to a loss of clinically relevant information, potentially affecting model performance. Third, in order to simplify feature extraction, we represented dynamic variables using summary statistics such as maximum, minimum, or mean values over specified time intervals. While this approach facilitated model development, it disregarded the temporal patterns and variability inherent in time-series data, which may contain important prognostic information. Incorporating full time-series dynamics will be a focus of future research.

Conclusion

In this prognostic study, we developed and validated three outcome prediction models for patients with HF, targeting in-hospital, 90-day, and 1-year mortality. The models integrated a comprehensive set of predictors, including demographic and clinical variables (age, CCI score, BMI), PF indicators (Braden mobility and nutrition, activity level, and fall risk), SDOH (primary language and insurance status), vital signs (GCS, systolic blood pressure, respiratory rate, and temperature), laboratory values (BUN), urine output, and LS. These variables collectively enabled more accurate and individualized risk stratification for both short- and long-term outcomes. The models demonstrated strong discriminatory power and interpretability, offering promising tools to support clinical decision-making and optimize discharge planning for patients with HF.

Supporting information

S1 File. Supplementary materials: cohort characteristics, detailed model performance, and feature analysis for mortality risk prediction.

https://doi.org/10.1371/journal.pone.0327979.s001

(DOCX)

References

  1. 1. Virani SS, Alonso A, Benjamin EJ, Bittencourt MS, Callaway CW, Carson AP, et al. Heart disease and stroke statistics-2020 update: a report from the American Heart Association. Circulation. 2020;141(9):e139–596. pmid:31992061
  2. 2. Ambrosy AP, Fonarow GC, Butler J, Chioncel O, Greene SJ, Vaduganathan M, et al. The global health and economic burden of hospitalizations for heart failure: lessons learned from hospitalized heart failure registries. J Am Coll Cardiol. 2014;63(12):1123–33. pmid:24491689
  3. 3. Akintoye E, Briasoulis A, Egbe A, Dunlay SM, Kushwaha S, Levine D, et al. National trends in admission and in-hospital mortality of patients with heart failure in the United States (2001-2014). J Am Heart Assoc. 2017;6(12):e006955. pmid:29187385
  4. 4. Jentzer JC, Reddy YN, Rosenbaum AN, Dunlay SM, Borlaug BA, Hollenberg SM. Outcomes and predictors of mortality among cardiac intensive care unit patients with heart failure. J Card Fail. 2022;28(7):1088–99. pmid:35381356
  5. 5. Cook C, Cole G, Asaria P, Jabbour R, Francis DP. The annual global economic burden of heart failure. Int J Cardiol. 2014;171(3):368–76. pmid:24398230
  6. 6. Tuegel C, Bansal N. Heart failure in patients with kidney disease. Heart. 2017;103(23):1848–53. pmid:28716974
  7. 7. Adler ED, Voors AA, Klein L, Macheret F, Braun OO, Urey MA, et al. Improving risk prediction in heart failure using machine learning. Eur J Heart Fail. 2020;22(1):139–47. pmid:31721391
  8. 8. Angraal S, Mortazavi BJ, Gupta A, Khera R, Ahmad T, Desai NR, et al. Machine learning prediction of mortality and hospitalization in heart failure with preserved ejection fraction. JACC Heart Fail. 2020;8(1):12–21. pmid:31606361
  9. 9. Peterson PN, Rumsfeld JS, Liang L, Albert NM, Hernandez AF, Peterson ED, et al. A validated risk score for in-hospital mortality in patients with heart failure from the American Heart Association get with the guidelines program. Circ Cardiovasc Qual Outcomes. 2010;3(1):25–32. pmid:20123668
  10. 10. White-Williams C, Rossi LP, Bittner VA, Driscoll A, Durant RW, Granger BB, et al. Addressing social determinants of health in the care of patients with heart failure: a scientific statement from the American Heart Association. Circulation. 2020;141(22):e841–63. pmid:32349541
  11. 11. Abraham WT, Fonarow GC, Albert NM, Stough WG, Gheorghiade M, Greenberg BH, et al. Predictors of in-hospital mortality in patients hospitalized for heart failure: insights from the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF). J Am Coll Cardiol. 2008;52(5):347–56. pmid:18652942
  12. 12. Sterling MR, Ringel JB, Pinheiro LC, Safford MM, Levitan EB, Phillips E, et al. Social determinants of health and 90-day mortality after hospitalization for heart failure in the REGARDS study. J Am Heart Assoc. 2020;9(9):e014836. pmid:32316807
  13. 13. Enard KR, Coleman AM, Yakubu RA, Butcher BC, Tao D, Hauptman PJ. Influence of social determinants of health on heart failure outcomes: a systematic review. J Am Heart Assoc. 2023;12(3):e026590. pmid:36695317
  14. 14. Averbuch T, Mohamed MO, Islam S, Defilippis EM, Breathett K, Alkhouli MA, et al. The association between socioeconomic status, sex, race / ethnicity and in-hospital mortality among patients hospitalized for heart failure. J Card Fail. 2022;28(5):697–709. pmid:34628014
  15. 15. Latif Z, Makuvire T, Feder SL, Garan AR, Pinzon PQ, Warraich HJ, et al. Challenges facing heart failure patients with limited english proficiency: a qualitative analysis leveraging interpreters’ perspectives. JACC Heart Fail. 2022;10(6):430–8.
  16. 16. Seman M, Karanatsios B, Simons K, Falls R, Tan N, Wong C, et al. The impact of cultural and linguistic diversity on hospital readmission in patients hospitalized with acute heart failure. Eur Heart J Qual Care Clin Outcomes. 2020;6(2):121–9. pmid:31332442
  17. 17. Dietrich S, Hernandez E. Language use in the United States: 2019. Am Community Surv Rep. 2022.
  18. 18. Wadhera RK, Joynt Maddox KE, Wang Y, Shen C, Yeh RW. 30-Day episode payments and heart failure outcomes among medicare beneficiaries. JACC Heart Fail. 2018;6(5):379–87.
  19. 19. Blecker S, Herrin J, Li L, Yu H, Grady JN, Horwitz LI. Trends in hospital readmission of medicare-covered patients with heart failure. J Am Coll Cardiol. 2019;73(9):1004–12.
  20. 20. Somech J, Joshi A, Mancini R, Chetrit J, Michel C, Sheppard RJ, et al. Impact of physical frailty on survival and quality of life in heart failure patients. J Am Coll Cardiol. 2022;79(9):389.
  21. 21. Kozier B. Fundamentals of nursing: concepts, process and practice. Pearson Education; 2008.
  22. 22. Bandle B, Ward K, Min S-J, Drake C, McIlvennan CK, Kao D, et al. Can braden score predict outcomes for hospitalized heart failure patients? J Am Geriatr Soc. 2017;65(6):1328–32. pmid:28221672
  23. 23. Manemann SM, Chamberlain AM, Boyd CM, Miller DM, Poe KL, Cheville A, et al. Fall risk and outcomes among patients hospitalized with cardiovascular disease in the community. Circ Cardiovasc Qual Outcomes. 2018;11(8):e004199. pmid:30354374
  24. 24. Wang Z, Baumann BM, Slutsky K, Gruber KN, Jean S. Respiratory sound energy and its distribution patterns following clinical improvement of congestive heart failure: a pilot study. BMC Emerg Med. 2010;10:1. pmid:20078862
  25. 25. Cottin V, Cordier J-F. Velcro crackles: the key for early diagnosis of idiopathic pulmonary fibrosis? Eur Respir J. 2012;40(3):519–21. pmid:22941541
  26. 26. Johnson AEW, Pollard TJ, Shen L, Lehman L-WH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. pmid:27219127
  27. 27. Johnson AE, Stone DJ, Celi LA, Pollard TJ. The MIMIC Code Repository: enabling reproducibility in critical care research. J Am Med Inform Assoc. 2018;25(1):32–9. pmid:29036464
  28. 28. Xu Y, Lee S, Martin E, D’souza AG, Doktorchik CTA, Jiang J, et al. Enhancing ICD-code-based case definition for heart failure using electronic medical record data. J Card Fail. 2020;26(7):610–7. pmid:32304875
  29. 29. Liu X, DuMontier C, Hu P, Liu C, Yeung W, Mao Z, et al. Clinically interpretable machine learning models for early prediction of mortality in older patients with multiple organ dysfunction syndrome: an international multicenter retrospective study. J Gerontol A Biol Sci Med Sci. 2023;78(4):718–26. pmid:35657011
  30. 30. Kamiran F, Calders T. Data preprocessing techniques for classification without discrimination. Knowl Inf Syst. 2011;33(1):1–33.
  31. 31. Park Y, Hu J, Singh M, Sylla I, Dankwa-Mullan I, Koski E, et al. Comparison of methods to reduce bias from clinical prediction models of postpartum depression. JAMA Netw Open. 2021;4(4):e213909. pmid:33856478
  32. 32. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. New York: Association for Computing Machinery; 2016. p. 785–94.
  33. 33. Ye C, Li J, Hao S, Liu M, Jin H, Zheng L, et al. Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. Int J Med Inform. 2020;137:104105. pmid:32193089
  34. 34. Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2(10):749–60. pmid:31001455
  35. 35. Xue B, Li D, Lu C, King CR, Wildes T, Avidan MS, et al. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw Open. 2021;4(3):e212240. pmid:33783520