Figures
Abstract
Background and objective
Despite advances in intensive care, sepsis remains a leading cause of mortality in intensive care unit (ICU) patients, especially middle-aged and elderly individuals. Given the limitations of conventional scoring systems and the interpretability challenges of machine learning models, this study aims to develop and temporally validate a nomogram for predicting 28-day ICU mortality in middle-aged and elderly sepsis patients via the eICU database (2014--2015), providing a clinically practical prediction tool.
Methods
This retrospective study included 13,717 sepsis patients aged ≥45 years. The cohort was temporally divided into training (n = 6,397, 2014) and validation (n = 7,320, 2015) sets. Variable selection was performed via random forest importance ranking and LASSO regression. A nomogram was developed on the basis of multivariable logistic regression analysis.
Results
The 28-day ICU mortality rates were 9.08% and 9.49% in the training and validation cohorts, respectively. The final nomogram incorporated 11 independent predictors: red cell distribution width (RDW), SOFA score, lactate, pH, 24-hour urine output, platelet count, total protein, temperature, heart rate, GCS score, and white blood cell (WBC) count. The model showed good discrimination in both the training (AUC: 0.805) and validation (AUC: 0.756) cohorts. The calibration curves demonstrated good agreement between the predicted and observed probabilities.
Citation: She X, Zhao X, Yang H, Cui X (2025) Development and temporal validation of a nomogram for predicting ICU 28-day mortality in middle-aged and elderly sepsis patients: An eICU database study. PLoS One 20(7): e0328701. https://doi.org/10.1371/journal.pone.0328701
Editor: Robert Jeenchen Chen, Stanford University School of Medicine, UNITED STATES OF AMERICA
Received: March 29, 2025; Accepted: June 30, 2025; Published: July 21, 2025
Copyright: © 2025 She et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The original source data are available at https://eicu-crd.mit.edu/. The specific dataset used for this study, including extracted variables and processed data for analysis, is available in the Dryad Digital Repository at https://doi.org/10.5061/dryad.hmgqnk9wb.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: ICU, Intensive Care Unit; SOFA, Sequential Organ Failure Assessment; SIRS, Systemic Inflammatory Response Syndrome; MODS, Multiple Organ Dysfunction Syndrome; APACHE, Acute Physiology and Chronic Health Evaluation; SAPS, Simplified Acute Physiology Score; ML, Machine Learning; GCS, Glasgow Coma Scale; RDW, Red Cell Distribution Width; WBC, White Blood Cell; PLT, Platelet; BMI, Body Mass Index; MAP, Mean Arterial Pressure; EPV, Events Per Variable; PPV, Positive Predictive Value; NPV, Negative Predictive Value; DCA, Decision Curve Analysis; AUC, Area Under the Curve.
Introduction
Sepsis is a common critical illness in the intensive care unit (ICU) and is characterized by systemic inflammatory response syndrome (SIRS) and multiple organ dysfunction syndrome (MODS) caused by infection [1]. Despite advances in diagnosis and organ support in intensive care units, sepsis continues to have high morbidity and mortality rates [2]. According to recent meta-analyses, the hospital mortality rate for sepsis patients is 26.7%, with an even higher rate of 41.9% among ICU patients [3]. In the ICU, mortality rates among sepsis patients are significantly higher than those in the general patient population [4]. Moreover, with an aging population, the number of elderly sepsis patients continues to increase, and these patients face increased mortality risk [5,6].
Early identification and accurate prediction of mortality risk in elderly sepsis patients are essential for improving clinical outcomes. While existing scoring systems provide valuable guidance in clinical practice, our nomogram model aims to offer more individualized risk assessment for middle-aged and elderly sepsis patients by integrating multiple clinical and laboratory parameters. This approach has the potential to complement existing tools, particularly in identifying high-risk patients within this specific population. Furthermore, these traditional scoring systems incorporate limited predictive variables, thus failing to comprehensively capture the complex pathophysiological processes underlying sepsis.
In recent years, machine learning (ML) models have been widely applied in the prognostic prediction of sepsis patients. For example, Liu et al. developed a stacking ensemble ML model based on the MIMIC-IV database to predict in-hospital mortality risk in patients with sepsis-induced coagulopathy. Their model identified anion gap and age as the most crucial predictive features, demonstrating robust predictive performance (AUC = 0.795, 95% CI: 0.763–0.827) [7]. Additionally, Wang et al. (2022) reported that the LightGBM model outperformed other ML algorithms in predicting 30-day mortality among sepsis patients, achieving an AUC of 0.90 [8]. Machine learning models demonstrate significant advantages in medical prediction by processing complex nonlinear relationships and integrating numerous clinical variables. However, the ‘black box’ nature of these models (the opacity of the decision-making process and the algorithmic complexity) can make it difficult for clinicians to understand and trust their predictions [9]. Furthermore, the model’s generalizability may be limited by data heterogeneity and temporal variations, limiting its application across different healthcare settings.
With the accumulation of electronic health records and advancements in machine learning technologies, numerous studies have focused on developing models to predict mortality risk in sepsis patients, aiming to provide robust support for clinical decision-making. While Shen et al. employed interpretable machine learning methods with multicenter validation, their study did not address validation differences across different years [10]. Yang et al. developed a conformity prediction model (CPMORS) incorporating model interpretation and uncertainty estimation; however, their validation data were confined to a specific time period, making it difficult to reflect the model’s dynamic performance in clinical practice [11]. Although Zhang et al. developed an XGBoost model on the basis of multiple databases, their study also lacked stratified validation across different years [12]. The importance of temporal validation lies in the fact that diagnostic criteria, treatment protocols, and patient population characteristics may change over time in medical practice, potentially affecting the accuracy and reliability of prediction models. For example, in a gestational diabetes risk prediction study [13], the original model showed reasonable discrimination but suboptimal calibration during temporal validation, highlighting the importance of updating models to maintain their predictive performance in contemporary populations.
Middle-aged and elderly patients are more susceptible to sepsis and have poorer prognoses due to decreased physiological function, increased chronic disease burden, and impaired immune function. Therefore, developing and validating sepsis mortality prediction models specifically for this vulnerable population has significant clinical implications. However, current research on sepsis mortality prediction focuses primarily on model development for the general population, with relatively insufficient targeted studies for middle-aged and elderly patients as a high-risk group. For example, although Zhang et al. (2021) developed the Sepsis Mortality Risk Score (SMRS), their study population did not specifically distinguish middle-aged and elderly patients, and their dataset was limited to 2008--2012, failing to cover a broader time range to evaluate the model’s long-term performance in this age group [14]. Other outcome prediction studies in sepsis patients also lack a specific focus on middle-aged and elderly patients [15,16].
Among medical prediction models, nomograms, as intuitive and practical visualization tools, have been widely applied in prognostic assessment and risk prediction for various diseases [17,18]. By transforming complex predictive models into visual graphics, nomograms enable clinicians to quickly and conveniently calculate patients’ prognostic risk, demonstrating significant advantages in clinical practice. However, nomograms have not yet been developed for sepsis patients.
Given the limitations of existing predictive models, this study leverages the multicenter eICU database, which offers comprehensive and high-quality data particularly suitable for elderly sepsis patients. We aimed to develop a nomogram-based model using cross-year data (2014 for development, 2015 for validation) to predict 28-day ICU mortality in middle-aged and elderly sepsis patients. This novel approach, which incorporates temporal validation and nomogram visualization, addresses a significant gap in previous research. By comparing the validation results across different years, we can assess the model’s stability and generalizability under evolving healthcare environments and treatment guidelines. This comprehensive evaluation provides valuable insights for clinical decision-making in this high-risk population and can also guide future model optimization, facilitating early risk stratification and personalized intervention strategies.
Methods
Data source and ethics approval
We utilized the eICU Collaborative Research Database (eICU-CRD), which contains clinical information from 335 intensive care units across 208 hospitals in the United States from 2014--2015. We accessed the database after completing the CITI program training and obtaining PhysioNet certification (Record ID: 67403327). The database is publicly available and fully deidentified in compliance with HIPAA regulations (Privacert, Cambridge, MA; Certification No. 1031219−2).
Study population
The study enrolled patients who were diagnosed with sepsis upon admission to the intensive care unit (ICU). In accordance with the Sepsis-3 criteria, sepsis was defined as the presence of suspected or documented infection combined with an acute increase in the Sequential Organ Failure Assessment (SOFA) score of ≥2 points from baseline [19]. The identification of infections was performed via the International Classification of Diseases, Ninth Revision (ICD-9) codes, and the physiological parameters necessary for SOFA score calculation were extracted from the Acute Physiology and Chronic Health Evaluation (APACHE) IV dataset [20]. According to the Sepsis-3 criteria, we included patients who were diagnosed with sepsis upon ICU admission. After excluding duplicate admissions, ICU stays less than 24 hours, patients younger than 45 years, and those with missing ICU outcome data, 13,717 patients were finally included and chronologically divided into training (2014, n = 6,397) and validation (2015, n = 7,320) cohorts.
Data collection
Clinical data were collected from various tables within the eICU-CRD during the first 24 hours of ICU admission. Demographic characteristics (age, gender, ethnicity, body mass index [BMI]) and hospital admission data were extracted from the patient and apachePatientResult tables. The ApacheApsVar table provides vital signs (heart rate, respiratory rate, temperature, mean arterial pressure [MAP], and oxygen saturation), treatment-related variables (mechanical ventilation, dialysis, and vasopressor use), and severity scores (Glasgow Coma Scale [GCS], Sequential Organ Failure Assessment [SOFA], Acute Physiology and Chronic Health Evaluation IV [APACHE IV], and Acute Physiology Score III). Laboratory data, including blood gas parameters, complete blood count, blood chemistry, liver function tests, and coagulation profiles, were obtained from the laboratory table. Medical history was identified from the diagnosis table, while the site of infection was extracted from the AdmissionDx table.
Outcome and sample size
Death within 28 days after ICU admission was considered the primary outcome event. In accordance with a previous study [21], the sample size was determined on the basis of the rule of at least 10 events per variable (EPV). A total of 13,717 patients were enrolled, with 581 deaths in the training cohort (n = 6,397) and 695 deaths in the validation cohort (n = 7,320). With 11 variables in the final model and 52.8 events per variable in the training cohort, our sample size substantially exceeded the minimum requirement determined by the EPV approach.
Statistical analysis
The baseline demographic and clinical characteristics of all participants at admission are presented as the means (standard deviations) or medians (interquartile ranges) for continuous variables and as frequencies (percentages) for categorical variables, stratified by training and validation cohorts. Differences between groups were analyzed via the χ² test for categorical variables and one-way ANOVA or the Kruskal-Wallis test for normally and nonnormally distributed continuous variables, respectively.
From the univariate analysis, we identified variables significantly associated with 28-day ICU mortality (p < 0.05), removed those with high multicollinearity (VIF > 5), and further refined our selection through random forest importance ranking and LASSO regression with 10-fold cross-validation, ultimately identifying 11 key predictors for the final model (detailed in S1–S3 Tables and S1 Fig).
Owing to missing data for predictor variables, the final complete case analysis included 420 patients in the training cohort and 491 patients in the validation cohort. To assess potential selection bias, we compared baseline characteristics between patients with complete data (n = 911) and those with missing data (n = 12,806).
The final predictive model was constructed via multivariable logistic regression, incorporating the variables selected through both random forest importance ranking and LASSO regression. Model performance was assessed through the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Calibration was evaluated via calibration curves, and clinical utility was assessed via decision curve analysis (DCA). The final model was visualized as a nomogram.
All the statistical analyses were two-tailed, with P < 0.05 considered statistically significant. Analyses were performed via EmpowerStats (www.empowerstats.com, X&Y Solutions, Inc., Boston, MA) and R software version 4.2.0 (http://www.r-project.org).
Results
The patient selection process is illustrated in Fig 1. From the eICU database (2014--2015), 23,136 patients with a diagnosis of sepsis on ICU admission were initially identified. After several exclusion steps, 13,717 patients were included in the study and divided into training and validation cohorts. After patients with missing predictor variables were excluded, 911 patients remained for the final analysis, comprising 420 patients in the training cohort (2014) and 491 patients in the validation cohort (2015). The 28-day ICU mortality rates were 9.08% and 9.49% in the training and validation cohorts, respectively.
ICU intensive care unit.
To assess potential selection bias, we compared baseline characteristics between patients with complete data (n = 911) and those with missing data (n = 12,806). Patients with complete data demonstrated greater illness severity (SOFA score: 6.3 ± 3.4 vs 4.2 ± 2.8, P < 0.001), whereas other baseline characteristics were largely comparable between the groups (S4 Table).
The baseline characteristics of both cohorts are presented in S5 Table. The training and validation cohorts presented similar demographic and clinical characteristics, with comparable disease severity, as measured by the SOFA and APACHE IV scores. Only minor differences were observed in respiratory parameters, lactate levels, and certain comorbidities between the cohorts.
The comparisons between survivors and nonsurvivors are presented in Table 1. In both cohorts, nonsurvivors were characterized by advanced age and lower BMI, while no significant differences were observed in sex distribution or ethnicity. Nonsurvivors were more frequently admitted through nonemergency departments. With respect to vital signs, nonsurvivors presented significantly higher heart and respiratory rates, accompanied by lower temperature and mean arterial pressure. The laboratory parameters revealed that nonsurvivors had more severe metabolic disturbances, enhanced inflammatory responses, compromised coagulation profiles, and impaired organ function. Disease severity scores (GCS, SOFA, APACHE IV, and APS III) were all significantly worse in nonsurvivors. With respect to comorbidities, nonsurvivors had higher incidences of congestive heart failure, acute myocardial infarction, and pneumonia but lower rates of diabetes. Furthermore, nonsurvivors require more intensive therapeutic interventions, including mechanical ventilation and vasopressor support.
Our model demonstrated slightly better discriminatory power (AUC: 0.813, 95% CI: 0.750–0.877) than the previously published SMRS scoring system [14] (AUC: 0.789, 95% CI: 0.720–0.859) for predicting sepsis mortality in our study population (Fig 2).
SMRS Sepsis Mortality Risk Score.
Our predictive model showed good discrimination in both the training cohort (AUC: 0.805) and the validation cohort (AUC: 0.756), with consistent sensitivity, specificity, and predictive values across cohorts (Table 2, Fig 3). The calibration curves confirmed good agreement between the predicted and observed mortality probabilities (Fig 4). Specifically, the development cohort showed good calibration for predicted probabilities less than 0.6, whereas the validation cohort exhibited better overall calibration performance. Decision curve analysis (DCA) revealed that the model provided greater net benefit than either the “treat all” or “treat none” strategies across threshold probabilities ranging from 0.1--0.8 in the development cohort and from 0.1--0.9 in the validation cohort (S2 Fig).
(A) Development cohort (AUC = 0.805). (B) Validation cohort (AUC = 0.75).
(A) Development cohort. (B) Validation cohort. The red diagonal line represents perfect calibration, the black curve represents the actual calibration, and the yellow shading indicates the 95% confidence interval.
To facilitate clinical application, we developed a nomogram with 11 independent predictors (Fig 5): RDW, SOFA score, lactate, pH, 24-hour urine output, platelet count, total protein, temperature, heart rate, GCS score, and WBC count. For example, a patient with an RDW of 18% (35 points), a SOFA score of 11 (42 points), a lactate level of 8 mmol/L (28 points), a pH of 7.3 (22 points), a 24-h urine output of 1000 mL (58 points), a PLT (platelet) count of 150 × 109/L (32 points), a total protein level of 5 g/dL (25 points), a temperature of 38.5°C (18 points), a heart rate of 120 bpm (31 points), a GCS score of 11 (20 points), and a WBC count of 20 × 109/L (15 points) would total 326 points, corresponding to a predicted 28-day ICU mortality risk of 70%.
To use this nomogram, firs the patient’s value for each variable (RDW, SOFA score, lactate, pH, urine output in the first 24 h, PLT count, total protein, temperature, heart rate, GCS score, and WBC count) was located on the corresponding axis. A vertical line is drawn up to the “Points” axis to determine the points assigned for each variable. All points are summed to obtain the “Total Points”. A vertical line is drawn from the “Total Points” axis down through the “Linear Predictor” line to the “ICU 28-day mortality” line to determine the predicted probability of 28-day ICU mortality.
Discussion
In this study, we analyzed 13,717 middle-aged and elderly sepsis patients from the eICU database (2014--2015). To address the limitations of existing prediction tools, we first evaluated the performance of previously published scoring systems in our cohort. Our analysis revealed that the current SMRS model had a moderate predictive ability (AUC: 0.789) for 28-day ICU mortality, suggesting room for improvement in risk stratification for this specific population.
Our research introduces innovation to the field of sepsis prognostic assessment through the implementation of a nomogram as a visual tool. Compared with traditional prediction methods, nomograms intuitively present the weights of various risk factors, enabling clinicians to more clearly understand each factor’s influence on prognosis. We developed and validated a novel nomogram incorporating 11 clinical predictors that outperforms existing scoring systems, representing the first temporally validated risk stratification tool specifically designed for middle-aged and elderly sepsis patients and addressing a gap in existing assessment approaches regarding visual presentation and clinical practicality.
Recent studies have explored various machine learning approaches for sepsis mortality prediction. Zhang et al. [12] achieved remarkable performance when XGBoost (AUC: 0.94) was used to incorporate inflammatory biomarkers, whereas Bao et al. [16] reported excellent results when Light GBM was used (AUC: 0.99 training/0.96 testing). Our relatively lower AUC may reflect our focus on interpretability and clinical practicality rather than pure predictive performance. The use of readily available clinical predictors makes our model more accessible for routine clinical implementation, particularly in resource-limited settings.
Notably, our findings align with those of previous studies regarding key mortality predictors. The importance of APACHE scores, lactate levels, and organ dysfunction markers has been consistently demonstrated across studies. Yang et al. [11] identified similar risk factors in their CPMORS model (AUC: 0.858 internal/0.800 external), although they emphasized the value of uncertainty quantification through conformal prediction. Zhang et al. [12] further validated these findings via XGBoost (AUC: 0.94), identifying age, AST, invasive ventilation and BUN as the strongest predictors. Our nomogram approach offers comparable performance while maintaining transparency and ease of use.
Our study identified 11 key predictors of sepsis mortality, with RDW, the SOFA score, and the lactate level emerging as the most clinically significant. These three variables reflect critical aspects of sepsis pathophysiology, including the inflammatory status, organ dysfunction, and tissue hypoperfusion.
These findings align with those of previous studies, further substantiating the selection of variables employed in our model and emphasizing the importance of multisystem evaluation in the prognosis of sepsis. Yang et al. [11] demonstrated that RDW and SOFA scores were significant predictors in their machine learning model. Specifically, elevated RDWs reflect increased oxidative stress and inflammatory cytokine production under chronic inflammatory conditions. These factors interfere with the intracellular metabolism of iron and vitamin B12, ultimately impairing erythropoiesis and resulting in increased RDWs [22]. Compared with existing sepsis prediction tools, our model specifically incorporates the RDW as a key predictor, which is relatively uncommon in traditional scoring systems. Despite the confirmed prognostic value of RDW, it remains underutilized in routine clinical sepsis scoring tools. By integrating RDW into our nomogram, we have expanded the biological foundation of prediction and provided an assessment method that better captures the effects of inflammatory status and oxidative stress. Bao et al. [16] confirmed that GCS scores and lactate levels are crucial prognostic indicators. Lower GCS scores suggest neurological dysfunction [23] and potential cerebral hypoperfusion [24], whereas elevated lactate levels indicate tissue hypoxia at the cellular level and the activation of anaerobic metabolism [25]. Similarly, Zhang et al. [12] identified platelet count and total protein levels as independent factors associated with mortality risk. Thrombocytopenia is often indicative of disseminated intravascular coagulation (DIC) [26] and endothelial dysfunction [27], whereas decreased total protein levels suggest capillary leak syndrome [28] and impaired protein synthesis due to organ dysfunction. Collectively, these observations reinforce the validity of the variables included in our model and highlight the necessity for comprehensive, multisystem evaluation in predicting outcomes in sepsis patients.
Other predictors in our model, while showing relatively lower importance, still contribute significantly to mortality prediction. The 24-hour urine output, as noted by Yang et al. [11], reflects both renal function and tissue perfusion. Oliguria may indicate acute kidney injury, decreased cardiac output, or increased vascular permeability, which are characteristic of septic shock [29–31]. The temperature and WBC count, although traditional markers, remain valuable in assessing the systemic inflammatory response. Fever represents enhanced immune activation and increased metabolic demands [32], whereas abnormal white blood cell counts reflect the severity of the inflammatory response and immune status. Heart rate and pH provide important information about cardiovascular status and metabolic derangement, which Shen et al. [10] identified as critical components in their prediction model. Specifically, tachycardia reflects compensatory mechanisms to maintain cardiac output and the direct effects of inflammatory mediators on the myocardium [33,34]. Acidosis indicates severe cellular dysfunction and may further compromise cardiovascular function through reduced myocardial contractility [35] and impaired myocardial energy metabolism [36]. These parameters, although individually may appear less prominent, collectively enhance the model’s ability to capture the complex pathophysiological alterations in sepsis.
Our nomogram provides clinicians with an intuitive tool to assess 28-day ICU mortality risk in middle-aged and elderly sepsis patients, potentially facilitating early identification of high-risk individuals and optimizing resource allocation in ICU settings.“ This not only clarifies the clinical significance of this study, but also avoids duplication of the results.
Strengths and innovation
Our study has several key strengths: (1) temporal validation design using separate years (2014--2015) to assess model stability; (2) integration of readily available clinical parameters, making it practical for routine use; and (3) development of an intuitive visual nomogram that balances predictive accuracy with clinical interpretability, allowing clinicians to directly quantify the contribution of each risk factor without complex calculations. The nomogram enhances bedside application efficiency and makes the tool suitable for various healthcare settings, including resource-limited environments, providing a valuable complement to existing scoring systems; (4) comprehensive performance evaluation, including discrimination, calibration, and decision curve analysis; and (5) a specific focus on middle-aged and elderly sepsis patients, a demographic group with unique physiological challenges and increased mortality risks that may not be adequately captured by existing general tools. By targeting this critical population, our model more accurately assesses age-related risk factors and provides clinicians with a more tailored decision support tool.
Limitations and future directions
Despite the use of real-world clinical data, which enhances the authenticity of our findings, several limitations should be noted in this study. First, the temporal validation period (2014--2015) is relatively short due to database constraints, warranting further validation across longer time spans. Second, substantial amounts of data were missing, with only 911 of the 13,717 patients having complete predictor variables. To address this limitation and assess potential bias, we compared baseline characteristics between patients with and without complete data. Although patients with complete data had greater illness severity (SOFA score: 6.3 ± 3.4 vs 4.2 ± 2.8, P < 0.001), most other clinical characteristics remained comparable between the groups. These findings suggest that our findings may be more applicable to critically ill sepsis patients, and caution should be exercised when these results are generalized to patients with milder disease severity. Third, external validation in different geographical populations is needed to confirm generalizability. Fourth, the retrospective nature of the study limits causal inference. Additionally, dynamic prediction incorporating temporal changes in clinical parameters could improve accuracy. Future studies should focus on prospective validation, integration with electronic health records, and evaluation of the tool’s impact on clinical outcomes and decision-making processes.
Conclusions
In conclusion, we developed and temporally validated a practical nomogram incorporating 11 readily available clinical parameters to predict 28-day ICU mortality in middle-aged and elderly sepsis patients. The model demonstrated good discriminative ability in both the training (AUC: 0.805) and validation (AUC: 0.756) cohorts, with consistent calibration and favorable decision curve analysis results. This nomogram provides clinicians with an intuitive tool for rapid risk stratification, potentially facilitating early identification of high-risk patients and supporting clinical decision-making. While promising, further external validation in different populations and evaluation of the tool’s clinical utility in real-world settings are warranted.
Supporting information
S1 Table. Univariate logistic regression analysis of 28-day ICU mortality in the training cohort.
Data are OR (95% CI) and P value. BMI: Body mass index; MAP: Mean arterial pressure; O2 Sat: Oxygen saturation; PaO2: Partial pressure of arterial oxygen; PaCO2: Partialpressure of arterial carbon dioxide; FiO2: Fraction of inspired oxygen; WBC: White blood cell; RDW: Red cell distribution width; MCHC: Mean corpuscular hemoglobin concentration; BUN: Blood urea nitrogen; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PT: Prothrombin time; APTT: Activated partial thromboplastin time; INR: International normalized ratio; GCS: Glasgow coma scale; SOFA: Sequential organ failure assessment; APACHE: Acute physiology and chronic health evaluation; COPD: Chronic obstructive pulmonary disease; CHF: Congestive heart failure; AMI: Acute myocardial infarction; DM: Diabetes mellitus.
https://doi.org/10.1371/journal.pone.0328701.s001
(DOCX)
S2 Table. Multiple collinearity screening of variables in the training cohort.
Data are variance inflation factor (VIF) values. NA indicates the variable was removed from the model at that step. Variables eliminated (VIF > 5): Bicarbonate (VIF = 17.0), Base Excess (VIF = 18.2), ALT (VIF = 9.9), PT (VIF = 66.2), APACHE IV score (VIF = 69.6), and Acute Physiology Score III (VIF = 73.0). BMI: Body mass index; GCS: Glasgow coma scale; PaCO2: Partial pressure of arterial carbon dioxide; FiO2: Fraction of inspired oxygen; WBC: White blood cell; RDW: Red cell distribution width; BUN: Blood urea nitrogen; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PT: Prothrombin time; APTT: Activated partial thromboplastin time; INR: International normalized ratio; SOFA: Sequential organ failure assessment; APACHE: Acute physiology and chronic health evaluation; CHF: Congestive heart failure; AMI: Acute myocardial infarction; DM: Diabetes mellitus.
https://doi.org/10.1371/journal.pone.0328701.s002
(DOCX)
S3 Table. Variable importance for 28-day ICU mortality prediction by random forest in the training cohort.
SET params as: n_trees = 500, split_features = 6, total_features = 34, sampling = swor (without replacement), resample_size = 4043, splitting_rule = gini random, random_split_points = 10, class_imbalance_ratio = 10.01. The table shows all important variables ranked by total importance score in the random forest model. Total importance represents the overall predictive power of each variable. Positive impact indicates the variable’s contribution to predicting mortality when its value increases, while negative impact indicates its contribution to predicting survival. Analysis was performed using the training cohort (patients discharged in 2014, n = 6,397). SOFA: Sequential organ failure assessment; WBC: White blood cell; RDW: Red cell distribution width; PaCO2: Partial pressure of arterial carbon dioxide; GCS: Glasgow coma scale; FiO2: Fraction of inspired oxygen; BUN: Blood urea nitrogen; INR: International normalized ratio; AST: Aspartate aminotransferase; APTT: Activated partial thromboplastin time; BMI: Body mass index; AMI: Acute myocardial infarction; CHF: Congestive heart failure; DM: Diabetes mellitus.
https://doi.org/10.1371/journal.pone.0328701.s003
(DOCX)
S1 Fig. (A) Tenfold cross-validation plot showing the binomial deviance against log(lambda).
(B) Coefficient path plot showing variable selection with log(lambda) values.
https://doi.org/10.1371/journal.pone.0328701.s004
(PNG)
S4 Table. Comparison of baseline characteristics between patients with complete data and those with missing data.
Data are expressed as the mean±SD, median (interquartile range), or percentage. BMI: Body mass index; MAP: Mean arterial pressure; O2 Sat: Oxygen saturation; PaO2: Partial pressure of arterial oxygen; PaCO2: Partial pressure of arterial carbon dioxide; FiO2: Fraction of inspired oxygen; WBC White blood cell; RDW: Red cell distribution width; MCHC: Mean corpuscular hemoglobin concentration; BUN: Blood urea nitrogen; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PT: Prothrombin time; APTT: Activated partial thromboplastin time; INR: International normalized ratio; GCS: Glasgow coma scale; SOFA: Sequential organ failure assessment; APACHE: Acute physiology and chronic health evaluation; COPD: Chronic obstructive pulmonary disease; CHF: Congestive heart failure; AMI: Acute myocardial infarction; DM: Diabetes mellitus.
https://doi.org/10.1371/journal.pone.0328701.s005
(DOCX)
S5 Table. Baseline clinical and laboratory characteristics of the study population in the training and validation cohorts.
Data are expressed as the mean±SD, median (interquartile range), or percentage. BMI: Body mass index; MAP: Mean arterial pressure; O2 Sat: Oxygen saturation; PaO2: Partial pressure of arterial oxygen; PaCO2: Partial pressure of arterial carbon dioxide; FiO2: Fraction of inspired oxygen; WBC White blood cell; RDW: Red cell distribution width; MCHC: Mean corpuscular hemoglobin concentration; BUN: Blood urea nitrogen; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PT: Prothrombin time; APTT: Activated partial thromboplastin time; INR: International normalized ratio; GCS: Glasgow coma scale; SOFA: Sequential organ failure assessment; APACHE: Acute physiology and chronic health evaluation; COPD: Chronic obstructive pulmonary disease; CHF: Congestive heart failure; AMI: Acute myocardial infarction; DM: Diabetes mellitus.
https://doi.org/10.1371/journal.pone.0328701.s006
(DOCX)
S2 Fig. Decision curve analysis for predicting 28-day ICU mortality in middle-aged and elderly patients with sepsis.
(A) Development cohort. (B) Validation cohort. The grey line represents the net benefit of treating all patients, the black line represents treating no patients, and the red line represents the net benefit of the prediction model at different threshold probabilities. ICU, Intensive Care Unit; DCA, Decision Curve Analysis.
https://doi.org/10.1371/journal.pone.0328701.s007
(PNG)
Acknowledgments
We sincerely thank the eICU-CRD for providing valuable data that significantly contributed to our study.
References
- 1. Caraballo C, Jaimes F. Organ dysfunction in sepsis: an ominous trajectory from infection to death. Yale J Biol Med. 2019;92(4):629–40. pmid:31866778
- 2. Castellheim A, Brekke O-L, Espevik T, Harboe M, Mollnes TE. Innate immune responses to danger signals in systemic inflammatory response syndrome and sepsis. Scand J Immunol. 2009;69(6):479–91. pmid:19439008
- 3. Fleischmann-Struzek C, Mellhammar L, Rose N, Cassini A, Rudd KE, Schlattmann P, et al. Incidence and mortality of hospital- and ICU-treated sepsis: results from an updated and expanded systematic review and meta-analysis. Intensive Care Med. 2020;46(8):1552–62. pmid:32572531
- 4. Wang M, Jiang L, Zhu B, Li W, Du B, Kang Y, et al. The prevalence, risk factors, and outcomes of sepsis in critically Ill patients in China: a multicenter prospective cohort study. Front Med (Lausanne). 2020;7:593808. pmid:33392219
- 5. Bruno RR, Wernly B, Mamandipoor B, Rezar R, Binnebössel S, Baldia PH, et al. ICU-mortality in old and very old patients suffering from sepsis and septic shock. Front Med (Lausanne). 2021;8:697884. pmid:34307423
- 6. Martin-Loeches I, Guia MC, Vallecoccia MS, Suarez D, Ibarz M, Irazabal M, et al. Risk factors for mortality in elderly and very elderly critically ill patients with sepsis: a prospective, observational, multicenter cohort study. Ann Intensive Care. 2019;9(1):26. pmid:30715638
- 7. Liu X, Niu H, Peng J. Improving predictions: enhancing in-hospital mortality forecast for ICU patients with sepsis-induced coagulopathy using a stacking ensemble model. Medicine (Baltimore). 2024;103(14):e37634. pmid:38579092
- 8. Wang Z-Y, Lan Y-S, Xu Z, Gu Y-W, Li J. Comparison of mortality predictive models of sepsis patients based on machine learning. Chin Med Sci J. 2022;37(3):201–9. pmid:36321175
- 9. Gao J, Lu Y, Ashrafi N, Domingo I, Alaei K, Pishgar M. Prediction of sepsis mortality in ICU patients using machine learning methods. BMC Med Inform Decis Mak. 2024;24(1):228. pmid:39152423
- 10. Shen L, Wu J, Lan J, Chen C, Wang Y, Li Z. Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study. Front Cell Infect Microbiol. 2025;14:1500326. pmid:39844844
- 11. Yang M, Chen H, Hu W, Mischi M, Shan C, Li J, et al. Development and validation of an interpretable conformal predictor to predict sepsis mortality risk: retrospective cohort study. J Med Internet Res. 2024;26:e50369. pmid:38498038
- 12. Zhang G, Shao F, Yuan W, Wu J, Qi X, Gao J, et al. Predicting sepsis in-hospital mortality with machine learning: a multi-center study using clinical and inflammatory biomarkers. Eur J Med Res. 2024;29(1):156. pmid:38448999
- 13. Cooray SD, De Silva K, Enticott JC, Dawadi S, Boyle JA, Soldatos G, et al. Temporal validation and updating of a prediction model for the diagnosis of gestational diabetes mellitus. J Clin Epidemiol. 2023;164:54–64. pmid:37659584
- 14. Zhang K, Zhang S, Cui W, Hong Y, Zhang G, Zhang Z. Development and validation of a sepsis mortality risk score for sepsis-3 patients in intensive care unit. Front Med (Lausanne). 2021;7:609769. pmid:33553206
- 15. Zeng Z, Yao S, Zheng J, Gong X. Development and validation of a novel blending machine learning model for hospital mortality prediction in ICU patients with Sepsis. BioData Min. 2021;14(1):40. pmid:34399809
- 16. Bao C, Deng F, Zhao S. Machine-learning models for prediction of sepsis patients mortality. Med Intensiva (Engl Ed). 2023;47(6):315–25. pmid:36344339
- 17. Xu X, Sun X, Ma L, Zhang H, Ji W, Xia X, et al. 18F-FDG PET/CT radiomics signature and clinical parameters predict progression-free survival in breast cancer patients: a preliminary study. Front Oncol. 2023;13:1149791. pmid:36969043
- 18. Hwangbo S, Kim Y, Lee C, Lee S, Oh B, Moon MK, et al. Machine learning models to predict the maximum severity of COVID-19 based on initial hospitalization record. Front Public Health. 2022;10:1007205. pmid:36518574
- 19. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The Third International Consensus Definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016;315(8):801–10. pmid:26903338
- 20. Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today’s critically ill patients. Crit Care Med. 2006;34(5):1297–310. pmid:16540951
- 21. Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Stat Methods Med Res. 2017;26(2):796–808. pmid:25411322
- 22. Aslam H, Oza F, Ahmed K, Kopel J, Aloysius MM, Ali A, et al. The role of red cell distribution width as a prognostic marker in chronic liver disease: a literature review. Int J Mol Sci. 2023;24(4):3487. pmid:36834895
- 23. Chen L, Lu L, Fang Y, Ren J, Yang X, Gong Z, et al. Glasgow Coma Scale on admission as predictor of neurological sequelae at discharge and acute respiratory failure in patients with heatstroke. Postgrad Med J. 2023;99(1178):1237–45. pmid:37650372
- 24. Salem MS, Abosabaa MA, Abd El Ghafar MS, Ei-Gendy HME-DM, Alsherif SE-DI. Norepinephrine titration in patients with sepsis-induced encephalopathy: cerebral pulsatility index compared to mean arterial pressure guided protocol: randomized controlled trial. BMC Anesthesiol. 2025;25(1):5. pmid:39755598
- 25. Garcia-Alvarez M, Marik P, Bellomo R. Stress hyperlactataemia: present understanding and controversy. Lancet Diabetes Endocrinol. 2014;2(4):339–47. pmid:24703052
- 26. Levi M, Sivapalaratnam S. Disseminated intravascular coagulation: an update on pathogenesis and diagnosis. Expert Rev Hematol. 2018;11(8):663–72. pmid:29999440
- 27. Chen A-T, Wang C-Y, Zhu W-L, Chen W. Coagulation disorders and thrombosis in COVID-19 patients and a possible mechanism involving endothelial cells: a review. Aging Dis. 2022;13(1):144–56. pmid:35111367
- 28. Moreira DC, Ng CJ, Quinones R, Liang X, Chung DW, Di Paola J. Microangiopathic hemolytic anemia due to ADAMTS-13 loss in idiopathic systemic capillary leak syndrome. J Thromb Haemost. 2016;14(12):2353–5. pmid:27622772
- 29. Legrand M, Dupuis C, Simon C, Gayat E, Mateo J, Lukaszewicz A-C, et al. Association between systemic hemodynamics and septic acute kidney injury in critically ill patients: a retrospective observational study. Crit Care. 2013;17(6):R278. pmid:24289206
- 30. Fisher J, Douglas JJ, Linder A, Boyd JH, Walley KR, Russell JA. Elevated Plasma Angiopoietin-2 levels are associated with fluid overload, organ dysfunction, and mortality in human septic shock. Crit Care Med. 2016;44(11):2018–27. pmid:27441903
- 31. Maiden MJ, Otto S, Brealey JK, Finnis ME, Chapman MJ, Kuchel TR, et al. Structure and function of the kidney in septic shock. a prospective controlled experimental study. Am J Respir Crit Care Med. 2016;194(6):692–700. pmid:26967568
- 32. Wang A, Medzhitov R. Counting calories: the cost of inflammation. Cell. 2019;177(2):223–4. pmid:30951664
- 33. Sakellakis M, Reet J, Kladas M, Hoge G, Chalkias A, Radulovic M. Cancer-induced resting sinus tachycardia: an overlooked clinical diagnosis. Oncol Rev. 2024;18:1439415. pmid:39156014
- 34. Elzeneini M, Aranda JM Jr, Al-Ani M, Ahmed MM, Parker AM, Vilaro JR. Hemodynamic effects of ivabradine use in combination with intravenous inotropic therapy in advanced heart failure. Heart Fail Rev. 2021;26(2):355–61. pmid:32997214
- 35. Wu D, Kraut JA. Role of NHE1 in the cellular dysfunction of acute metabolic acidosis. Am J Nephrol. 2014;40(1):36–42. pmid:24994076
- 36. Zhu M-Y, Zhang D-L, Zhou C, Chai Z. Mild acidosis protects neurons during oxygen-glucose deprivation by reducing loss of mitochondrial respiration. ACS Chem Neurosci. 2019;10(5):2489–97. pmid:30835994