Figures
Abstract
Sepsis-Associated Liver Injury (SALI) is an independent risk factor for death from sepsis. The aim of this study was to develop an interpretable machine learning model for early prediction of 28-day mortality in patients with SALI. Data from the Medical Information Mart for Intensive Care (MIMIC-IV, v2.2, MIMIC-III, v1.4) were used in this study. The study cohort from MIMIC-IV was randomized to the training set (0.7) and the internal validation set (0.3), with MIMIC-III (2001 to 2008) as external validation. The features with more than 20% missing values were deleted and the remaining features were multiple interpolated. Lasso-CV that lasso linear model with iterative fitting along a regularization path in which the best model is selected by cross-validation was used to select important features for model development. Eight machine learning models including Random Forest (RF), Logistic Regression, Decision Tree, Extreme Gradient Boost (XGBoost), K Nearest Neighbor, Support Vector Machine, Generalized Linear Models in which the best model is selected by cross-validation (CV_glmnet), and Linear Discriminant Analysis (LDA) were developed. Shapley additive interpretation (SHAP) was used to improve the interpretability of the optimal model. At last, a total of 1043 patients were included, of whom 710 were from MIMIC-IV and 333 from MIMIC-III. Twenty-four clinically relevant parameters were selected for model construction. For the prediction of 28-day mortality of SALI in the internal validation set, the area under the curve (AUC (95% CI)) of RF was 0.79 (95% CI: 0.73–0.86), and which performed the best. Compared with the traditional disease severity scores including Oxford Acute Severity of Illness Score (OASIS), Sequential Organ Failure Assessment (SOFA), Simplified Acute Physiology Score II (SAPS II), Logistic Organ Dysfunction Score (LODS), Systemic Inflammatory Response Syndrome (SIRS), and Acute Physiology Score III (APS III), RF also had the best performance. SHAP analysis found that Urine output, Charlson Comorbidity Index (CCI), minimal Glasgow Coma Scale (GCS_min), blood urea nitrogen (BUN) and admission_age were the five most important features affecting RF model. Therefore, RF has good predictive ability for 28-day mortality prediction in SALI. Urine output, CCI, GCS_min, BUN and age at admission(admission_age) within 24 h after intensive care unit(ICU) admission contribute significantly to model prediction.
Citation: Wen C, Zhang X, Li Y, Xiao W, Hu Q, Lei X, et al. (2024) An interpretable machine learning model for predicting 28-day mortality in patients with sepsis-associated liver injury. PLoS ONE 19(5): e0303469. https://doi.org/10.1371/journal.pone.0303469
Editor: Nattapol Aunsri, Mae Fah Luang University, THAILAND
Received: September 12, 2023; Accepted: April 25, 2024; Published: May 20, 2024
Copyright: © 2024 Wen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All datasets supporting the conclusions of this study are obtained from the MIMIC-IV and MIMIC-III database (web site: https://mimic.physionet.org/). The medical code can be obtained from github (web site: https://github.com/MIT-LCP/mimic-code).
Funding: This study was supported by Sichuan Science and Technology Program (2022YFS0626), Southwest Medical University (2022QN073), and Sichuan Science Technology Innovation Seedling Project (MZGC20230040) and Southwest Medical University and Xuyong County People's Hospital (2023XYXNYD16). Funding related to 2022YFS062, 2022QN073, MZGC20230040, and 2023XYXNYD16 was received by Chengli Wen, who wrote the original draft, completed data curation, and writing – review & editing the manuscript. Funding related to 2022YFS063 and 2021(451) were received by Xianying Lei, who completed the part of conceptualization with Tao Xu and Sicheng Liang; 2022NSFSC0576 was received by Sicheng Liang, who assisted Lei Xianying to complete Conceptualization; 2021SNXNYD05 was received by Muhan Lü, who supervised the progress and quality of the research and writing – review & editing the manuscript.
Competing interests: The authors declare no conflict of interest.
Introduction
Sepsis is a syndrome of multiple organ dysfunction caused by an abnormal immune response to infection [1], and being one of the common diseases in intensive care units (ICU), it has been an important global health problem. The Global Burden of Disease Study, published in 2020, analyzed global, regional and national sepsis incidence and mortality rates from 1990 to 2017 and reported that there were approximately 48.9 million cases of sepsis in 2017, with about 11 million sepsis-related deaths, accounting for 19.7% of all deaths worldwide [2]. High risk of rehospitalization and high cost of treatment for sepsis [3,4]. In the United States, sepsis was the most expensive condition treated, amounting to $38.2 billion or 8.8% of aggregate costs for all hospital stays in 2017 [5].
The liver is a vital organ for the human body which regulates the balance of metabolism and immunity [6,7]. The liver is essential for regulating immune defense during sepsis and the mechanisms it is involved with are lipopolysaccharide detoxification, bacterial clearance, acute-phase protein and cytokine release, inflammation metabolic regulation, etc. [8] The production of large amounts of endotoxins and the release of inflammatory factors in sepsis lead to abnormal immune responses that impair the function of multiple organs, including the liver [9]. When there is an inappropriate immune response or excessive inflammation in the liver, the ability to clear pathogens is impaired and liver metabolism is disrupted. Sepsis associated liver injury (SALI) can be caused by a variety of factors, including pathogens or shock, an exaggerated inflammatory response, persistent microcirculation failure, or even oxidative stress [10]. There are two main manifestations of SALI: ischemic hypoxic liver injury and sepsis-related cholestasis. There are no unified diagnostic criteria for SALI, and the Surviving Sepsis Campaign (SSC) Guidelines recommended to use total bilirubin(TBIL) >2 mg/dL and international standardized ratio (INR) >1.5 as the diagnostic criteria [11]. In the assessment of the severity of disease, Sequential Organ Failure Assessment (SOFA) [12], Oxford Acute Severity of Illness Score (OASIS) [13], Acute Physiology Score III (APS III) [14], Logistic Organ Dysfunction Score (LODS) [15], Simplified Acute Physiology Score II (SAPS II) [16], Systemic Inflammatory Response Syndrome (SIRS), and Glasgow Coma Scale (GCS) were some traditional scorings of disease severity.
Studies have shown that the incidence of SALI in the U.S. adult sepsis population is 34% and 46%, which is considered as an independent risk factor for death from sepsis, and that patients who develop SALI have an increased risk of death of nearly 54% [9,17]. The high mortality rate of SALI may be related to the lack of effective diagnostic tools and early warning systems. The aim of this study is to develop an explicable machine learning model that can predict the 28-day mortality of SALI early, provide early warning for SALI, and remind clinicians to conduct effective clinical interventions in patients to reduce their 28-day mortality.
Methods
Data source
This is a retrospective cohort study based on data the extracted from two open databases at the same center, including critical care databases v2.2 (Medical Information Mart for Intensive Care (MIMIC-IV)) (2008 to 2019) and v1.4 (MIMIC-III) (2001 to 2008) collected from Beth Israel Deaconess Medical Center in Boston. We were granted access to the database (Chengli Wen ID 11718300).
Research population
We included patients ≥18 years old with sepsis which was defined as infection with a SOFA score ≥2 according to the sepsis 3.0 diagnostic criteria [1], with an ICU stay ≥24 h and at least one occurrence of SALI (SALI is defined as TBIL>2 mg/dL and INR >1.5 in sepsis [11]). We excluded patients aged <18 years, without liver injury, with ICU stay <24h, and all patients with other types of liver disease. Patients with human immunodeficiency virus (HIV) infection, pregnant women, and patients without biochemical and coagulation tests within 24h of admission to the ICU were also excluded.
Data collection
We used Structured Query Language (SQL, version 15.1) to extract data from the two databases. To develop optimal early predictive interpretable machine learning model for 28-day mortality in patients with SALI, we extracted seven types of data and 79 candidate clinical features. We retrospectively collected the following data: (1) demographic characteristics, including age, sex, body weight, body height, and body mass index (BMI); (2) medical history which was obtained according to the International Classification of Diseases (ICD)-9 and ICD-10, including hypertension, diabetes, congestive heart failure, myocardial infarction, peptic ulcer, cerebrovascular disease, chronic obstructive pulmonary disease, kidney disease, and Charlson Comorbidity Index (CCI) [18]; (3) vital signs, including heart rate, systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), respiratory rate(RR), body temperature, and oxygen saturation (SPO2); (4) laboratory parameters, including white blood cell count, neutrophils, lymphocytes, platelets, hematocrit, red blood cell distribution width (RDW), hemoglobin, Hypersensitive c-reactive protein (hs-CRP), activated partial thromboplastin time (APTT), prothrombin time (PT), partial thromboplastin time (PTT), INR, fibrinogen, alanine aminotransferase (ALT), aspartate aminotransferase (AST), alkaline phosphatase (ALP), amylase, TBIL, lactate dehydrogenase (LDH), albumin, triglyceride, high-density lipoprotein, low-density lipoprotein, blood urea nitrogen (BUN), serum creatinine(Cr), creatine phosphokinase, creatine kinase MB, high sensitivity troponin T, N-terminal pro brain natriuretic peptide (NT-pro-BNP), lactate, pH, pO2, pCO2, PaO2/FiO2 ratio, base excess, anion gap, bicarbonate, serum calcium, serum chloride, serum sodium, serum potassium, and blood glucose; (5) traditional scores for assessing disease severity, including OASIS, SOFA, SIRS, SAPS II, LODS, GCS, and APSIII; and (6) urine volume on the first day of ICU admission; and (7) others, including duration of ICU stay this time, infection site, dopamine (ug/kg. min), adrenalin (ug/kg. min), and noradrenaline (ug/kg. min), dobutamine (ug/kg. min). The 28-day mortality rate was an outcome indicator. A detailed list of the included variables is shown in S1 Table.
Ethics statements
The databases were approved by the Massachusetts Institute of Technology and Beth Israel Deaconess Medical Center. This study is a retrospective study and does not affect clinical treatment and care; Therefore, the ethical approval statement and informed consent of each patient included in the study were waived [19]. This study is consistent with the Transparent Reporting of Multivariate Predictive Models for Individual Prognosis or Diagnosis (TRIPOD): TRIPOD statement [20], and the TRIPOD checklist showed in S2 Table.
Data preprocessing
All data processing was done in the R or python environment. First, the cohorts from the two databases were divided into either the death group or the survival group (Patient outcome defined as 1 for death and 0 for survival), and the differences in each of the clinical features between the two group were compared. Second, we conducted a missing value analysis (S3 Table) and removed features with missing values exceeding 20%. Third, we used multiple interpolation to interpolate features with less than 20% missing values. The data overlap before and after interpolation were good, and the distribution of the original and interpolated data is shown in S1 Fig. Then, based on the Lasso-CV method with an optimal regularization parameter of 0.113 for feature screening of the interpolated data after excluding SIRS, SOFA, OASIS, SAPSII, LODS, and APSIII, which are comprehensive scores that can comprehensive assessment the severity of the disease, 24 features were ultimately selected to develop the model (S4 Table).
Model development and validation
The data extracted from MIMIC-IV were randomly divided into training and internal validation sets according to 7:3, and the data from MIMIC-III that did not overlap with MIMIC-IV were used as the external validation set. We chose the following eight models in the training set for model training: Random Forest (RF), Logistic Regression, Decision Tree, Extreme Gradient Boost (XGBoost), K Nearest Neighbor Model (KNN), Support Vector Machine (SVM), Network for Generalized Linear Models in which the best model is selected by cross-validation (CV_glmnet), and Linear Discriminant Analysis (LDA). Internal and external validation sets were used to test the performance of the model. We used area under the curve (AUC), accuracy, precision, recall, and specificity to evaluate the performance of the models, and the most important of these indicators was AUC. The optimal model was compared with the traditional clinical disease severity scores (SIRS, SOFA, OASIS, SAPSII, LODS, and APSIII) to better predict the 28-day mortality risk of patients with SALI, in order to alert the clinicians to make early interventions. We hyper-parameterized the optimal model to obtain the optimal performance of the model.
Model explainability
The Shapley additive explanations (SHAP) method was used to improve the interpretability of the final model. SHAP is a machine learning interpretation method that can be used to interpret the importance of features in model prediction results [21]. It is based on the concept of Shapley value in cooperative game theory and uses an additive method to calculate the contribution of each feature to the model prediction results. The SHAP algorithm can provide an explanatory value for each feature, indicating the degree of influence of the feature on the model’s prediction results, and the results of the calculation can be used to explain not only the feature importance of individual predictions, but also the feature importance distribution of the entire dataset.
Statistical analysis
Values are expressed as medians (interquartile range) for continuous variables and totals (percentages) for categorical variables. The rank sum test was used for continuous variables and the Chi-square test for categorical variables. After data preprocessing and feature selection, we developed eight popular machine learning models to predict 28-day mortality in patients with sepsis-related liver injury. The overall performance of each model was evaluated on their AUC, accuracy, precision, recall, and specificity. The best performing model was interpreted using Shapley values.
All calculations and analyses were performed using R 4.2.1 and Python 3.7 software. All statistical tests were 2-sided, and P values<0.05 were considered to be statistically significant.
Results
Baseline characteristics
There was a total of 73,181 records in MIMIC-IV, and after screening the records based on the inclusion and exclusion criteria, 710 records were finally obtained. Of these, 497 cases were used as the training set and 213 cases were used as the internal validation set. MIMIC-III (2001–2008) included 28,391 records, with 333 patients ultimately included as an external validation set. The flow chart of this study is shown in Fig 1. Table 1 shows the baseline characteristics of the entire cohort from MIMIC-IV, as well as the death and survival groups. Baseline characteristics of the cohort from MIMIC-III are shown in S5 Table. The cohort from MIMIC-IV included 404 male (56.9%) and 306 female (43.1%), with 424 survivors (59.7%) and 286 deaths (40.3%), and the median (interquartile range [IQR]) age was 68.4 (57.5,78.8) years. The age of patients in the death group (70.6[62,80.9]) was significantly higher than that in the survival group (66.1[54.2,77.2]). Compared to the survival group, the death group had a higher CCI (6.0[4.0,7.0] vs 7.0[5.0,9.0], P<0.001) and more patients with diabetes (97(33.9%) vs 128(30.2%), P<0.001) and Myocardial infarction (67(15.8%) vs 70(24.5%), P = 0.022). In addition, among the laboratory test indices, RDW (15.7[14.2,17.7] vs 16.6[14.7,18.6], P<0.001), ALP (15.7[14.2,17.7] vs 99[62,178.8], P<0.001), LDH (338.0[236.0,521.0] vs 440.5[283.3,792.2], P<0.001), BUN (25.0[16.0,40.0] vs 36.0[22.0,57.0], P<0.001), Cr (1.3[0.9,1.9] vs 25.0[16.0,40.0], P = 0.001), and Lactate (2.1[1.4,3.4] vs 2.3[1.5,4.0], P<0.001) were higher in the death group, while pO2 (134[86,254] vs 112.5[74.3,206.5], P = 0.006), Albumin (3.0[2.5,3.4] vs 2.7[2.2,3.2], P<0.001), Platelets (163.0[101.5,228.0] vs 139.0[74.0,211.0], P = 0.002), and Hemoglobin (139.0[74.0,211.0] vs 10.0[8.5,12.2], P = 0.032) were lower. We also found that the survival group had a longer ICU stay (6.2[3.0,13.0] vs 4.5[2.4,7.7], P<0.001) and 24h urine output (1500[878,2270] vs 382[780,1650], P<0.001) than the death group. All of the scores for disease severity, except for the SIRS score, were significantly higher for the death group than the survival group.
A. Screening Process for MIMIC-IV. B. Screening Process for MIMIC-III.
Model development and validation
Twenty-four features were screened for model construction (S4 Table). The features coefficients were plotted in Fig 2. A positive value of the coefficient of identity indicates a positive effect on 28-day mortality, while a negative value of the coefficient indicates a negative effect.
Both internal and external validation sets were used to evaluate the model. In the internal validation cohort, the RF model had good predictive power in predicting sepsis-related liver injury 28-day mortality, with a maximum AUC. 0.79 (95% CI: 0.73–0.86), as compared to CV_glmnet (AUC. 0.76 (95% CI: 0.70–0.83)), Support Vector Machine (AUC. 0.78 (95% CI: 0.72–0.85)), Logistic Regression (AUC. 0.78 (95% CI: 0.70–0.83%)), LDA (AUC. 0.77 (95% CI: 0.70–0.84)), K Nearest Neighbor Model (AUC. 0.69 (95% CI: 0.61–0.76)), XGBoost (AUC. 0.68 (95% CI: 0.61–0.76)), and Decision Tree (AUC. 0.67 (95% CI: 0.59–0.75)). Receiver Operating Characteristic (ROC) were plotted to evaluate the performance of the models, and the ROC curves for the internal validation set and the external validation set are shown in Fig 3. The AUC, accuracy, Precision, Recall, Specificity of the eight models constructed were compared in Table 2.
A. Internal validation set. B. External validation set.
We selected the top three models in terms of AUC value for Decision Curve Analysis (DCA), and RF remained the best performing model among them (Fig 4). The RF model showed better predictive performance when compared to the traditional disease severity scores (SIRS (AUC.0.53 (95% CI: 0.45–0.60)), SOFA (AUC. 0.62 (95% CI: 0.55–0.70)), OASIS (AUC. 0.62 (95% CI: 0.55–0.70)), SAPSII (AUC. 0.61 (95% CI: 0.53–0.69)), LODS (AUC. 0.65 (95% CI: 0.58–0.73)), and APSIII (AUC. 0.61 (95% CI: 0.54–0.69)). The ROCs are shown in Fig 5, and Table 3 compared performance evaluation of RF and traditional disease severity score in the internal validation set. RF was the optimal model for predicting 28-day mortality in patients with SALI. We also compared the predictive performance of the models in the external validation set and the results are shown in S6 and S7 Tables. Hyperparameter tuning resulted in better predictive performance of the model, and Table 4 displays the result.
A. Internal validation set. B. External validation set.
Model explainability
To improve the clinical utility of the model, we used the SHAP method to determine which features contribute to the model’s prediction of 28-day mortality in patients with sepsis, which is shown in Fig 6. Fig 6A shows the distribution of the SHAP values of the top 20 clinical features: each point in the figure represents a feature, and the position of the point indicates the SHAP value of the feature, with the value representing the magnitude of the feature’s contribution to the model output. If the value is positive, the feature positively influences the output; if the value is negative, the feature negatively influences the output. Red color indicates high values and blue color indicates low values. A darker color indicates a stronger influence of the feature on the target feature. Fig 6A shows a low SHAP value for urine output and GCS_min that indicated a positive influence on 28-day mortality, while the CCI, BUN and admission_age displayed an opposite trend. The bar chart was formed by ranking the features from high to low according to their average SHAP absolute values, indicating the degree of the contribution of each feature to the whole model. The larger the SHAP absolute value is, the more important the feature is, and the greater impact it has on the model output results. From Fig 6B, it is easy to see that the top five clinically important features were urin output, CCI, GCS_min, BUN and admission_age.
A. SHAP values showing the influence of different features on the output of RF Model. B. Mean absolute SHAP values for each clinical feature.
Based on the summary plot of SHAP, we further derived the top 5 influential SHAP dependency plots to explain the effect of clinical characteristics on the risk of 28-day death (Fig 7). The vertical axis of the SHAP dependency plot is the SHAP value of the clinical characteristic, while the horizontal axis is the range of variation of the clinical characteristic, where a SHAP value higher than zero indicates that the patient has an increased risk of 28-day death.
A. Urine output; B. CCI; C. GCS_min; D. BUN; E. Admission_age.
Discussion
We developed and validated a predictive model using a large dataset in order to build a valid, stable, and interpretable model to predict 28-day mortality in patients with SALI. Among the multiple models developed, RF was the most reliable and stable and had the best predictive performance. We compared RF with the traditional disease severity scores (SIRS, SOFA, OASIS, SAPSII, LODS, and APSIII) and found that the RF model was still performed the best. An external validation of the model was performed, confirming the stability of RF. To date, no researchers have developed a predictive model for 28-day mortality in patients with SALI, and no studies have used multi-model screening for optimal model development. Some researchers have used nomogram to predict in-hospital mortality and 90-day mortality in patients with SALI, but they have only compared the developed model with some of the traditional disease severity scores [22,23]. We also performed hyperparameter tuning of the model after developing the optimal model to optimize the predictive performance of the model [24,25]. In addition, we screened the five clinical features that contributed most to the model, which were Urine output, CCI, GCS, BUN, and admission_age. Therefore, clinical features can serve as early warning.
Shapley value was used to explain the opacity of the model [26]. Model opacity refers to the opacity of the intermediate process between the input of data and the output of results [27,28]. From Fig 7A, Shapley value was 0 when the urine volume was about 1000 ml within 24 h after admission, and the Shapley value decreased, which showed a negative effect on 28-d mortality in SALI, when the urine volume increased. The GCS value for a Shapley value of 0 is approximately 10 from Fig 7C, and its effect on 28-d mortality in SALI is consistent with urine output. However, the effect of CCI, BUN and admission_age on 28-day mortality in SALI were opposite to the trend of the first two features. The Shapley values tended to approach 0 when the CCI, BUN and admission_age were about 6, 22 and 65, respectively.
It is well known that patients with severe sepsis have severely impaired microcirculation and reduced end-organ tissue perfusion, exacerbating organ damage. Urine output is one of the traditional indicators of tissue perfusion that can be used to assess microcirculation [29]. A study by Heffernan et al. on the relationship between urine output and mortality in critically ill patients showed that a urine output threshold of less than 0.5 mL/kg/hr moderately predicted mortality in ICU inpatients [30]. This serves as a reminder to clinicians that they need to focus not only on the total amount of urine output, but also on changes to urine output over time to detect changes in the patient’s microcirculatory concerns in a timely manner. Bun is one of the indicators used to assess kidney function. wen et al. showed that bun greater than or equal to 21 mg/dl is one of the most important predictors of mortality risk in patients with sepsis, which is almost consistent with our results [31]. The CCI, developed in 1987, is considered the gold standard for assessing comorbidities in clinical studies [32], as a tool used to predict long-term mortality in patients [33]. Previous studies have also shown an increase in patient mortality with increasing CCI [34,35], consistent with our results. We usually use the GCS which is a scale used to assess a patient’s level of consciousness [36]. Lai Q et al. incorporated GCS into a model construct to assess in-hospital mortality in patients with sepsis. Our model and that of Lai Q et al. consistently show that GCS is an important clinical indicator in predicting the risk of death in patients with sepsis [37]. As for the admission_age, as people aging, their bodily functions gradually deteriorate, and the functioning of their organs diminishes. This may explain why admission_age was one of the top five important predictors of 28-day mortality in patients with sepsis-related liver injury.
We found that the top 5 metrics that had the greatest impact on predicting performance were not liver function-related metrics. The ALBI grade is a new score for assessing liver function, which was developed by Dr. Philip J. Johnson, Professor of Translational Oncology at the University of Liverpool, UK [38]. However, due to too many missing values, more than 40% (S4 Table), it was excluded when incorporating the clinical features used to construct the model, and in Table 1, no statistically significant difference between the two groups of ABLI in the SALI death group and survival group. Moreover, liver function related measurements, except ALP death group was significantly higher than the stock group, other measurements of the two groups of patients were not significantly different, in Table 1. SALI is a hepatic impairment caused by sepsis, usually accompanied by other organ injuries, only liver function impairment-related indexes are not sufficient to represent the overall severity of this group of patients, and there is no significant difference between liver function-related indexes in the death group and the survival group, they may be the reason for the absence of indicators of liver injury among the five most important indicators affecting the predictive performance of the model.
There are some limitations in our study. First, our modeling used a single-center dataset and was a retrospective study; In addition, non-overlapping dataset with MIMIC-IV in MIMIC-III was used as an external validation queue, and the chronology was not forward-looking; Third, we focused only on the clinical indicators within 24 h after ICU admission and did not assess the impact of changes in the clinical features on the outcomes during the ICU stay. Therefore, further design of multicenter prospective studies is needed to validate our findings.
Conclusion
RF machine learning models have good predictive ability for 28-day mortality prediction in SALI. Urine output, CCI, GCS-min, BUN and admission age within 24 h of ICU admission contribute significantly to model prediction.
Supporting information
S1 Fig. Distribution of the original and interpolated data.
https://doi.org/10.1371/journal.pone.0303469.s001
(DOCX)
S1 Table. All extracted variables collection from the MIMIC-IV and MIMIC-III database.
https://doi.org/10.1371/journal.pone.0303469.s002
(CSV)
S3 Table. Missing number (%) for included variables in the dataset (MIMIC-IV).
https://doi.org/10.1371/journal.pone.0303469.s004
(CSV)
S4 Table. A list of the features that were finally used for prediction in this study.
https://doi.org/10.1371/journal.pone.0303469.s005
(CSV)
S5 Table. Baseline characteristics of the cohort from MIMIC-III.
https://doi.org/10.1371/journal.pone.0303469.s006
(CSV)
S6 Table. Compare the performance evaluation of 8 machine learning classific-ation models in predicting 28-day mortality rate in the external validation set.
https://doi.org/10.1371/journal.pone.0303469.s007
(CSV)
S7 Table. Compare performance evaluation of random forest and traditional disease severity scores in predicting 28-day mortality rate in the external validation set.
https://doi.org/10.1371/journal.pone.0303469.s008
(CSV)
References
- 1. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801–810. pmid:26903338
- 2. Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan DR, et al. Global, regional, and national sepsis incidence and mortality, 1990–2017: analysis for the Global Burden of Disease Study. Lancet. 2020;395(10219):200–211. pmid:31954465
- 3. Shankar-Hari M, Saha R, Wilson J, Prescott HC, Harrison D, Rowan K, et al. Rate and risk factors for rehospitalisation in sepsis survivors: systematic review and meta-analysis. Intensive Care Med. 2020;46(4):619–636. pmid:31974919
- 4. Agrawal D, Chen CB, Dravenstott RW, Strömblad CT, Schmid JA, Darer JD, et al. Predicting Patients at Risk for 3-Day Postdischarge Readmissions, ED Visits, and Deaths. Med Care. 2016;54(11):1017–1023. pmid:27213544
- 5.
Liang L, Moore B, Soni A. National Inpatient Hospital Costs: The Most Expensive Conditions by Payer, 2017. In: Healthcare Cost and Utilization Project (HCUP) Statistical Briefs. Rockville (MD): Agency for Healthcare Research and Quality (US); July 14, 2020.
- 6. Kubes P, Jenne C. Immune Responses in the Liver. Annu Rev Immunol. 2018; 36:247–277. pmid:29328785
- 7. Solhi R, Lotfinia M, Gramignoli R, Najimi M, Vosough M. Metabolic hallmarks of liver regeneration. Trends Endocrinol Metab. 2021;32(9):731–745. pmid:34304970
- 8. Zhang X, Liu H, Hashimoto K, Yuan S, Zhang J. The gut-liver axis in sepsis: interaction mechanisms and therapeutic potential. Crit Care. 2022;26(1):213. Published 2022 Jul 13. pmid:35831877
- 9. Yan J, Li S, Li S. The role of the liver in sepsis. Int Rev Immunol. 2014;33(6):498–510. pmid:24611785
- 10. Lelubre C, Vincent JL. Mechanisms and treatment of organ failure in sepsis. Nat Rev Nephrol. 2018;14(7):417–427. pmid:29691495
- 11. Dellinger RP, Levy MM, Rhodes A, Annane D, Gerlach H, Opal SM, et al. Surviving sepsis campaign: international guidelines for management of severe sepsis and septic shock: 2012. Crit Care Med. 2013;41(2):580–637. pmid:23353941
- 12. Evans L, Rhodes A, Alhazzani W, Antonelli M, Coopersmith CM, French C, et al. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock 2021. Intensive Care Med. 2021;47(11):1181–1247. pmid:34599691
- 13. Chen Q, Zhang L, Ge S, He W, Zeng M. Prognosis predictive value of the Oxford Acute Severity of Illness Score for sepsis: a retrospective cohort study. PeerJ. 2019;7: e7083. Published 2019 Jun 10. pmid:31218129
- 14. Korkmaz Toker M, Gülleroğlu A, Karabay AG, Biçer İG, Demiraran Y. SAPS III or APACHE IV: Which score to choose for acute trauma patients in intensive care unit? Yoğun bakımdaki akut travma hastalarında hangi skoru seçmeliyiz: SAPS III mü, APACHE IV mü? Ulus Travma Acil Cerrahi Derg. 2019;25(3):247–252. pmid:31135940
- 15. Lu Z, Zhang J, Hong J, Wu J, Liu Y, Xiao W, et al. Development of a Nomogram to Predict 28-Day Mortality of Patients with Sepsis-Induced Coagulopathy: An Analysis of the MIMIC-III Database. Front Med (Lausanne). 2021; 8:661710. Published 2021 Apr 6. pmid:33889591
- 16. Le Gall JR, Lemeshow S, Saulnier F. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study [published correction appears in JAMA 1994 May 4;271(17):1321]. JAMA. 1993;270(24):2957–2963. pmid:8254858
- 17. Minemura M, Tajiri K, Shimizu Y. Liver involvement in systemic infection. World J Hepatol. 2014;6(9):632–642. pmid:25276279
- 18. Charlson ME, Carrozzino D, Guidi J, Patierno C. Charlson Comorbidity Index: A Critical Review of Clinimetric Properties. Psychother Psychosom. 2022;91(1):8–35. pmid:34991091
- 19. Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3: 160035. Published 2016 May 24. pmid:27219127
- 20. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015; 350: g7594. Published 2015 Jan 7. pmid:25569120
- 21. Peng S, Huang J, Liu X, Deng J, Sun C, Tang J, et al. Interpretable machine learning for 28-day all-cause in-hospital mortality prediction in critically ill patients with heart failure combined with hypertension: A retrospective cohort study based on medical information mart for intensive care database-IV and eICU databases. Front Cardiovasc Med. 2022; 9:994359. Published 2022 Oct 12. pmid:36312291
- 22. Liu Y, Sun R, Jiang H, Liang G, Huang Z, Qi L, et al. Development and validation of a predictive model for in-hospital mortality in patients with sepsis-associated liver injury. Ann Transl Med. 2022;10(18):997. pmid:36267798
- 23. Cui L, Bao J, Yu C, Zhang C, Huang R, Liu L, et al. Development of a nomogram for predicting 90-day mortality in patients with sepsis-associated liver injury. Sci Rep. 2023;13(1):3662. Published 2023 Mar 4. pmid:36871054
- 24. Dalal S, Onyema EM, Malik A. Hybrid XGBoost model with hyperparameter tuning for prediction of liver disease with better accuracy. World J Gastroenterol. 2022;28(46):6551–6563. pmid:36569269
- 25. Li W, Wang T, Ng WWY. Population-Based Hyperparameter Tuning with Multitask Collaboration [published online ahead of print, 2021 Dec 8]. IEEE Trans Neural Netw Learn Syst. 2021; PP:10.1109/TNNLS.2021.3130896. pmid:34878983
- 26. Hsu W, Elmore JG. Shining Light into the Black Box of Machine Learning. J Natl Cancer Inst. 2019;111(9):877–879. pmid:30629201
- 27. The Lancet Respiratory Medicine. Opening the black box of machine learning. Lancet Respir Med. 2018;6(11):801. pmid:30343029
- 28. Azodi CB, Tang J, Shiu SH. Opening the Black Box: Interpretable Machine Learning for Geneticists. Trends Genet. 2020;36(6):442–455. pmid:32396837
- 29. Hariri G, Joffre J, Leblanc G, Bonsey M, Lavillegrand JR, Urbina T, et al. Narrative review: clinical assessment of peripheral tissue perfusion in septic shock. Ann Intensive Care. 2019;9(1):37. Published 2019 Mar 13. pmid:30868286
- 30. Heffernan AJ, Judge S, Petrie SM, Godahewa R, Bergmeir C, Pilcher D, et al. Association Between Urine Output and Mortality in Critically Ill Patients: A Machine Learning Approach. Crit Care Med. 2022;50(3): e263–e271. pmid:34637423
- 31. Weng J, Hou R, Zhou X, Xu Z, Zhou Z, Wang P, et al. Development and validation of a score to predict mortality in ICU patients with sepsis: a multicenter retrospective study. J Transl Med. 2021;19(1):322. Published 2021 Jul 29. pmid:34325720
- 32. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–383. pmid:3558716
- 33. Charlson ME, Carrozzino D, Guidi J, Patierno C. Charlson Comorbidity Index: A Critical Review of Clinimetric Properties. Psychother Psychosom. 2022;91(1):8–35. pmid:34991091
- 34. Torvik MA, Nymo SH, Nymo SH, Bjørnsen LP, Kvarenes HW, Ofstad EH. Patient characteristics in sepsis-related deaths: prevalence of advanced frailty, comorbidity, and age in a Norwegian hospital trust [published correction appears in Infection. 2023 May 12;]. Infection. 2023;51(4):1103–1115. pmid:36894755
- 35. Guirgis FW, Black LP, Henson M, Labilloy G, Smotherman C, Hopson C, et al. A hypolipoprotein sepsis phenotype indicates reduced lipoprotein antioxidant capacity, increased endothelial dysfunction and organ failure, and worse clinical outcomes. Crit Care. 2021;25(1):341. Published 2021 Sep 17. pmid:34535154
- 36. Mehta R; trainee GP, Chinthapalli K; consultant neurologist. Glasgow coma scale explained. BMJ. 2019; 365: l1296. Published 2019 May 2. pmid:31048343
- 37. Lai Q, Xia Y, Yang W, Zhou Y. Development and Validation of a Rapid and Efficient Prognostic Scoring System for Sepsis Based on Oxygenation Index, Lactate and Glasgow Coma Scale. J Inflamm Res. 2023; 16:2955–2966. Published 2023 Jul 18. pmid:37484996
- 38. Johnson PJ, Berhane S, Kagebayashi C, Satomura S, Teng M, Reeves HL, et al. Assessment of liver function in patients with hepatocellular carcinoma: a new evidence-based approach-the ALBI grade. J Clin Oncol. 2015;33(6):550–558. pmid:25512453