Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Development and validation of a cardiac surgery-associated acute kidney injury prediction model using the MIMIC-IV database

  • Yang Xu,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Validation, Visualization, Writing – original draft

    Affiliation Department of Emergency, Affiliated Hospital of Yangzhou University, Yangzhou University, Jiangsu, China

  • Chunxiao Song,

    Roles Data curation

    Affiliation Department of Emergency, Affiliated Hospital of Yangzhou University, Yangzhou University, Jiangsu, China

  • Wenping Wei,

    Roles Writing – review & editing

    Affiliation Department of Pediatrics, Affiliated Hospital of Yangzhou University, Yangzhou University, Jiangsu, China

  • Runfeng Miao

    Roles Funding acquisition

    090486@yzu.edu.cn

    Affiliation Department of Emergency, Affiliated Hospital of Yangzhou University, Yangzhou University, Jiangsu, China

Abstract

Objective

This study aimed to develop an innovative early prediction model for acute kidney injury (AKI) following cardiac surgery in intensive care unit (ICU) settings, leveraging preoperative and postoperative clinical variables, and to identify key risk factors associated with AKI.

Methods

Retrospective data from 1,304 cardiac surgery patients (1,028 AKI cases and 276 non-AKI controls) were extracted from the MIMIC-IV database. We analyzed three datasets: preoperative 48-hour averages, preoperative 48-hour maxima, and postoperative 24-hour maxima of critical physiological parameters. Using logistic regression, LASSO regression, and random forest (RF) algorithms, we constructed nine prediction models, evaluating their performance via AUROC, sensitivity, specificity, Youden’s index, decision curve analysis (DCA), and calibration curves.

Results

Our findings demonstrate that all models achieved AUROC values >0.7, with three models exceeding 0.75. Notably, the logistic regression model incorporating preoperative 48-hour maximum values and postoperative 24-hour maximum values exhibited the highest predictive accuracy (AUROC = 0.755, 95%CI: 0.7185–0.7912), outperforming other configurations. This model’s superiority lies in its integration of dynamic preoperative and postoperative variables, capturing both baseline risks and acute postoperative changes. By systematically comparing multiple machine learning approaches, our study highlights the utility of combining temporal physiological metrics to enhance AKI risk stratification. These results offer a robust, clinically applicable tool for early AKI prediction, enabling proactive interventions to improve outcomes in cardiac surgery patients.

Introduction

Acute kidney injury (AKI) is a common complication of cardiac surgery, and the incidence of cardiac surgery–associated acute kidney injury (CSA-AKI) is as high as 20% to 30% [1]. At least approximately 3% of CSA-AKI patients require temporary renal replacement therapy (RRT). The perioperative mortality rate for patients with severe AK is three to eight times greater than that for patients without AKI, resulting in longer ICU admissions and hospital stays and increasing the cost of medical care during hospitalization [2]. Patients who undergo cardiac surgery have an increased incidence of acute kidney injury and are four times more likely to die in the hospital [3]. Therefore, we need a clinical prediction model that has some predictive power and is relatively easy to obtain, which can also play an important role in clinicians’ decision-making about treatment options.

The pathophysiology of CSA-AKI is multifactorial, and the exact pathophysiological mechanisms are not fully understood. The major pathways that may be involved include insufficient renal perfusion, ischemia‒reperfusion injury (IRI), inflammatory cascade activation, oxidative stress, nephrotoxin exposure, and genetic polymorphisms, which may occur at any time in the perioperative period [4]. Each of these factors interacts before, during, and after surgery [5]. Therefore, we believe that the changes in vital signs of surgical patients before, during, and after surgery, as well as the results of various laboratory tests, can be used as potential predictors of CSA-AKI.

Generally, before elective surgery, a number of routine preoperative laboratory tests are completed in most patients in order to assess the patient’s suitability for surgery. Based on the available literature, these tests can be categorized as random, indicated, routine, or screening tests. Routine laboratory tests include complete blood count (CBC), basal metabolic panel (BMP) and coagulation tests, such as the partial thromboplastin time (PTT), prothrombin time (PT), international normalized ratio (INR), and blood grouping and antibody screening (T&S) [6]; these tests are used for the detection of unsuspected diseases [4]. Considering the easy accessibility of these laboratory indicators, in this study, we utilized large public databases for screening routine laboratory indicators in parallel retrospective studies.

By reading the articles related to the MIMIC database, it can be determined that there is a lack of clear criteria for the extraction of inspection result data in the database in previous articles; in Fangqi Hu et al. the first inspection data was extracted [7]; in Wei Jiang et al. the average value of the inspection data was extracted due to the consideration that multiple variables were measured more than once [8]; while in our study the maximum value was extracted based on the addition of the maximum value. In our study, in addition to the above two extraction methods, we additionally extracted the maximum value. This retrospective study used the MIMIC-IV database and focused on patients admitted to the ICU after cardiac major vascular surgery. Multiple predictive models were developed using logistic regression, LASSO regression, and random forest regression based on the results of several routine laboratory tests in critical care laboratory patients. And, we will compare the predictive performance of each predictive model using area under the ROC curve (AUROC), decision curve analysis (DCA), and calibration curve analysis to validate the predictive value of predictive models for the development of AKI in ICU patients after cardiac surgery, and obtain the best model.

Materials and methods

Data acquisition

The data for this study were obtained from the Critical Care Medical Information Market Database version 2.2 (MIMIC-IV v2.2). MIMIC-IV is a publicly accessible critical care data repository from a single medical centre that collaborates with Beth Israel Deaconess Medical Center (BIDMC, Boston, MA, USA) and the Massachusetts Institute of Technology Laboratory of Computational Physiology (MIT, Cambridge, MA, USA). It is divided into “modules” to reflect the source of the data [9].

The database includes data on more than 257,000 different patients treated at Beth Israel Deaconess Medical Center (BIDMC) in Boston, Massachusetts, between 2008 and 2019, 524,000 of whom were admitted. It also contains the comprehensive records of 73,181 patients hospitalized in various intensive care units [10]. The dataset includes demographic indicators, vital sign readings, laboratory results, fluid balance assessments, and patient survival data, in addition to International Classification and Revision of Diseases (ICD-9 and ICD-10) codes, which provide a standardized framework for systematic classification. In our study, the researcher responsible for extracting the data had access to the database [10].

Study cohort selection

We screened patients in the MIMIC-IV database who had undergone coronary artery bypass grafting (CABG), valve replacement or repair, or combined CABG and valve surgery using the International Classification of Diseases, Ninth Edition (ICD-9, code 361%) and Tenth Edition (ICD10, code 021%). For patients who underwent multiple major cardiac vascular surgeries, we only included information on the first surgery, and for patients who had multiple ICU admissions after major cardiac vascular surgery, we only considered information on the first ICU admission.

Secondary eligibility assessment of screened cardiac surgery patients was then performed by applying the following exclusion criteria: patients under the age of 18 years, patients who were not admitted to the ICU after cardiac surgery, patients who had an ICU stay of less than 24 hours, patients with missing information on laboratory bioindicators, and patients for whom postoperative AKI status could not be ascertained. A total of 1,304 patients who met the inclusion criteria were ultimately included, including 1,028 AKI patients and 276 non-AKI patients.

Variable selection

We extracted information using Structured Query Language (SQL) in PostgreSQL (version 13.7.2) and Navicate Premium (version 16) software. The extracted latent variables can be categorized into the following five main groups: 1) demographic data, including age, weight, height, BMI, sex, Sepsis-related Organ Failure Assessment (SOFA) score and Glasgow Coma Scale (GCS); 2) vital signs, including heart rate, respiration, temperature, systolic blood pressure, diastolic blood pressure, mean arterial blood pressure, and oxygen saturation; 3) comorbidities, including obesity, diabetes, hypertension, atrial fibrillation, myocardial infarction, chronic lung disease, chronic renal failure, liver disease, shock, hyperlipidaemia, and vascular disease; 4) laboratory indices, including white blood cell count, red blood cell count, mean erythrocyte haemoglobin content, mean erythrocyte haemoglobin concentration, platelet count, haemoglobin volume, international normalized ratio (INR), plasminogen time, activated partial thromboplastin time, alanine aminotransferase, methionine aminotransferase, alkaline phosphatase, total bilirubin, urea, and creatinine in the first 48 hours of the procedure; blood chloride, blood potassium, blood sodium, blood glucose, white blood cell count, lymphocyte count, monocyte count, neutrophil count, erythrocyte count, mean erythrocyte haemoglobin content, mean erythrocyte haemoglobin concentration, platelet count, haemoglobin volume, mean erythrocyte volume, platelet count, international normalized ratio, prothrombin time, and activated partial thromboplastin time in the first 24 hours of the operation, partial pressure of oxygen, partial pressure of carbon dioxide, oxygenation index, pH, lactate, urea, creatinine, blood calcium, blood chloride, and blood potassium; 5) outcome indicators: the occurrence of AKI, the number of days of ICU stay, and in-hospital mortality. The primary outcome variable of this study was whether AKI occurred after cardiac surgery, and the secondary outcome variables were the length of ICU stay and in-hospital mortality.

It should be noted that in this study, we extracted three different sets of data from the mimic-iv database: 1) the first set of data was included for the first laboratory variables in the patient’s preoperative 48 hours and postoperative 24 hours as group Q1; 2) considering that multiple variables were measured more than once, the second set of data was included for the patient’s preoperative 48 hours and postoperative 24 hours laboratory variables were averaged as group Q2; 3) and the third set of data incorporated was the maximum value of the patient’s laboratory variables during the 48 hours preoperative and 24 hours postoperative as group Q3.

Data analysis

First, we applied R language version 4.3.2 to randomly assign the research cohort to the training group (70%) and the validation group (30%). Then, we built prediction models based on the training cohort and finally validated them in the validation cohort. Because missing data are very common in the MIMIC database, a potential variable was not included in the analysis in this study if it had a missing data rate greater than 20%. We used SPSS 27.0 to supplement missing data by multiple interpolation for latent variables with missing data rates less than or equal to 20%.

Comparisons between groups were made using the Mann-Whitney U test, Fisher exact test, or Χ2 test, as appropriate.

Data were combined and then screened and analyzed using Stata 18.0. Non-normally distributed continuous variables were expressed as median (interquartile spacing), normally distributed continuous variables as mean ± standard deviation, and categorical variables as percentages.

First, due to the large number of extracted variables, we first used SPSS 27. 0 to perform one-way analysis of variance with P < 0.05 as statistically significant, and initially screened out potential variables with significant differences in order to avoid overfitting and improve the fit of the model. Non-normal continuous variables were tested using the Mann-Whitney U test (non-parametric U test) test and categorical variables using the χ2 test.

Then, we used logistic regression analysis (spss27.0), lasso regression analysis (r-language version 4.3.2) and random forest (r-language version 4.3.2) to further compute and analyze the screened latent variables to derive the risk factors and coefficients, and we built a risk prediction model based on the results of this computation.

Finally, considering that the prediction discriminations of all models were evaluated using consistency statistics, we calculated the AUC value, DCA decision curve, accuracy, sensitivity, and Yoden index of each prediction model separately, as well as calculated the Integrated Discriminant Improvement Index (IDI), and selected the best prediction model by comparing the above consistency statistics.

We plotted the calibration curves of the predictive models using the validation group and assessed the calibration ability of the predictive models by internal validation. The nomograms were plotted according to the weights of the variables in the best predictive model for clinical use.

Results

The number of patients in the MIMIC IV database who underwent elective coronary artery bypass grafting (CABG), valve surgery, or both totaled 15,459, of which a total of 12,532 were admitted to the ICU. We screened these patients by applying the screening criteria of this study, and a total of 1304 patients were included (Fig 1).

thumbnail
Fig 1. Flowchart of cohort screening. ICU, intensive care unit; MIMIC-IV, Medical Information Mart for Intensive Care IV; CABG, coronary artery bypass grafting; AKI, acute kidney injury.

https://doi.org/10.1371/journal.pone.0325151.g001

Patient baseline characteristics

The detailed characteristics of all included patients are listed in Table 1. The training cohort consisted of 916 patients, and the validation cohort consisted of 388 patients. The comparison of the median and quartiles of preoperative vital signs, preoperative and postoperative laboratory indices of patients in the training cohort and the validation cohort, as well as the percentage of patients with multiple complications among the total number of patients, revealed little variability. No statistically significant difference was noted between the training and validation cohorts (S1 Table).

thumbnail
Table 1. Characterization of patients in the training and validation cohort at baseline.

https://doi.org/10.1371/journal.pone.0325151.t001

Table 1 shows that 719 (78.5%) patients met the KDIGO criteria for AKI in the training cohort, while 309 (79.6%) patients met the criteria for AKI in the validation cohort. The median ages of these patients were 69.8 years (IQR: 61.9, 77.1) and 69.7 years (IQR: 61.4, 77.4), respectively. The in-hospital mortality rates for all included patients were 0.8% (7/916) and 1.0% (4/388), and the numbers of days in the ICU were 2.0 days (IQR: 1.2, 3.1) and 2.0 days (IQR: 1.2, 3.2), respectively.

Construction of models for predicting the risk of AKI occurrence

First, univariate analysis was carried out for laboratory indicators with less than 20% missing values to screen for significant independent risk factors (p < 0.05) that could serve as potential predictor variables. Then, we performed the following three regression analyses on the screened potential predictor variables to construct prediction models.

  1. 1. Multivariate logistic-LR regression analysis

Multivariate logistic regression analysis was performed on the potential predictors screened for each of the three datasets based on univariate analysis. The results showed that the potential predictor variables for Q1 were age, weight, systolic blood pressure, mean arterial pressure, diabetes mellitus, atrial fibrillation, preoperative hemoglobin, preoperative creatinine, and postoperative neutrophil counts; the potential predictor variables for Q2 were age, weight, mean arterial pressure, atrial fibrillation, diabetes mellitus, preoperative hemoglobin, and postoperative neutrophil counts; the potential predictor variables for Q3 were age, weight, mean arterial pressure, atrial fibrillation, preoperative hemoglobin, preoperative creatinine, preoperative glucose, postoperative neutrophil counts, and postoperative international normalized ratio, with atrial fibrillation and diabetes mellitus as categorical variables. Moreover, among arterial pressure, atrial fibrillation, preoperative hemoglobin, preoperative creatinine, preoperative glucose, postoperative neutrophil count, and postoperative International Normalized Ratio, atrial fibrillation and diabetes mellitus were categorical variables, and the remaining variables were continuous (Table 1). The VIFs of the latent variables ultimately included in the three models were all less than 2, which shows that the covariances among the variables were all weak (S2 Table).

  1. 2. LASSO regression analysis

To construct a LASSO regression model for predicting CSA-AKI, all potential predictors were included in the LASSO regression analysis. We used the minimum standard and minimum standard 1x standard error of ten cross-validations in LASSO regression analysis to select the optimal lambda to minimize model bias. As shown in Fig 2, a total of 6 independent risk factors were screened in Q1, namely, age, weight, mean arterial pressure, atrial fibrillation, preoperative hemoglobin and postoperative partial pressure of oxygen; a total of 6 independent risk factors were screened in Q2, namely, age, weight, mean arterial pressure, atrial fibrillation, preoperative hemoglobin and postoperative monocyte count; and a total of 10 independent risk factors were screened in Q3, namely, age, weight, mean arterial pressure, diabetes mellitus, atrial fibrillation, preoperative hemoglobin, preoperative creatinine, preoperative glucose, postoperative monocyte count and postoperative international normalized ratio. Similarly, we analyzed the covariance of the variables included in the Lasso model, and the results showed that the variance inflation factor (VIF) values for each variable were less than 5 (S3 Table).

thumbnail
Fig 2. A, Coefficient profile plot for Q1; B, Coefficient profile plot for Q2; C, Coefficient profile plot for Q3; D, Lasso regression variable trajectories for Q1; E, Lasso regression variable trajectories for Q2; F, Lasso regression variable trajectories for Q3.

Predictor selection by the least absolute shrinkage and selection operator (LASSO) regression method.

https://doi.org/10.1371/journal.pone.0325151.g002

  1. 3. Random forest (RF)

RF is one of the most advanced multiple machine learning (ML) methods, and Junlong Hu et al.‘s study concluded that the RF model has the strongest discriminative ability for AKI in critically ill children [11]. Random forests are nonparametric methods that can be adapted to different types of responses, such as categorical or quantitative outcomes and survival times; moreover, these methods are well suited for analyzing complex data and can highlight the relevance of each predictor through so-called variable importance measures [12]. In this study, we used all significantly different latent variables for random forest analysis and constructed optimal random forest models for Q1, Q2, and Q3 based on the correlation of predictors, as shown in S5 Table. The VIFs for each variable included in the three RF models were less than 5 (S4 Table).

We summarized the results of the above regression analyses and ultimately created a total of nine CSA-AKI candidate predictive models based on the three data sets. In addition we included the SOFA score [13], a predictor variable in the Rapid Acute Kidney Injury Score, to predict the incidence of AKI in patients admitted to the ICU after cardiac surgery (S5 Table).

Evaluating the validity of the predictive models

In this study, a total of 9 prediction models were constructed, and the ROC curves of the prediction models were plotted. The area under the ROC curve (AUROC) is defined as the area under the ROC curve. In general, a larger AUROC indicates a greater accuracy of the prediction model and a better diagnostic performance. As shown in Table 2, the AUCs of the nine models were greater than 0.7, and the AUCs (95% CIs) of the three models (logistic, LASSO, and RF) in Q1 were 0.748 (0.7109 ~ 0.7853), 0.737 (0.6998 ~ 0.7747), and 0.724 (0.6863 ~ 0.7626), respectively; the AUCs (95% CIs) of the three models in Q2 (logistic, LASSO, and RF) were 0.753 (0.7156 ~ 0.7897), 0.748 (0.7105 ~ 0.7847), and 0.741 (0.7044 ~ 0.7781), respectively; the AUC (95% CI) of the 3 models (logistic, LASSO, and RF) in Q3 were 0.753 (0.7156 ~ 0.7897), 0.748 (0.7105 ~ 0.7847), and 0.741 (0.7044 ~ 0.7781), respectively; and the AUC (95% CI) were 0.752 (0.7152 ~ 0.7883), 0.755 (0.7185 ~ 0.7912), and 0.738 (0.7018 ~ 0.7752). We plotted the corresponding ROC curves (Fig 3). Fig 3A depicts the comparison of ROC curves for prediction models derived from the Logistic regression algorithm across three distinct groups (Q1, Q2, Q3). Here, the abscissa denotes specificity, while the ordinate represents sensitivity. A curve positioned closer to the upper-left corner signifies stronger discriminatory capacity for the target outcome. As illustrated in the legend, the AUC values for devmodelQ1, devmodelQ2, and devmodelQ3 were 0.748, 0.753, and 0.752, respectively. Notably, the Logistic models in Q2 and Q3 groups exhibited significantly higher AUCs compared to Q1, indicating superior diagnostic efficacy in these subgroups. This suggests that the Logistic model more accurately discriminates between positive and negative outcomes in Q2 and Q3 scenarios, offering a robust foundation for clinical decision-making. Fig 3B illustrates the ROC curve analysis for the Random Forest (RF) model across Q1, Q2, and Q3. The AUCs for devmodelQ1, devmodelQ2, and devmodelQ3 were 0.724, 0.741, and 0.738, respectively. While all three groups demonstrated AUC values exceeding 0.7-indicating baseline diagnostic utility-the RF model in Q1 yielded the lowest AUC, whereas the Q2 group’s model showed comparatively superior performance. Clinically, these results emphasize the importance of selecting RF models tailored to specific scenarios; the Q2 model’s balanced sensitivity and specificity profile may optimize predictive accuracy in corresponding clinical contexts. Fig 3C displays the ROC curves for the LASSO model across the three groups. The AUCs for devmodelQ1, devmodelQ2, and devmodelQ3 were 0.737, 0.748, and 0.755-the latter being the highest among all models analyzed. The Q3 group’s LASSO model, with an AUC of 0.755, demonstrated significantly improved discrimination of diseased versus non-diseased populations compared to Q1. From a clinical research standpoint, this model’s exceptional performance positions it as a top candidate for supporting clinical decisions, particularly in scenarios analogous to the Q3 subgroup. We screened out the first three models according to the AUC value of two of the logistic model, it can be seen that compared with the newer machine learning model, in certain scenarios, the performance of the model established by traditional statistical analysis methods may be more, so when we build a predictive model to select statistical analysis methods, machine learning methods can not yet completely replace the traditional statistical analysis methods.

thumbnail
Table 2. Predictive performance of CSA-AKI’s nine predictive modeling systems in a three -group cohort.

https://doi.org/10.1371/journal.pone.0325151.t002

thumbnail
Fig 3. A, ROC curves for Logistic models constructed on the basis of three cohort groups; B, ROC curves for Lasso models constructed on the basis of three cohort groups; C, ROC curves for RF models constructed on the basis of three cohort groups.

RF, random forest.

https://doi.org/10.1371/journal.pone.0325151.g003

Decision curve analysis (DCA) is a simple method for evaluating clinical predictive models, diagnostic tests, and molecular markers that integrates and analyses patient or decision-maker preferences while meeting the practical needs of clinical decision-making. As shown in the figure, in the DCA curve, the grey diagonal curve indicates intervention for all patients, the grey parallel line indicates no intervention for any patient, and the red, black, and green colors indicate the clinical benefit of the logistic, LASSO, and RF models, respectively (Fig 4). Combining the DCA curves of the nine prediction models (Fig 4), we can see that postoperative AKI risk assessment using the prediction models in this study yields some net benefit within most prediction thresholds.

thumbnail
Fig 4. A, The results of the decision curve analysis for group Q1; B, The results of the decision curve analysis for group Q2; C, The results of the decision curve analysis for group Q3.

PredmodelA, Logistic model of Q1; predmodelB, LASSO model of Q1; predmodelC, RF model of Q1; predmodelD, Logistic model of Q2; predmodelE, LASSO model of Q2; predmodelF, RF model of Q2; predmodelG, Logistic model of Q3; predmodelH, LASSO model of Q3; predmodelI, RF model of Q3.

https://doi.org/10.1371/journal.pone.0325151.g004

High specificity means is well known to indicate that the model has high accuracy in identifying actual negative samples. High sensitivity means that the model has high accuracy in identifying actual positive samples. A model with high sensitivity in clinical applications can minimize missed diagnoses. Youden’s index is a statistical metric used to assess the performance of a diagnostic test, with a larger Youden’s index indicating better performance and greater fidelity of the classification model. The specificity, sensitivity, and Youden’s index of the nine predictive models constructed from the three sets of data are shown in Table 2. A comparison of the three sets of data revealed that the RF model of Q3 had the highest specificity (0.7918782), followed by the logistic model of Q1 (0.7766497); the logistic model of Q3 had the highest sensitivity (0.7635605), followed by the RF model of Q2 (0.7343533). The LASSO model of Q3 had the highest Youden index of 0.4034297.

Validation of predictive models

The top three models ranked according to the AUC value were the LASSO model based on Q3, the logistic model based on Q2, and the logistic model based on Q3, while the model with the lowest AUC value was the RF model based on Q1. The main principle of the NRI and IDI is based on the “gold standard”, which is to reclassify the classification results of a model according to a new set of criteria and to examine the ability of the reclassified metrics to improve. According to further analysis of the IDI values, the LASSO model of Q3 improved the predictive ability by 3.12% (95% CI: 1.78% to 4.46%) compared to that of the Q1 random forest model; the logistic model of Q2 improved the predictive ability by 3.49% (95% CI: 1.88% to 4.49%) compared to that of the Q1 random forest model; and the Q3 logistic model improved the predictive ability by 2.88% (95%:1.6% to 4.16%) over that of the Q1 random forest model.

In the calibration curve, the vertical coordinate represents the probability of actual occurrence in the study cohort, and the vertical coordinate represents the estimated probability of the model; a well-calibrated model should try to keep the calibration curve as close as possible to the ideal line, which indicates that the predicted probability of the model matches the probability of actual occurrence.

We plotted the calibration curves for the LASSO model of Q3, the logistic model of Q2, and the logistic model of Q3 based on the validation set of data, as shown in Fig 5. Panels A, B, and C respectively display the calibration curves of different models. The vertical axis represents the actual occurrence probability (Actual Probability), while the horizontal axis indicates the model-predicted probability (Predicted Probability). The figure contains three curves: the gray “Ideal line (Ideal)” represents the ideal state where the predicted probability aligns perfectly with the actual probability; the black “Logistic calibration” denotes the logistic calibration curve, reflecting the calibration status of the logistic regression model; the dashed “Nonparametric” serves as the non-parametric calibration curve, illustrating the relationship between predictions and actual outcomes from a non-parametric perspective. Statistics such as Dxy, C (ROC), and R2 listed in the figure are utilized to quantify the models’ performance. As shown in the figure, compared with the logistic model of Q2, the probabilities estimated by the LASSO model and the logistic model built based on Q3 have greater conformity and better consistency with the actual values. Although the sensitivity and Youden’s index of Q3’s logistic model are slightly lower than those of Q3’s LASSO model, the calibration curves of Q3’s logistic model are closer to the ideal curves, and Q3’s logistic model is simpler and easier to use clinically.

thumbnail
Fig 5. A, Calibration curves for Q2’s logistic model; B, Calibration curves for Q3’s logistic model; C, Calibration curves for Q3’s LASSO model.

Calibration plots illustrate the relationship between the predicted AKI risk according to the models and the actual occurrence of AKI in the validation data. Plots along the 45 line indicate model calibration, the more similar the predicted probability is to the actual outcome, the better the dashed line fits the solid line, indicating a better predictive model.

https://doi.org/10.1371/journal.pone.0325151.g005

Summarizing the above consistency test indexes, the logistic model based on the data of Q3 group not only has a high AUC value (0.752) among the nine prediction models, but also ranks the top three in sensitivity and Jordon’s index among the nine prediction models. Combined with Fig 4, the logistic model based on the Q3 group data is not only simpler and easier to operate than the other prediction models, but also has greater conformity and better consistency with the actual values. And as shown in Fig 5, it also has a certain clinical net gain in actual clinical use. Therefore, we finally selected the prediction model as the logistic model based on the data of Q3 group.

Nomogram establishment

We plotted nomogram column-line plots based on the regression coefficients for each variable of the logistic prediction model for Q3, as shown in Fig 6. The scores of each variable were obtained from the specific values of each variable and summed to obtain the total score, which was converted to the probability of occurrence of CSA-AKI in the patient by the probability axis of the nomogram. Based on this model, we constructed a nomogram to predict the risk of CSA-AKI. To illustrate its application, we randomly selected a patient and plotted their clinical parameters on the nomogram to compute the total score and corresponding CSA-AKI risk. The patient’s preoperative data included a glucose level of 174, creatinine of 1.2, and hemoglobin of 14.3 within 24 hours before surgery, along with baseline characteristics: a history of atrial fibrillation, mean arterial pressure (MBP) of 68.64, weight of 69.85, age of 70.28, and a SOF score of 5. Postoperatively, their INR was 1.5 within 48 hours, and monocyte count was 0.64. Summing these factor scores yielded a total of 0.178, translating to an approximately 82.6% risk of developing CSA-AKI.

thumbnail
Fig 6. Nomogram to predict the incidence of postoperative AKI in patients undergoing cardiac surgery.

AKI, Acute Kidney Injury; MBP, mean arterial pressure; SOFA, Sequential Organ Failure Assessment.

https://doi.org/10.1371/journal.pone.0325151.g006

Discussion

Although the pathophysiological mechanism of postoperative AKI after cardiac surgery has not been clarified thus far, many clinical data and bioindicators have been proven to have some early predictive value for CSA-AKI. These methods have been applied to construct prediction models by other authors.

These factors have been reported as potential predictors, including sex, age, height, weight, BMI, body temperature, heart rate, respiratory rate, systolic blood pressure, diastolic blood pressure, oxygen saturation, obesity, renal disease without RRT, hypovolemia, hepatorenal syndrome, diabetes mellitus, cerebrovascular disease, malignancy, prior chronic kidney disease, sepsis, shock, multiorgan failure, Charlson comorbidity index (≥2 points), LVEF ≤30%, NYHA score (>2 points), previous cardiac surgery [14], valve and coronary artery bypass grafting, cardiopulmonary bypass application, cardiopulmonary bypass duration, mechanical ventilation [15], dialysis, red blood cell transfusion [16], HDL cholesterol concentration [17], proteinuria [18], preoperative glomerular filtration rate [19], urinary liver-type fatty acid binding protein (L-FABP), urinary neutrophil gelatinase-associated lipocalin, serum L-FABP, urinary interleukin-18, urinary kidney injury molecule-1 [20], neutrophil-lymphocyte ratio [21], white blood cell count, hemoglobin, erythrocyte pressure volume, platelet count, blood creatinine, blood urea nitrogen, bicarbonate, and serum potassium [22].

Based on the prediction model assessment time, we can broadly categorize several identified models for AKI risk assessment after cardiac surgery into preoperative, intraoperative, and postoperative models. And the pathogenesis of AKI can occur in preoperative, intraoperative as well as postoperative still and phases or simultaneously in all three phases at the same time [16,2325]. In our study, a prediction model for AKI associated with cardiac surgery was developed and validated using routine laboratory test indices 48 hours before and up to 24 hours after surgery, as well as clinical data and scores of patients. The aim of this study is to provide an early risk assessment of patients undergoing cardiac surgery without additional invasive procedures and without increasing the risk of surgery or the cost of hospitalization, using available preoperative and postoperative laboratory indicators, clinical scores, and patient baseline data, in order to achieve timely intervention, shorten the length of hospital stay, and reduce the total cost of hospitalization. However, patients undergoing cardiac surgery usually have more than one routine laboratory index in the 48 hours prior to surgery, as well as surgical patients admitted to the ICU who may also have multiple routine laboratory indexes reviewed, usually in the 24 hours after surgery, due to the need to assess changes in the patient’s condition. Our search of the MIMIC IV database revealed that each patient enrolled usually had more than one record of routine laboratory tests that met the screening time criteria (48 hours preoperatively and up to 24 hours postoperatively). A review of the relevant literature found that the first preoperative, intraoperative, and postoperative laboratory indicators were generally used to establish CSA-AKI prediction models [7,26]. There is also related literature on the use of the MIMIC database to build predictive models by selecting the average value of the experimental indicators [8]. Compared with the published literature, we have added the grouping of the maximum values of laboratory indicators in the 48 hours before and 24 hours after the operation, and we have used three statistical methods to establish a prediction model for comparative analysis, which effectively avoids the chance of analyzing the results and improves the accuracy and credibility of the results.

In this study, we visually demonstrate the diagnostic performance of logistic regression, random forest, and LASSO models across different cohorts through ROC curves. As a gold-standard metric in ROC analysis, the area under the curve (AUC) quantifies model accuracy by summarizing the trade-off between sensitivity and specificity. The findings validate several core principles in medical ROC analysis: 1) higher AUC values—such as those observed in the Q3 LASSO model and Q2/Q3 logistic regression models—correspond to superior diagnostic utility; 2) model performance can vary significantly across subgroups, necessitating tailored model selection; and 3) traditional statistical methods (e.g., logistic regression) can rival modern machine learning approaches in specific clinical scenarios. These insights not only provide guidance for researchers to select the most appropriate predictive models but also highlight the importance of integrating ROC results with the clinical and biological characteristics of the studied populations.

As depicted in Fig 5, Panel A illustrates the calibration curve of the Q2 logistic model. With an ROC value of 0.692, this model exhibits moderate discriminatory capacity for target events (e.g., AKI). Panel B showcases the Q3 logistic model, characterized by an ROC value of 0.722 and a Dxy value of 0.443—metrics indicative of enhanced discrimination between positive and negative samples. Panel C presents the Q3 LASSO model, with an Eavg of 0.026, suggesting a relatively minimal overall prediction bias. Upon comparative analysis of the calibration curves across these three predictive models, the calibration curve of the Q3 logistic model demonstrates closer alignment with the ideal line. This observation signifies a higher degree of consistency between the predicted probability and the actual occurrence probability. Consequently, the Q3 logistic model not only exhibits superior clinical prediction reliability but also provides a more robust foundation for risk assessment, enabling effective clinical risk stratification. Such findings underscore the Q3 logistic model’s utility in translating predictive analytics into actionable clinical decisions for target events like AKI.

Multivariate logistic regression (LR) models are well-suited for scenarios where data exhibit a clear linear relationship and there is a high demand for model interpretability. For instance, in medical research, when predicting the relationship between the occurrence of a disease and known risk factors, logistic regression can clearly explain the impact of each factor on the probability of disease occurrence [27,28]. Lasso regression is appropriate for high-dimensional data with feature redundancy, particularly when aiming to perform feature selection to enhance model generalization. For example, in gene expression data analysis, Lasso regression can identify key genes associated with a disease by shrinking the coefficients of irrelevant genes, thereby achieving feature selection [29]. Random forest (RF) models generally perform better in handling complex nonlinear problems, especially when dealing with large datasets and prioritizing model accuracy. In fields such as image recognition and speech recognition, random forests can manage highly complex feature relationships and achieve high recognition accuracy [26,30]. Machine learning modelling methods such as RF can rank the importance of the variables in the model. Furthermore, Po-Yu Tseng’s study showed that the models constructed by machine learning outperform the models constructed by traditional regression analysis methods in predicting CSA-AKI with small sample sizes. However, comparing the models constructed in this study using three different methods showed that logistic regression, a traditional regression analysis method, outperforms the models constructed by traditional regression analysis methods. Traditional logistic regression analysis methods are superior to machine learning methods such as RF. In Penghua Hu’s study, the final selection of predictive models was also constructed by multifactor logistic regression analysis [31]. Therefore, we believe that the traditional logistic regression model remains superior for cases involving many potential variables and samples.

The final model developed included patient age, weight, mean arterial pressure, atrial fibrillation, preoperative hemoglobin, preoperative creatinine, preoperative glucose, postoperative mononuclear cell count, postoperative INR, and the SOFA score. The nomogram also demonstrated a high degree of predictive ability in the validation cohort. This model is simple and easy to operate and has some net clinical benefit with a high AUC value (0.752), sensitivity (0.7635605), and Jordon’s index (0.3828498) is simpler and easier to operate and has good conformity and consistency with the actual values. Clinical implementation of the nomogram involves systematically inputting validated preoperative and postoperative variables into its computational framework. Specifically, this includes integrating baseline demographics (age, weight, atrial fibrillation history), hemodynamic parameters (mean arterial pressure), preoperative laboratory maxima (hemoglobin, creatinine, glucose), and postoperative laboratory maxima (monocyte count, INR). The nomogram generates a risk score that provides real-time, point-of-care decision support. And this score allows clinicians to dynamically stratify patients into distinct risk categories and implement immediate, risk-tailored actions, such as intensifying renal monitoring, initiating preemptive nephroprotective measures, or consulting nephrology early. The model represents a paradigm shift in clinical practice by offering a quantitative, evidence-based risk assessment that complements traditional subjective evaluations. By objectively categorizing patients into high- and low-risk groups for CSA-AKI, the nomogram enables data-driven, proactive decisions. This approach reduces reliance on anecdotal reasoning, enhances risk stratification accuracy, and ultimately facilitates timely, personalized interventions to improve patient outcomes.

Comparing the nine risk prediction models we developed shows that preoperative hemoglobin is a significant predictor in patients with CSA-AKI. Kulier A et al. suggested that a low preoperative hemoglobin level is an independent predictor of poor noncardiac prognosis, including AKI [32]. A study that included 123 consecutive patients who underwent pulmonary endarterectomy (PEA) also demonstrated that a low preoperative hemoglobin concentration was an independent risk factor for AKI after PEA. Congya Zhang and Guyan Wang deduced the underlying mechanism in their article. They concluded that the hemoglobin concentration determines arterial blood oxygen content, which plays an important role in tissue oxygen supply. Therefore, a lower hemoglobin concentration reduces oxygen delivery to the kidneys, predisposing them to renal medullary injury and ultimately leading to exacerbation of postoperative AKI [33]. In another study that included 920 patients who underwent cardiac surgery combined with extracorporeal circulation, the authors concluded that the incidence of AKI significantly increased when hemoglobin levels were extremely low (<25%) [34].

Our study also has shortcomings. First, this study only used the creatinine item of KDIGO as a diagnostic criterion for postoperative AKI and did not include urine output as a diagnostic criterion; therefore, it may have underestimated the incidence of postoperative AKI. Second, this study only used the MIMIC-IV database, which is based on the U.S. population; therefore, the applicability of the prediction model to other populations remains to be validated. Third, due to the limitations of the database itself, some of the latent variables with missing records could not be included in the statistical analysis.

Conclusion

In this study, we developed nine early predictive models for CSA-AKI using logistic regression, LASSO regression, and random forest regression. Comparative analysis revealed that traditional logistic regression retained notable advantages over machine learning methods like random forest, indicating modern techniques cannot fully replace traditional regression. Models based on maximum laboratory values outperformed those using other datasets. Integrating statistical and clinical insights, we identified the logistic regression model incorporating preoperative and postoperative maxima maxima as optimal. This model includes: 1) baseline characteristics (age, weight, mean arterial pressure, SOFA score, atrial fibrillation history); 2) preoperative 48-hour maxima of hemoglobin, creatinine, and glucose; and 3) postoperative 24-hour maxima of monocytes and INR. Its key strength lies in synthesizing perioperative dynamic variables, capturing both baseline risks and acute intraoperative/postoperative physiological changes. This framework offers a robust tool for early CSA-AKI risk assessment.

Supporting information

S1 Table. Variability between modelling and validation groups.

https://doi.org/10.1371/journal.pone.0325151.s001

(PNG)

S2 Table. VIF values for logistic prediction models.

https://doi.org/10.1371/journal.pone.0325151.s002

(PNG)

S3 Table. VIF values for lasso prediction models.

https://doi.org/10.1371/journal.pone.0325151.s003

(PNG)

S4 Table. VIF values for random forest prediction models.

https://doi.org/10.1371/journal.pone.0325151.s004

(PNG)

S5 Table. Predictors included in the models.

https://doi.org/10.1371/journal.pone.0325151.s005

(PNG)

References

  1. 1. Cheruku SR, Raphael J, Neyra JA, Fox AA. Acute Kidney Injury after Cardiac Surgery: Prediction, Prevention, and Management. Anesthesiology. 2023;139(6):880–98. pmid:37812758
  2. 2. Yu Y, Li C, Zhu S, Jin L, Hu Y, Ling X, et al. Diagnosis, pathophysiology and preventive strategies for cardiac surgery-associated acute kidney injury: a narrative review. Eur J Med Res. 2023;28(1):45. pmid:36694233
  3. 3. Abedini A, Zhu YO, Chatterjee S, Halasz G, Devalaraja-Narashimha K, Shrestha R, et al. Urinary Single-Cell Profiling Captures the Cellular Diversity of the Kidney. J Am Soc Nephrol. 2021;32(3):614–27. pmid:33531352
  4. 4. Kannaujia AK, Gupta A, Verma S, Srivastava U, Haldar R, Jasuja S. Importance of routine laboratory investigations before elective surgery. Discoveries. 2020;8(3).
  5. 5. Harky A, Joshi M, Gupta S, Teoh WY, Gatta F, Snosi M. Acute Kidney Injury Associated with Cardiac Surgery: a Comprehensive Literature Review. Braz J Cardiovasc Surg. 2020;35(2):211–24. pmid:32369303
  6. 6. Adams AJ, Cahill PJ, Flynn JM, Sankar WN. Utility of Perioperative Laboratory Tests in Pediatric Patients Undergoing Spinal Fusion for Scoliosis. Spine Deform. 2019;7(6):875–82. pmid:31731997
  7. 7. Hu F, Zhu J, Zhang S, Wang C, Zhang L, Zhou H, et al. A predictive model for the risk of sepsis within 30 days of admission in patients with traumatic brain injury in the intensive care unit: a retrospective analysis based on MIMIC-IV database. Eur J Med Res. 2023;28(1):290. pmid:37596695
  8. 8. Jiang W, Zhang C, Yu J, Shao J, Zheng R. Development and validation of a nomogram for predicting in-hospital mortality of elderly patients with persistent sepsis-associated acute kidney injury in intensive care units: a retrospective cohort study using the MIMIC-IV database. BMJ Open. 2023;13(3):e069824. pmid:36972970
  9. 9. Gupta M, Gallamoza B, Cutrona N, Dhakal P, Poulain R, Beheshti R. An Extensive Data Processing Pipeline for MIMIC-IV. Proc Mach Learn Res. 2022;193:311–25.
  10. 10. Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. pmid:36596836
  11. 11. Hu J, Xu J, Li M, Jiang Z, Mao J, Feng L, et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine. 2024;68:102409. pmid:38273888
  12. 12. Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform. 2023;24(2):bbad002. pmid:36653905
  13. 13. Ferrari F, Puci MV, Ferraro OE, Romero-González G, Husain-Syed F, Rizo-Topete L, et al. Development and validation of quick Acute Kidney Injury-score (q-AKI) to predict acute kidney injury at admission to a multidisciplinary intensive care unit. PLoS One. 2019;14(6):e0217424. pmid:31220087
  14. 14. Marco PS, Nakazone MA, Maia LN, Machado MN. Cardiac Surgery-associated Acute Kidney Injury in Patients with Preserved Baseline Renal Function. Braz J Cardiovasc Surg. 2022;37(5):613–21. pmid:36346770
  15. 15. Wang M, Yan P, Zhang NY, Deng YH, Luo XQ, Wang XF. Prediction of mortality risk after ischemic acute kidney injury with a novel prognostic model: A multivariable prediction model development and validation study. Front Med. 2022;9.
  16. 16. Jiang W, Teng J, Xu J, Shen B, Wang Y, Fang Y, et al. Dynamic Predictive Scores for Cardiac Surgery-Associated Acute Kidney Injury. J Am Heart Assoc. 2016;5(8):e003754. pmid:27491837
  17. 17. Zhou Y, Yang H-Y, Zhang H-L, Zhu X-J. High-density lipoprotein cholesterol concentration and acute kidney injury after noncardiac surgery. BMC Nephrol. 2020;21(1):149. pmid:32334566
  18. 18. Nautiyal A, Sethi SK, Sharma R, Raina R, Tibrewal A, Akole R, et al. Perioperative albuminuria and clinical model to predict acute kidney injury in paediatric cardiac surgery. Pediatr Nephrol. 2022;37(4):881–90. pmid:34545446
  19. 19. Reazaul Karim HM, Yunus M, Dey S. A retrospective comparison of preoperative estimated glomerular filtration rate as a predictor of postoperative cardiac surgery associated acute kidney injury. Ann Card Anaesth. 2020;23(1):53–8. pmid:31929248
  20. 20. Yuan S-M. Acute Kidney Injury after Cardiac Surgery: Risk Factors and Novel Biomarkers. Braz J Cardiovasc Surg. 2019;34(3):352–60. pmid:31310475
  21. 21. Wheatley J, Liu Z, Loth J, Plummer MP, Penny-Dimri JC, Segal R, et al. The prognostic value of elevated neutrophil-lymphocyte ratio for cardiac surgery-associated acute kidney injury: A systematic review and meta-analysis. Acta Anaesthesiol Scand. 2023;67(2):131–41. pmid:36367845
  22. 22. Pan P, Liu Y, Xie F, Duan Z, Li L, Gu H, et al. Significance of platelets in the early warning of new-onset AKI in the ICU by using supervise learning: a retrospective analysis. Ren Fail. 2023;45(1):2194433. pmid:37013397
  23. 23. Mehta RH, Grab JD, O’Brien SM, Bridges CR, Gammie JS, Haan CK, et al. Bedside tool for predicting the risk of postoperative dialysis in patients undergoing cardiac surgery. Circulation. 2006;114(21):2208–16; quiz 2208. pmid:17088458
  24. 24. Palomba H, de Castro I, Neto ALC, Lage S, Yu L. Acute kidney injury prediction following elective cardiac surgery: AKICS Score. Kidney Int. 2007;72(5):624–31. pmid:17622275
  25. 25. Thakar CV, Arrigain S, Worley S, Yared J-P, Paganini EP. A clinical score to predict acute renal failure after cardiac surgery. J Am Soc Nephrol. 2005;16(1):162–8. pmid:15563569
  26. 26. Demirjian S, Bashour CA, Shaw A, Schold JD, Simon J, Anthony D, et al. Predictive Accuracy of a Perioperative Laboratory Test-Based Prediction Model for Moderate to Severe Acute Kidney Injury After Cardiac Surgery. JAMA. 2022;327(10):956–64. pmid:35258532
  27. 27. Zabor EC, Reddy CA, Tendulkar RD, Patil S. Logistic Regression in Clinical Studies. Int J Radiat Oncol Biol Phys. 2022;112(2):271–7. pmid:34416341
  28. 28. Schober P, Vetter TR. Logistic Regression in Medical Research. Anesthesia Analg. 2021;132(2):365–6.
  29. 29. Lee JH, Shi Z, Gao Z. On LASSO for predictive regression. J Econom. 2022;229(2):322–49.
  30. 30. Yu Y, He Z, Ouyang J, Tan Y, Chen Y, Gu Y, et al. Magnetic resonance imaging radiomics predicts preoperative axillary lymph node metastasis to support surgical decisions and is associated with tumor microenvironment in invasive breast cancer: A machine learning, multicenter study. eBioMedicine. 2021;69.
  31. 31. Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, et al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Critical Care. 2020;24(1).
  32. 32. Kulier A, Levin J, Moser R, Rumpold-Seitlinger G, Tudor IC, Snyder-Ramos SA, et al. Impact of Preoperative Anemia on Outcome in Patients Undergoing Coronary Artery Bypass Graft Surgery. Circulation. 2007;116(5):471–9.
  33. 33. Zhang C, Wang G, Zhou H, Lei G, Yang L, Fang Z, et al. Preoperative platelet count, preoperative hemoglobin concentration and deep hypothermic circulatory arrest duration are risk factors for acute kidney injury after pulmonary endarterectomy: a retrospective cohort study. J Cardiothorac Surg. 2019;14(1):220. pmid:31888760
  34. 34. Haase M, Bellomo R, Story D, Letis A, Klemz K, Matalanis G, et al. Effect of mean arterial pressure, haemoglobin and blood transfusion during cardiopulmonary bypass on post-operative acute kidney injury. Nephrol Dial Transplant. 2012;27(1):153–60. pmid:21677302