Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting severe renal dysfunction in alcohol-associated cirrhosis: Comparative performance of liver function scores and machine learning models

  • Julian Müller-Kühnle ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft

    julian.mueller-kuehnle@rbk.de

    Affiliations Department of General Internal Medicine and Nephrology, Robert Bosch Hospital, Stuttgart, Germany, Robert Bosch Society for Medical Research, Stuttgart, Germany, Paracelsus Medical University, Salzburg, Austria

  • Moritz Schanz,

    Roles Writing – review & editing

    Affiliation Department of General Internal Medicine and Nephrology, Robert Bosch Hospital, Stuttgart, Germany

  • Severin Schricker,

    Roles Methodology, Writing – review & editing

    Affiliation Department of General Internal Medicine and Nephrology, Robert Bosch Hospital, Stuttgart, Germany

  • Christian Benignus,

    Roles Formal analysis, Writing – review & editing

    Affiliations Paracelsus Medical University, Salzburg, Austria, Department of General, Visceral, Thoracic, and Pediatric Surgery, Ludwigsburg Hospital, Ludwigsburg, Germany

  • Julia Todoroff,

    Roles Data curation, Formal analysis, Writing – review & editing

    Affiliation Department of Hematology, Oncology, Stem Cell Transplantation, and Palliative Medicine, Klinikum Stuttgart, Stuttgart, Germany

  • Jörg Latus,

    Roles Supervision, Writing – review & editing

    Affiliation Department of General Internal Medicine and Nephrology, Robert Bosch Hospital, Stuttgart, Germany

  • Wolfram Zoller,

    Roles Supervision, Writing – review & editing

    Affiliation Department of General Internal Medicine, Gastroenterology, Gastrointestinal Oncology, Hepatology, Infectious Diseases, and Pulmonology, Klinikum Stuttgart, Stuttgart, Germany

  • Dominik Marschner

    Roles Data curation, Formal analysis, Writing – review & editing

    Affiliation Department of Hematology, Oncology, Stem Cell Transplantation, and Palliative Medicine, Klinikum Stuttgart, Stuttgart, Germany

Abstract

Background

Renal dysfunction is a frequent and clinically relevant complication of cirrhosis, yet chronic kidney disease (CKD) often remains underrecognized, particularly in non-acute settings. Early identification of at-risk patients is essential to guide timely interventions. Although MELD, Child-Pugh Score (CPS), APRI, and FIB-4 are widely used to assess hepatic disease severity, their predictive value for advanced renal dysfunction is uncertain.

Methods

In this retrospective cohort study (2014–2021, Klinikum Stuttgart), we evaluated the ability of MELD, CPS, APRI, and FIB-4 to predict severe renal dysfunction (chronic kidney disease [CKD] stage ≥ 3, according to Kidney Disease: Improving Global Outcomes [KDIGO] classification) in patients with alcoholic cirrhosis. Logistic and linear regression analyses were performed. In addition, machine learning (ML) models were trained to identify non-renal predictors of CKD stage ≥ 3.

Results

Among 131 patients (mean age 62.8 ± 11.3 years; 71% male), 33% met criteria for KDIGO stage ≥ 3. MELD was significantly associated with advanced CKD (OR = 1.379, p < 0.001), with prevalence increasing from 17% (MELD ≤ 9) to 80% (MELD ≥ 20). CPS showed an inverse association (p = 0.002), while APRI and FIB-4 were not predictive. The optimized Random Forest model, refined through ROSE oversampling and feature selection, achieved an AUC of 0.757, with 76% accuracy, 82% sensitivity (KDIGO < 3), and 63% specificity (KDIGO ≥ 3).

Conclusion

MELD was the most reliable conventional score for identifying advanced renal dysfunction in alcoholic cirrhosis. ML-based models incorporating routinely available clinical parameters further improved predictive performance and may support risk stratification in this high-risk population.

Introduction

Chronic kidney disease (CKD) is a frequent and underrecognized complication of liver cirrhosis, contributing to increased morbidity, hospitalizations, and mortality [14]. While renal dysfunction in acute settings – such as hepatorenal syndrome (HRS) – has been extensively studied, early detection of chronic impairment remains challenging, particularly in compensated or non-acute disease stages [57]. Alcohol-associated cirrhosis, which continues to represent a major global burden, is especially prone to renal complications owing to its frequent coexistence with malnutrition, systemic inflammation, and circulatory dysfunction [811].

Given the prognostic relevance of renal impairment, reliable risk stratification tools are needed. Widely used hepatic scores, such as the Model for End-Stage Liver Disease (MELD) and Child-Pugh Score (CPS), are applied to assess liver disease severity and prioritize transplant allocation [1215]. MELD incorporates serum creatinine and has been linked to renal dysfunction and mortality [16,17]. In contrast, CPS and fibrosis-based indices such as the AST-to-Platelet Ratio Index (APRI) and Fibrosis-4 Score (FIB-4) – developed for staging fibrosis or predicting survival – do not include renal parameters and may be less suitable for identifying patients at risk of CKD [1821].

Moreover, serum creatinine is an imperfect marker in cirrhosis due to confounding from muscle wasting and altered hepatic metabolism, particularly in sarcopenic or malnourished individuals [2224]. These limitations underline the need for novel, data-driven approaches that incorporate broader clinical and biochemical features.

In this study, we systematically evaluated the performance of four conventional liver scores – MELD, CPS, APRI, and FIB-4 – for predicting advanced chronic kidney disease (CKD; stage ≥ 3 according to Kidney Disease: Improving Global Outcomes [KDIGO] classification) in patients with alcoholic cirrhosis. We also applied supervised machine learning (ML) methods to identify supplementary, non-renal predictors of renal dysfunction. While the focus is on alcohol-associated disease, the modeling approach may be transferable to other etiologies with appropriate recalibration. The overarching goal was to improve renal risk stratification using both established clinical tools and modern data analytics. By directly addressing the limited representation of renal dysfunction in widely used hepatic scores, this study aims to close an important gap in current risk stratification approaches. Specifically, we not only compare conventional scores with CKD outcomes but also evaluate novel, non-renal predictors through ML to provide a broader framework for early risk assessment.

Methods

Study design and population

This retrospective cohort study was conducted at Klinikum Stuttgart, a tertiary care center in Germany, between 2014 and 2021. The primary objective was to assess the predictive performance of four commonly used liver scores – MELD, CPS, APRI, and FIB-4 – identifying advanced renal dysfunction, defined as chronic kidney disease (CKD) stage ≥ 3 according to KDIGO guidelines. In addition, supervised ML models were applied to identify non-traditional predictors beyond hepatic function.

Eligible participants were adults (≥18 years) with clinically and radiologically confirmed alcoholic cirrhosis and a documented history of chronic alcohol use. Endoscopic and ultrasonographic examinations were performed within one week before or after hospital admission to ensure consistent staging. Laboratory values, including serum creatinine and estimated glomerular filtration rate (eGFR), were obtained at admission.

Exclusion criteria were acute kidney injury (AKI) according to KDIGO guidelines, unstable renal function prior to baseline, and cirrhosis of non-alcoholic etiology (e.g., viral hepatitis, autoimmune, or cholestatic liver disease). Common comorbidities such as diabetes mellitus and arterial hypertension were not excluded to preserve external validity. Patients with substantial missing data were excluded unless the extent of missingness was low and could be addressed by imputation (see below).

Evaluation of renal function and hepatic disease severity

Renal function was assessed using eGFR, calculated with the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formula. CKD stage ≥ 3 was defined as eGFR < 60 mL/min/1.73 m² in accordance with KDIGO staging.

Liver disease severity was evaluated using four established scoring systems: MELD, CPS, APRI, and FIB-4. These scores were selected because they are widely applied in routine clinical practice and represent complementary aspects of hepatic disease severity and fibrosis, while enabling assessment of their potential utility for predicting renal dysfunction.

Statistical analysis

Patients were stratified into two groups: KDIGO stage < 3 and KDIGO stage ≥ 3. Continuous variables were tested for normality and compared using analysis of variance (ANOVA) or Kruskal–Wallis tests, as appropriate. Categorical variables were analyzed with chi-square or Fisher’s exact tests. Correlations between liver scores and renal markers were assessed using Spearman’s rank correlation.

Binary logistic regression was used to examine associations between liver scores and the likelihood of KDIGO stage ≥ 3. Linear regression models were applied to evaluate the relationship between liver scores and eGFR as a continuous variable. Statistical significance was defined as p ≤ 0.05.

Given the exploratory nature of this study, no adjustment for multiple comparisons was applied. In line with recommendations by Bender and Lange [39] and Streiner [40], p-values are interpreted descriptively and are not used for confirmatory inference.

Machine learning-based prediction of severe renal dysfunction

To identify non-renal predictors of KDIGO stage ≥ 3, several ML models were developed. Variables directly linked to renal function (e.g., creatinine, urea, eGFR) were excluded to avoid circular reasoning. Candidate features were selected based on clinical relevance, data availability, and prior literature. The final feature set included age, sex, body mass index (BMI), international normalized ratio (INR), alanine aminotransferase (ALT), platelet count, bilirubin, presence and grade of hepatic encephalopathy, ascites volume, and spleen diameter.

An initial Random Forest (RF) model was trained using 1,000 trees. To improve performance, a refined model with 50,000 trees was built, incorporating hyperparameter tuning (mtry = 3), class weighting, and oversampling with the Random OverSampling Examples (ROSE) method [25]. ROSE was chosen over the Synthetic Minority Oversampling Technique (SMOTE) because it natively handles mixed-type data without requiring adaptations such as SMOTE for Nominal and Continuous (SMOTE-NC) variables [26].

Model performance was evaluated using accuracy, sensitivity, specificity, balanced accuracy, area under the receiver operating characteristic curve (AUC), and Cohen’s kappa. Feature importance was quantified using mean decrease in Gini impurity.

Comparative XGBoost models [27] were trained using 500 and 5,000 trees with early stopping. These models, however, exhibited severe overfitting and poor external performance (AUC < 0.5) and were therefore excluded from final evaluation.

Missing data and imputation

Minor missingness was present for selected variables (e.g., ALT, bilirubin, spleen diameter). For ML models, missing values were imputed within KDIGO strata using the median (continuous variables) or mode (categorical variables). Conventional statistical analyses were performed on complete-case data.

Ethics statement

This study was conducted in accordance with institutional and data protection regulations. Ethical review was waived by the Ethics Committee of the University of Tübingen (Reference 607/2021BO2) in accordance with § 27 Bundesdatenschutzgesetz (BDSG) and Articles 5, 6, 9, and 89 of the General Data Protection Regulation (GDPR). Informed consent was not required due to the retrospective and anonymized study design. To ensure compliance, data extraction and analysis were performed by different individuals.

Software and tools

All classical statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS) version 29.0 (IBM Corp., Armonk, NY, USA). ML modeling and receiver operating characteristic (ROC) analysis were conducted in R version 4.4.2 (R Foundation for Statistical Computing, Vienna, Austria).

Results

Study population

A total of 131 patients with alcoholic liver cirrhosis were included. The mean age was 62.8 ± 11.3 years, and 71% were male. The average body mass index (BMI) was 25.7 ± 2.6 kg/m². Ascites was present in 71% of patients, and hepatic encephalopathy in 23%. The mean portal vein velocity was 17.7 ± 6.1 cm/s, and the mean spleen diameter was 130.3 ± 25.8 mm.

Renal parameters revealed a mean serum creatinine of 1.2 ± 0.8 mg/dL, an estimated glomerular filtration rate (eGFR) of 78.6 ± 33.7 mL/min, and a mean urea level of 71.6 ± 36.5 mg/dL. According to Kidney Disease: Improving Global Outcomes (KDIGO) criteria, 43 patients (33%) met the definition of CKD stage ≥ 3.

Baseline demographics, laboratory values, and prognostic scores are summarized in Table 1. Detailed endoscopic findings, including variceal grading and intervention requirements, are provided in Supplementary Table S1.

thumbnail
Table 1. Clinical, laboratory, and endoscopic characteristics of the study cohort, stratified by KDIGO stage < 3 and ≥ 3.

https://doi.org/10.1371/journal.pone.0332840.t001

Comparison of KDIGO < 3 and KDIGO ≥ 3 Groups

Patients with KDIGO stage ≥ 3 were significantly older than those with KDIGO < 3 (67.6 ± 10.5 vs. 60.5 ± 11.3 years; p = 0.001, φ = −0.65). They also showed higher serum creatinine (2.0 ± 0.9 vs. 0.8 ± 0.2 mg/dL; p = 0.001, φ = −2.3), lower eGFR (36.6 ± 13.6 vs. 98.4 ± 19.8 mL/min; p = 0.001, φ = 3.4), and elevated urea levels (98.2 ± 28.2 vs. 46.7 ± 26.4 mg/dL; p = 0.002, φ = −1.25).

There was a trend toward lower hemoglobin (9.8 ± 2.4 vs. 10.6 ± 2.6 g/dL; p = 0.09) and lower AST (62.9 ± 45.6 vs. 93.9 ± 88.5 U/L; p = 0.06) in the KDIGO ≥ 3 group. No significant differences were observed for platelet count, INR, ALT, or total bilirubin.

While the overall prevalence of varices did not differ significantly, Paquet grade 3 varices were exclusively observed in the KDIGO ≥ 3 group (5%; p = 0.04, φ = 0.18; Supplementary S1 Table). Endoscopic interventions were also more common in this group (44% vs. 27%; p = 0.05, φ = 0.17).

MELD scores were significantly higher in patients with KDIGO ≥ 3 (14.5 ± 5.0 vs. 11.0 ± 3.5; p = 0.001, φ = −0.85), whereas CPS, APRI, and FIB-4 did not differ significantly between groups (Table 1).

Associations between liver function scores and renal dysfunction

A stepwise increase in the prevalence of KDIGO stage ≥ 3 was observed with higher MELD strata: 17% (MELD ≤ 9), 36% (MELD 10–19), and 80% (MELD ≥ 20) (p = 0.002; Table 2). Corresponding changes in renal parameters included increasing serum creatinine (0.9 ± 0.5 to 2.7 ± 1.4 mg/dL) and decreasing eGFR (88.3 ± 26.2 to 35.5 ± 35.3 mL/min) (both p < 0.001). Urea levels followed a similar pattern (p < 0.001).

thumbnail
Table 2. Renal function parameters across MELD and FIB-4 strata.

https://doi.org/10.1371/journal.pone.0332840.t002

No significant differences in renal markers were observed across CPS, APRI, or FIB-4 strata (all p > 0.05), although a trend toward lower eGFR with higher CPS values was noted (p = 0.05; Table 3).

thumbnail
Table 3. Renal function markers and KDIGO stage distribution across Child-Pugh score (CPS) and APRI strata.

https://doi.org/10.1371/journal.pone.0332840.t003

Correlation analyses

Among all liver scores, MELD showed the strongest associations with renal parameters: a positive correlation with serum creatinine (r = 0.333; p < 0.001) and a negative correlation with eGFR (r = −0.294; p = 0.001). The correlation between MELD and urea did not reach statistical significance (r = 0.282; p = 0.078). No significant correlations were found for APRI, CPS, or FIB-4 with any renal marker (Table 4).

thumbnail
Table 4. Correlation between liver function scores and renal function parameters (Spearman’s rank correlation).

https://doi.org/10.1371/journal.pone.0332840.t004

Regression analyses

Logistic regression for KDIGO stage ≥ 3.

MELD was significantly associated with KDIGO stage ≥ 3 (B = 0.322, SE = 0.092, Wald = 12.286, p < 0.001; OR = 1.379 [95% CI: 1.152–1.651]). CPS showed a significant inverse association (B = −0.690, SE = 0.225, Wald = 9.376, p = 0.002; OR = 0.501 [95% CI: 0.322–0.780]).

No significant associations were found for APRI (p = 0.318) or FIB-4 (p = 0.364) (Table 5).

thumbnail
Table 5. Multivariable logistic regression for the prediction of severe renal dysfunction (KDIGO stage ≥ 3).

https://doi.org/10.1371/journal.pone.0332840.t005

Linear regression for eGFR.

MELD was negatively associated with eGFR (B = −3.570, SE = 0.848, β = −0.491, t = −4.208, p < 0.001).

CPS was positively associated with eGFR (B = 5.597, SE = 2.049, β = 0.324, t = 2.732, p = 0.008).

APRI and FIB-4 were not significantly associated with eGFR (both p > 0.85) (Table 6).

thumbnail
Table 6. Multivariable linear regression predicting estimated glomerular filtration rate (eGFR) from liver function scores.

https://doi.org/10.1371/journal.pone.0332840.t006

ROC analyses

ROC analysis showed that MELD had the highest discriminatory ability for KDIGO stage ≥ 3 (AUC = 0.718), followed by FIB-4 (AUC = 0.489), APRI (AUC = 0.453), and CPS (AUC = 0.357) (Fig 1).

thumbnail
Fig 1. Receiver operating characteristic (ROC) curves for MELD, Child-Pugh score (CPS), APRI, and FIB-4 in predicting severe renal dysfunction (KDIGO stage ≥ 3).

The figure shows ROC curves of the Model for End-stage Liver Disease (MELD; AUC = 0.72), Child-Pugh score (CPS; AUC = 0.36), aspartate aminotransferase-to-platelet ratio index (APRI; AUC = 0.45), and fibrosis-4 index (FIB-4; AUC = 0.49) for predicting severe renal dysfunction, defined as Kidney Disease: Improving Global Outcomes (KDIGO) stage ≥ 3. The diagonal dashed line represents the reference line of no discrimination.

https://doi.org/10.1371/journal.pone.0332840.g001

The optimal MELD cutoff was 13.5 (sensitivity 56%, specificity 77%, Youden index 0.325). FIB-4 yielded a cutoff of 5.62 (sensitivity 68%, specificity 57%). APRI achieved high sensitivity (92%) but moderate specificity (80%) at a cutoff of 0.79. CPS showed the lowest performance with a Youden index of 0.040 (cutoff 13.5; sensitivity 4%, specificity 100%).

Performance of machine learning models for CKD classification

The initial Random Forest model [28], trained with 1,000 trees, showed limited performance with an out-of-bag (OOB) error of 30%. Accuracy was 64%, and the AUC was 0.588. Specificity for KDIGO stage ≥ 3 was low (25%).

Model optimization with 50,000 trees, class weighting, and tuning of the mtry parameter improved performance: accuracy increased to 76%, specificity to 50%, and AUC to 0.802. Inter-rater reliability was moderate (Cohen’s kappa, κ = 0.41).

Further improvement was achieved by applying the ROSE method [25] to balance class distribution in the training set. The resulting RF model achieved 64% test accuracy, 75% specificity for KDIGO stage ≥ 3, and an AUC of 0.728.

In contrast, XGBoost models [29] trained with 500 and 5,000 trees demonstrated severe overfitting, with AUC values of 1.00 in training but only 0.490 and 0.438 in testing. Early stopping did not prevent overfitting, and specificity for KDIGO stage ≥ 3 remained low (11%), limiting their clinical utility.

The final RF model, trained with ROSE oversampling and refined feature selection, achieved 76% accuracy, 82% sensitivity for KDIGO < 3, 63% specificity for KDIGO ≥ 3, an AUC of 0.757, balanced accuracy of 72%, and moderate agreement (κ = 0.45) (Fig 2).

thumbnail
Fig 2. Performance of the final optimized Random Forest (RF) model for predicting severe renal dysfunction (KDIGO stage ≥ 3) in alcoholic liver cirrhosis.

The ROC curve of the final RF model shows an AUC of 0.76 for predicting severe renal dysfunction, defined as KDIGO stage ≥ 3. The diagonal grey line represents the reference line of no discrimination.

https://doi.org/10.1371/journal.pone.0332840.g002

The most influential predictors in the final model were ALT, platelet count, age, INR, and BMI (Fig 3).

thumbnail
Fig 3. Top predictive variables for KDIGO stage ≥ 3 identified by the final Random Forest model.

Variable importance plots of the final Random Forest (RF) model showing the most predictive variables for severe renal dysfunction, defined as KDIGO) stage ≥ 3. Importance is displayed as mean decrease in accuracy (left panel) and mean decrease in Gini index (right panel). Variables include age, glutamate pyruvate transaminase (GPT; alanine aminotransferase, ALT), hemoglobin (Hb), body mass index (BMI), spleen diameter, sex, bilirubin, platelet count, international normalized ratio (INR), ascites volume, and hepatic encephalopathy.

https://doi.org/10.1371/journal.pone.0332840.g003

Discussion

This study systematically evaluated the predictive utility of four widely used liver function scores – MELD, CPS, APRI, and FIB-4 – for identifying advanced renal dysfunction (KDIGO stage ≥ 3) in patients with alcoholic cirrhosis. In addition, supervised ML methods were applied to identify non-renal predictors that may enhance risk stratification. Our analysis yielded several novel observations: (i) an inverse association between CPS and CKD, (ii) minimal predictive value of fibrosis-based scores (APRI, FIB-4), and (iii) overfitting of XGBoost models compared with the more robust Random Forest approach. Together, these findings highlight the limitations of conventional scoring systems and the potential of ML-based models to capture broader determinants of renal dysfunction.

Among conventional scores, MELD was the strongest predictor of CKD stage ≥ 3, with significant associations in both logistic and linear regression models and an AUC of 0.718. The prevalence of CKD increased from 17% in patients with MELD ≤ 9–80% in those with MELD ≥ 20, underscoring its prognostic relevance in advanced disease. These results align with prior studies linking MELD to renal dysfunction and mortality, likely due to its inclusion of serum creatinine as a core variable [18,3034]. However, MELD demonstrated only moderate sensitivity (56%) at the optimal cutoff, suggesting a role primarily in identifying patients who may benefit from nephrology consultation rather than in early CKD screening.

The reliance of MELD on serum creatinine introduces notable limitations, particularly in cirrhosis where sarcopenia and malnutrition can lead to falsely low creatinine and overestimated GFR [35,36]. Gender-related muscle mass differences may further affect accuracy [37,38]. Future approaches may benefit from incorporating alternative renal markers such as cystatin C or inflammatory parameters (e.g., IL-6, CRP) [39,40], and from using eGFR-based rather than creatinine-based thresholds.

Fibrosis-based scores (APRI, FIB-4) showed no significant association with renal parameters. Their low AUCs (0.453 and 0.489, respectively) and absence of meaningful correlations support the notion that CKD in cirrhosis is driven less by hepatic fibrosis and more by systemic and hemodynamic alterations [41]. Although FIB-4 has been associated with renal outcomes in metabolic or non-cirrhotic populations [19,42,43], these discrepancies likely reflect distinct pathophysiologic drivers in alcohol-associated cirrhosis, including toxic-metabolic, hemodynamic, and inflammatory factors.

The inverse association between CPS and CKD stage ≥ 3 was unexpected. CPS relies on subjective or indirect clinical variables (e.g., ascites, encephalopathy) and does not include renal markers. Patients with higher CPS may have received more intensive inpatient therapy (e.g., fluid resuscitation), potentially improving renal markers at the time of measurement. In addition, sarcopenia-related overestimation of GFR may have biased this association [35,36]. These factors illustrate the limited utility of CPS for renal risk stratification, despite its widespread use in hepatic staging [44].

ML-based models identified non-traditional predictors of renal dysfunction, including GPT, platelet count, INR, BMI, and age. The optimized Random Forest model, refined by ROSE oversampling and feature selection, achieved an AUC of 0.757 with balanced accuracy of 72%. Its performance exceeded that of conventional scores and illustrates the capacity of ML to capture multifactorial predictors of CKD. Notably, none of the top predictors were renal-specific. GPT may indicate hepatocellular injury or catabolic activity contributing to hepatorenal dysfunction [1,45], while platelet count and INR are established markers of portal hypertension and coagulopathy [46,47]. Age and BMI represent general CKD risk factors [48,49]. These findings support the integration of broader clinical markers into renal risk models for cirrhosis [6,50]. Inflammatory markers (e.g., CRP, IL-6) and renal hormones (e.g., aldosterone, renin) were not available in this retrospective dataset but may provide additional discriminatory value. Future studies should assess whether such biomarkers can further refine prediction models [39,40].

XGBoost models, despite theoretical advantages, suffered from overfitting and poor test performance (AUCs < 0.50). This highlights the importance of algorithm selection, particularly in smaller datasets, and supports the relative robustness of Random Forest in this context.

From a clinical perspective, one in three patients had CKD stage ≥ 3, emphasizing the need for systematic renal risk assessment in alcoholic cirrhosis. These patients were significantly older, had higher serum creatinine and urea levels, and more often required endoscopic therapy, consistent with prior studies linking renal dysfunction and portal hypertensive complications [6,5153].

While our Random Forest model performed well internally, its generalizability remains to be tested. External validation in independent cohorts is essential, and transferability to other cirrhosis etiologies (e.g., viral hepatitis, NAFLD) will likely require recalibration. Importantly, the model excluded serum creatinine and GFR to avoid circularity, providing an unbiased stratification framework. Future studies should prospectively validate such models and explore integration into electronic health records or clinical decision support systems.

In summary, MELD remains the most robust conventional score for renal risk stratification in alcoholic cirrhosis. However, ML-based approaches incorporating systemic and hepatic variables offer improved diagnostic value and may enable earlier identification of at-risk individuals. Their clinical implementation will require external validation and prospective evaluation.

Limitations

This study has several limitations. First, its retrospective single-center design may limit generalizability, particularly to non-alcoholic etiologies of cirrhosis and more diverse patient populations. Second, renal function was assessed using serum creatinine, which can underestimate impairment in patients with sarcopenia, malnutrition, or in women due to reduced muscle mass. Third, potentially relevant biomarkers – such as inflammatory markers (CRP, IL-6) and renal hormones (aldosterone, renin) – were not available in this retrospective dataset. Their absence likely restricted the scope of the models, and future prospective studies should explicitly incorporate these variables to improve predictive accuracy. Fourth, although the Random Forest model demonstrated promising internal performance, external validation in independent cohorts is required to confirm its robustness and applicability. Finally, long-term renal and survival outcomes were not assessed, which precludes prognostic interpretation beyond cross-sectional associations.

Conclusion

Among conventional liver function scores, MELD showed the highest accuracy for identifying severe renal dysfunction (KDIGO stage ≥ 3) in patients with alcoholic cirrhosis. In contrast, CPS was inversely associated with renal dysfunction, and fibrosis-based scores (APRI, FIB-4) had no meaningful predictive value in this setting. A Random Forest–based machine learning model improved diagnostic performance by incorporating additional clinical features – such as GPT, platelet count, INR, BMI, and age – that are not represented in standard hepatic scores. These results support the notion that renal dysfunction in cirrhosis is influenced by systemic factors beyond liver disease severity.

Although the model performed well internally, external validation in independent and etiologically diverse cohorts is required before clinical application. Future studies should assess whether integrating novel biomarkers, including inflammatory markers (CRP, IL-6) and renal hormones (aldosterone, renin), can further enhance prediction and refine clinical decision-making.

Supporting information

S1 Table. Endoscopic findings and intervention requirements in the study cohort, stratified by KDIGO stage < 3 and ≥ 3.

https://doi.org/10.1371/journal.pone.0332840.s001

(DOCX)

Acknowledgments

The authors thank all clinical and scientific collaborators for their valuable contributions and the constructive interdisciplinary support during the course of this project.

References

  1. 1. Adebayo D, Wong F. Pathophysiology of Hepatorenal Syndrome - Acute Kidney Injury. Clin Gastroenterol Hepatol. 2023;21(10S):S1–10. pmid:37625861
  2. 2. GBD 2017 Cirrhosis Collaborators. The global, regional, and national burden of cirrhosis by cause in 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet Gastroenterol Hepatol. 2020;5(3):245–66. pmid:31981519
  3. 3. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2095–128. pmid:23245604
  4. 4. Williams R, Aspinall R, Bellis M, Camps-Walsh G, Cramp M, Dhawan A, et al. Addressing liver disease in the UK: a blueprint for attaining excellence in health care and reducing premature mortality from lifestyle issues of excess consumption of alcohol, obesity, and viral hepatitis. Lancet. 2014;384(9958):1953–97. pmid:25433429
  5. 5. D’Amico G, Garcia-Tsao G, Pagliaro L. Natural history and prognostic indicators of survival in cirrhosis: a systematic review of 118 studies. J Hepatol. 2006;44(1):217–31. pmid:16298014
  6. 6. Moreau R, Jalan R, Gines P, Pavesi M, Angeli P, Cordoba J, et al. Acute-on-chronic liver failure is a distinct syndrome that develops in patients with acute decompensation of cirrhosis. Gastroenterology. 2013;144(7):1426–37, 1437.e1-9. pmid:23474284
  7. 7. Wong F, Nadim MK, Kellum JA, Salerno F, Bellomo R, Gerbes A, et al. Working Party proposal for a revised classification system of renal dysfunction in patients with cirrhosis. Gut. 2011;60(5):702–9. pmid:21325171
  8. 8. Piano S, Tonon M, Angeli P. Management of ascites and hepatorenal syndrome. Hepatol Int. 2018;12(Suppl 1):122–34. pmid:28836115
  9. 9. Wong F. Management of refractory ascites. Clin Mol Hepatol. 2023;29(1):16–32.
  10. 10. Garcia-Tsao G. Current management of the complications of cirrhosis and portal hypertension: variceal hemorrhage, ascites, and spontaneous bacterial peritonitis. Gastroenterology. 2001;120(3):726–48. pmid:11179247
  11. 11. Ginès P, Cárdenas A. The Management of Ascites and Hyponatremia in Cirrhosis. Semin Liver Dis. 2008;28(1):043–58.
  12. 12. Schmoyer CJ, Kumar D, Gupta G, Sterling RK. Diagnostic accuracy of noninvasive tests to detect advanced hepatic fibrosis in patients with hepatitis C and end-stage renal disease. Clin Gastroenterol Hepatol. 2020;18(10):2332–9.e1.
  13. 13. Torres L, Schuch A, Longo L, Valentini BB, Galvão GS, Luchese E, et al. New FIB-4 and NFS cutoffs to guide sequential non-invasive assessment of liver fibrosis by magnetic resonance elastography in NAFLD. Ann Hepatol. 2023;28(1):100774. pmid:36280013
  14. 14. Tsoris A, Marlar CA. Use of the Child Pugh score in liver disease. StatPearls. Treasure Island (FL): StatPearls Publishing. 2023. https://www.ncbi.nlm.nih.gov/books/NBK542308/
  15. 15. Oikonomou T, Chrysavgis L, Kiapidou S, Adamantou M, Parastatidou D, Papatheodoridis GV, et al. Aspartate aminotransferase-to-platelet ratio index can predict the outcome in patients with stable decompensated cirrhosis. Ann Gastroenterol. 2023;36(4):442–8. pmid:37395998
  16. 16. Emenena I, Emenena B, Kweki AG, Aiwuyo HO, Osarenkhoe JO, Iloeje UN, et al. Model for End Stage Liver Disease (MELD) Score: A Tool for Prognosis and Prediction of Mortality in Patients With Decompensated Liver Cirrhosis. Cureus. 2023;15(5):e39267. pmid:37342753
  17. 17. Biegus J, Zymliński R, Sokolski M, Siwołowski P, Gajewski P, Nawrocka-Millward S, et al. Impaired hepato-renal function defined by the MELD XI score as prognosticator in acute heart failure. Eur J Heart Fail. 2016;18(12):1518–21. pmid:11172350
  18. 18. Kamath PS, Wiesner RH, Malinchoc M, Kremers W, Therneau TM, Kosberg CL, et al. A model to predict survival in patients with end-stage liver disease.  Hepatology. 2001;33(2):464–70. https://doi.org/10.1053/jhep.2001.22172
  19. 19. Schleicher EM, Gairing SJ, Galle PR, Weinmann-Menke J, Schattenberg JM, Kostev K, et al. A higher FIB-4 index is associated with an increased incidence of renal failure in the general population. Hepatol Commun. 2022;6(12):3505–14. pmid:36194174
  20. 20. Kotoku K, Michishita R, Matsuda T, Kawakami S, Morito N, Uehara Y, et al. The Association between Decreased Kidney Function and FIB-4 Index Value, as Indirect Liver Fibrosis Indicator, in Middle-Aged and Older Subjects. IJERPH. 2021;18(13):6980.
  21. 21. Peng Y, Qi X, Guo X. Child-Pugh Versus MELD Score for the Assessment of Prognosis in Liver Cirrhosis: A Systematic Review and Meta-Analysis of Observational Studies. Medicine (Baltimore). 2016;95(8):e2877. pmid:26937922
  22. 22. Sinclair M, Gow PJ, Grossmann M, Angus PW. Review article: sarcopenia in cirrhosis--aetiology, implications and potential therapeutic interventions. Aliment Pharmacol Ther. 2016;43(7):765–77. pmid:26847265
  23. 23. Pirlich M, Schütz T, Spachos T, Ertl S, Weiss ML, Lochs H, et al. Bioelectrical impedance analysis is a useful bedside technique to assess malnutrition in cirrhotic patients with and without ascites. Hepatology. 2000;32(6):1208–15. pmid:11093726
  24. 24. Dietrich CG, Götze O, Geier A. Molecular changes in hepatic metabolism and transport in cirrhosis and their functional importance. World J Gastroenterol. 2016;22(1):72–88. pmid:26755861
  25. 25. Lunardon N, Menardi G, Torelli N. ROSE: A package for binary imbalanced learning. R J. 2014;6(1):79–89.
  26. 26. Wongvorachan T, He S, Bulut O. A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining. Information. 2023;14(1):54.
  27. 27. Wiens M, Verone‐Boyle A, Henscheid N, Podichetty JT, Burton J. A Tutorial and Use Case Example of the eXtreme Gradient Boosting (XGBoost) Artificial Intelligence Algorithm for Drug Development Applications. Clinical Translational Sci. 2025;18(3).
  28. 28. Breiman L. Random Forests. Machine Learning. 2001;45(1):5–32.
  29. 29. Chen T, Guestrin C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016 Aug 13-17. San Francisco, CA, USA. New York, NY: Association for Computing Machinery; 2016. p. 785–94.
  30. 30. Bittermann T, Hubbard RA, Lewis JD, Goldberg DS. The proportion of Model for End-stage Liver Disease Sodium score attributable to creatinine independently predicts post-transplant survival and renal complications. Clin Transplant. 2020;34(3):e13817. pmid:32027405
  31. 31. Lins PRG, Narciso RC, Ferraz LR, Pereira VG, Ferraz-Neto B-H, De Almeida MD, et al. Modelling kidney outcomes based on MELD eras - impact of MELD score in renal endpoints after liver transplantation. BMC Nephrol. 2022;23(1).
  32. 32. Sibulesky L, Leca N, Blosser C, Rahnemai-Azar AA, Bhattacharya R, Reyes J. Is MELD score failing patients with liver disease and hepatorenal syndrome? World J Hepatol. 2016;8(27):1155–6. pmid:27721921
  33. 33. Sharma P, Schaubel DE, Guidinger MK, Goodrich NP, Ojo AO, Merion RM. Impact of MELD-based allocation on end-stage renal disease after liver transplantation. Am J Transplant. 2011;11(11):2372–8. pmid:21883908
  34. 34. Biggins SW, Kim WR, Terrault NA, Saab S, Balan V, Schiano T, et al. Evidence-Based Incorporation of Serum Sodium Concentration Into MELD. Gastroenterology. 2006;130(6):1652–60.
  35. 35. Kim S, Jung H-W, Kim C-H, Kim K, Chin HJ, Lee H. A New Equation to Estimate Muscle Mass from Creatinine and Cystatin C. PLoS One. 2016;11(2):e0148495. pmid:26849842
  36. 36. De Rosa S, Greco M, Rauseo M, Annetta MG. The Good, the Bad, and the Serum Creatinine: Exploring the Effect of Muscle Mass and Nutrition. Blood Purif. 2023;52(9–10):775–85.
  37. 37. Janssen I, Heymsfield SB, Wang ZM, Ross R. Skeletal muscle mass and distribution in 468 men and women aged 18-88 yr. J Appl Physiol (1985). 2000;89(1):81–8. pmid:10904038
  38. 38. Allen AM, Heimbach JK, Larson JJ, Mara KC, Kim WR, Kamath PS, et al. Reduced Access to Liver Transplantation in Women: Role of Height, MELD Exception Scores, and Renal Function Underestimation. Transplantation. 2018;102(10):1710–6. pmid:29620614
  39. 39. Nakatani S, Maeda K, Akagi J, Ichigi M, Murakami M, Harada Y, et al. Coefficient of Determination between Estimated and Measured Renal Function in Japanese Patients with Sarcopenia May Be Improved by Adjusting for Muscle Mass and Sex: A Prospective Study. Biol Pharm Bull. 2019;42(8):1350–7. pmid:31167988
  40. 40. Ivey-Miranda J, Stewart B, Gomez N, Griffin M, Rao V, Testani J. Sarcopenia strongly affects serum levels of cystatin C in patients with heart failure. J Card Fail. 2019;25(8 Suppl):S20–1.
  41. 41. Rajakumar A, Appuswamy E, Kaliamoorthy I, Rela M. Renal Dysfunction in Cirrhosis: Critical Care Management. Indian J Crit Care Med. 2021;25(2):207–14. pmid:33707901
  42. 42. Das DS, Anupam A, Saharia GK. Association between liver fibrosis scores and short-term clinical outcomes in hospitalized chronic kidney disease patients: a prospective observational study. Front Med (Lausanne). 2024;11:1387472. pmid:39228803
  43. 43. Kuma A, Mafune K, Uchino B, Ochiai Y, Miyamoto T, Kato A. Potential link between high FIB-4 score and chronic kidney disease in metabolically healthy men. Sci Rep. 2022;12(1):16638. pmid:36198747
  44. 44. Bhatti HW, Tahir U, Chaudhary NA, Bhatti S, Hafeez M, Rizvi ZA. Factors associated with renal dysfunction in hepatitis C-related cirrhosis and its correlation with Child-Pugh score. BMJ Open Gastroenterol. 2019;6(1):e000286. pmid:31275583
  45. 45. Angeli P, Garcia-Tsao G, Nadim MK, Parikh CR. News in pathophysiology, definition and classification of hepatorenal syndrome: A step beyond the International Club of Ascites (ICA) consensus document. J Hepatol. 2019;71(4):811–22. pmid:31302175
  46. 46. Baaten CCFMJ, Schröer JR, Floege J, Marx N, Jankowski J, Berger M, et al. Platelet Abnormalities in CKD and Their Implications for Antiplatelet Therapy. Clin J Am Soc Nephrol. 2022;17(1):155–70. pmid:34750169
  47. 47. Crăciun R, Grapă C, Mocan T, Tefas C, Nenu I, Buliarcă A, et al. The Bleeding Edge: Managing Coagulation and Bleeding Risk in Patients with Cirrhosis Undergoing Interventional Procedures. Diagnostics (Basel). 2024;14(22):2602. pmid:39594268
  48. 48. O’Seaghdha CM, Lyass A, Massaro JM, Meigs JB, Coresh J, D’Agostino RB Sr, et al. A risk score for chronic kidney disease in the general population. Am J Med. 2012;125(3):270–7. pmid:22340925
  49. 49. Ejerblad E, Fored CM, Lindblad P, Fryzek J, McLaughlin JK, Nyrén O. Obesity and risk for chronic renal failure. J Am Soc Nephrol. 2006;17(6):1695–702. pmid:16641153
  50. 50. Tandon P, James MT, Abraldes JG, Karvellas CJ, Ye F, Pannu N. Relevance of New Definitions to Incidence and Prognosis of Acute Kidney Injury in Hospitalized Patients with Cirrhosis: A Retrospective Population-Based Cohort Study. PLoS One. 2016;11(8):e0160394. pmid:27504876
  51. 51. Li H, Li M, Liu C, He P, Dong A, Dong S, et al. Causal effects of systemic inflammatory regulators on chronic kidney diseases and renal function: a bidirectional Mendelian randomization study. Front Immunol. 2023;14:1229636. pmid:37711613
  52. 52. Wojtacha P, Bogdańska-Chomczyk E, Majewski MK, Obremski K, Majewski MS, Kozłowska A. Renal Inflammation, Oxidative Stress, and Metabolic Abnormalities During the Initial Stages of Hypertension in Spontaneously Hypertensive Rats. Cells. 2024;13(21):1771.
  53. 53. Tang W, Wei Q. The metabolic pathway regulation in kidney injury and repair. Front Physiol. 2024;14.