Predictive value of individual Sequential Organ Failure Assessment sub-scores for mortality in the cardiac intensive care unit

Purpose To determine the impact of Sequential Organ Failure Assessment (SOFA) organ sub-scores for hospital mortality risk stratification in a contemporary cardiac intensive care unit (CICU) population. Materials and methods Adult CICU admissions between January 1, 2007 and December 31, 2015 were reviewed. The SOFA score and organ sub-scores were calculated on CICU day 1; patients with missing SOFA sub-score data were excluded. Discrimination for hospital mortality was assessed using area under the receiver-operator characteristic curve (AUROC) values, followed by multivariable logistic regression. Results We included 1214 patients with complete SOFA sub-score data. The mean age was 67 ± 16 years (38% female); all-cause hospital mortality was 26%. Day 1 SOFA score predicted hospital mortality with an AUROC of 0.72. Each SOFA organ sub-score predicted hospital mortality (all p <0.01), with AUROC values of 0.53 to 0.67. On multivariable analysis, only the cardiovascular, central nervous system, renal and respiratory SOFA sub-scores were associated with hospital mortality (all p <0.01). A simplified SOFA score containing the cardiovascular, central nervous system and renal sub-scores had an AUROC of 0.72. Conclusions In CICU patients with complete SOFA sub-score data, risk stratification for hospital mortality is determined primarily by the cardiovascular, central nervous system, renal and respiratory SOFA sub-scores.

Introduction Risk prediction scores have guided care in the cardiac intensive care unit (CICU) since Killip, et al. reported their classification of patients with acute myocardial infarction. [1] The CICU population has evolved to include patients with acute and chronic multi-organ dysfunction and superimposed cardiac pathology, similar to other intensive care unit (ICU) populations. [2][3][4][5][6][7] Risk stratification models allow prediction of adverse outcomes in this increasingly complex CICU patient population in order to facilitate care planning and therapeutic intervention. [3,4,8] The use of disease-specific risk prediction scores in the CICU is limited by the presence of undifferentiated clinical syndromes in patients with multiple acute and chronic cardiovascular disease processes, making general ICU severity of illness scoring models potentially advantageous. [2,6,[8][9][10][11] The Sequential Organ Failure Assessment (SOFA) score is an illness severity score developed in patients with sepsis, including a 4-point assessment of dysfunction in each of 6 organ systems (central nervous system, cardiovascular, respiratory, renal, liver and coagulation). [11][12][13] The SOFA score contains fewer variables and is simpler to calculate compared to other ICU risk prediction models, yet it can accurately predict short-term mortality in CICU populations. [13][14][15] We previously reported very good discrimination for hospital mortality using the SOFA score on the first CICU day in our CICU population, although calibration was suboptimal. [15] The cardiovascular and renal SOFA organ sub-scores had the highest discrimination for short-term mortality in our prior study. However, data to calculate the respiratory and liver SOFA sub-scores were available in fewer than one-third of patients; as is customary in such models, missing data were imputed as normal. [15] The absence of available data for calculating ICU severity of illness scoring models influences model performance by underestimating illness severity and mortality risk, raising questions about the accuracy of the SOFA score in patients with missing data. [16] The purpose of this study was to determine the relative contribution of each individual SOFA organ sub-score for prediction of mortality in CICU patients without any missing SOFA sub-score data, in order to facilitate potential future modification of the SOFA score to better fit the CICU population. Additionally, we sought to further explore the importance of missing data for mortality risk prediction using the SOFA score in CICU patients, as highlighted in our prior work.

Materials and methods
This study was approved by the Mayo Clinic Institutional Review Board under an exception from informed consent as posing minimal risk to patients. This is a subset analysis of a historical cohort analysis utilizing an institutional database of patients admitted to the CICU at the Mayo Clinic Hospital, St. Mary's Campus, as previously described. [15] The Mayo Clinic CICU is a closed 16-bed unit serving critically-ill cardiac medical patients, not including postoperative cardiac surgery patients or patients receiving extracorporeal membrane oxygenation support. Consultation by a Critical Care Medicine physician is provided for assistance in management of patients with respiratory failure. Unique adult patients � 18 years old admitted to the CICU between January 1, 2007 and December 31, 2015 were identified and data from the first CICU admission were used. [17] Patients admitted to the CICU prior to January 1, 2007, patients still hospitalized on December 31, 2015 and patients who did not provide Minnesota Research Authorization under Minnesota state law were excluded from the initial study population. We excluded patients in whom any of the individual SOFA organ sub-scores could not be calculated due to the presence of missing data points; for SOFA organ sub-scores (such as cardiovascular and renal) based on multiple data points, patients missing either of the required data points were excluded.
As described previously, demographic and laboratory data and use of invasive ventilation and catecholamine infusions during the first 24 hours of CICU admission were collected. [15] The SOFA score (with individual SOFA organ sub-scores), Acute Physiology and Chronic Health Evaluation (APACHE)-III score and Oxford Acute Severity of Illness Score (OASIS) were generated using data in the electronic medical record system from the first 24 hours of CICU admission; for the APACHE-III score and OASIS, missing variables were imputed as normal (score of 0) as the default. [18][19][20][21] Total SOFA scores were automatically calculated on each day a patient remained in the CICU, and the mean and maximum of all SOFA scores up to the first week in the CICU were recorded. The Charlson Comorbidity Index (CCI) was calculated electronically. [22] Sepsis was identified using a previously-validated electronic algorithm. [23] Relevant cardiovascular hospital discharge diagnoses were determined using ICD-9 diagnostic codes.
The primary study endpoint was all-cause hospital mortality; secondary endpoints included all-cause CICU mortality and 30-day mortality. Mortality data were extracted from Mayo Clinic electronic databases, the state of Minnesota electronic death certificates and the Rochester Epidemiology Project database, as previously described. [24] Categorical variables are reported as number (%), and the chi-squared test was used to compare groups. Continuous variables are reported as mean (± standard deviation, SD), and Student's t-test was used to compare groups. Univariate analysis was performed using continuous variables as predictors of mortality, and the area under the receiver operating characteristic curve (AUROC) values were determined. AUROC confidence intervals (CI) were calculated via 2000 bootstrap samples, and AUROC values were compared between scores using the DeLong test. A logistic regression model was created for each score to determine calibration for hospital mortality using the Hosmer-Lemeshow statistic. Multivariate analysis was performed using logistic regression including each individual SOFA sub-score as a continuous variable. Two-tailed P values <0.05 were considered statistically significant. Statistical analyses were performed using JMP version 13.0 Pro (SAS Institute, Cary, NC) and R version 3.4.2 (https://www.r-project. org/).
In the final study population, the mean age was 66.7±15.0 years and 459 (37.8%) patients were female ( Table 1). The final study population differed significantly from excluded patients with missing SOFA sub-score data, with higher illness severity and different cardiovascular discharge diagnoses ( Table 1). The mean SOFA score in the final study population was 8.1±3.6 compared to 2.9±3.6 in the excluded patients with missing SOFA sub-score data (p <0.001), and the SOFA score distribution was shifted towards higher SOFA scores in the final study Short-term mortality increased progressively with rising Day 1 SOFA score in the final study population (Fig 1). The Day 1 SOFA score was a univariate predictor of hospital mortality in the final study population (OR 1.28, 95% CI 1.23-1.34, AUROC 0.72, p <0.001; Table 2). As shown in Table 2, the discriminative capacity of the APACHE-III score for hospital mortality (AUROC 0.79; p <0.001 by DeLong test) was higher than the Day 1 SOFA score in the final study population; the OASIS score performed similarly to the Day 1 SOFA score (AUROC 0.73; p >0.05 by DeLong test). In the 1,098 (90.4%) patients without missing data for calculating OASIS (i.e. in whom imputation of missing data was not necessary), the AUROC was 0.77 (p = 0.01 by DeLong test compared with SOFA). Calibration of the APACHE-III score using the Hosmer-Lemeshow statistic ( Table 2) was good (p = 0.157), while calibration of the SOFA score was poor (p = 0.037); calibration of OASIS was borderline (p = 0.055). The mean and maximum SOFA score during the first 2 CICU days outperformed the Day 1 SOFA for prediction of hospital mortality (p <0.05 by DeLong test). The mean SOFA score during the first week in the CICU had the highest AUROC value of any of the scores tested (0.82) and had good calibration (p = 0.253), but the AUROC was not significantly different than APACHE-III (p = 0.07 by DeLong test). Among the 989 (81.5%) patients remaining in the CICU for >1 day, the 131 patients (13.2%) with a rising Day 2 SOFA had increased hospital mortality (36.6% vs. 19.8%, OR 2.34, 95% CI 1.58-1.37, p <0.001). The distribution of individual SOFA sub-scores in the final study population is shown in Fig 2. Each of the individual organ sub-scores was a univariate predictor of mortality (Table 2; all p <0.01). The cardiovascular and renal sub-scores had the highest AUROC values (0.67) for hospital mortality and the coagulation sub-score had the lowest AUROC value (0.53). Limited SOFA scores were calculated by omitting each of the SOFA organ sub-scores individually and discrimination for hospital mortality was assessed. Removal of the coagulation or liver subscores had minimal impact on the AUROC values for hospital mortality, while removal of the other sub-scores had a greater impact ( Table 2). The lowest AUROC value occurred when the renal sub-score was removed from the Day 1 SOFA score. In an exploratory analysis examining these modified SOFA scores among patients excluded from the study due to missing data, AUROC values for all tested scores were similar or higher compared to AUROC values seen in patients included in the final study population (S1 Table).
On multivariate analysis including all 6 SOFA organ sub-scores as predictors of hospital mortality, only the cardiovascular, central nervous system, respiratory and renal sub-scores were significant predictors of hospital mortality ( Table 3). The AUROC value of 0.74 for hospital mortality in the regression model was essentially unchanged when the coagulation and/or liver sub-scores were removed. When the coagulation sub-score was removed from the regression model, the liver sub-score became marginally significant as a predictor of hospital mortality (p = 0.048).
A simplified SOFA score including only the 4 SOFA organ sub-scores (cardiovascular, central nervous system, renal and respiratory) that were significantly predictive of hospital mortality on multivariate analysis had similar discriminative capacity compared to the Day 1 SOFA score for hospital mortality in the final study population (AUROC 0.73; Table 2); the AUROC value of this simplified SOFA score for the initial population was 0.83. Inclusion of only the cardiovascular, central nervous system and renal sub-scores again performed similarly in the final population (AUROC 0.72; Table 2); the AUROC value for the initial population was 0.81. Calibration of both of these simplified SOFA scores was good (p >0.05).

Discussion
This is the first study to explore the predictive value of individual SOFA organ sub-scores for short-term mortality in a contemporary CICU population with complete data availability. These CICU patients with available data to calculate all 6 SOFA organ sub-scores constituted a cohort of severely ill patients with hospital mortality exceeding 25%. Among these high-risk patients, the Day 1 SOFA score had good discrimination for hospital mortality, although discrimination was lower than previously reported, and calibration was poor. [15] Removal of the cardiovascular and renal SOFA sub-scores had the greatest effect on discrimination of hospital mortality. Only the cardiovascular, central nervous system, respiratory and renal sub-scores were independently predictive of hospital mortality on multivariate analysis. Removing the coagulation and liver SOFA sub-scores did not substantially impact discrimination. Simplified SOFA scores including the cardiovascular, central nervous system and renal sub-scores (with or without the respiratory sub-score) had similar discrimination for hospital mortality as the original SOFA score in this selected cohort. These findings expand on our prior study demonstrating that the Day 1 SOFA score had very good discrimination (AUROC value of 0.83) for hospital mortality in unselected CICU patients. [15] The availability of complete data for calculating the SOFA score significantly impacted its discrimination for hospital mortality, with paradoxically lower discrimination in patients with available data for all 6 SOFA organ sub-scores compared to the initial population or patients excluded due to missing data. [15] Patients in this study with complete SOFA sub-  score data had higher illness severity, leading to lower discrimination for mortality by the SOFA score and other risk scores. Because ICU risk scores differentiate high-risk from lowrisk patients, model performance would be expected to decrease in a sicker population. A prior study by Afessa, et al. demonstrated that individual variables used to calculate the Acute Physiology Score component of the APACHE score were more likely to be missing in less-sick patients, and patients without any missing data had the highest short-term mortality. [16] We report a novel, U-shaped association between the number of SOFA sub-scores with available data and short-term mortality, with higher mortality in the small number of patients missing data for 3 or more SOFA sub-scores as well as among patients who had available data for the respiratory and liver SOFA sub-scores. This association between laboratory testing patterns and mortality mirrors the prior study by Afessa, et al. reporting that patients with a measured serum bilirubin or albumin level (fewer than 20% of all patients) had higher observed mortality. [16] In the study by Afessa, et al. the number of missing variables was associated with increased mortality after correcting for the APACHE-III score using multivariate analysis, yet the observed-to-expected mortality appeared to be lower in patients with complete data. [16] Prior studies comparing the SOFA score with the APACHE score in general CICU populations have demonstrated similar discrimination for mortality. [14,15] Our prior study showed very good discrimination and poor calibration by both the SOFA and APACHE-III scores, but the APACHE-III score performed better in the high-risk subgroup included in the present study. [15] Paradoxically, while the discrimination as measured by the AUROC value of the SOFA score was lower in the population without missing data, the discrimination AUROC value of OASIS was higher in patients without missing data. This divergent effect of missing data on the performance of SOFA and OASIS is novel, whereas missing data has been previously shown to decrease discrimination by the APACHE-III score. [16] Notably, in our prior study, the AUROC for most SOFA sub-scores likewise decreased when missing data were imputed as normal. [15] Therefore, we hypothesize that the lower discrimination by the Day 1 SOFA score in this cohort compared to our prior study cohort may be due to higher observed mortality in this cohort, especially among patients with low SOFA scores. [15] Unlike the SOFA score, the APACHE-III score retained very good discrimination for hospital mortality in this study population compared to our prior study (AUROC 0.79 vs. 0.82). [15] Superior risk prediction by the APACHE-III score in the selected subgroup represented in this study may reflect the greater number of variables in the APACHE-III model, potentially allowing refinement of risk prediction beyond the simpler SOFA score; improved prediction by models containing greater numbers of variables has previously been demonstrated. [10,19] The major advantages of the SOFA score include its simplicity and ease of calculation, which allows daily SOFA score calculations to trend illness severity over time. [10,15] However, these advantages are only relevant insofar as risk prediction remains robust; we hypothesize that the SOFA score may be more useful for distinguishing high-risk from low-risk patients, rather than further stratifying the high-risk patients.
Not all of the individual SOFA organ sub-scores are equally relevant for mortality risk prediction in CICU patients. [15] Because the original intent of the SOFA score was to prognosticate in patients with sepsis, the organ failure variables included in the SOFA score are reflective of those commonly seen in sepsis and may be less relevant in CICU patients. [10][11][12][13] Notably, there was a 45% prevalence of sepsis in this cohort of CICU patients with complete SOFA organ sub-score data. The coagulation and liver SOFA sub-scores were not independently associated with hospital mortality and therefore contributed little to risk prediction in this population, which is not surprising given the low prevalence of significant thrombocytopenia and hyperbilirubinemia. In this cohort, the respiratory (36%), cardiovascular (26%) and renal (15%) sub-scores contributed the most to the total SOFA score, while the coagulation and liver sub-scores together contributed only 11% to the total SOFA score.
The impact of individual SOFA organ sub-scores on overall mortality prediction has not been previously explored in CICU patients, apart from our prior study. [15] Knox, et al. demonstrated that the central nervous system SOFA sub-score (i.e. GCS) dominated the predictive value of the SOFA score in a mixed ICU population. [25] Toma, et al. used computer modeling to determine that the central nervous system sub-score was the most important predictor of mortality in general ICU patients, followed by the cardiovascular and renal sub-scores. [26] In our prior study, the cardiovascular and renal sub-scores had the highest AUROC and OR values for hospital mortality, followed by the central nervous system sub-score; when missing data were imputed as normal, the respiratory sub-score had the highest AUROC value, followed by the cardiovascular sub-score. [15] This retrospective single-center cohort study has a number of limitations, including the possibility of unmeasured confounders and potential bias due to local practice patterns. Our patient population may be distinct from other centers, as reflected by a lower hospital death rate and rate of acute coronary syndromes in the initial population than most prior CICU studies. [2,5,6,14] This study included the minority of highly-selected patients who had available data for calculating each of the SOFA sub-scores, with evidence of selection bias whereby these sicker patients were more likely to have serum bilirubin and/or arterial blood gases measured. Because patients with complete SOFA sub-score data available differed substantially from other CICU patients, conclusions about the performance of the SOFA score and relative predictive value of individual SOFA sub-scores in this selected subgroup may not be broadly applicable. Our analysis was limited by data missingness, which was not random but instead associated with illness severity. While we could have performed multiple imputation to account for this missing data, imputation of missing data as normal is the recommended and accepted methodology for dealing with missing data in prognostic scoring systems. This approach provides a more parsimonious estimate of model performance, but can be associated with inaccuracy of the prognostic models at the extremes of illness severity. [10,16] Due to the potential non-randomness of missing data, a complete-case subgroup analysis, as performed in this study, has important limitations when compared to multiple imputation of missing data within the entire population. In addition, we did not have individual SOFA sub-scores available for subsequent CICU days to assess prediction at later time points; notably, the use of mean or maximum SOFA scores during the first week did not outperform the APACHE-III score.
In conclusion, CICU patients with the availability of complete data to calculate all 6 SOFA organ sub-scores are a high-risk population, reflecting a bias toward more laboratory testing in sicker patients. Discrimination of the SOFA and OASIS scores for hospital mortality was lower in this cohort than reported in the initial population; the APACHE-III score had the best performance of the scores we examined. The renal and cardiovascular SOFA sub-scores contribute the most to mortality prediction in these CICU patients, and the liver and coagulation SOFA sub-scores contribute little to mortality prediction. These findings emphasize the impact of data availability on performance of the SOFA score, and highlight the potential to refine the SOFA score for mortality risk prediction in CICU patients by replacing the current coagulation and liver SOFA sub-scores with variables more predictive of mortality in CICU populations. We suggest that future prospective studies using the SOFA score as a marker of illness severity use standardized methods for ensuring complete data availability to calculate each of the organ sub-scores, to avoid the limitations of missing data described herein. The suboptimal performance of the SOFA score in this study emphasizes the limitations of using the SOFA score in CICU populations and the need to develop better risk prediction models for CICU patients. Future studies are needed to determine the real-world performance of the SOFA score in CICU patients. This work may ultimately facilitate the future development of a CICUspecific SOFA-derived score or novel CICU-specific risk score that will be more widely applicable to CICU populations.