Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A 12-hospital prospective evaluation of a clinical decision support prognostic algorithm based on logistic regression as a form of machine learning to facilitate decision making for patients with suspected COVID-19

  • Monica I. Lupei ,

    Contributed equally to this work with: Monica I. Lupei, Danni Li, Nicholas E. Ingraham

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing

    Current address: Department of Anesthesiology, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

    Affiliation Division of Critical Care, Department of Anesthesiology, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Danni Li ,

    Contributed equally to this work with: Monica I. Lupei, Danni Li, Nicholas E. Ingraham

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Laboratory Medicine and Pathology, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Nicholas E. Ingraham ,

    Contributed equally to this work with: Monica I. Lupei, Danni Li, Nicholas E. Ingraham

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Writing – original draft, Writing – review & editing

    Affiliation Division of Pulmonary and Critical Care, Department of Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Karyn D. Baum,

    Roles Conceptualization, Data curation, Methodology, Resources, Writing – review & editing

    Affiliation Division of General Internal Medicine, Department of Medicine, Section of Hospital Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Bradley Benson,

    Roles Conceptualization, Writing – review & editing

    Affiliation Division of General Internal Medicine, Department of Medicine, Section of Hospital Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Michael Puskarich,

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Emergency Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • David Milbrandt,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Department of Emergency Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Genevieve B. Melton,

    Roles Conceptualization, Investigation, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliations Department of Surgery, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America, Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, United States of America

  • Daren Scheppmann,

    Roles Conceptualization, Data curation, Investigation

    Affiliation Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, United States of America

  • Michael G. Usher ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Visualization, Writing – original draft, Writing – review & editing

    ‡ MGU and CJT also contributed equally to this work

    Affiliation Division of General Internal Medicine, Department of Medicine, Section of Hospital Medicine, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America

  • Christopher J. Tignanelli

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

    ‡ MGU and CJT also contributed equally to this work

    Affiliations Department of Surgery, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America, Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, United States of America, Division of Critical Care and Acute Care Surgery, Department of Surgery, University of Minnesota Medical School, Minneapolis, Minnesota, United States of America



To prospectively evaluate a logistic regression-based machine learning (ML) prognostic algorithm implemented in real-time as a clinical decision support (CDS) system for symptomatic persons under investigation (PUI) for Coronavirus disease 2019 (COVID-19) in the emergency department (ED).


We developed in a 12-hospital system a model using training and validation followed by a real-time assessment. The LASSO guided feature selection included demographics, comorbidities, home medications, vital signs. We constructed a logistic regression-based ML algorithm to predict “severe” COVID-19, defined as patients requiring intensive care unit (ICU) admission, invasive mechanical ventilation, or died in or out-of-hospital. Training data included 1,469 adult patients who tested positive for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) within 14 days of acute care. We performed: 1) temporal validation in 414 SARS-CoV-2 positive patients, 2) validation in a PUI set of 13,271 patients with symptomatic SARS-CoV-2 test during an acute care visit, and 3) real-time validation in 2,174 ED patients with PUI test or positive SARS-CoV-2 result. Subgroup analysis was conducted across race and gender to ensure equity in performance.


The algorithm performed well on pre-implementation validations for predicting COVID-19 severity: 1) the temporal validation had an area under the receiver operating characteristic (AUROC) of 0.87 (95%-CI: 0.83, 0.91); 2) validation in the PUI population had an AUROC of 0.82 (95%-CI: 0.81, 0.83). The ED CDS system performed well in real-time with an AUROC of 0.85 (95%-CI, 0.83, 0.87). Zero patients in the lowest quintile developed “severe” COVID-19. Patients in the highest quintile developed “severe” COVID-19 in 33.2% of cases. The models performed without significant differences between genders and among race/ethnicities (all p-values > 0.05).


A logistic regression model-based ML-enabled CDS can be developed, validated, and implemented with high performance across multiple hospitals while being equitable and maintaining performance in real-time validation.


The dynamic of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection raised concerns regarding resource availability throughout medical systems, including intensive care unit (ICU) healthcare providers, personal protective equipment, total hospital, and ICU beds, and mechanical ventilators. On March 11th, 2020, the World Health Organization declared the Coronavirus disease 2019 (COVID-19) a pandemic. The COVID-19 pandemic has caused over 249 million confirmed infections and over 5 million confirmed deaths as of November 9th, 2021 [1]. One of the initial large observational studies, published from China, revealed that approximately 15% of the confirmed cases required hospitalization, 5% needed ICU admission, and 2.3% died [2]. A multihospital United States (U.S.) based cohort study identified that the 30-day mean risk standardized event rate of hospital mortality and hospice referral among patients with COVID-19 varied from 9% to 16%, with better outcomes occurring in community’s with lower disease prevalence [3]. A large cross-sectional study found racial and ethnic disparities in rates of COVID-19 hospital and ICU admission and in-hospital mortality in the US [4].

Since the beginning, global efforts by the scientific community to understand SARS-CoV-2 and the COVID-19 from the bench to the bedside have been remarkable [5]. Stratifying disease severity is an essential aspect of patient care; however, during a pandemic, its role becomes paramount and expands to improving patient safety while also optimizing hospital resource utilization. Several studies have developed emergency department (ED) evaluation systems with variable goals and methods [612]. These models successfully evaluated the possibility of isolating COVID-19 patients in ED, the epidemiology and COVID-19 clinical data, the advantage of distinguishing life-threatening emergencies, and the likelihood of COVID-19 diagnosis [612].

Most predictive models for COVID-19 severity involved patients with a positive polymerase chain reaction (PCR) test, not in patients with suspected COVID-19. A systematic evaluation of COVID-19 predictive models aimed at identifying clinical deterioration found that the majority of published studies included patients with confirmed infection [13], making them less useful in the clinic or emergency departments when diagnosis remains uncertain. The majority of predictive models for patients with suspected COVID-19 infection aimed to diagnose COVID-19, and very few predicted severity [14]. One systematic review of the prognostic models emphasized the high risk of bias while not recommending their use in clinical practice yet [15]. Since limitations mark the systematic reviews of the prognostic models, and a group of researchers from the United Kingdom (UK) developed a COVID-19 precise living document [16]. Another group of researchers proposed an open platform for such reviews that will be continuously updated using artificial intelligence and numerous experts [17]. The QCOVID is a published living risk prediction algorithm that performed well for predicting time to death in patients with confirmed or suspected COVID-19 [18].

We hypothesize that a logistic regression-based machine learning (ML) tool for patients with suspected or confirmed COVID-19 can accurately and equitably predict the development of “severe” COVID-19. The objective of this study was to conduct a 12-site prospective observational study to evaluate the real-time performance of a ML-enabled COVID-19 prognostic tool delivered as clinical decision support (CDS) to ED providers to facilitate shared decision-making with patients regarding ED discharge.


Study design and setting

This is a retrospective and prospective multihospital observational study that developed, implemented, and evaluated a prognostic model in patients with PCR-confirmed COVID-19 diagnosis or suspected COVID-19 (person under investigation [PUI]) in a 12-hospital system. This study was approved and determined as non-human research by the University of Minnesota Institutional Review Board (STUDY00011742).

Selection of participants

Patients were included if they were PCR confirmed COVID-19 positive or symptomatic PUI with a patient status of emergency, observation, or inpatient at a participating center. We only included patients who did not opt out of research on admission. Patients were excluded if they did not have at least one recorded ED vital sign (heart rate, respiratory rate, temperature, oxygen saturation, or systolic blood pressure) or missing comorbidity data. A complete set of vital signs was deemed necessary given our model was intended to be implemented and utilized across patients receiving a complete evaluation which would include at least one complete set of vital signs.

Feature selection and model development

A team of subject matter experts with expertise treating patients with COVID-19 and research experience in COVID-19 identified features hypothesized to be associated with development of “severe” disease (S1 Table). To reduce the likelihood of over-fitting a Least Absolute Shrinkage and Selection Operator (LASSO)-logit model was used to facilitate feature selection from this list with the tuning parameter determined by the Bayesian information criterion (BIC) as previously done by our group [19, 20]. LASSO is a penalized regression method that can facilitate factor selection by excluding factors with a minor contribution to the model [21]. S1 Table lists the features selected for the final model following LASSO selection.

Final features selected by LASSO included age (years), male [3, 22], race or ethnicity, non-English speaking [23, 24], overweight or obese (body mass index [BMI] > 25) [19, 25, 26], three month prior home medications [27] (defined as whether a patient was prescribed a medication within 3 months or before and after the index acute care visit) and chronic comorbidities [3, 28] extracted from ICD10 codes (S2 Table) collected in the 5 years prior to the index visit: Finally, we included the following vital signs: maximum heart rate (HR), respiratory rate (RR), temperature within the first 24 hours, and minimal peripheral arterial oxygen saturation (SpO2) and systolic blood pressure (SBP) within the first 24 hours. We included in the final list of features for LASSO only the variables available on presentation to ED.

Model construction

The purpose of this model generation was to develop a prognostic model that could predict patients who developed a severe case of COVID-19. Due to ease of interpretation and the importance to provide the basis to the clinician and patients for model predictions, a multivariable logistic regression model was trained using the features selected from LASSO. This model was developed using only data from the training dataset. A risk score was calculated in the validation cohorts based on the sum of the beta coefficients. The AUROC was calculated for all validation cohorts to evaluate discrimination in the validation datasets.


Our primary outcome was “severe” COVID-19 infection, defined as intensive care unit (ICU) admission, need for invasive mechanical ventilation (ventilator use), or in-hospital or out-of-hospital mortality (defined using state death certificate database) [2, 29, 30]. The secondary outcomes were individual, and combinations of the dependent variables mentioned above.

Training and test datasets

The training data set included 1,469 patients who were PCR-positive for SARS-CoV-2 within 14 days of an acute care, hospital-based visit including emergency department, observation, and inpatient encounters between March 4th to August 21st, 2020. The test set included 158 patients (random 90:10 selection of the training set).

Validation datasets

We included three validation sets:

  1. A temporal validation COVID-19 PCR-positive dataset comprised of 414 patients who tested positive for SARS-CoV-2 between August 22nd to October 11th, 2020. The purpose of this validation was to simulate real-time performance had the system gone “live” between August 22 and October 11th, 2020.
  2. A PUI data set comprised of 13,271 patients who had a SARS-CoV-2 test with a “symptomatic” designation ordered and a result pending during the first 24 hours of an acute care, hospital-based visit irrespective of the results between May 4th and October 11th, 2020. The symptomatic designation for patients with fever, cough, dyspnea, sore throat, muscle aches, vomiting, diarrhea was based on clinical judgment and prioritized testing for faster turnaround time beginning May 4th, 2020.
  3. A real-time data set included 2,174 patients with an ED visit and symptomatic test or a positive SARS-CoV-2 PCR test following implementation of the prognostic model in Emergency Departments (EDs) from November 23rd, 2020 to January 21st, 2021.


The patients’ characteristics between data sets were compared using ANOVA and chi-square respectively for continuous versus categorical variables. Odds ratios (OR) and 95% Confidence Intervals were also reported. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratios, false negative and false positive rate, and the area under the receiver operating characteristics (AUROC) were summarized for the model performance. Statistical significance was defined with the alpha set to 0.05, all tests were two-tailed. Statistical analyses were performed using Stata MP, version 16 (StataCorp, College Station, TX).

The real-time model was evaluated across gender and racial/ethnic groups to compare performance across different groups and ensure the model performed equitably.

Model implementation

Implementation into an Electronic Health Record (EHR) occurred for ED patients on November 23rd, 2020. The logistic prognostic model was exported as a predictive model markup language (PMML) file. An EHR reporting workbench was developed to facilitate inputs into the model. All the inputs were mapped using corresponding ICD-10 codes (S2 Table), pharmaceutical subclasses, RxNorm codes [31], and EHR documentation flowsheets (for vitals). The output was delivered as a clinical decision support system to ED providers. For visualization purposes, the COVID-19 severity risk score was multiplied by 100 and cut points that identified patients with Low Risk (low probability of primary outcome) and High Risk (high probability of primary outcome). Visualization (S1 Fig) was highlighted on the patient sidebar, available to all ED providers and nurses, as well as physicians and staff involved in triage, patient flow, and capacity management.


Descriptive results

A total of 2,041 patients were included in the final model training (1,469), testing (158), and temporal validation (414) (Fig 1). Table 1 listed patients’ characteristics in each cohort. Overall, significant difference in all variables in demographics, use of home medications, comorbidities, and 24-hour vitals existed across training and validation cohorts, except for loop diuretic, inflammatory bowel disease, and rheumatoid arthritis. Compared to COVID-19 PCR-positive patients in the training set, the patients in the temporal validation set and PUI set were slightly younger (median age of 52.2 and 49.1 years vs. 53.6 years) and had lower rates of ICU admission (18.1% and 10.8% vs 23.4%), ventilator use (3.4% and 5.3% vs. 11.1%), and mortality (1.7% and 3.5% vs. 8.5%). Compared to the training set, the real-time data set was older (median age of 56.9 years) and had lower rates of ICU admission, ventilator use and mortality (9.4%, 3.5%, and 6.8%), respectively.

Fig 1. Study diagram detailing the selection of patients for model generation.

Table 1. Characteristics of the patients included in a training set, temporal validation set, PUI set, and real-time validation set.

Table 2 described the odds ratios used in the logistic regression model generation. Other as race and inflammatory bowel disease, are the two variables with the highest odds ratios that reached statistical significance. Warfarin is the variable with lowest odds ratios that reached statistical significance. The model included factors that increase the odds of COVID-19 severity, such as age, male, Asian or Hispanic race, obesity, use of calcium channel blocker, rivaroxaban, oral steroids, clopidogrel, aspirin, and a loop diuretic, hypertension, type 2 diabetes mellitus, venous thromboembolism, pacemaker/automatic implantable cardioverter-defibrillator, pulmonary hypertension, chronic kidney disease, inflammatory bowel disease, maximum temperature, heart rate, respiratory rate in 24 hours, and factors that decrease the odds, such as the use of hydrochlorothiazide, angiotensin-converting enzyme inhibitor, angiotensin II receptor blockade, warfarin, rheumatoid arthritis, minimum peripheral oxygen saturation, systolic blood pressure in 24 hours.

In the validation cohorts, the risk score was used to identify a clinically useful threshold to predict the institutional metric. Multiple thresholds were defined, and 2x2 contingency tables including sensitivity, specificity, PPV, and NPV were created for each threshold. The system leadership reviewed the various thresholds and based on clinical resources, defined an appropriate threshold. The multidisciplinary team reviewed the model performance, including sensitivity, specificity, PPV, NPV, likelihood ratios across multiple thresholds to facilitate rapid implementation. Cut-off points flagging high and low-risk patients were chosen in collaboration with both system leadership following engagement with front-line providers. The goal for low-risk cut-off was to have a high sensitivity at the expense of specificity to reduce potential errors associated with inappropriate discharge home. The goal for high-risk cut-off was a higher specificity to balance the need for close monitoring with resource scarcity, including ICU and step-down capacity.

Pre-implementation validation: Temporal validation and in PUI

The model produced an AUROC of 0.87 (95% CI: 0.83, 0.91) for predicting the primary outcome (ICU admission, ventilator use, or death) using the temporal validation cohort (S2 Fig). None of the patients with the lowest 20% of the scores (0–0.0104) had ICU admission, ventilator use, or died, compared to 62%, 15.9%, and 7.3%, respectively, for patients with the highest 20% of the scores (0.168–1.0) (S3 Table). At a cut point of >0.1, the model had a sensitivity of 73.7% and specificity of 79.9% in predicting the composite outcomes (S4 Table).

This model was further tested in the PUI cohort that included 13,271 patients who had a SARS-CoV-2 test with a “symptomatic” designation ordered in ED. Of note, the accumulative COVID positive rate in the PUI data set was 26.8% (3,561 of 13,271 patients). A total of 68% of patients were discharged before the test resulted in our medical system. The model produced an AUROC of 0.82 (95% CI: 0.81, 0.83) for predicting the composite outcomes in the PUI cohort (S3 Fig). For patients with the lowest 20% of the scores (0.00062–0.0074), only 1.0% had ICU admission, 0.3% ventilator use, and 0.2% died, compared to 31.6%, 13.8%, and 11.9%, respectively, for patients with the highest 20% of the scores (0.168–1.0) (S5 Table). At the cut point of >0.1, the model had a sensitivity of 52.2% and specificity of 88.1% in predicting composite outcomes (S5 Table).

Real-time validation

Critically, we implemented this model to predict the composite outcomes to evaluating the COVID-19 severity and assessed the model’s real-time performance. The COVID positive rate in the real-time validation set was 61.2% (1,331 of 2,174 patients). This real-time cohort had a median age of 56.9 years (IQR: 35.4–72.4), had an ICU admission rate of 9.4%, ventilation rate of 3.5%, and mortality rate of 6.8%. The model had an AUROC of 0.85 (95% CI, 0.83, 0.87) to predict the primary outcome in the real-time data set. (Fig 2). The rates of ICU admission, ventilator use, and death in patients with the lowest 20% of the scores (0.001–0.009) were zero, significantly lower compared to those rates (32.7%, 15.5%, and 22.0%, respectively) for patients with the highest 20% of the scores (0.20–0.99) (Table 3). At the cut point of >0.1, the model had a sensitivity of 78% and a specificity of 71% in the real-time data set (Table 4). To evaluate the probabilities in the real time world we depicted the calibration plot for the real-time validation set. (S4 Fig).

Table 3. Distribution of outcomes by score ranges in quintile for the real-time validation data set (n = 2,174).

Table 4. Clinical performance of the logistical model for predicting COVID-19 severity* in the real-time validation data set (n = 2174).

Model performance on individual and combined outcomes and across minorities

The AUROC of all cohorts predicting various outcomes combined and individual are listed in Table 5. The performance remained strong for predicting secondary outcomes in combinations of ICU admission, need for mechanical ventilation, and mortality.

Table 5. AUROC of all cohorts for predicting the individual and composite outcomes.

Furthermore, the model performed equitably across gender/racial/ethnic minorities (Table 6). The AUROC for Blacks and Asians are 0.94 (95% CI: and 0.94 (95% CI: 0.90. 0.99), compared to that for Whites 0.82 (95% CI: 0.78, 0.86) (p>0.05) (Table 6). There is no statistical difference between the model performance in female versus male patients (p>0.05).

Table 6. Sensitivity analysis across gender/racial/ethnic minorities.


We developed and implemented a ML-enabled model to predict increased risk for COVID-19 severity to support the ED physicians’ clinical decision-making across our 12-sites medical system. Despite the significant variabilities of the factors, our model performed well in a large PUI study population. This approach is beneficial for clinical decision-making in ED where the COVID-19 PCR test results are inconsistently resulted. Importantly, we evaluated our model real-time in PUI patients seeking acute care in ED after the score became available in the EHR and the model performance remained strong. The difference in ICU admission rate, ventilator use, and mortality rate between the training set and the temporal, PUI, and real-time validation sets can be explained by the temporal improvement in COVID-19 patients’ outcomes that was noted in other studies [4, 32, 33]. The COVID-19 ICU admission and patient survival improved in our study over time as it did in other reports, perhaps because of better understating of the diseases and improvement of treatment as the pandemic progresses [4, 33] In our study, for a cutoff of 0.1 for COVID-19 severity, our model had a sensitivity of 73.7% and specificity 79.9% in the prospective validation set, 52.2% and 88.1%, respectively in the PUI set, and 78% and 71%, respectively in the real-time validation set. These results show good discrimination for patients with scores associated with increased rates of the primary outcome. Furthermore, the performance of the models were robust to the secular improvements in outcomes throughout validation.

Our model purpose was to estimate the risk of severe disease using ML as CDS in patients with or suspicion of COVID-19 presenting in ED. Furthermore, our goal was to use this model as CDS and facilitate the shared decision making between ED providers and patients regarding ED discharge and home saturation monitoring. The variables included (demographics, comorbidities, home medication, vital signs) are readily available in the ED. The laboratory values which are not always obtainable in ED were not included in the final model, which seemed to be feasible as described in a recent ML model published in the literature [34]. The variables associated with a significantly higher risk for COVID-19 severity in our model were male gender, older age, other as race, increased temperature, increased respiratory rate, decreased oxygen saturation, inflammatory bowel disease. Comparable to our model, vital signs, age, BMI, and comorbidities were the most important predictors in other investigations and reviews [35, 36]. Oxygen saturation and patient’s age were strong risk factors for deterioration and mortality in COVID-19 in a systematic evaluation of predictive models [13]. The use of warfarin appeared to be protective for our study’s composite outcome, similar to another report [37]. Hypercoagulability and need for anticoagulation were well recognized in COVID-19 and likely from increased immune response [38, 39]. We included variables that were not significant on univariate analysis as well as variables that were protective. These variables made our model valuable in real life when many covariates and confounding factors exist and increased the model calibration.

It is imperative that ML models are evaluated for equity across gender, race and ethnicity. We included gender, race and ethnicity in our model given the association between minority populations and male gender and worse COVID-19 outcomes [23, 24, 4042]. While others chose to create a different prognostic model for males and females, we decided to include all [18]. The male gender was a significant predictor in our study, and the AUROC in male patients showed good performance without statistical difference compared to the female gender AUROC. While including race has led to over and undertreatment of minority populations [43, 44], due to sampling bias, others argue that creating a “race un-aware” model also pertains risk in specific situations [45]. One particular situation is when race/ethnicity is associated with increased risk of the outcome, like other as race in our study that showed increase risk of COVID-19 severity. By creating a model without race or ethnicity, the model is trained to reflect the majority population and will inherently underappreciate the risk across minority populations [45]. Our model performed equitably across racial/ethnic minorities and did not increase the risk of widening the disparate outcomes observed throughout the pandemic [24, 46, 47]. By increasing treatment and resource allocation to non-whites, we hypothesize that this will increase equitable treatment allocation and attenuate disparate care.

Unlike most prognostic models predicting the COVID-19 diagnosis [35, 4850], our study aimed to implement and assess the predictive model in patients with suspected COVID-19 disease, or PUI. It is worth noting that 68% of our patients were discharged before the test resulted in our medical system. During this uncertainty period, many ED physicians are required to make clinical and triage decisions. Previous predictive models for patients with suspected COVID-19 infection have used imaging, demographics, signs and symptoms, vital signs to predict the likelihood of COVID-19 diagnosis, but they have not sought to predict the severity of the disease [12, 14, 51]. The Epic Deterioration Index (EDI) is a proprietary emergency deterioration index that has been developed in 3 US hospitals in US between 2012 and 2016; although it is not specific for COVID-1, it has been introduced in over 100 US hospitals to predict COVID-19 deterioration [30].

Multiple prognostic models for COVID-19 have been previously developed [14, 18, 30, 52, 53]. However, previous models suffer from multiple limitations. For example, many prior prognostic models included very limited training dataset [54, 55]. The largest study to date published in Great Britain used the 4 C Mortality Score to stratify the severity of the COVID-19 [56]. In contrast to our model, the 4 C includes some laboratory values (urea level and C-reactive protein) not always available in ED, and used data from COVID-19 positive patients admitted to the hospital: AUROC 0.79, (95% CI 0.78–0.79). A systematic external validation of 22 prognostic models in a cohort of 411 patients with COVID-19 found that NEWS2 score that predicted ICU admission or death within 14 days for symptoms onset: AUROC 0.78 (95% CI 0.73–0.83) achieved the highest AUROC [13]. The EDI has been recently tested on 392 COVID-19 hospitalized patients in single center and found an AUROC 0.79 (95% CI, 0.74–0.84) [30]. Our model performance for predicting COVID-19 severity in our prospective validation, PUI, and real-time data sets is more robust than in the above mentioned external validation of the prognostic models. Data from the national Registry of suspected COVID-19 in Emergency care (RECOVER network) comprising 116 hospitals from 25 states in the US produced a 13 variable score that can predict the probability of infection in patients presenting with suspected COVID-19 in ED [57]. The large RECOVER registry used patient data such as age, temperature, oxygen saturation, symptoms, and ethnicity readily available in ED; however, the score was developed with retrospective data and it was not tested in real time [57].

Strengths and limitations

Our study has several strengths. First, it was validated on patients with COVID-19 diagnosis and patients with suspected COVID-19. Second, the logistic regression-based ML used data readily available in ED. Third, we included variables that were non-significant or were protective in univariate analysis, making the logistic regression-based ML more suitable for real-life when many confounders exist. Fourth, it was tested in real-time in patients with suspected COVID-19 who presented in the acute care setting as a CDS for ED providers and patients. Finally, our model was tested for gender and race/ethnicity differences and performed equitably to avoid disparities.

These findings must be viewed within the context of the following limitations. First, this study was done within a single healthcare system. Despite a large catchment area that includes surrounding states, these results are specific to the regional patient population in which the models were derived until they have been validated in other populations with different demographics and socioeconomic backgrounds. Second, our model over-predicted the disease severity making it a valuable tool for patient safety and less for resource utilization. Third, the accuracy of patient comorbidities and medications available in ED relies on the history from EHR, not consistently updated during the acute care visit. Fourth, as seen in the calibration plot, the model does suffer from at the high-risk end, this is likely due to imbalance of the dataset without a large degree of “bad outcomes”. Future studies will seek to increase sample size and further include external institutions which will aid in further optimization of the model along with addressing the generalizability, respectively. Lastly, this study sought to develop, validate, and implement a prediction model to support clinical decision-making. Importantly, the model was never intended to replace clinical judgment, rather it was intended to complement and better inform providers and patients, specifically when there is a large degree of clinical uncertainty. The effect on clinical decisions and the long-term effect on patient safety remained to be determined and were beyond the scope of this analysis.


COVID-19 has burdened healthcare systems from multiple different facets, and finding ways to alleviate stress is crucial. CDS through ML-enabled predictive modeling may add to patient care, reduce undue decision-making variations, and optimize resource utilization, especially during a pandemic. We present a 12-hospital successful development and implementation of a COVID-19 prediction model that performs well across gender, race, and ethnicity for three different outcomes. The severity of illness primary outcome performed well in the PUI population despite being developed on a COVID-19 positive population. The effect on patient outcomes and resource use are needed to assess further the benefits of the model presented here.

Supporting information

S1 Table. Factors of interest and factors selected for the final model.


S2 Table. Categories of comorbidities and ICD 10 Codes.


S3 Table. Distribution of outcomes by score ranges in quintile for the temporal validation data set (n = 414).


S4 Table. Clinical performance of the logistical model for predicting COVID-19 disease severity* in the temporal validation data set (n = 414).


S5 Table. Distribution of outcomes by score ranges in quintile for the PUI data set (n = 13,271).


S6 Table. Clinical performance of the logistical model for predicting COVID-19 severity* in the PUI data set (n = 13,271).


S1 Fig. Implementation of the model for predicting COVID-19 severity in ED.


S2 Fig. ROC curve for prospective validation (n = 414).


S3 Fig. ROC curve for PUI validation (n = 13,271).


S4 Fig. Real-time validation calibration plot.



The authors wish to thank Sean Switzer, DO, Eric Murray, and Iva Ninkovic for valuable technical support, and Gyorgy Simon, PhD, for regression analysis and model building advice.


  1. 1.
  2. 2. Wu Z, McGoogan JM. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. JAMA. 2020. Epub 2020/02/24. pmid:32091533.
  3. 3. Asch DA, Sheils NE, Islam MN, Chen Y, Werner RM, Buresh J, et al. Variation in US Hospital Mortality Rates for Patients Admitted With COVID-19 During the First 6 Months of the Pandemic. JAMA Intern Med. 2020. Epub 2020/12/22. pmid:33351068.
  4. 4. Acosta AM, Garg S, Pham H, et al. Racial and Ethnic Disparities in Rates of COVID-19-Associated Hospitalization, Intensive Care Unit Admission, and In-Hospital Death in the United States From March 2020 to February 2021. JAMA Netw Open. 10 01 2021;4(10):e2130479. pmid:34673962
  5. 5. Ingraham NE, Tignanelli CJ. Fact Versus Science Fiction: Fighting Coronavirus Disease 2019 Requires the Wisdom to Know the Difference. Crit Care Explor. 2020;2(4):e0108. Epub 2020/04/29. pmid:32426750.
  6. 6. Wallace DW, Burleson SL, Heimann MA, Crosby JC, Swanson J, Gibson CB, et al. An adapted emergency department triage algorithm for the COVID-19 pandemic. J Am Coll Emerg Physicians Open. 2020. Epub 2020/08/25. pmid:32838392.
  7. 7. Levy Y, Frenkel Nir Y, Ironi A, Englard H, Regev-Yochay G, Rahav G, et al. Emergency Department Triage in the Era of COVID-19: The Sheba Medical Center Experience. Isr Med Assoc J. 2020;8(22):404–9. pmid:33236578.
  8. 8. Chung HS, Lee DE, Kim JK, Yeo IH, Kim C, Park J, et al. Revised Triage and Surveillance Protocols for Temporary Emergency Department Closures in Tertiary Hospitals as a Response to COVID-19 Crisis in Daegu Metropolitan City. J Korean Med Sci. 2020;35(19):e189. Epub 2020/05/19. pmid:32419401.
  9. 9. O’Reilly GM, Mitchell RD, Noonan MP, Hiller R, Mitra B, Brichko L, et al. Informing emergency care for COVID-19 patients: The COVID-19 Emergency Department Quality Improvement Project protocol. Emerg Med Australas. 2020;32(3):511–4. Epub 2020/04/08. pmid:32255567.
  10. 10. Jaffe E, Sonkin R, Alpert EA, Magid A, Knobler HY. Flattening the COVID-19 Curve: The Unique Role of Emergency Medical Services in Containing a Global Pandemic. Isr Med Assoc J. 2020;8(22):410–6. pmid:33236579.
  11. 11. Penverne Y, Leclere B, Labady J, Berthier F, Jenvrin J, Javaudin F, et al. Impact of two-level filtering on emergency medical communication center triage during the COVID-19 pandemic: an uncontrolled before-after study. Scand J Trauma Resusc Emerg Med. 2020;28(1):80. Epub 2020/08/18. pmid:32799911.
  12. 12. Hüfner A, Kiefl D, Baacke M, Zöllner R, Loza Mencía E, Schellein O, et al. [Risk stratification through implementation and evaluation of a COVID-19 score: A retrospective diagnostic study]. Med Klin Intensivmed Notfmed. 2020. Epub 2020/11/06. pmid:33156352.
  13. 13. Gupta RK, Marks M, Samuels THA, Luintel A, Rampling T, Chowdhury H, et al. Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: An observational cohort study. Eur Respir J. 2020. Epub 2020/09/25. pmid:32978307.
  14. 14. Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ. 2020;369:m1328. Epub 2020/04/07. pmid:32265220.
  15. 15. Shamsoddin E. Can medical practitioners rely on prediction models for COVID-19? A systematic review. Evid Based Dent. 2020;21(3):84–6. Epub 2020/09/27. pmid:32978532.
  16. 16. COVID-19 PRECISE.
  17. 17. Rada G, Verdugo-Paiva F, Ávila C, Morel-Marambio M, Bravo-Jeria R, Pesce F, et al. Evidence synthesis relevant to COVID-19: a protocol for multiple systematic reviews and overviews of systematic reviews. Medwave. 2020;20(3):e7868. Epub 2020/04/01. pmid:32255438.
  18. 18. Clift AK, Coupland CAC, Keogh RH, Diaz-Ordaz K, Williamson E, Harrison EM, et al. Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study. BMJ. 2020;371:m3731. Epub 2020/10/22. pmid:33082154
  19. 19. Bramante CT, Ingraham NE, Murray TA, Marmor S, Hovertsen S, Gronski J, et al. Metformin and risk of mortality in patients hospitalised with COVID-19: a retrospective cohort analysis. Lancet Healthy Longev. 2021;2(1):e34–e41. Epub 2021/02/02. pmid:33521772.
  20. 20. Bramante C, Tignanelli CJ, Dutta N, Jones E, Tamariz L, Clark JM, et al. Non-alcoholic fatty liver disease (NAFLD) and risk of hospitalization for Covid-19. medRxiv: the preprint server for health sciences. 2020:2020.09.01.20185850. pmid:32909011.
  21. 21. Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society Series B (Methodological). 1996;58(1):267–88.
  22. 22. Petrilli CM, Jones SA, Yang J, Rajagopalan H, O’Donnell L, Chernyak Y, et al. Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective cohort study. BMJ. 2020;369:m1966. Epub 2020/05/24. pmid:32444366
  23. 23. Lusczek ER, Ingraham NE, Karam BS, Proper J, Siegel L, Helgeson ES, et al. Characterizing COVID-19 clinical phenotypes and associated comorbidities and complication profiles. PLoS One. 2021;16(3):e0248956. Epub 2021/04/01. pmid:33788884.
  24. 24. Ingraham NE, Purcell LN, Karam BS, Dudley RA, Usher MG, Warlick CA, et al. Racial and Ethnic Disparities in Hospital Admissions from COVID-19: Determining the Impact of Neighborhood Deprivation and Primary Language. J Gen Intern Med. 2021. Epub 2021/05/19. pmid:34003427.
  25. 25. Dutta N, Ingraham NE, Usher MG, Fox C, Tignanelli CJ, Bramante CT. We Should Do More to Offer Evidence-Based Treatment for an Important Modifiable Risk Factor for COVID-19: Obesity. J Prim Care Community Health. 2021;12:2150132721996283. Epub 2021/03/03. pmid:33648370.
  26. 26. Bramante CT, Buse J, Tamaritz L, Palacio A, Cohen K, Vojta D, et al. Outpatient metformin use is associated with reduced severity of COVID-19 disease in adults with overweight or obesity. J Med Virol. 2021;93(7):4273–9. Epub 2021/02/14. pmid:33580540.
  27. 27. Elliott J, Bodinier B, Whitaker M, Delpierre C, Vermeulen R, Tzoulaki I, et al. COVID-19 mortality in the UK Biobank cohort: revisiting and evaluating risk factors. Eur J Epidemiol. 2021. Epub 2021/02/15. pmid:33587202.
  28. 28. Guan WJ, Liang WH, Zhao Y, Liang HR, Chen ZS, Li YM, et al. Comorbidity and its impact on 1590 patients with COVID-19 in China: a nationwide analysis. Eur Respir J. 2020;55(5). Epub 2020/05/14. pmid:32217650.
  29. 29.
  30. 30. Singh K, Valley TS, Tang S, Li BY, Kamran F, Sjoding MW, et al. Evaluating a Widely Implemented Proprietary Deterioration Index Model among Hospitalized Patients with COVID-19. Ann Am Thorac Soc. 2021;18(7):1129–37. pmid:33357088.
  31. 31. Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc. 2011;18(4):441–8. Epub 2011/04/26. pmid:21515544.
  32. 32. Auld SC, Caridi-Scheible M, Blum JM, Robichaux C, Kraft C, Jacob JT, et al. ICU and Ventilator Mortality Among Critically Ill Adults With Coronavirus Disease 2019. Crit Care Med. 2020;48(9):e799–e804. pmid:32452888.
  33. 33. Kristinsson B, Kristinsdottir LB, Blondal AT, Thormar KM, Kristjansson M, Karason S, et al. Nationwide Incidence and Outcomes of Patients With Coronavirus Disease 2019 Requiring Intensive Care in Iceland. Crit Care Med. 2020;48(11):e1102–e5. Epub 2020/08/17. pmid:32796182.
  34. 34. Mahdavi M, Choubdar H, Zabeh E, Rieder M, Safavi-Naeini S, Jobbagy Z, et al. A machine learning based exploration of COVID-19 mortality risk. PLoS One. 2021;16(7):e0252384. Epub 20210702. pmid:34214101.
  35. 35. Hao B, Sotudian S, Wang T, Xu T, Hu Y, Gaitanidis A, et al. Early prediction of level-of-care requirements in patients with COVID-19. Elife. 2020;9. Epub 2020/10/13. pmid:33044170.
  36. 36. Ingraham NE, Barakat AG, Reilkoff R, Bezdicek T, Schacker T, Chipman JG, et al. Understanding the renin-angiotensin-aldosterone-SARS-CoV axis: a comprehensive review. Eur Respir J. 2020;56(1). Epub 2020/04/29. pmid:32341103.
  37. 37. Harrison RF, Forte K, Buscher MG, Chess A, Patel A, Moylan T, et al. The Association of Preinfection Daily Oral Anticoagulation Use and All-Cause in Hospital Mortality From Novel Coronavirus 2019 at 21 Days: A Retrospective Cohort Study. Crit Care Explor. 2021;3(1):e0324. Epub 2021/01/22. pmid:33521644.
  38. 38. Obi AT, Barnes GD, Napolitano LM, Henke PK, Wakefield TW. Venous thrombosis epidemiology, pathophysiology, and anticoagulant therapies and trials in severe acute respiratory syndrome coronavirus 2 infection. J Vasc Surg Venous Lymphat Disord. 2021;9(1):23–35. Epub 2020/09/12. pmid:32916371.
  39. 39. Ingraham NE, Lotfi-Emran S, Thielen BK, Techar K, Morris RS, Holtan SG, et al. Immunomodulation in COVID-19. Lancet Respir Med. 2020;8(6):544–6. Epub 2020/05/08. pmid:32380023.
  40. 40. Rentsch CT, Kidwai-Khan F, Tate JP, Park LS, King JT, Skanderson M, et al. Covid-19 by Race and Ethnicity: A National Cohort Study of 6 Million United States Veterans. medRxiv. 2020. Epub 2020/05/18. pmid:32511524.
  41. 41. Ciceri F, Castagna A, Rovere-Querini P, De Cobelli F, Ruggeri A, Galli L, et al. Early predictors of clinical outcomes of COVID-19 outbreak in Milan, Italy. Clin Immunol. 2020;217:108509. Epub 20200612. pmid:32535188.
  42. 42. Grasselli G, Greco M, Zanella A, Albano G, Antonelli M, Bellani G, et al. Risk Factors Associated With Mortality Among Patients With COVID-19 in Intensive Care Units in Lombardy, Italy. JAMA Intern Med. 2020;180(10):1345–55. pmid:32667669.
  43. 43. D’Agostino RB Sr, Grundy S, Sullivan LM, Wilson P, Group ftCRP. Validation of the Framingham Coronary Heart Disease Prediction Scores: Results of a Multiple Ethnic Groups Investigation. JAMA. 2001;286(2):180–7. pmid:11448281
  44. 44. Tillin T, Hughes AD, Whincup P, Mayet J, Sattar N, McKeigue PM, et al. Ethnicity and prediction of cardiovascular disease: performance of QRISK2 and Framingham scores in a UK tri-ethnic prospective cohort study (SABRE—Southall And Brent REvisited). Heart. 2014;100(1):60–7.
  45. 45. Paulus JK, Kent DM. Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. npj Digital Medicine. 2020;3(1):99. pmid:32821854
  46. 46. Mackey K, Ayers CK, Kondo KK, Saha S, Advani SM, Young S, et al. Racial and Ethnic Disparities in COVID-19-Related Infections, Hospitalizations, and Deaths: A Systematic Review. Ann Intern Med. 2021;174(3):362–73. Epub 2020/12/01. pmid:33253040.
  47. 47. Ogedegbe G, Ravenell J, Adhikari S, Butler M, Cook T, Francois F, et al. Assessment of Racial/Ethnic Disparities in Hospitalization and Mortality in Patients With COVID-19 in New York City. JAMA Netw Open. 2020;3(12):e2026881. Epub 2020/12/05. pmid:33275153.
  48. 48. Haimovich AD, Ravindra NG, Stoytchev S, Young HP, Wilson FP, van Dijk D, et al. Development and Validation of the Quick COVID-19 Severity Index: A Prognostic Tool for Early Clinical Decompensation. Ann Emerg Med. 2020;76(4):442–53. Epub 2020/10/06. pmid:33012378.
  49. 49. Castro VM, McCoy TH, Perlis RH. Laboratory Findings Associated With Severe Illness and Mortality Among Hospitalized Individuals With Coronavirus Disease 2019 in Eastern Massachusetts. JAMA Netw Open. 2020;3(10):e2023934. Epub 2020/10/31. pmid:33125498.
  50. 50. Ioannou GN, Locke E, Green P, Berry K, O’Hare AM, Shah JA, et al. Risk Factors for Hospitalization, Mechanical Ventilation, or Death Among 10 131 US Veterans With SARS-CoV-2 Infection. JAMA Netw Open. 2020;3(9):e2022310. Epub 2020/09/01. pmid:32965502.
  51. 51. Sun Y, Koh V, Marimuthu K, Ng OT, Young B, Vasoo S, et al. Epidemiological and Clinical Predictors of COVID-19. Clin Infect Dis. 2020;71(15):786–92. pmid:32211755.
  52. 52. Paranjape N, Staples LL, Stradwick CY, Ray HG, Saldanha IJ. Development and validation of a predictive model for critical illness in adult patients requiring hospitalization for COVID-19. PLoS One. 2021;16(3):e0248891. Epub 20210319. pmid:33740030.
  53. 53. Wang AZ, Ehrman R, Bucca A, Croft A, Glober N, Holt D, et al. Can we predict which COVID-19 patients will need transfer to intensive care within 24 hours of floor admission? Acad Emerg Med. 2021;28(5):511–8. Epub 20210404. pmid:33675164.
  54. 54. Hu H, Yao N, Qiu Y. Comparing Rapid Scoring Systems in Mortality Prediction of Critically Ill Patients With Novel Coronavirus Disease. Acad Emerg Med. 2020;27(6):461–8. Epub 2020/04/21. pmid:32311790.
  55. 55. Ji D, Zhang D, Xu J, Chen Z, Yang T, Zhao P, et al. Prediction for Progression Risk in Patients With COVID-19 Pneumonia: The CALL Score. Clin Infect Dis. 2020;71(6):1393–9. Epub 2020/04/10. pmid:32271369.
  56. 56. Knight SR, Ho A, Pius R, Buchan I, Carson G, Drake TM, et al. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. BMJ. 2020;370:m3339. Epub 2020/09/09. pmid:32907855.
  57. 57. Kline JA, Camargo CA, Courtney DM, Kabrhel C, Nordenholz KE, Aufderheide T, et al. Clinical prediction rule for SARS-CoV-2 infection from 116 U.S. emergency departments 2-22-2021. PLoS One. 2021;16(3):e0248438. Epub 20210310. pmid:33690722.