COVID-IRS: A novel predictive score for risk of invasive mechanical ventilation in patients with COVID-19

Background Coronavirus disease 2019 (COVID-19) is a systemic disease that can rapidly progress into acute respiratory failure and death. Timely identification of these patients is crucial for a proper administration of health-care resources. Objective To develop a predictive score that estimates the risk of invasive mechanical ventilation (IMV) among patients with COVID-19. Study design Retrospective cohort study of 401 COVID-19 patients diagnosed from March 12, to August 10, 2020. The score development cohort comprised 211 patients (52.62% of total sample) whereas the validation cohort included 190 patients (47.38% of total sample). We divided participants according to the need of invasive mechanical ventilation (IMV) and looked for potential predictive variables. Results We developed two predictive scores, one based on Interleukin-6 (IL-6) and the other one on the Neutrophil/Lymphocyte ratio (NLR), using the following variables: respiratory rate, SpO2/FiO2 ratio and lactic dehydrogenase (LDH). The area under the curve (AUC) in the development cohort was 0.877 (0.823–0.931) using the NLR based score and 0.891 (0.843–0.939) using the IL-6 based score. When compared with other similar scores developed for the prediction of adverse outcomes in COVID-19, the COVID-IRS scores proved to be superior in the prediction of IMV. Conclusion The COVID-IRS scores accurately predict the need for mechanical ventilation in COVID-19 patients using readily available variables taken upon admission. More studies testing the applicability of COVID-IRS in other centers and populations, as well as its performance as a triage tool for COVID-19 patients are needed.


Conclusion
The COVID-IRS scores accurately predict the need for mechanical ventilation in COVID-19 patients using readily available variables taken upon admission. More studies testing the applicability of COVID-IRS in other centers and populations, as well as its performance as a triage tool for COVID-19 patients are needed.

Background
SARS-CoV2 is a viral pathogen that causes coronavirus disease 2019 . The clinical spectrum of COVID-19 varies widely. Up to 80% of patients present with an inconsequential flu-like illness, but 20% develop a form of viral pneumonia with acute respiratory distress syndrome (ARDS). In turn, 15% require support with invasive mechanical ventilation (IMV) [4][5][6]. Among hospitalized COVID-19 patients, 5-33% will require admission to an intensive care unit (ICU) and 75% to 100% of them will require IMV [1]. Mortality rates vary from center to center, but in general they remain high in the group of critically ill patients who develop respiratory failure and require admission to ICU for IMV [5].
Since the original outbreak in Wuhan, China in December 2019, SARS-CoV2 has rapidly spread around the world reaching unprecedented pandemic proportions and overwhelming healthcare systems worldwide [2]. Mexico's public health system represents one of those cases, being the country with the third highest COVID-19 mortality rate [3,4]. One of the main challenges of the COVID-19 pandemic has been performing a proper triage that allows reasonable and cost-effective allocation of health-care resources [5][6][7]. Identifying patients that are likely to evolve into severe disease is a challenging task that surpasses good clinical judgement. Thus, there is an urgent need to develop tools capable of predicting the course of the disease. These could aid clinicians to select patients who are at risk and therefore warrant early life-saving interventions [4].

Objectives
To develop a new severity score for the prediction of IMV in COVID-19.

Study design
We retrospectively collected information from all COVID-19 patients aged 18 years or older admitted to the American British Cowdray Medical Center, a private teaching hospital in Mexico City, between March 12 and August 10, 2020. The diagnosis of COVID-19 was suspected based on clinical manifestations and confirmed by means of a positive PCR for SARS--CoV-2, which was carried out according to the Centers for Disease Control published guidelines [8] or in case of a negative PCR, with a chest CT scan with characteristic findings for COVID-19. The primary outcome was the need for IMV.
Exclusion criteria included having a "Do Not Resuscitate" order or having incomplete data in the electronical medical record. The ethics committee waived the requirement for an informed consent. All the analyzed data was fully anonymized from the moment it was captured and remained so during the entire duration of the study. The protocol (ID: ABC-20-50) was approved by our local scientific and ethics committees (Comité de Ética en Investigación, American British Cowdray Medical Center) and conducted according to the principles of the Helsinki declaration.

Development and validation cohort election
We divided the cohort in two groups of roughly equal size using a random number generation algorithm. The larger group was used for the development cohort, while the smaller group was used as the validation cohort. We compared both cohorts using the chi-square test for categorical variables and Man-Whitney U test for continuous variables, in order to find significant differences in their baseline characteristics and outcomes.

Potential predictive variables
We categorized patients' characteristics at hospital admission into the following groups of variables: demographic and anthropometric characteristics, clinical features, medical history, laboratory results, and clinical outcomes. Demographic and anthropometric characteristics included age, gender, body mass index, and ethnicity. Clinical features included vital signs, presence of symptoms characteristic of COVID-19 (dyspnea, fever, cough, etc.), and date of symptom onset. Medical history included currently diagnosed comorbidities (diabetes, hypertension, cancer, etc.), smoking status, alcohol consumption, and current medical treatments. Laboratory results included complete blood count (CBC), coagulation tests, blood chemistry panel, liver function tests, lipid profile, inflammatory markers, including interleukin-6 (IL-6), ultrasensitive C reactive protein (CRP), D-dimer, fibrinogen and procalcitonin, as well as and 25-hydroxi-vitamin D3. Clinical outcomes included in-hospital death, length of stay and the need for invasive mechanical ventilation (IMV).

Predictive variable selection
Using the development cohort, we performed univariate logistic regressions for IMV using all the variables mentioned above. We selected all variables that had a p value <0.1 and conducted a backwards stepwise multivariate logistic regression to find the variables that were independently associated with the requirement of IMV. After the selection of the optimal variables for the model, in order to ensure the model's applicability in most settings, we checked for the laboratory variable's availability in general settings. This was done via a telephonic interview on 7 different general hospitals in Mexico City and its surroundings. The variables that were not available in more than 50% of the screened hospitals were deemed to be not readily available. We tested for similar variables using the Spearman correlation test in order to identify suitable surrogates. Thus, we developed two predictive models, one constructed with optimal variables and the other one with accessible surrogate variables.

Construction of the score and assessment of accuracy
After identifying the predictive variables, we carried out locally weighted scatterplot smoothing (LOWESS) curves on numerical variables in order to determine adequate intervals and cut-off points on both models. Subsequently, in order to assign a scoring value to the selected variables, we estimated their coefficient of variation using univariate logistic regressions and assigned the rounded-up coefficient as the numeric value for the score in the corresponding strata. We constructed receiver operating characteristic (ROC) curves in order to evaluate the performance of our scores. Evaluation for goodness of fit was carried out by means of the Hosmer-Lemeshow test and predictive performance was ascertained by the concordance index (C-index). We evaluated internal calibration with 2000 bootstrap samples. The score underwent external validation by comparing the ROC curves of the development and validation cohorts. Finally, we compared the ROC curves of our score with the calculated ROC curves of other scores that predict ventilatory deterioration or other adverse outcomes in COVID-19 patients (ABC-GOALScl, COVID-GRAM, NEWS-2, CURB-65, and CALL prediction model) [9][10][11][12][13] in both, the development and validation cohorts. We compared the ROC curves of the aforementioned scores using only the data from those patients in whom all the scores were calculated appropriately. We performed all statistical analyses using STATA version 14 (StataCorp, College Station, Texas, USA) and GraphPad Prism 6.0 (GraphPad Software, San Diego, CA, USA).

Results
The score development cohort comprised 211 patients (52.62% of total sample) whereas the validation cohort included 190 patients (47.38% of total sample). We divided participants according to the need of IMV. Baseline population characteristics are depicted in Table 1. The comparison between the development and validation cohorts is shown in S1 Table (S1 Table. Comparison between the development and validation cohorts).

Predictive variables selection and score construction
S2 Table (S2 Table. Univariate logistic regressions for variable selection) depicts the univariate logistic regressions for all individual variables. Based on the backwards stepwise multivariate logistic regression (S3 Table. Multivariate logistic regression), we selected the following predictive variables for the development of the score: Respiratory rate, SpO2/FiO2 ratio, LDH and IL-6. Since IL-6 was deemed as not readily available in most settings, we decided to use the Neutrophil/Lymphocyte Ratio (NLR) as a suitable surrogate, due to its easy availability and good performance in both, the correlation test (Spearman's rho = 0.485, p<0.001) and the multivariate logistic model (coefficient 0.049, p = 0.004, R-squared = 0.3428) (Fig 1) Table. Spearman's correlation results and R-squared of multivariate logistic regression models for surrogate variables).
We named our score COVID-IRS (Intubation Risk Score). We constructed two different versions of the score: COVID-IRS-IL6 using the optimal model and COVID-IRS-NLR using the accessible variables. We further stratified the aforementioned scores into low, moderate, high, and very high-risk categories. The scores and their respective interpretations are shown in Fig 2. Although there was a tendency towards a higher median amount of days between patient admission to the hospital and the requirement of IMV in lower risk groups (ex. 5 days in low risk patients vs. one day in high risk patients) these differences did not prove to be statistically significant (COVID-IRS-NLR, p = 0.371; COVID-IRS-IL6, p = 0.275) (S1 Fig. Median days from patient admission until IMV requirement by risk group).

Discussion
In this study, we developed two novel prognostic scores for the prediction of IMV requirement in COVID-19 patients, using variables registered upon hospital admission. ROC analysis of data derived from both the development and the validation cohorts revealed an excellent

PLOS ONE
performance of the NLR-based as well as of the IL-6-based scores. Importantly, according to our analysis, the NLR proved to be an outstanding surrogate of IL-6. When compared with other similar scores developed for the prediction of adverse outcomes in COVID-19, the COV-ID-IRS scores proved to be superior in the prediction of IMV. We believe that the biomarkers used in the COVID-IRS scores (respiratory rate, SaO2/ FiO2 ratio, LDH, and either IL-6 or NLR), accurately represent relevant aspects of the clinical phenomena seen in severe COVID-19. Both, the respiratory rate and the SaO2/FiO2 ratio evaluate ventilatory function, whose deterioration is the main component associated with COVID-19 mortality [9,10]. The SaO2/FiO2 ratio was used as a surrogate for the PaO2/FiO2 ratio due to its availability and because it maintains a close linear relationship with O2-CO2 exchange and blood oxygenation [11]. LDH is involved in the anaerobic metabolism of glucose and thus, is upregulated when oxygen supplies are limited [12]. LDH levels are increased in patients with COVID-19 pneumonia and have been associated with adverse outcomes and consistently included in COVID-19 severity scores [12]. Finally, IL-6 and the NLR reflect the severity of the ongoing inflammatory process and immune dysregulation [13][14][15][16]. IL-6 is a pleiotropic cytokine mainly secreted by activated macrophages in response to any aggressor. It promotes the production of acute phase reactants and the proliferation of myeloid cells, as well as neutrophil survival in lung tissue [17,18]. On the other hand, neutrophils as effectors of the innate immune system may reflect the severity of pneumonia and have been used as markers of poor prognosis in different inflammatory states, such as sepsis [17]. Lymphocytes, another important cell of the immune system, are recruited to damaged tissues and in the context of COVID-19 tend to migrate to lung and blood vessels, which partially accounts for the low peripheral lymphocyte count seen in these patients [19,20]. Thus, a high NLR is a reflection of the severity of the ongoing inflammatory process [21][22][23].
Both IL-6 and NLR have been used as prognostic markers in both, influenza and community-acquired pneumonia [24]. It therefore seemed logical to try to use them as predictive biomarkers in patients with SARS-Cov-2 pneumonia [24,25]. Since the beginning of the pandemic leukocytosis, lymphopenia and high levels of IL-6 have been consistently associated with poor prognosis in patients with COVID-19 infection [25]. The correlation between NLR and IL-6 has been previously described in other clinical contexts [11,18]. Our study is perhaps the first one to evaluate the equivalency between the NLR and the serum levels of IL-6 in the context of

PLOS ONE
COVID-19 severity. Even though both measurements seem to accurately reflect severity, IL-6 measurements require specialized equipment and are only readily available in few centers, while the NLR only requires a CBC, which is inexpensive and widely available [19,20]. Different prognostic scores for COVID-19 have been developed using different variables, including the presence of comorbidities, age, absolute lymphocyte count, LDH, oxygen saturation, respiratory rate, and bilateral opacities on CT scan in order to identify patients at risk of adverse outcomes [26][27][28][29][30]. There are some predictive scores with similar applications to the COVID-IRS score. The COVID-GRAM score was created to calculate the probability of developing critical COVID-19 using data from 1590 Chinese patients. The AUC on both the development and the validation cohorts were 0.88 [27]. Another score is the ABC-GOALS, developed to predict ICU admission, and is based on data from 329 patients admitted to a COVID-19 reference center in Mexico City. The ABC-GOALS score has 3 versions, a clinical only model (ABC-GOALSc), a clinical and laboratory model (ABC-GOALScl), and a clinical, laboratory and x-ray model (ABC-GOALSclx). We only compared our data with the ABC-GOALScl score, due to our lack of more precise CT scan interpretation data in our dataset. The AUC of the ABC-GOALScl score was 0.86 and 0.87 in its development and validation cohorts, respectively. More recently the PREDICO score has been developed for the prediction of severe respiratory failure, using the data of 1265 patients from eleven Italian hospitals. The AUC was of 0.89 and 0.85 in its development and validation cohorts. All the aforementioned scores have several variables in common with the COVID-IRS score like LDH, Lymphocyte count (NLR in the COV-ID-GRAM score), respiratory rate and SaO2/FiO2 ratio [29]. Even though both these scores were not developed for the specific identification of patients that were going to require IMV, they achieved lower AUC when they were tested directly in our population, in both the development and validation cohorts (COVID-GRAM: 0.787 and 0.773; ABC-GOALScl: 0.765 and 0.739). As mentioned earlier, both COVID-IRS was superior to the COVID-GRAM and ABC--GOALScl scores at predicting the need for IMV. Additionally, the Brescia-COVID Respiratory Severity Scale (BCRSS), a stepwise approach to managing patients with confirmed/presumed COVID-19 pneumonia [31], is a meaningful tool based on clinical features and chest x-ray changes, for determining the scalation in ventilatory support. It is meant to be dynamic and frequently reassessed and re-scored after interventions and has been widely used in that center for evaluating patients from de emergency department and throughout hospitalization. We weren't able to estimate and compare the BCRSS's performance in our cohort to predict the IMV risk, due to lack of information in our records. Finally, all variables needed to calculate the COVID--GRAM, ABC-GOALScl, PREDI-CO and COVID-IRS-NLR scores can be easily obtained in the outpatient setting and could complement each other. Of important note the SOFA score had a similar AUC when compared with the COVID-IRS scores for predicting IMV. Due to the retrospective nature of our data, we did not distinguish patients who needed IMV on arrival or first day of admission from those who were intubated during their hospital stay, and when taking into consideration that the SOFA score includes a variable for IMV, this most likely results in an overestimation of its capacity to predict the need for IMV in our population.
It is important to emphasize that some high-risk patients may not present with signs of respiratory distress upon admission, but can rapidly progress to ARDS, and thus need frequent monitoring [9,29,30]. In order to avoid overwhelming of health care systems worldwide, the identification of these patients is a priority. The timely identification of these cases could help to reduce mortality and allow a reasonable and cost-effective allocation of human resources and infrastructure [5,31]. One of the possible benefits of our score, comes from its utility in identifying which patients require this closer surveillance and which can have their evaluations spaced-out safely. We identified four risk categories according to the probability of requiring IMV: low, moderate, high and very high risk. Low-risk patients have a low probability of requiring IMV and could benefit from a strategy that offers early discharge from the hospital and subsequent ambulatory visits. Patients with moderate-risk scores could remain in a hospital ward for surveillance. Finally, the high-risk and very high-risk category patients have an IMV probability of over 31.8%, and could therefore should be kept in a ward that has enough personnel to provide frequent re-evaluations and prompt response times for emergency endotraqueal intubation (like intermediate care units). Further studies are needed in order to validate this application of the COVID-IRS.
The main limitations of our study are its retrospective nature and the fact that some of the patients received different medical treatments prior to hospitalization (such as glucocorticoids) which could act as confounders. Our results may not be representative of the general real-life situation prevailing in most COVID-19 centers; our mortality rate is rather low, which can be attributed to the availability of ICU facilities. Finally, the incidence of comorbidities and old age in our cohort is lower than that reported in other centers and could thus prove to be a factor that hampers its application in other settings. Here we show the median time in days from patient admission until the patients required the initiation of IMV. There was a tendency towards a higher median amount of days between patient admission to the hospital and the requirement of IMV in lower risk groups. These differences did not prove to be statistically significant (COVID-IRS-NLR, p = 0.371; COVID-IRS-IL6, p = 0.275). (TIF)

S2 Fig. Predicted and observed percentages of patients who required IMV at each point of both COVID-IRS scores in the development and validation cohorts.
Here we show the correlation between observed and predicted percentages of patients who required IMV. Both predicted and measured risks showed a strong correlation. (TIF)