Clinical outcome, risk assessment, and seasonal variation in hospitalized COVID-19 patients—Results from the CORONA Germany study

Background After one year of the pandemic and hints of seasonal patterns, temporal variations of in-hospital mortality in COVID-19 are widely unknown. Additionally, heterogeneous data regarding clinical indicators predicting disease severity has been published. However, there is a need for a risk stratification model integrating the effects on disease severity and mortality to support clinical decision-making. Methods We conducted a multicenter, observational, prospective, epidemiological cohort study at 45 hospitals in Germany. Until 1 January 2021, all hospitalized SARS CoV-2 positive patients were included. A comprehensive data set was collected in a cohort of seven hospitals. The primary objective was disease severity and prediction of mild, severe, and fatal cases. Ancillary analyses included a temporal analysis of all hospitalized COVID-19 patients for the entire year 2020. Findings A total of 4704 COVID-19 patients were hospitalized with a mortality rate of 19% (890/4704). Rates of mortality, need for ventilation, pneumonia, and respiratory insufficiency showed temporal variations, whereas age had a strong influence on the course of mortality. In cohort conducting analyses, prognostic factors for fatal/severe disease were: age (odds ratio (OR) 1.704, CI:[1.221–2.377]), respiratory rate (OR 1.688, CI:[1.222–2.333]), lactate dehydrogenase (LDH) (OR 1.312, CI:[1.015–1.695]), C-reactive protein (CRP) (OR 2.132, CI:[1.533–2.965]), and creatinine values (OR 2.573, CI:[1.593–4.154]. Conclusions Age, respiratory rate, LDH, CRP, and creatinine at baseline are associated with all cause death, and need for ventilation/ICU treatment in a nationwide series of COVID 19 hospitalized patients. Especially age plays an important prognostic role. In-hospital mortality showed temporal variation during the year 2020, influenced by age. Trial registration number NCT04659187.


Introduction
The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing novel coronavirus disease 2019 (COVID-19) infected millions of people in a worldwide pandemic situation after it was first discovered in December 2019 [1,2].
In Germany, the first case was observed on 27 January 2020. Infection numbers increased with a peak at the end of March (first wave or phase 1), low numbers during summer (phase 2) and an increase in October with a maximum in December 2020 (second wave or phase 3). The first SARS-CoV-2 variant of concern was detected at the end of December 2020 in Germany [3]. A governmental lockdown was implemented at 23 March until 20 April 2020 with closing of schools, restaurants, most shops, etc. Several restrictions remained until 11 Mai 2020. SARS-CoV-2 testing was recommended for hospitals and retirement homes on 15 October and new restrictions were implemented on 2 November with a new lockdown on 16 December 2020. Despite high numbers of hospitalized patients with a maximum of 5.643 intensive care patients with COVID-19 in December, hospitals were able to accept all patients and provide the same level of care throughout the pandemic [4].
After approximately one year of pandemic, it is possible to evaluate first observations regarding temporal changes of SARS-CoV-2 infection rates, which were reported previously [5]. Dennis et al. documented a decrease in mortality in a large cohort of intensive care patients from 1 March 2020 until 27 June 2020, assumed to be caused by introduction of effective treatments, improved physician understanding of the disease process, and a falling critical care burden [6]. However, temporal variations of in-hospital mortality in COVID-19 over a longer period of time are widely unreported.
While most SARS-CoV-2 infected patients have non-severe symptoms, 14% develop dyspnea and hypoxemia, which can lead to severe or fatal courses [7]. The often-observed silent hypoxemia and rapid worsening of symptoms, with patients becoming seriously ill in a short period of time, requires the need to detect patients at risk fast and easily. Previous studies described older age, male sex, cardiovascular disease, hypertension, and diabetes as possible risk factors [8][9][10]. Nevertheless, this crisis has led to the publication of multiple, heterogeneous results regarding the impact of clinical indicators on disease severity in hospitalized COVID-19 patients with different singular or combined endpoints. Structured, integrated approaches for data analysis are sparse.
The aim of this multicenter cohort study was to assess the impact of possible baseline risk factors on disease progression and on mortality with an integrated approach for data analysis. Additionally, we present temporal data for the entire year 2020 regarding in-hospital mortality, morbidity, and need for mechanical ventilation.

Study design
The "CORONA Germany"-Clinical Outcome and Risk in hospitalized COVID-19 patientsstudy (ClinicalTrials.gov, NCT04659187) is a multicenter, observational, prospective, epidemiological cohort study. It was conducted in 45 hospitals across Germany that were all part of the same hospital network (see S1 Appendix for list of hospitals). The trial was without funding and investigator-initiated; the steering committee was responsible for design, execution, and conduct of the study. The statistical analyses and interpretation of the data was approved by all members of the steering committee. An endpoint committee, provided by the networks research institute, reviewed all study endpoints. All data were matched to and validated by the networks' quality management data base. The authors attest to the accuracy of the data and of all analyses. The ethics committee of the General Medical Council (Aerztekammer) for the city of Hamburg and the ethics committee of the General Medical Council (Aerztekammer) for the city of Munich approved the study and determined that this work was exempt from human subjects' research regulations since all information was collected on a fully anonymized basis. Therefore, the ethics committees waived the need for participant consent.

Study participants and patients
Enrollment of all consecutive hospitalized patients testing positive for SARS CoV-2 (PCR, polymerase chain reaction test) was the key inclusion criterion. Patients with negative SARS CoV-2 testing and outpatients were excluded from this study.
The total sample size was open, reflecting the development of the pandemic situation, missing historic data and the explorative design of the study.
This dataset included all patients coded with international statistical classification of diseases and related health (ICD) for coronavirus SARS CoV-2 infection and need for hospitalization. Data about age, gender, outcome, necessity for mechanical ventilation (invasive and noninvasive ventilation), and all ICD-coded main (co-)diagnoses were given for the total study cohort. Patients were enrolled from the start of pandemic in February 2020 until 1 January 2021, while only patients with a complete hospital stay (until discharge or death) were analyzed.
Additionally, the predefined cohort from Hamburg and Gauting, including seven of the participating centers, provided a second data set consisting of more detailed information collected from 1 March until 15 September 2020. These seven centers are specialized hospitals regarding intensive care, cardiovascular care, and research. They all provided access to electronic patient charts of the same hospital information technology network. Therefore, development of a comprehensive second data set was feasible. Data entry was performed in a structured way, anonymously. The data fields were pre-formatted and periodic data checks ensured a high data quality. This dataset included n = 45 variables regarding baseline symptoms, baseline vital parameters, baseline laboratory findings, prior medication, and preexisting comorbidities. Each study participant was followed up for the whole length of the hospital stay.

Endpoints
The aim of this study was to assess the impact of possible risk factors on disease progression and on mortality. Therefore, the primary endpoint was assessment of disease severity defined by three stages: 1. non-survivors (fatal disease), 2. requirement of mechanical ventilation and/or ICU admission and afterwards successful hospital discharge (severe disease), and 3. recovery after a hospital stay solely on a normal, non-ICU ward (mild disease).
For simplification, these will be referred to as fatal, severe, and mild disease in the following manuscript. The impact of risk factors on COVID-19 disease in hospitalized patients was assessed in two steps. First, effects on fatal and severe disease were examined. Second, predictors on overall survival for patients with severe but non-fatal disease (ventilation and/or ICU) were analyzed.
Only baseline data i.e. variables regarding baseline symptoms, baseline vital parameters, baseline laboratory findings, prior medication, and preexisting comorbidities were investigated as possible risk factors. Potential predictive variables were: age, body mass index, fever at admission (defined cut-off 38�5˚C), presence of dyspnea at admission, presence of cough at admission, baseline vital parameters (including respiratory rate, oxygen saturation by pulse oximetry (SpO2), systolic and diastolic blood pressure), laboratory findings at admission (leukocyte count, lymphocyte count, d-dimer level, creatinine, C-reactive protein, procalcitonin, activated partial thromboplastin time, potassium level, initial lactate dehydrogenase), coexisting comorbidities (presence of cardiomyopathy, hypertension, diabetes mellitus, vascular disease, dyslipidemia, pulmonary disease, chronic kidney disease, active tumor disease), prior medication (ACE-inhibitor, angiotensin receptor blocker (ARB), statins, antiplatelet therapy, oral anticoagulation, diuretics, proton pump inhibitors, chemo-/immunosuppression and/or targeted therapy).
After statistical analyses, the detected risk factors were depicted in a nomogram (risk score), which is a graphical tool for manually obtaining predictions from the fitted model. Point scores can be obtained for each predictor and the user should add the points manually before reading predicted values on the final axis of the nomogram. With this tool it is possible to calculate the individual risk score for each patient.
Ancillary endpoints were temporal trends for mortality, need for mechanical ventilation (invasive and non-invasive), and morbidity (respiratory insufficiency and pneumonia, as defined by current guidelines [11]) over the observational period. Additional analyses were rates, baseline characteristics, and outcomes of patients receiving palliative care.
Palliative care was defined according to the World Health Organization (WHO) as an approach that improves the quality of life of patients and their families facing the problems associated with life-threatening illness, through the prevention and relief of suffering by means of early identification and impeccable assessment and treatment of pain and other problems, physical, psychosocial, and spiritual [12].

Statistical analysis
Descriptives. Demographic data, symptoms, vital data, laboratory data, cardiac marker, blood gas analysis, comorbidities, and medication, documented at baseline, were summarized descriptively. Continuous data are shown as means +/-standard deviations and medians [25th and 75th percentiles]. Categorical data are presented as proportions and frequencies (data presented in Tables 1-3).

Model development.
The primary endpoint was analyzed in an ordinal fashion, recording the worst of the following three states: death for non-survivors, severe cases as the need for ventilation or ICU admission for survivors, or mild state as discharge from a non-ICU ward. The missing data pattern was analyzed beforehand. Variables with more than 1/3 missing values and systematic data gaps were not taken into account for model building. Further data gaps were filled by multiple imputation using chained equations (MICE algorithm). First, an ordinal logistic regression (continuation ratio) model was applied to associate baseline data with the occurrence of disease severity. Nonlinearities in continuous variables were considered by restricted cubic splines transformations. Three knots at the 10 th , 50 th and 90 th percentile were set. The regression model was fitted without deletion of further predictors using a penalized estimation approach.
As a second step, two models were derived 1) for the prediction of disease severity and nonsurviving and 2) for the prediction of non-surviving conditional on at least severe cases, where no mild cases were taken into account. Each models was approximated from the full model to become simplified and clinically useable. The linear predictor of the logistic regression model was used as the outcome in a linear regression model. A step-down method on a least squares model was applied to simplify the model. While the full OLS model shows an R2 of 1, an approximation could theoretically be applied to any level. The final models were selected to exhibit a high R2 level and a high goodness of fit (AIC, BIC).
Validation of the prediction models. The logistic regression models were validated using B = 200 bootstrap samples. The training sample was used to fit the models and estimate the apparent model fit. The out-of-sample data were used to calculate the test performance of the model. The mean difference between training and test performance (overfitting in the final models) was examined using explained variation R2 and Sommers' D rank-correlation. Validation results are shown in the S2 Appendix.
Model effects. The predictors of each model are shown with odds ratios and 95% confidence intervals and were summarized in forest plots. Continuous predictors are presented as interquartile-range odds ratios (upper quartile: lower quartile) and categorical predictors as simple odds ratios (current category: reference category). Partial effects in the prediction model for the cause of a disease severity and mortality are shown on a log-odds scale. Nomograms were derived from the prediction models to calculate the probability for the patients' states. Event prediction in the nomogram was derived from a total point score that summarizes individual points (0-100) assigned from the value of each predictor.
Further analyses based on the entire dataset. Weekly rates of mortality, need for ventilation, pneumonia, respiratory insufficiency, and mean age are presented using smoothed local regression curves across time. In cases with missing medical history data only age was related to mortality. A one dimensional regression model was used to relate weekly changes in mean age to mortality rates. All analyses except the descriptive analyses of the whole cohort (Tables 1 and 2) and weekly rates of mortality, need for ventilation, pneumonia, respiratory insufficiency, and mean age ( Table 3, Figs 1-5) were based on the Hamburg/Gauting cohort. P-values are two-sided. The significance level is 5%. All analyses were performed with R (R Core Team 2020), (see S3 Appendix for statistical literature).

Temporal variations
With regard to the entire study cohort of 4704 patients, Fig 3 shows the temporal variations of hospital admissions during the year 2020 with a first peak in April, low numbers during summer, and a second peak in November and December 2020.
Mortality varied during the year: rates of 22.4% in April 2020 decrease to 3% in June 2020 with an increase at the end of the year to up to 23.1% (Table 3, Fig 4). The rates of pneumonia, respiratory insufficiency, and the need for mechanical ventilation showed similar trends ( Fig  5). The temporal variations of mortality rates and age were associated (p = 0.029).

Hamburg/Gauting cohort
In the following, we report the data of the comprehensive data set from the Hamburg/Gauting cohort, consisting of 490 patients from seven centers.
Baseline characteristics and baseline symptoms of this cohort are shown in Table 4.

Primary endpoint
Of the included 490 patients, 63% were classified as mild disease, 17% as severe (ICU/ventilation, non-fatal), and 20% as fatal (death), respectively (Fig 2). For the analysis as potential risk

PLOS ONE
Clinical outcome and risk in hospitalized COVID-19 patients    . A nomogram (risk score) derived from the regression model was designed to predict the probability of a severe disease from baseline data (Fig 6A). When comparing common effects in both models the role of age largely increased, whilst other effects (respiratory rate, LDH, CRP, and creatinine) were negligibly smaller (Fig 6C).

Discussion
Major findings of the study showed that more than one third of all hospitalized COVID-19 patients suffered from a fatal or severe disease requiring mechanical ventilation and or admission to intensive care unit. The risk for a severe and/or fatal disease can be calculated by a simple risk score, since age, respiratory rate, lactate dehydrogenase, C-reactive protein, and creatinine at baseline are associated with this outcome. Additionally, mortality rates of hospitalized COVID-19 patients showed temporal variations with lower rates during summer. Age had a big effect on in-hospital mortality in severe diseased patients and also had a strong influence on the temporal changes of mortality rates. SARS-CoV-2 variants of concern did not play a role in this study, as in Germany the first detected case was at the end of December 2020 [3].

Risk factors and risk score
The identified risk factors for all cause death and need for ventilation or ICU treatment in our study are in line with the results of Galloway et al. who also identified respiratory rate, higher CRP, and renal impairment as relevant risk factors among others; however, their study approach was slightly different [9]. The overall strongest predictor in our study was age, especially in ICU-patients. The importance of age in COVID-19 severity was reported previously and confirmed by national health monitoring in Germany, as 85% of all fatal cases in Germany occurred in patients �70 years of age [3].
Interestingly, the examined coexisting comorbidities did not show a strong effect in our study, which should not mistakenly lead to the impression that our findings are controversial to other studies that clearly showed comorbidities to be a predictor for mortality [14]. Our analysis only demonstrates them to be weaker than other factors. Only the pre-existing use of oral anticoagulation was a strong predictor for severe outcome and mortality, which might point to atrial fibrillation as a relevant risk factor.
To further optimize the detection of high-risk COVID-19 patients, we developed a simple risk score (nomogram), which can be used in daily routine. Baseline characteristics such as age, laboratory values, and respiratory rate give a specific point value for each patient adding up to the individual patient's risk. This risk score is based on a comprehensive and highly  Fig 6A (above) shows the interquartile-range odds ratios for continuous predictors (upper quartile: lower quartile) and simple odds ratios for categorical predictors (current category: reference category) with regard to a severe or fatal disease. Fig 6A (below) shows the nomogram (risk score): calculating the probability for a severe or fatal disease. For each predictor, points (0-100) are assigned. The total points are associated to event prediction. E.g. In case of an 80 years old (= 40 points) patient with CRP 150 (= 30 points), respiratory rate 24 (= 30 points), LDH 400 (= 10 points), and creatinine 2 (= 73 points), the total point score is 183 and therefore the risk for ventilation, ICU, or death is elevated with a probability of approximately 0.7. Fig 6B (above) shows the interquartile-range odds ratios for continuous predictors (upper quartile: lower quartile) and simple odds ratios for categorical predictors (current category: reference category) with regard to a fatal disease conditional of patients with at least severe diseases (mild courses are not taken into account). Fig 6B (below) shows the nomogram (risk score): calculating the probability for a fatal course of patients with severe diseases. Fig 6C This figure shows comparing effects of the above mentioned two models (model 1 for prediction of the combined endpoint versus model 2 for prediction of death in mechanical ventilated/ICU patients) by baseline parameters. The interquartile-range odds ratios for continuous predictors (upper quartile: lower quartile) and simple odds ratios for categorical predictors (current category: reference category) with regard to disease severity (Model 1) and mortality (Model 2) are compared. validated, prospective dataset with an integrated approach of data analysis and is therefore of high quality. Hence, we overcome some limitations of other published risk scores with retrospective analysis or smaller sample sizes [15]. In contrast to the approach of Nakakubo et al. who included diagnostic imaging to the risk score [16], we focused on simple baseline characteristics and laboratory values, accessible for general practitioners or clinicians in emergency departments. Especially in the current pandemic situation with further rising infection rates worldwide, it is necessary to detect vulnerable patients fast and easily.

Temporal variations
The COVID-19 disease causes increased morbidity and mortality around the globe. The "CORONA Germany" study was carried out by one of the largest hospital networks in Germany with clinics in nearly all states of the country (as previously shown in Fig 1). During the observational period, we enrolled a large number of COVID-19 patients in 45 hospitals with a mortality rate as high as 19%. Whereas in-hospital mortality in Britain was described as 37% [17], analysis for Germany showed rates ranging from 17% to 22.2% [3,8].
Since the beginning of the pandemic in Germany, SARS-CoV-2 infection rates and numbers of hospitalized COVID-19 patients emerged in different phases, as shown in our study with a first peak in April 2020 and comparatively low hospital admission rates during summer. The onset of a second wave was observed in September 2020 with still rising numbers of COVID-19 hospitalized patients in December 2020 [3]. Interestingly, mortality rates of hospitalized patients in this study also showed temporal variations similar to the seasonal peaks of admission numbers. Whereas in-hospital mortality in April and December 2020 was 22.4% and 23.1%, it dropped to 3% in June, even though all patients had a clinical indication for hospitalization. Furthermore, rates of pneumonia, respiratory insufficiency, and the need for mechanical ventilation showed similar trends, which may indicate to a higher proportion of less severe courses of the disease during summer season. Dennis et al. previously described a decrease of in-hospital mortality in a big cohort of intensive care patients from March until June 2020, which was assumed to be due to introduction of effective treatments and a learning curve regarding clinical management of the novel disease [6]. However, this would not explain the increase of in-hospital mortality from August to December observed in our study.
Regression analyses in our study demonstrate the observed temporal variation in mortality to be strongly influenced by age. In addition to this strong association, other factors might also play a role and we assume seasonality of COVID-19 to be noteworthy. Global SARS-Cov-2 infection rates were presumed to underlie seasonality [5]. Among other factors, this was explained by the fact that UV light is most strongly associated with lower COVID-19 growth [18]. Additionally, potential co-infectious diseases also consist of seasonal variations like e.g. pneumococcal infection [19]. Because the prevalence of co-infection was reported to be up to 50% among non-survivors in COVID-19 [20], this may be an additional factor explaining the seasonal changes of in-hospital mortality over the year. Of course, due to the novelty of SARS--CoV-2, it is too early to prove seasonality; nevertheless, it may be an additional cause for the temporal pattern of in-hospital mortality beside the strong association to age.

Limitations
As we present all patients being hospitalized and tested positive for SARS-CoV2, the study population is heterogeneous and no further inclusion criteria were applied. Patients were at different states of COVID-19 disease by the time of admission and this could bias our results. However, some of our findings have been reported in comparable studies. Additionally, the effect of specific treatments has not been analyzed in this study and therefore it is unclear whether specific treatment options had an influence on the patients' outcome. However, as treatment options for COVID-19 are still the subject of current research, their impact is still mostly unknown. Despite high quality data collection and endpoint review committee, all data are observational and taken of medical patient charts during hospitalization. This has led to missing values in the study cohort. Therefore, their role might be underestimated in predicting the study outcome.

Conclusions
In conclusion, risk factors such as age, respiratory rate, lactate dehydrogenase, C-reactive protein, and creatinine values at baseline are associated with all cause death, need for ventilation, and/or ICU treatment in a nationwide series of COVID 19 hospitalized patients. Especially age plays an important prognostic role for mortality in critically ill patients. A simple risk score was developed to assist general practitioners or clinicians in emergency departments to detect vulnerable patients fast and easily.
Additionally, we observed temporal variations of in-hospital mortality and morbidity during the year 2020. In-hospital mortality of COVID-19 raised again by the end of the year, after a prior dip during the summer months. Again, the temporal course of mortality was influenced by age, highlighting the importance of this consistent risk factor.