Population risk factors for severe disease and mortality in COVID-19: A global systematic review and meta-analysis

Aim COVID-19 clinical presentation is heterogeneous, ranging from asymptomatic to severe cases. While there are a number of early publications relating to risk factors for COVID-19 infection, low sample size and heterogeneity in study design impacted consolidation of early findings. There is a pressing need to identify the factors which predispose patients to severe cases of COVID-19. For rapid and widespread risk stratification, these factors should be easily obtainable, inexpensive, and avoid invasive clinical procedures. The aim of our study is to fill this knowledge gap by systematically mapping all the available evidence on the association of various clinical, demographic, and lifestyle variables with the risk of specific adverse outcomes in patients with COVID-19. Methods The systematic review was conducted using standardized methodology, searching two electronic databases (PubMed and SCOPUS) for relevant literature published between 1st January 2020 and 9th July 2020. Included studies reported characteristics of patients with COVID-19 while reporting outcomes relating to disease severity. In the case of sufficient comparable data, meta-analyses were conducted to estimate risk of each variable. Results Seventy-six studies were identified, with a total of 17,860,001 patients across 14 countries. The studies were highly heterogeneous in terms of the sample under study, outcomes, and risk measures reported. A large number of risk factors were presented for COVID-19. Commonly reported variables for adverse outcome from COVID-19 comprised patient characteristics, including age >75 (OR: 2.65, 95% CI: 1.81–3.90), male sex (OR: 2.05, 95% CI: 1.39–3.04) and severe obesity (OR: 2.57, 95% CI: 1.31–5.05). Active cancer (OR: 1.46, 95% CI: 1.04–2.04) was associated with increased risk of severe outcome. A number of common symptoms and vital measures (respiratory rate and SpO2) also suggested elevated risk profiles. Conclusions Based on the findings of this study, a range of easily assessed parameters are valuable to predict elevated risk of severe illness and mortality as a result of COVID-19, including patient characteristics and detailed comorbidities, alongside the novel inclusion of real-time symptoms and vital measurements.

Introduction SARS-CoV-2, first reported to the WHO on 31 December 2019, has subsequently exponentially spread with cases now officially reported in 215 countries and territories [1]. Following infection, individuals may develop COVID-19, an influenza-like illness targeting, primarily, the respiratory system. The clinical pathophysiology of COVID-19 is still the subject of ongoing research. It is clear, however, that clinical presentation is heterogeneous, ranging from asymptomatic to severe disease. Common clinical features include major symptoms such as fever, cough, dyspnoea [2], and minor symptoms such as altered sense of smell and taste [3,4], gastrointestinal symptoms [5], and cutaneous manifestations [6]. Evidence suggests most patients move through two phases: (a) viral replication over several days with relatively mild symptoms; (b) adaptive immune response stage, which may cause sudden clinical deteriora-old) patients with laboratory-confirmed SARS-CoV-2 were selected. The minimum sample size for inclusion was 100 patients. Narrative reviews, case reports, papers only reporting laboratory or imaging data, and papers not reporting original data were not included. Studies including homogeneous populations with exclusion criteria (e.g. female patients pregnant at the time the study was conducted) were also excluded.

Information sources and search strategy
A systematic review using PubMed and SCOPUS was conducted. Additionally, a thorough hand search of the literature and review of the references of included papers in the systematic review was carried out to minimize the likelihood that the used search terms did not identify all relevant papers. The following search terms were included: ncov � OR coronavirus OR "SARS-CoV-2" OR "covid-19" OR covid, AND ventilator OR ICU OR "intensive care" OR mortality OR prognosis OR ARDS OR severity OR prognosis OR hospitalis � OR hospitaliz � OR "respiratory failure" OR intubation OR ventilation OR admission � OR admitted OR "critical care" OR "critical cases", AND clinical OR symptom � OR characteristic � OR comorbidit � OR co morbidit � OR risk OR predict � and "PUBYEAR > 2019". Comprehensive search terms can be found in supplementary material (S1 Table).

Study selection
Two authors (A.B.R. and A.B.) independently reviewed titles and abstracts to ascertain that all included articles were in line with the inclusion criteria (Fig 1). Studies with missing, unclear, duplicated, or incomplete data were excluded from the review. Observational studies including original data on at least 100 adult patients with laboratory-confirmed SARS-CoV-2, whether hospitalised or in outpatient settings, were included in the meta-analysis.

Data collection process and data items
The following information was extracted from each selected article: author, publication year, article title, location of study, SARS-CoV-2 case identification, study type (e.g. primary research, review, etc), peer-review status, quality assessment, and total sample size. Extracted data included sample demographics (age, sex, ethnicity), obesity/BMI status, smoking status, blood type, any existing comorbidities, symptoms, basic clinical variables (e.g. heart rate, respiration rate, and oxygen saturation), and their clinical outcomes of severe (severe case definition, admission to ICU, invasive mechanical ventilation (IMV), and death) versus non-severe comparator event (e.g. no ICU admission, survival/recovery). Data extraction was carried out using software specifically developed for systematic review (Covidence, Veritas Health Innovation, Melbourne, Australia).

Assessment of methodological quality and risk of bias
An adapted version of the Newcastle-Ottawa Scale [21] was used during full-text screening to assess the methodological quality of each article. Two authors reviewed the quality of included studies (A.B.R. and A.B.), with conflicts resolved in consensus. Studies were judged on three criteria: selection of participants; comparability of groups; and ascertainment of the exposure and outcome of interest. pooled extracted values. Where possible, a meta-analysis was carried out to assess the strength of association between reported risk factors and two outcomes: severe and mortality. Severe outcome was defined as the clinical definition of severe, ICU admission, or IMV, while excluding hospitalisation. If a study reported multiple outcomes, then the clinical definition of severe [22,23] was taken to avoid duplication of data. Meta-analysis regression of reported multivariate Odd Ratios (ORs) were pooled with estimated effect size calculated using a random-effects model.
To accommodate for heterogeneity across the studies, we estimated risk weighting for each reported variable across two endpoints: severe COVID-19 (comprising severe case definition, ICU admission, and IMV) and mortality from COVID-19. If at least two studies reported ORs (multivariate or univariate) for the same clinical variable, pooled weighted estimates were calculated on the basis of sample size and standard error. If only a single study reported the finding, a point estimate from that study was listed. Data were analysed using the R statistical software [24]. The meta-analysis and plots were created using the R package meta [25].

Results
The comprehensive search of databases and cross-referencing hand search identified 2122 articles meeting the search criteria, following removal of duplicates. During screening of title and abstract, 1991 articles were excluded. Consequently, 131 articles were selected for full-text review. Of these, 76 articles were deemed to meet the inclusion/exclusion criteria. Articles were excluded for the following primary reasons: repeated data (n = 19); wrong design/outcome of interest (n = 18); insufficient sample size (n = 7); homogenous population (n = 6); and editorial or commentary (n = 5) (full reasoning is noted in Fig 1). A summary of all included studies' characteristics and quality assessment is given in Table 1. Inter-rater reliability of article inclusion was substantial (κ = 0.74).
Reported outcomes across studies varied and were categorised into five grouped endpoints: severe, hospitalisation, ICU admission, IMV, composite endpoint (considered as ICU, IMV, or mortality), and mortality.
Due to the heterogeneity of studies and insufficient comparable data, it was not possible to conduct meta-regression on all reported variables, including symptoms and vitals measurements. As such, pooled weighted estimates were extracted where possible (  15.55), and chills (OR: 6.32). Fever showed low estimated risk for both severity and mortality (OR: 1.06, OR: 0.69 respectively). There was insufficient comparable data to estimate risk for cough as an independent factor, however, pooling univariate analysis also found low estimated risk for both severity and mortality (OR: 1.01, OR: 1.08 respectively). There was limited evidence on loss of smell as a risk factor for severe outcomes [93]. Respiratory rate �24 breaths/min was reported as a risk in five studies [28,49,53,70,77]. However, it was not possible to combine data and provide estimates for risk due to heterogeneous outcomes and risk measures reported, with a wide range in the effect estimates  [56]. Univariate analysis also showed increased risk of severe outcome with SpO2 on admission to hospital <90% (OR: 3.83, 95% CI: 1.05-14.01) [99] and <93% (OR: 13.12, 95% CI: 7.11-24.24) [28].

Quality assessment
Methodological structure and reporting of studies varied in quality. Quality scores were evaluated using an adapted version of the NOS [21], with an average quality score of 8.4 (SD = 1.7), ranging between 4 and 10 (scale out of 10) ( Table 1). All studies reported data collection from health records. Subject inclusion in reported literature was widely reported as hospital admission with positive RT-PCR (reverse transcription polymerase chain reaction) test and, therefore, most studies show bias towards inclusion of hospitalised, thus more severe, patients. Few studies reported handling of missing data and bias reporting in findings.

Publication bias
Given the high volume of published literature, we did not include publications in grey literature such as medRxiv and bioRxiv. As inclusion was limited to studies published only in English, language bias is likely. Due to high heterogeneity and spread of data, we estimate risk of bias based on the most commonly reported variable: male sex (Fig 4). The funnel plot showed a somewhat asymmetrical distribution, which may be explained by the small number studies, therefore high probability that deviations in funnel shape occur due to chance. Given the presence of high heterogeneity ( Table 1) and spread of study quality scores, one can conclude that study heterogeneity may be a significant factor.

Discussion
The findings of this systematic review and meta-analysis add to the growing body of evidence supporting the hypothesis that many patient characteristics, comorbidities, symptoms, and vital signs parameters relate to increased risk of a severe outcome or death due to COVID-19.
Presented results align well with recent systematic reviews investigating risk factors in COVID-19, highlighting that age, sex, obesity, and multiple comorbidities increase the risk of adverse outcomes [38,66,[103][104][105]. This study, however, goes further than previously available literature through our mapping of a wider variety of risk variables, including symptoms and vital signs.  Prior reported literature has made it clear that certain individuals are at higher risk than others. Hence, there has been a concerted effort to profile these high-risk individuals which has resulted in the development of a variety of diagnostic and prognostic models for COVID-19, with many reporting moderate to excellent discrimination [41,90]. Interpretation of early models, however, should be treated with caution as a result of the high risk of bias due to overfitting, lack of external validation, low representativeness of targeted populations, and subjective/proxy outcomes in criteria for hospitalisation and treatment [105,106]. These performance estimates may be misleading and, potentially, even harmful [105]. Efforts for future development of risk profiling should follow standardised approaches such as the TRIPOD (Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline [107]. The identified risk factors align with current understanding of clinical pathophysiology for severe COVID-19. There are several theories as to why age is a significant risk factor for severe COVID-19. These include the role of comorbidities, as well as decreased efficiency of the immune system related to normal ageing [108]. Male sex as a risk factor for severe disease is thought to result from a combination of the effect of health behaviours, sex hormone-mediated immune responses, and differential expression of ACE2 between sexes [109]. Obesity is a risk factor for development of comorbidities such as hypertension, cardiovascular disease, and diabetes. However, there may be further involvement of obesity through metabolic consequences, which include increased circulating cytokine levels [110].
One study included in this review stands out due to its scale, investigating the primary care records of over 17 million UK citizens [89]. Using a database of overwhelmingly unexposed individuals, the study can be differentiated from ours in that the risk associated with each variable confounds propensity for infection with the relative likelihood of death once infected. The resulting net risk weighting makes it unclear which of these two discrete probabilities is being affected by each variable. The limitation of this approach can be seen best with smoking status whereby the combined approach outputs a protective weighting, potentially due to the reported reduced infection risk conferred by active smoking, contrasting with our analysis which suggests increased prognostic risk (0.91 vs 1.21) [111]. Moreover, as the increased mortality risk of comorbidities was public knowledge before the first wave in the UK, it could be assumed that this demographic behaved more cautiously, resulting in the risk weightings being underestimated in the combined approach. Weightings for hypertension (HR: 0.88, 95% CI: 0.84-0.92 vs OR: 1.09, 95% CI: 0.86-1.37) and non-haematological cancer (using OpenSA-FELY's highest risk group; diagnosed <1-year ago (HR: 1.68, 95% CI 1.46-1.94) vs our anytimeframe (OR: 2.15, 95% CI: 1.41-3.28) seem to conform to this expectation. Both approaches, however, are uniquely useful in their application and, nevertheless, are largely in alignment in their outputs. Combining the discrete risks presents the foundation for the development of a risk model which can aid with the strategic planning required for health systems and the allocation of their resources. Our approach presents the foundation for a prognostic model which could support healthcare triage and be used on an individual level for comprehension of personal risk should one get infected.

Limitations
While our study presents pooled findings across 14 geographies and may be considered broadly representative of the pandemic, a number of limitations should be highlighted. The primary limitation is the high heterogeneity of the included studies. Notation of patients' highest level of care may be complex to interpret because such an endpoint is dependent on local policy and resources, which have been evolving in strategy and capacity since the onset of the pandemic. Thus, a recommendation of our study is for the development of standardised protocols for reporting of COVID-19 case series and retrospective analysis. Definition of the nonsevere or comparator group is often poorly defined and is likely to result in sample selection bias towards more severe cases. Recent evidence from nationwide blanket testing suggests that 86.1% of individuals who tested positive for COVID-19 had none of the three main indicative symptoms of the illness, such as cough, fever, or a loss of taste or smell [112]. In the majority of papers presented within this analysis, the individuals were already admitted to hospital, hence there is a strong selection bias towards those more severely affected and, as such, our results may underestimate the degree of risk. To facilitate rapid and widespread implementation of risk stratification, this investigation focused on risk factors that were easily obtainable. As such, we did not consider haematological risk factors within our review. These factors are known to be significant and may be valuable to include as part of risk stratification upon admission to hospital [113].
Confounding factors are highly likely in reported literature and, therefore, multivariate analysis is essential to determine causal risk factors. One such example of this is ethnicity. In our analysis of results, we chose to exclude estimates for risk relating to ethnicity and race due to the complex association of socio-economic factors and comorbidities which may be entangled with ethnicity. In early reports from the UK, there was significant disparity in outcomes for BAME (Black, Asian, and Minority Ethnic) communities [114]. However, in more recent analysis, it was found that the great majority of the increased risk of infection and death from COVID-19 among people from ethnic minorities can be explained by factors such as occupation, postcode, living situation, and pre-existing health conditions [115].
A further limitation of our study is the method used to pool risk estimates. We aimed to maximise the data collected by pulling all available estimates for risk of an associated variable. This method is flawed in that these outcomes are not directly comparable in a rigorous metaanalysis. Thus, caution is advised in interpretation of absolute risk for each variable of interest.

Implications for future practice
A key finding of the global analysis is the difficulty in combining data reported in the literature. Healthcare systems and researchers are, at present, not providing standardised recording and reporting of health data and outcomes. This heterogeneity in reporting limits the efficacy and impact of broad meta-analysis, as highlighted by the spread of data (Fig 4). The use of standard case report forms, such as those outlined by the WHO may support this endeavour [116]. At a global level, if such data, anonymised and aggregated at patient level, is made more widely available, this could support the development of robust data-driven risk prediction models [117,118].
At regional and provider levels, evidence-based risk stratification could help plan resources and identify trends that predict areas with increased demand. Hospital admission of severe COVID-19 cases can be expected up to two weeks following onset of symptoms [19,119]. Hence, if risk stratification can be carried out in real-time and incorporate dynamic factors, including symptoms and vital signs, resources such as increased ICU capacity can be allocated strategically. Furthermore, through implementation of remote patient monitoring, patients can remain at home on a 'virtual ward' while under clinical observation. Early signs of clinical deterioration can be managed and, as a result, reduce hospital burden [120].
At the patient level, based on the findings of this study, it is recommended that individuals undergo comprehensive screening for risk factors including patient characteristics, detailed comorbidities, and reporting of real-time symptoms and vital sign measurements as part of a COVID-19 risk assessment. While some of the variables identified in this review are wellknown risk factors within the clinical or research domain, it is essential that this information is disseminated to the general public in an easily consumable format with supporting evidence and information. The pandemic has brought about significant social and economic disruption. Due to the lack of a prior evidence-base, current guidelines for individual risk management are blunt, broad generalisations. These may be sensitive to the majority of at-risk individuals, but simultaneously have low specificity, erroneously profiling large sections of the population. Thus, the concern is that many may lose confidence in these measures, including those correctly labelled 'at-risk'. Providing individual patients with a comprehensive and individualised risk profile may empower individuals and increase engagement with public health messaging. This may facilitate efforts by national governments to encourage behaviour modification at a population level, in a manner which reduces the spread of the virus, thereby limiting socioeconomic impact.

Conclusion
The findings of this paper highlight the range of factors associated with adverse outcomes in COVID-19, across severe disease, ICU admission, IMV, and death. The determination of critical risk factors may support risk stratification of individuals at multiple levels, from government policy, to clinical profiling at hospital admission, to individual behaviour change. This would enable both a more streamlined allocation of resources and provision of support to individuals who require them most. Future studies aimed at developing and validating robust prognostic models should look to follow a standardised approach to allow for comparability and sharing of knowledge. In this respect, a continuation of open data sharing is essential to facilitate improvement of these models.