Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Comorbidity patterns associated with severe COVID-19 outcomes: A cohort study based on the UK Biobank

  • Jian Zhang ,

    Contributed equally to this work with: Jian Zhang, Can Hou

    Roles Formal analysis, Investigation, Writing – original draft

    Affiliations Mental Health Center and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China, Med-X Center for Informatics, Sichuan University, Chengdu, China

  • Can Hou ,

    Contributed equally to this work with: Jian Zhang, Can Hou

    Roles Methodology, Writing – original draft, Writing – review & editing

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China

  • Wenwen Chen,

    Roles Investigation, Visualization

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China

  • Yao Hu,

    Roles Conceptualization, Resources, Software

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China

  • Shishi Xu,

    Roles Visualization, Writing – review & editing

    Affiliations West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China, Division of Endocrinology and Metabolism, West China Hospital, Sichuan University, Chengdu, China

  • Haowen Liu,

    Roles Software, Visualization

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China

  • Yao Yang,

    Roles Software, Visualization

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China

  • Unnur A. Valdimarsdóttir,

    Roles Conceptualization, Writing – review & editing

    Affiliations Center of Public Health Sciences, Faculty of Medicine, University of Iceland, Reykjavík, Iceland, Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden, Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, Massachusetts, United States of America

  • Fang Fang,

    Roles Conceptualization, Writing – review & editing

    Affiliation Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden

  • Huan Song

    Roles Conceptualization, Project administration

    songhuan@wchscu.cn

    Affiliations Med-X Center for Informatics, Sichuan University, Chengdu, China, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China, Center of Public Health Sciences, Faculty of Medicine, University of Iceland, Reykjavík, Iceland

Abstract

Background

Pre-existing comorbidities are linked to increased risk of severe COVID-19, but comprehensive assessments of comorbidity patterns remain limited.

Methods

We used network analysis to identify pre-existing comorbidity modules (i.e., groups of diseases more densely interconnected with each other than with other diseases in the comorbidity network) in a cohort of 420,920 individuals from the UK Biobank who were in England. We defined cases requiring hospitalization or who died of COVID-19 as “severe COVID-19”. Logistic regression was used to examine associations between comorbidity modules and severe COVID-19, and a module-based comorbidity index was developed to predict severe COVID-19, compared with existing indices.

Results

Comorbidity network analysis identified 190 disease pairs with confirmed comorbidity associations, which were further divided into seven comorbidity modules. Among the 30,914 individuals diagnosed with COVID-19, 3,970 were identified as severe cases (median age of 73.6 years, 58.77% being male). Six of seven identified modules showed statistically significant associations with severe COVID-19, especially modules related to circulatory and respiratory diseases (odds ratio = 1.67 [95% confidence interval 1.54–1.81]) and age-related eye diseases (1.39 [1.27–1.52]). Associations did not differ by sex, age or vaccination status but were generally stronger during the first wave of COVID-19 pandemic (i.e., 31st January-1st October, 2020). Our newly developed module-based comorbidity index showed better performance in predicting severe COVID-19 (AUC = 0.779) compared to the existing Charlson Comorbidity Index (0.714) and the 16-comorbidity index (0.714).

Conclusions

Our study demonstrated that pre-existing comorbidity modules, particularly modules related to circulatory and respiratory diseases and age-related eye diseases, were associated with severe COVID-19. Moreover, the module-based comorbidity index provides better prediction of severe COVID-19 than existing prediction indices.

Introduction

As of November 2024, the global coronavirus disease 2019 (COVID-19) pandemic has affected 10% of the world’s population, resulting in approximately 137 million hospitalized cases and nearly seven million deaths [1]. Most people experience mild symptoms and recover within a few weeks, whilst others suffer severe or even life-threatening conditions which require intensive medical interventions and might lead to serious and prolonged complications [2]. Therefore, to reduce the burden of healthcare system and diminish the impact of such pandemic on society, it is important to identify individuals at high risk of progressing to severe COVID-19 (i.e., hospitalization or death due to COVID-19) and offer a basis for implementing early risk prediction.

As the pandemic continues to evolve, a considerable research effort has been devoted to characterizing risk factors potentially associated with severe COVID-19. Besides demographic and lifestyle factors such as advanced age, high body mass index (BMI) [3], and current smoking, a significant proportion of severe COVID-19 has been attributed to pre-existing comorbidities [4]. Indeed, previous studies have found that comorbidities, such as hypertension [5], diabetes mellitus [6], cardiovascular disease [7], chronic obstructive pulmonary disease [8], and chronic kidney disease [9] are associated with increased risk of hospitalization or death due to COVID-19. However, findings on other major comorbidities such as psychiatric disorders and autoimmune diseases are inconsistent or still lacking [10,11]. Instead of focusing on single diseases, recent efforts focused on diseases networks by partitioning co-occurring diseases into distinct modules with potentially shared underlying biological processes, or a shared genetic basis [12,13]. Demonstrating concurrent diseases through networks may provide a more comprehensive understanding of the links between pre-existing comorbidities and severe COVID-19, with possible explanations on the underlying biological mechanisms.

In the present study, leveraging extensive information on healthcare records and COVID-19 status in the UK Biobank, we aimed to identify pre-existing comorbidity modules in a large sample of study participants and elucidate their impact on the occurrence of severe COVID-19. Further, based on the importance of each comorbidity in the corresponding module and the impact of the module on the risk of severe COVID-19, we developed a module-based comorbidity index for predicting severe COVID-19. Together, findings of the present study have potential to advance our understanding of mechanisms underlying disease progression of COVID-19 and provide a basis for developing medical management for individuals affected by various COVID-19 variants or other contagious diseases in future pandemics of similar kind.

Method

Study design

The present study was based on the UK Biobank, a large community-based cohort that enrolled over 500,000 individuals aged 40−69 years across the UK between 2006 and 2010 [14]. This cohort study is described in detail elsewhere [15]. The UK Biobank has released primary care and COVID-19 test data for ~92% of participants in England to support COVID-19 research. Primary care data come from two major GP systems, and COVID-19 test results (RT-PCR) are sourced from Public Health England’s surveillance system.

Among the 502,507 participants of UK Biobank, we included 445,757 individuals after exclusion of participants who withdrew their informed consent (n = 101, S1 Fig) and those who were registered outside of England (n = 56,649) as there are no complete primary care and COVID-19 test data for participants in Scotland and Wales. We further excluded 24,837 individuals who had died or were lost to follow-up on 31st January 2020 (i.e., the date of the first reported COVID-19 case in the UK), leaving 420,920 participants in the analytical cohort for the identification of comorbidity modules. Among these, 30,914 individuals were identified as COVID-19 cases, according to COVID-19 diagnostic tests, diagnoses in the primary care (codes listed in S1 Table) or inpatient hospital data (International Classification of Diseases 10th edition [ICD-10]: U07·1 or U07·2), and underlying cause of death in the mortality data (ICD-10: U07.1 or U07.2). Follow-up for severe COVID-19 among all cases started from their date of diagnosis until hospitalization or death within 30 days post-diagnosis or the end of the study (November 30, 2021), whichever occurred first. The incidence rate of COVID-19 was comparable between the analytical cohort and the entire UK as reported by the Office for National Statistics [16] (see S2 Table), suggesting that the majority of COVID-19 cases were captured in this analysis.

The UK Biobank study has received full ethical approval from the NHS National Research Ethics Service (16/NW/0274), and all participants provided written informed consent before data collection. The present study was approved by the biomedical research ethics committee of West China Hospital (2020.661).

Ascertainment of pre-existing comorbidities.

We used the term ‘comorbidity’ broadly to encompass a spectrum of medical conditions, including both chronic and acute illnesses, that were diagnosed before COVID-19 [17,18]. Since we focused on diseases that are common in the general population with substantial impact on healthcare systems, we restricted our analyses to a total of 20 communicable and 95 non-communicable diseases listed in the Global Burden of Disease (GBD) Study 2019 [19]. All diagnoses (i.e., main and secondary) in the primary care and inpatient hospital data before 31st January 2020 were used for disease ascertainment, according to corresponding ICD-10 codes (see S3 Table). Specifically, as primary care diagnoses are coded using SNOMED-CT and Read V3 codes, these were converted to the latest release of cross-maps provided by NHS digital ICD-10 codes [20,21], as used for inpatient hospital data.

Ascertainment of severe COVID-19.

Severe COVID-19 cases were defined as (i) those admitted to hospital or who died with this diagnosis, or (ii) those admitted to hospital with or without a COVID-19 diagnosis, or who died, within 30 days of a positive COVID-19 diagnostic test or a COVID-19 diagnosis recorded in the primary care data. For subsequent hospital admissions or deaths lacking a COVID-19 diagnosis, inclusion was restricted to those potentially related to COVID-19, excluding underlying causes of death or primary diagnosis codes related to pregnancy, perinatal conditions, symptoms, or signs, as well as external causes of morbidity and mortality (refer to S4 Table).

Covariates.

Sociodemographic and lifestyle factors, including date of birth, sex, annual household income, and smoking and drinking status, were collected using touchscreen questionnaires at recruitment. The Townsend deprivation index (TDI), a widely used measure of population-level deprivation [22], was assigned to each participant based on the postal codes provided at recruitment. Additionally, we calculated body mass index (BMI) using their height and weight measurements. We extracted COVID-19 vaccination information for each participant from primary care data, using a list of predefined SNOMED-CT and Read V3 codes (see S5 Table).

Statistical analysis

Comorbidity network analysis for identification of comorbidity modules.

The comorbidity network analysis was constructed to investigate the diversity of pre-existing disease patterns prior to the diagnosis of COVID-19. We followed previously described steps of comorbidity network analysis to identify comorbidity modules (i.e., groups of diseases more densely interconnected with each other than with other diseases in the comorbidity network) [18]. A detailed description of the analysis steps is available in S1 File. Specifically, disease pairs with sufficient prevalence (i.e., co-occurrence in at least 1% of the study population) and comorbidity strength (measured by Pearson’s correlation and relative risk of a disease pair in the same individual) were pre-selected and subsequently verified using logistic regression, controlling for potential confounders. The selected disease pairs were used to construct a comorbidity network, and comorbidity modules within the network (i.e., clusters of highly interconnected comorbidities) were identified using the Louvain community detection algorithm [23].

Association analyses.

Logistic regression was used to investigate the associations between pre-existing comorbidity modules and risk of severe COVID-19. For each comorbidity module identified, we first calculated the odds ratio (OR) of severe COVID-19 in relation to being diagnosed with any disease in the module, adjusting for age, sex, annual household income, TDI, BMI, smoking status, drinking status, and disease status of other modules (i.e., being diagnosed with any disease in the corresponding module). Additionally, we estimated ORs for being diagnosed with different numbers of comorbidities within a module (i.e., 1, 2, and 3+). Finally, we calculated the OR of severe COVID-19 in relation to being diagnosed for each individual disease of the comorbidity module.

We conducted sub-analyses for males and females separately. To assess the effect of age, we also conducted sub-analyses for individuals with age < 66 and those≥66 years (i.e., median age) separately. To examine age-related differences, participants were grouped by the median age (66 years) into <66 and ≥66 years. To investigate the effect of vaccination on the association between comorbidity modules and severe COVID-19, we conducted separate analyses for COVID-19 cases with and without vaccination at the time of diagnosis. To further investigate the influence of vaccination status on the observed associations, we stratified COVID-19 cases with vaccination based on the time from the last vaccination does to the COVID-19 diagnosis (i.e., < 6 months or ≥6 months). Finally, we repeated the analyses for COVID-19 cases diagnosed during the first wave (before 1st October, 2020) and second wave (after 1st October, 2020) of the COVID-19 pandemic in the UK separately, to investigate the influence of different viral strains on the results [24,25]. Difference in ORs was assessed by introducing an interaction term to the logistic regression.

Module-based comorbidity index.

Based on the identified comorbidity modules and their association with severe COVID-19, we developed a module-based comorbidity index to predict the risk of severe COVID-19. A detailed description of the analysis steps is available in Supplementary Methods. The index included 51 diseases, with weight of each disease calculated as the product of the disease’s importance in the corresponding module and the OR of the association between the corresponding module and severe COVID-19. The importance of each disease in a module was estimated using a previously proposed importance ranking method for complex networks [26]. We also developed six simplified module-based comorbidity indices by including the top 15, 20, 25, 30, 35, or 40 diseases with the highest weight, among the 51 diseases, to understand if it is possible to reduce the information needed for calculating the comorbidity score.

To evaluate the performance of the module-based comorbidity index in predicting severe COVID-19, we compared its performance with two existing indices, the Charlson Comorbidity Index (CCI) [27] and a 16-comorbidity based index [28]. We used both Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) models that incorporated age and sex as additional covariates in this comparison. We randomly divided the dataset into training and test sets in a 9:1 ratio for index development and evaluation, respectively. The area under the receiving operating characteristic curve (AUC) was chosen as the primary evaluation metric, and differences in AUC between indices were compared using the DeLong test [29]. Additionally, we considered the Akaike information criterion (AIC) [30] and Bayesian information criterion (BIC) [31] as secondary evaluation metrics. The detailed workflow is depicted in S2 Fig.

All the analyses were carried out using R (version 3·6·2, R Foundation for Statistical Computing, Vienna, Austria) and Python 3·8 (Python Software Foundation Delaware, USA), with two-sided P-value<0·05 as statistically significant.

Results

We included a total of 420,920 participants as our study population for comorbidity network analysis. The median age at recruitment was 69·30 years, and 54·31% of the participants were females. Among the 115 comorbidities studied, the most prevalent non-communicable diseases included bacterial skin diseases, low back pain, and anxiety disorders, while the most prevalent communicable diseases included upper respiratory infections and lower respiratory infections. A total of 30,914 individuals with COVID-19 were identified in the study population, of which 3,970 were classified as severe COVID-19 (S1 Fig and S6 Table). Compared to mild COVID-19 cases, severe COVID-19 cases were generally older (median age 73·6 vs 64·6, P < .001) and more likely to be male (58·77% vs. 45·18%, P < .001), obese (41·61% vs. 27·75% for BMI ≥ 29·9, P < .001), and current smokers (41·86% vs. 34·22%, P < .001) but less likely to have a higher household income (10·88% vs. 23·49% for household income ≥52,000£, P < .001) and be current drinkers (84·89% vs. 91·50%, P < .001) (Table 1).

thumbnail
Table 1. Baseline characteristics of the study participants with COVID-19.

https://doi.org/10.1371/journal.pone.0329701.t001

Comorbidity modules

The comorbidity network analysis identified 190 disease pairs with confirmed comorbidity associations, which were further divided into seven comorbidity modules (Fig 1). According to the predominant diseases in the module, they were named as age-related eye diseases module, cardiometabolic diseases module, infectious and neuropsychiatric diseases module, circulatory and respiratory diseases module, gastrointestinal diseases module, digestive diseases module, and mental and skin disorders module, respectively.

thumbnail
Fig 1. Comorbidity networks for pre-pandemic diseases. Each node represents a disease, while the size and color of the node indicate the prevalence and category of the corresponding disease, respectively. The width of the link represents the strength of the comorbidity association, measured by odds ratio from logistic regression. The network is partitioned into seven modules using Louvain algorithm, and nodes belonging to the same module are grouped together and separated from other nodes using dashed lines.

https://doi.org/10.1371/journal.pone.0329701.g001

Main analyses

As shown in Fig 2A, all six modules, except the infectious and neuropsychiatric diseases module, demonstrated statistically significant associations with severe COVID-19. The strongest association was found for the circulatory and respiratory diseases module (OR: 1·67, 95% CI: 1·54–1·81), followed by age-related eye diseases module (OR: 1·39, 95% CI: 1·27–1·52), digestive diseases module (OR: 1·35, 95%CI: 1·25–1·45), gastrointestinal diseases module (OR: 1·30, 95%CI: 1·20–1·41), cardiometabolic diseases module (OR: 1·27, 95%CI: 1·17–1·40), and mental and skin disorders module (OR: 1·27, 95%CI: 1·17–1·37). The magnitude of the association for each individual disease was generally similar to that of the comorbidity module it belonged to, with the highest ORs observed for decubitus ulcer, atrial fibrillation and flutter, and diabetes mellitus (Fig 3 and S7 Table). Notably, 14 diseases were found to be associated with a reduced risk of severe COVID-19, mainly in the infectious and neuropsychiatric diseases module (Fig 3). Further analyses by numbers of diagnosed comorbidities in each module revealed that out of the seven modules, only two exhibited a distinct dose-response relationship between increasing number of comorbidities in the same module and increasing risk of severe COVID-19 (Fig 2B).

thumbnail
Fig 2. Association between pre-existing comorbidity modules and severe COVID-19.

A, association between individual comorbidity modules and risk of severe COVID-19; B, association between numbers of diagnosed diseases within a module and risk of severe COVID-19. OR and the corresponding 95% CI were derived from logistic regression, adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules.

https://doi.org/10.1371/journal.pone.0329701.g002

thumbnail
Fig 3. The association between individual diseases in each comorbidity module and risk of severe COVID-19.

OR and the corresponding 95% CI were derived from the logistic regression, adjusted for age, sex, Townsend deprivation index, household income, BMI, smoking, and alcohol drinking status. The red color indicates positive association (i.e., OR>1) and the green color indicates negative association (i.e., OR<1). The degree of color represents the magnitude of the corresponding association. Detailed results are shown in S5 Table.

https://doi.org/10.1371/journal.pone.0329701.g003

The sub-analyses revealed that the results were largely similar for most comorbidity modules between males and females, although females had stronger associations for modules related to gastrointestinal diseases (OR for females vs. males: 1·43 vs. 1·21, P-value for interaction = .048), and mental and skin disorders (1·39 vs. 1·18, P-value = .049; S3 Fig). Similarly, age-stratified analyses indicated generally consistent associations across age groups, except for the circulatory and respiratory disease module, which showed a stronger association in the older group (OR for ≥66 vs. < 66 years: 1.90 vs. 1.43; P for interaction = .006; S4 Fig). Although a lower proportion of severe COVID-19 cases were noted among COVID-19 cases that were vaccinated at the time of diagnosis, the observed association was slightly stronger among vaccinated patients than the unvaccinated patients for most comorbidity modules, apart from the circulatory and respiratory diseases module (OR for vaccinated vs. unvaccinated: 1·31 vs. 1·83, P-value = .002) and the cardiometabolic diseases module (OR for vaccinated vs. unvaccinated: 1·11 vs. 1·38, P-value = .191) (S5 Fig). No clear difference in ORs was identified when stratified the time since the last vaccine dose to the COVID-19 diagnosis (S6 Fig). Finally, the stratification analyses by the wave of the COVID-19 pandemic revealed stronger associations for almost all the comorbidity modules during the first wave, compared to the second wave. The largest difference was noted for infectious and neuropsychiatric diseases module (OR for wave 1 vs. wave 2: 1·94 vs. 1·11, P-value < .001), followed by digestive diseases module (1·27 vs. 1·74, P-value = .002) (S7 Fig).

Module-based comorbidity index

The weights of individual diseases in different comorbidity modules used to calculate the module-based comorbidity index are presented in S8 Table. Cataract in the age-related eye diseases module was assigned the highest weight (5·00), followed by diphtheria in the gastrointestinal diseases module (3·39) and four diseases in the circulatory and respiratory diseases module (i.e., asthma, atrial fibrillation and flutter, ischemic heart disease, and chronic obstructive pulmonary disease, with a weight of 3·14, 2·50, 2·48 and 2·25, respectively). The full module-based comorbidity index, which included all 51 diseases, achieved a statistically significantly higher AUC of 0·779 on the XGBoost model, compared to the CCI (0·714) and 16-comorbidity index (0·714), as confirmed by the DeLong test (P < .001) (Fig 4). Furthermore, the module-based comorbidity index also demonstrated favorable performance on the secondary evaluation metrics, as evidenced by its lower AIC and BIC values (S9 Table). Although the predictive performance of the module-based comorbidity index decreased with the number of included diseases, the simplified index, which included the top 15 diseases, still achieved satisfactory results (S9 Table). Using the SVM model, we further confirmed the superior predictive performance of the module-based comorbidity index compared to the CCI and 16-comorbidity index (S9 Table).

thumbnail
Fig 4. Receiver-operating characteristic curves for predicting severe COVID-19 using XGBoost model.

AUC, the area under the receiving operating characteristic curve; CCI, Charlson Comorbidity Index. 16-comorbidity based index was reported in previous literature, including 16 diseases. The full module-based comorbidity index included all 51 diseases, and the simplified module-based comorbidity index included the top 15 diseases.

https://doi.org/10.1371/journal.pone.0329701.g004

Discussion

In this cohort study based on UK Biobank, we identified six distinct comorbidity modules that were statistically significantly associated with severe COVID-19, with the highest effect size observed for modules characterized by circulatory and respiratory diseases followed by age-related eye diseases. We further noted that females with comorbidity modules related to gastrointestinal diseases and mental and skin disorders were more susceptible to severe COVID-19. We further found stronger associations between majority of the comorbidity modules and risk of severe COVID-19 during first wave than second wave of the pandemic while limited attenuation by vaccination status. Finally, the new module-based comorbidity index that outperformed existing prediction indices in predicting severe COVID-19. Overall, our study provided valuable insights into the relationship between pre-existing comorbidity patterns and severe COVID-19 and has important implications for identifying high-risk individuals and personalized medical care at the time of pandemic like the COVID-19.

While no existing studies have examined the relationship between comorbidity modules and risk of severe COVID-19, our finding that six comorbidity modules are associated with the risk of severe COVID-19 aligns with finding of previous studies investigating individual or groups of diseases, including chronic obstructive pulmonary disease [32], stroke [33], diabetes [34], depression [35], inflammatory bowel disease [36], and cirrhosis [37]. Notably, the circulatory and respiratory diseases module exhibited the strongest association with severe COVID-19, corroborating several prior studies that identified respiratory diseases [38] as well as hypertension and cardiovascular diseases [39] as potent risk factors for severe COVID-19. This finding could be attributed to the high expression of angiotensin-converting enzyme 2 (ACE2) in the lung, heart, and blood vessels [40], with receptors serving as important binding sites for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [41]. In addition, individuals with respiratory diseases may have a heightened risk due to their anatomically small airways and impaired pulmonary function [42]. We found that age-related eye diseases module was also strongly related to severe COVID-19. This finding is supported by two recent studies where individuals with exudative age-related macular degeneration and individuals on a waiting list for cataract surgery were shown to have a higher risk of developing severe COVID-19 [43,44]. A potential explanation for this finding is that eye diseases may be an indicator of biological aging and chronic inflammation, which are also well-established risk factors for severe COVID-19. We observed strong associations between the digestive diseases module and gastrointestinal diseases module with severe COVID-19 [43,45,46]. This association might be linked to medications, including those with anticholinergic effects or impacting the gastrointestinal system [47]. Conversely, our study found that the infectious and neuropsychiatric diseases module, although common, was not statistically significantly associated with risk of severe COVID-19. This result may be possibly attributed to heightened self-protection awareness [48], and dysregulated immune responses, specifically contributing to certain components involved in anti–COVID-19 immune response [49] among individuals with these conditions.

Stratification analysis by sex revealed stronger associations for gastrointestinal diseases and mental and skin disorders in females, compared to males. This may be due to the fact that females typically possess stronger innate and adaptive immune responses than males, particularly in the presence of specific diseases (i.e., inflammatory bowel diseases and depression) [50,51]. In the case of COVID-19, this heightened immune response may also elevate the risk of cytokine imbalances, thereby increasing the likelihood of severe COVID-19 in females. Age-stratified analyses showed generally consistent associations across age groups, suggesting these links are not solely attributable to age, although modest differences in effect size were noted for individual diseases, particularly those within the circulatory and respiratory module. In the stratification analysis based on vaccination status, we observed a marginal impact of vaccination. We speculate this might be attributed to the prioritization of early vaccine recipients based on risk factors associated with severe outcomes [52]. Our findings suggest that the presence of comorbidities is equally significant for COVID-19 patients, irrespective of their vaccination status. In the stratification analysis by waves of the pandemic, we observed stronger associations between different comorbidity modules and risk of severe COVID-19 during wave 1. While this disparity might be largely attributed to delayed diagnosis (due to lack of testing for instance) and limited treatment for COVID-19 during wave 1 [25], other factors such as changes in viral strains and improved awareness and preparedness during wave 2 [24,53] could also have contributed.

The proposed module-based comorbidity index achieved superior risk prediction for severe COVID-19 when compared to existing indices like CCI and the 16-comorbidity index, across all predictive criteria. One of the major advantages of this index is its comprehensive coverage of 51 diseases, which were selected from 115 common diseases through a data-driven approach. This enables us to identify a wide range of diseases that may affect patient outcomes to include in one index for an accurate risk prediction. In comparison, CCI and the 16-comorbidity indices have a limited definition of comorbidities and exclude many common conditions [54]. Additionally, unlike existing indices that either treat diseases equally or only consider individual diseases, our index considers both the magnitude of association between a comorbidity module and severe COVID-19 and the importance for each disease for the specific module. This is under the assumption that comorbidities of a same comorbidity module typically impact health outcomes in a modular fashion, as opposed to individual diseases. Consequently, these two advantages make module-based comorbidity index a more comprehensive and accurate comorbidity index in predicting the risk of severe COVID-19, and severe outcomes of future pandemics.

The present study has several key strengths. Firstly, utilizing a comorbidity network analysis approach in the study allowed us to comprehensively assess the association between comorbidity patterns and severe COVID-19. Secondly, by leveraging enriched data on healthcare records and COVID-19 test results, with full coverage for UK Biobank participants in England, we were able to identify the majority of COVID-19 cases in the study cohort and minimize information bias. Lastly, detailed sociodemographic and lifestyle factors collected in the UK Biobank enabled us to consider several important confounders in the analysis.

However, our study also has some limitations. First, the dynamic changes over the COVID-19 pandemic, such as the evolution of virus, use of vaccine boosters and some novel medical interventions, may lead to uncertainties about the implication of our findings. However, a population-based study in the UK has found that, after the first vaccine booster, older people, those with high multimorbidity, and those with certain underlying health conditions remain at the highest risk of COVID-19-related hospitalization and death [55]. Therefore, the importance of multimorbidity to COVID-19 severity may persist. Additionally, for a broader picture that outside of the current COVID-19 challenges, our study offers a novel approach to study or capture the joint influence of comorbidities (i.e., the developed module-based comorbidity index) on a health issue, which could consequently contribute to a quicker or more comprehensive risk assessment for future epidemic, in terms of refining preventive and therapeutic approaches by pinpointing high-risk groups.

Other limitations of the current study include that external validation using datasets outside of the UK Biobank is needed to validate the module-based comorbidity index. Additionally, our definition for severe COVID-19 can be debated and may have limited the generalizability of our findings to clinical settings (e.g., identifying high risk COVID-19 patients requiring mechanical ventilation). Furthermore, the comorbidity network analysis method employed in our study has inherent limitations, and it may not fully eliminate the influence of other confounding variables on the comorbidity associations. Despite our efforts to control for environmental confounders, such as sociodemographic factors and lifestyle risk factors (i.e., BMI, drinking status, smoking status, household income, physical activity, Townsend deprivation index, household income), the influence of other comorbidities in the network was not explicitly controlled for in the model and may have contributed to observed complex comorbidity associations. Moreover, the confounding factors were collected at baseline, which may not reflect the status at the time of COVID-19 diagnosis. Although we had adjusted for multiple confounders in the analysis, residual confounding due to unmeasured factors like comorbidity severity and medication use cannot be ruled out. Lastly, it is important to note that UK Biobank participants are representative of the entire UK population or European population.

Conclusions

Based on a community-based cohort study, we identified six disease clusters that were associated with severe COVID-19, with circulatory and respiratory diseases module showing the strongest association. Additionally, we observed that the association of gastrointestinal diseases module and mental and skin disorders module with severe COVID-19 were more pronounced for females than males. Finally, we developed a module-based comorbidity index that outperformed existing indices in predicting severe COVID-19. Our findings provide valuable insight on the relationship between pre-existing comorbidity patterns and severe COVID-19 and have important implications for identifying high-risk individuals and personalized medical care.

Supporting information

S1 File. Comorbidity network and prediction methods.

https://doi.org/10.1371/journal.pone.0329701.s001

(PDF)

S1 Table. COVID-19 identification in primary care.

https://doi.org/10.1371/journal.pone.0329701.s002

(PDF)

S2 Table. The theoretical and observed infection rate in the study population.

https://doi.org/10.1371/journal.pone.0329701.s003

(PDF)

S3 Table. List of 115 GBD codes used in the current study.

https://doi.org/10.1371/journal.pone.0329701.s004

(PDF)

S4 Table. Diagnostic codes for the identification of severe COVID-19.

https://doi.org/10.1371/journal.pone.0329701.s005

(PDF)

S5 Table. COVID-19 vaccination identification in primary care.

https://doi.org/10.1371/journal.pone.0329701.s006

(PDF)

S6 Table. The ICD-10 codes for the identification of severe COVID-19 case.

https://doi.org/10.1371/journal.pone.0329701.s007

(PDF)

S7 Table. The association between each individual disease and the risk of severe COVID-19.

Adjusted for age, sex, annual household income, TDI, BMI, smoking status, drinking status, and the disease status of other modules.

https://doi.org/10.1371/journal.pone.0329701.s008

(PDF)

S8 Table. The node importance of diseases in seven comorbidity modules.

https://doi.org/10.1371/journal.pone.0329701.s009

(PDF)

S9 Table. Comparison of comorbidity indices in predicting risk of severe COVID-19.

AIC, Akaike Information Criterion; AUC, the area under the receiver operating characteristic curve; BIC, Bayesian Information Criterion; CCI, Charlson Comorbidity Index; SVM, Support Vector Machine; XGBoost, EXtreme Gradient Boosting.16-comorbidity based index was reported in previous literature, including 16 diseases. The full module-based comorbidity index included all 51 diseases.

https://doi.org/10.1371/journal.pone.0329701.s010

(PDF)

S1 Fig. Flowchart of the study population selection.

The first COVID-19 case was confirmed on 31 January 2020 in the UK.

https://doi.org/10.1371/journal.pone.0329701.s011

(TIF)

S2 Fig. Flow chart of constructing the predictive model.

AIC, Akaike Information Criterion; AUC, the area under the receiver operating characteristic curve; BIC, Bayesian Information Criterion; CCI, Charlson Comorbidity Index. c16 comorbidities reported in previous literature, including coronary artery disease, heart failure, atrial fibrillation, T1DM, T2DM, hypertension, asthma, COPD, cancer, dementia, depression, anxiety, psychosis, bipolar, cognitive impairment, and stroke.

https://doi.org/10.1371/journal.pone.0329701.s012

(TIF)

S3 Fig. Sex-specific associations between comorbidity modules and severe COVID-19.

ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and sex was evaluated by including the interaction term in the logistic models. * P < .05.

https://doi.org/10.1371/journal.pone.0329701.s013

(TIF)

S4 Fig. Associations between comorbidity modules and severe COVID-19, stratified by median age.

ORs (95%CI) were derived from fully adjusted logistic models (adjusted for sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and sex was evaluated by including the interaction term in the logistic models. * P < .05.

https://doi.org/10.1371/journal.pone.0329701.s014

(TIF)

S5 Fig. Association between comorbidity modules and severe COVID-19, stratified by vaccination status at diagnosis.

ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and vaccination status was evaluated by including the interaction term in the logistic models. * P < .05.

https://doi.org/10.1371/journal.pone.0329701.s015

(TIF)

S6 Fig. Association between comorbidity modules and severe COVID-19, stratified by time of last vaccination dose at diagnosis.

ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules).

https://doi.org/10.1371/journal.pone.0329701.s016

(TIF)

S7 Fig. Association between comorbidity modules and severe COVID-19 stratified by different waves of the pandemic.

ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and different waves was evaluated by including the interaction term in the logistic models. * P < .05.

https://doi.org/10.1371/journal.pone.0329701.s017

(TIF)

Acknowledgments

This work uses data provided by patients and collected by the NHS as part of their care and support. This research used data assets made available by National Safe Haven as part of the Data and Connectivity National Core Study, led by Health Data Research UK in partnership with the Office for National Statistics. We thank all the sponsors and team members involved in West China Biomedical Big Data Center and Med-X Center for Informatics, Sichuan University.

References

  1. 1. WHO. Coronavirus disease(COVID-19) situation reports. 2022. [cited 2022 Dec]. Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
  2. 2. Ballow M, Haga CL. Why do some people develop serious COVID-19 disease after infection, while others only exhibit mild symptoms? J Allergy Clin Immunol Pract. 2021;9(4):1442–8.
  3. 3. Chen U-I, Xu H, Krause TM, Greenberg R, Dong X, Jiang X. Factors associated with COVID-19 death in the United States: cohort study. JMIR Public Health Surveill. 2022;8(5):e29343. pmid:35377319
  4. 4. Laires PA, Dias S, Gama A, Moniz M, Pedro AR, Soares P, et al. The association between chronic disease and serious COVID-19 outcomes and its influence on risk perception: survey study and database analysis. JMIR Public Health Surveill. 2021;7(1):e22794. pmid:33433397
  5. 5. Rodilla E, Saura A, Jiménez I, Mendizábal A, Pineda-Cantero A, Lorenzo-Hernández E, et al. Association of hypertension with all-cause mortality among hospitalized patients with COVID-19. J Clin Med. 2020;9(10):3136. pmid:32998337
  6. 6. Cao H, Baranova A, Wei X, Wang C, Zhang F. Bidirectional causal associations between type 2 diabetes and COVID-19. J Med Virol. 2023;95(1):e28100. pmid:36029131
  7. 7. CAPACITY-COVID Collaborative Consortium and LEOSS Study Group. Clinical presentation, disease course, and outcome of COVID-19 in hospitalized patients with and without pre-existing cardiac disease: a cohort study across 18 countries. Eur Heart J. 2022;43(11):1104–20. pmid:34734634
  8. 8. Liu H, Chen S, Liu M, Nie H, Lu H. Comorbid chronic diseases are strongly correlated with disease severity among COVID-19 patients: a systematic review and meta-analysis. Aging Dis. 2020;11(3):668–78. pmid:32489711
  9. 9. Li J, Huang DQ, Zou B, Yang H, Hui WZ, Rui F, et al. Epidemiology of COVID-19: a systematic review and meta-analysis of clinical characteristics, risk factors, and outcomes. J Med Virol. 2021;93(3):1449–58. pmid:32790106
  10. 10. Fitzgerald KC, Mecoli CA, Douglas M, Harris S, Aravidis B, Albayda J, et al. Risk factors for infection and health impacts of the coronavirus disease 2019 (COVID-19) pandemic in people with autoimmune diseases. Clin Infect Dis. 2022;74(3):427–36. pmid:33956972
  11. 11. Nemani K, Li C, Olfson M, Blessing EM, Razavian N, Chen J, et al. Association of psychiatric disorders with mortality among patients with COVID-19. JAMA Psychiatry. 2021;78(4):380–6. pmid:33502436
  12. 12. Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68. pmid:21164525
  13. 13. Hidalgo CA, Blumm N, Barabási A-L, Christakis NA. A dynamic network approach for the study of human phenotypes. PLoS Comput Biol. 2009;5(4):e1000353. pmid:19360091
  14. 14. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. pmid:25826379
  15. 15. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9. pmid:30305743
  16. 16. Burns EM, Rigby E, Mamidanna R, Bottle A, Aylin P, Ziprin P, et al. Systematic review of discharge coding accuracy. J Public Health (Oxf). 2012;34(1):138–48. pmid:21795302
  17. 17. Hou C, Yang H, Qu Y, Chen W, Zeng Y, Hu Y, et al. Health consequences of early-onset compared with late-onset type 2 diabetes mellitus. Precis Clin Med. 2022;5(2):pbac015. pmid:35774110
  18. 18. Hou C, Zeng Y, Chen W, Han X, Yang H, Ying Z, et al. Medical conditions associated with coffee consumption: Disease-trajectory and comorbidity network analyses of a prospective cohort study in UK Biobank. Am J Clin Nutr. 2022;116(3):730–40. pmid:35849013
  19. 19. GBD 2019 Risk Factors Collaborators. Global burden of 87 risk factors in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396(10258):1223–49. pmid:33069327
  20. 20. Biobank U. Data providers and dates of data. [cited 2021 Apr 14]. Available from: https://biobank.ndph.ox.ac.uk/showcase/exinfo.cgi?src=Data_providers_and_dates
  21. 21. White T, Westgate K, Hollidge S, Venables M, Olivier P, Wareham N, et al. Estimating energy expenditure from wrist and thigh accelerometry in free-living adults: a doubly labelled water study. Int J Obes (Lond). 2019;43(11):2333–42. pmid:30940917
  22. 22. Townsend PP, Beattie A. Health and deprivation: inequality and the North. 1988. pp. 236.
  23. 23. De Meo PF, Fiumara G, Provetti A. Generalized louvain method for community detection in large networks. In: 11th international conference on intelligent systems design and applications. IEEE; 2011. pp. 88–93.
  24. 24. Fokas AS, Kastis GA. SARS-CoV-2: the second wave in Europe. J Med Internet Res. 2021;23(5):e22431. pmid:33939621
  25. 25. Verduri A, Short R, Carter B, Braude P, Vilches-Moraga A, Quinn TJ, et al. Comparison between first and second wave of COVID-19 outbreak in older people: the COPE multicentre European observational cohort study. Eur J Public Health. 2022;32(5):807–12. pmid:35997587
  26. 26. Liu J, Xiong Q, Shi W, Shi X, Wang K. Evaluating the importance of nodes in complex networks. Phys A-Stat Mech Appl. 2016;452:209–19.
  27. 27. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83. pmid:3558716
  28. 28. Wong KC-Y, Xiang Y, Yin L, So H-C. Uncovering clinical risk factors and predicting severe COVID-19 cases using UK Biobank Data: machine learning approach. JMIR Public Health Surveill. 2021;7(9):e29544. pmid:34591027
  29. 29. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45. pmid:3203132
  30. 30. Akaike H. A new look at the statistical model identification. IEEE Transac Autom Control. 1974;19(6):716–23.
  31. 31. Andel J, Perez MG, Negrao AI. Estimating the dimension of a linear-model. Kybernetika. 1981;17:514–25.
  32. 32. Aveyard P, Gao M, Lindson N, Hartmann-Boyce J, Watkinson P, Young D, et al. Association between pre-existing respiratory disease and its treatment, and severe COVID-19: a population cohort study. Lancet Respir Med. 2021;9(8):909–23. pmid:33812494
  33. 33. Shakil SS, Emmons-Bell S, Rutan C, Walchok J, Navi B, Sharma R, et al. Stroke among patients hospitalized with COVID-19: results from the American Heart Association COVID-19 cardiovascular disease registry. Stroke. 2022;53(3):800–7. pmid:34702063
  34. 34. Elliott J, Bodinier B, Whitaker M, Delpierre C, Vermeulen R, Tzoulaki I, et al. COVID-19 mortality in the UK Biobank cohort: revisiting and evaluating risk factors. Eur J Epidemiol. 2021;36(3):299–309. pmid:33587202
  35. 35. Ceban F, Nogo D, Carvalho IP, Lee Y, Nasri F, Xiong J, et al. Association between mood disorders and risk of COVID-19 infection, hospitalization, and death: a systematic review and meta-analysis. JAMA Psychiatry. 2021;78(10):1079–91. pmid:34319365
  36. 36. MacKenna B, Kennedy NA, Mehrkar A, Rowan A, Galloway J, Matthewman J, et al. Risk of severe COVID-19 outcomes associated with immune-mediated inflammatory diseases and immune-modifying therapies: a nationwide cohort study in the OpenSAFELY platform. Lancet Rheumatol. 2022;4(7):e490–506. pmid:35698725
  37. 37. Efe C, Dhanasekaran R, Lammert C, Ebik B, Higuera-de la Tijera F, Aloman C, et al. Outcome of COVID-19 in patients with autoimmune hepatitis: an international multicenter study. Hepatology. 2021;73(6):2099–109. pmid:33713486
  38. 38. Zhou Y, Yang Q, Chi J, Dong B, Lv W, Shen L, et al. Comorbidities and the risk of severe or fatal outcomes associated with coronavirus disease 2019: a systematic review and meta-analysis. Int J Infect Dis. 2020;99:47–56. pmid:32721533
  39. 39. Cheng S, Zhao Y, Wang F, Chen Y, Kaminga AC, Xu H. Comorbidities’ potential impacts on severe and non-severe patients with COVID-19: a systematic review and meta-analysis. Medicine (Baltimore). 2021;100(12):e24971. pmid:33761654
  40. 40. Delorey TM, Ziegler CGK, Heimberg G, Normand R, Yang Y, Segerstolpe Å, et al. COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets. Nature. 2021;595(7865):107–13. pmid:33915569
  41. 41. Rodrigues R, Costa de Oliveira S. The impact of angiotensin-converting enzyme 2 (ACE2) expression levels in patients with comorbidities on COVID-19 severity: a comprehensive review. Microorganisms. 2021;9(8):1692. pmid:34442770
  42. 42. He Y, Xie M, Zhao J, Liu X. Clinical characteristics and outcomes of patients with severe COVID-19 and chronic obstructive pulmonary disease (COPD). Med Sci Monit. 2020;26:e927212. pmid:32883943
  43. 43. Stuart M, Mooney C, Hrabovsky M, Silvestri G, Stewart S. Surgical planning during a pandemic: Identifying patients at high risk of severe disease or death due to COVID-19 in a cohort of patients on a cataract surgery waiting list. Ulster Med J. 2022;91(1):19–25. pmid:35169334
  44. 44. Yang JM, Moon SY, Lee JY, Agalliu D, Yon DK, Lee SW. COVID-19 morbidity and severity in patients with age-related macular degeneration: a Korean Nationwide Cohort Study. Am J Ophthalmol. 2022;239:159–69. pmid:34102151
  45. 45. Fang X, Li S, Yu H, Wang P, Zhang Y, Chen Z, et al. Epidemiological, comorbidity factors with severity and prognosis of COVID-19: a systematic review and meta-analysis. Aging (Albany NY). 2020;12(13):12493–503. pmid:32658868
  46. 46. Jager MJ, Seddon JM. Eye diseases direct interest to complement pathway and macrophages as regulators of inflammation in COVID-19. Asia Pac J Ophthalmol (Phila). 2020;10(1):114–20. pmid:33290288
  47. 47. McKeigue PM, McAllister DA, Caldwell D, Gribben C, Bishop J, McGurnaghan S, et al. Relation of severe COVID-19 in Scotland to transmission-related factors and risk conditions eligible for shielding support: REACT-SCOT case-control study. BMC Med. 2021;19(1):149. pmid:34158021
  48. 48. Kuroda N. Epilepsy and COVID-19: updated evidence and narrative review. Epilepsy Behav. 2021;116:107785. pmid:33515934
  49. 49. Patrick MT, Zhang H, Wasikowski R, Prens EP, Weidinger S, Gudjonsson JE, et al. Associations between COVID-19 and skin conditions identified through epidemiology and genomic studies. J Allergy Clin Immunol. 2021;147(3):857–869.e7. pmid:33485957
  50. 50. Takahashi T, Ellingson MK, Wong P, Israelow B, Lucas C, Klein J, et al. Sex differences in immune responses that underlie COVID-19 disease outcomes. Nature. 2020;588(7837):315–20. pmid:32846427
  51. 51. Takahashi T, Iwasaki A. Sex differences in immune responses. Science. 2021;371(6527):347–8. pmid:33479140
  52. 52. McIntyre PB, Aggarwal R, Jani I, Jawad J, Kochhar S, MacDonald N, et al. COVID-19 vaccine strategies must focus on severe disease and global equity. Lancet. 2022;399(10322):406–10. pmid:34922639
  53. 53. COVID-19 National Preparedness Collaborators. Pandemic preparedness and COVID-19: an exploratory analysis of infection and fatality rates, and contextual factors associated with preparedness in 177 countries, from Jan 1, 2020, to Sept 30, 2021. Lancet. 2022;399(10334):1489–512. pmid:35120592
  54. 54. Renson A, Bjurlin MA. The Charlson index is insufficient to control for comorbidities in a national trauma registry. J Surg Res. 2019;236:319–25. pmid:30694772
  55. 55. Agrawal U, Bedston S, McCowan C, Oke J, Patterson L, Robertson C, et al. Severe COVID-19 outcomes after full vaccination of primary schedule and initial boosters: pooled analysis of national prospective cohort studies of 30 million individuals in England, Northern Ireland, Scotland, and Wales. Lancet. 2022;400(10360):1305–20. pmid:36244382