Figures
Abstract
Background
Pre-existing comorbidities are linked to increased risk of severe COVID-19, but comprehensive assessments of comorbidity patterns remain limited.
Methods
We used network analysis to identify pre-existing comorbidity modules (i.e., groups of diseases more densely interconnected with each other than with other diseases in the comorbidity network) in a cohort of 420,920 individuals from the UK Biobank who were in England. We defined cases requiring hospitalization or who died of COVID-19 as “severe COVID-19”. Logistic regression was used to examine associations between comorbidity modules and severe COVID-19, and a module-based comorbidity index was developed to predict severe COVID-19, compared with existing indices.
Results
Comorbidity network analysis identified 190 disease pairs with confirmed comorbidity associations, which were further divided into seven comorbidity modules. Among the 30,914 individuals diagnosed with COVID-19, 3,970 were identified as severe cases (median age of 73.6 years, 58.77% being male). Six of seven identified modules showed statistically significant associations with severe COVID-19, especially modules related to circulatory and respiratory diseases (odds ratio = 1.67 [95% confidence interval 1.54–1.81]) and age-related eye diseases (1.39 [1.27–1.52]). Associations did not differ by sex, age or vaccination status but were generally stronger during the first wave of COVID-19 pandemic (i.e., 31st January-1st October, 2020). Our newly developed module-based comorbidity index showed better performance in predicting severe COVID-19 (AUC = 0.779) compared to the existing Charlson Comorbidity Index (0.714) and the 16-comorbidity index (0.714).
Conclusions
Our study demonstrated that pre-existing comorbidity modules, particularly modules related to circulatory and respiratory diseases and age-related eye diseases, were associated with severe COVID-19. Moreover, the module-based comorbidity index provides better prediction of severe COVID-19 than existing prediction indices.
Citation: Zhang J, Hou C, Chen W, Hu Y, Xu S, Liu H, et al. (2025) Comorbidity patterns associated with severe COVID-19 outcomes: A cohort study based on the UK Biobank. PLoS One 20(8): e0329701. https://doi.org/10.1371/journal.pone.0329701
Editor: Sreeram V. Ramagopalan, University of Oxford, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND
Received: May 13, 2025; Accepted: July 20, 2025; Published: August 22, 2025
Copyright: © 2025 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data cannot be shared publicly by the authors because of information governance restrictions on health data. However, the data are available from the UK Biobank following a project approval process. Researchers wishing to access the data can apply directly via https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. The application involves registering on the access management system, submitting a research study protocol, and paying a fee directly to the UK Biobank. The authors of this study did not receive any special privileges in accessing the data. UK Biobank is an open access research resource and accepts applications without restriction. This research has been conducted using the UK Biobank Resource under Application 54803.
Funding: This work was supported by the 1·3·5 Project for Disciplines of Excellence, West China Hospital, Sichuan University (grant no. ZYYC21005 to HS, https://www.wchscu.cn/scientific/office.html), the Natural Science Foundation of Sichuan Province (grant no. 2024NSFSC1568 to CH, https://kjt.sc.gov.cn/kjt/ywpt/newschildsecondywpt.shtml), and the EU Horizon 2020 Research and Innovation Action Grant (grant no. 847776 to UAV and FF, https://cordis.europa.eu/project/id/847776/results). These funding sources had no role in study design, data collection and analyses or manuscript, preparation and submission.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: ACE2, angiotensin-converting enzyme 2; AIC, akaike information criterion; AUC, the area under the receiver operating characteristic curve; BIC, Bayesian Information Criterion; BMI, high body mass index; CI, confidence interval; COPD, chronic obstructive pulmonary disease; COVID-19, coronavirus disease 2019; DIL, node importance ranking method; GBD, Global Burden of Disease; ICD-10, International Classification of Diseases 10th edition; NHS, National Health Service; OR, odds ratio; RR, relative risk; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; SVM, support vector machine; TDI, Townsend deprivation index; XGBoost, EXtreme Gradient Boosting
Introduction
As of November 2024, the global coronavirus disease 2019 (COVID-19) pandemic has affected 10% of the world’s population, resulting in approximately 137 million hospitalized cases and nearly seven million deaths [1]. Most people experience mild symptoms and recover within a few weeks, whilst others suffer severe or even life-threatening conditions which require intensive medical interventions and might lead to serious and prolonged complications [2]. Therefore, to reduce the burden of healthcare system and diminish the impact of such pandemic on society, it is important to identify individuals at high risk of progressing to severe COVID-19 (i.e., hospitalization or death due to COVID-19) and offer a basis for implementing early risk prediction.
As the pandemic continues to evolve, a considerable research effort has been devoted to characterizing risk factors potentially associated with severe COVID-19. Besides demographic and lifestyle factors such as advanced age, high body mass index (BMI) [3], and current smoking, a significant proportion of severe COVID-19 has been attributed to pre-existing comorbidities [4]. Indeed, previous studies have found that comorbidities, such as hypertension [5], diabetes mellitus [6], cardiovascular disease [7], chronic obstructive pulmonary disease [8], and chronic kidney disease [9] are associated with increased risk of hospitalization or death due to COVID-19. However, findings on other major comorbidities such as psychiatric disorders and autoimmune diseases are inconsistent or still lacking [10,11]. Instead of focusing on single diseases, recent efforts focused on diseases networks by partitioning co-occurring diseases into distinct modules with potentially shared underlying biological processes, or a shared genetic basis [12,13]. Demonstrating concurrent diseases through networks may provide a more comprehensive understanding of the links between pre-existing comorbidities and severe COVID-19, with possible explanations on the underlying biological mechanisms.
In the present study, leveraging extensive information on healthcare records and COVID-19 status in the UK Biobank, we aimed to identify pre-existing comorbidity modules in a large sample of study participants and elucidate their impact on the occurrence of severe COVID-19. Further, based on the importance of each comorbidity in the corresponding module and the impact of the module on the risk of severe COVID-19, we developed a module-based comorbidity index for predicting severe COVID-19. Together, findings of the present study have potential to advance our understanding of mechanisms underlying disease progression of COVID-19 and provide a basis for developing medical management for individuals affected by various COVID-19 variants or other contagious diseases in future pandemics of similar kind.
Method
Study design
The present study was based on the UK Biobank, a large community-based cohort that enrolled over 500,000 individuals aged 40−69 years across the UK between 2006 and 2010 [14]. This cohort study is described in detail elsewhere [15]. The UK Biobank has released primary care and COVID-19 test data for ~92% of participants in England to support COVID-19 research. Primary care data come from two major GP systems, and COVID-19 test results (RT-PCR) are sourced from Public Health England’s surveillance system.
Among the 502,507 participants of UK Biobank, we included 445,757 individuals after exclusion of participants who withdrew their informed consent (n = 101, S1 Fig) and those who were registered outside of England (n = 56,649) as there are no complete primary care and COVID-19 test data for participants in Scotland and Wales. We further excluded 24,837 individuals who had died or were lost to follow-up on 31st January 2020 (i.e., the date of the first reported COVID-19 case in the UK), leaving 420,920 participants in the analytical cohort for the identification of comorbidity modules. Among these, 30,914 individuals were identified as COVID-19 cases, according to COVID-19 diagnostic tests, diagnoses in the primary care (codes listed in S1 Table) or inpatient hospital data (International Classification of Diseases 10th edition [ICD-10]: U07·1 or U07·2), and underlying cause of death in the mortality data (ICD-10: U07.1 or U07.2). Follow-up for severe COVID-19 among all cases started from their date of diagnosis until hospitalization or death within 30 days post-diagnosis or the end of the study (November 30, 2021), whichever occurred first. The incidence rate of COVID-19 was comparable between the analytical cohort and the entire UK as reported by the Office for National Statistics [16] (see S2 Table), suggesting that the majority of COVID-19 cases were captured in this analysis.
The UK Biobank study has received full ethical approval from the NHS National Research Ethics Service (16/NW/0274), and all participants provided written informed consent before data collection. The present study was approved by the biomedical research ethics committee of West China Hospital (2020.661).
Ascertainment of pre-existing comorbidities.
We used the term ‘comorbidity’ broadly to encompass a spectrum of medical conditions, including both chronic and acute illnesses, that were diagnosed before COVID-19 [17,18]. Since we focused on diseases that are common in the general population with substantial impact on healthcare systems, we restricted our analyses to a total of 20 communicable and 95 non-communicable diseases listed in the Global Burden of Disease (GBD) Study 2019 [19]. All diagnoses (i.e., main and secondary) in the primary care and inpatient hospital data before 31st January 2020 were used for disease ascertainment, according to corresponding ICD-10 codes (see S3 Table). Specifically, as primary care diagnoses are coded using SNOMED-CT and Read V3 codes, these were converted to the latest release of cross-maps provided by NHS digital ICD-10 codes [20,21], as used for inpatient hospital data.
Ascertainment of severe COVID-19.
Severe COVID-19 cases were defined as (i) those admitted to hospital or who died with this diagnosis, or (ii) those admitted to hospital with or without a COVID-19 diagnosis, or who died, within 30 days of a positive COVID-19 diagnostic test or a COVID-19 diagnosis recorded in the primary care data. For subsequent hospital admissions or deaths lacking a COVID-19 diagnosis, inclusion was restricted to those potentially related to COVID-19, excluding underlying causes of death or primary diagnosis codes related to pregnancy, perinatal conditions, symptoms, or signs, as well as external causes of morbidity and mortality (refer to S4 Table).
Covariates.
Sociodemographic and lifestyle factors, including date of birth, sex, annual household income, and smoking and drinking status, were collected using touchscreen questionnaires at recruitment. The Townsend deprivation index (TDI), a widely used measure of population-level deprivation [22], was assigned to each participant based on the postal codes provided at recruitment. Additionally, we calculated body mass index (BMI) using their height and weight measurements. We extracted COVID-19 vaccination information for each participant from primary care data, using a list of predefined SNOMED-CT and Read V3 codes (see S5 Table).
Statistical analysis
Comorbidity network analysis for identification of comorbidity modules.
The comorbidity network analysis was constructed to investigate the diversity of pre-existing disease patterns prior to the diagnosis of COVID-19. We followed previously described steps of comorbidity network analysis to identify comorbidity modules (i.e., groups of diseases more densely interconnected with each other than with other diseases in the comorbidity network) [18]. A detailed description of the analysis steps is available in S1 File. Specifically, disease pairs with sufficient prevalence (i.e., co-occurrence in at least 1% of the study population) and comorbidity strength (measured by Pearson’s correlation and relative risk of a disease pair in the same individual) were pre-selected and subsequently verified using logistic regression, controlling for potential confounders. The selected disease pairs were used to construct a comorbidity network, and comorbidity modules within the network (i.e., clusters of highly interconnected comorbidities) were identified using the Louvain community detection algorithm [23].
Association analyses.
Logistic regression was used to investigate the associations between pre-existing comorbidity modules and risk of severe COVID-19. For each comorbidity module identified, we first calculated the odds ratio (OR) of severe COVID-19 in relation to being diagnosed with any disease in the module, adjusting for age, sex, annual household income, TDI, BMI, smoking status, drinking status, and disease status of other modules (i.e., being diagnosed with any disease in the corresponding module). Additionally, we estimated ORs for being diagnosed with different numbers of comorbidities within a module (i.e., 1, 2, and 3+). Finally, we calculated the OR of severe COVID-19 in relation to being diagnosed for each individual disease of the comorbidity module.
We conducted sub-analyses for males and females separately. To assess the effect of age, we also conducted sub-analyses for individuals with age < 66 and those≥66 years (i.e., median age) separately. To examine age-related differences, participants were grouped by the median age (66 years) into <66 and ≥66 years. To investigate the effect of vaccination on the association between comorbidity modules and severe COVID-19, we conducted separate analyses for COVID-19 cases with and without vaccination at the time of diagnosis. To further investigate the influence of vaccination status on the observed associations, we stratified COVID-19 cases with vaccination based on the time from the last vaccination does to the COVID-19 diagnosis (i.e., < 6 months or ≥6 months). Finally, we repeated the analyses for COVID-19 cases diagnosed during the first wave (before 1st October, 2020) and second wave (after 1st October, 2020) of the COVID-19 pandemic in the UK separately, to investigate the influence of different viral strains on the results [24,25]. Difference in ORs was assessed by introducing an interaction term to the logistic regression.
Module-based comorbidity index.
Based on the identified comorbidity modules and their association with severe COVID-19, we developed a module-based comorbidity index to predict the risk of severe COVID-19. A detailed description of the analysis steps is available in Supplementary Methods. The index included 51 diseases, with weight of each disease calculated as the product of the disease’s importance in the corresponding module and the OR of the association between the corresponding module and severe COVID-19. The importance of each disease in a module was estimated using a previously proposed importance ranking method for complex networks [26]. We also developed six simplified module-based comorbidity indices by including the top 15, 20, 25, 30, 35, or 40 diseases with the highest weight, among the 51 diseases, to understand if it is possible to reduce the information needed for calculating the comorbidity score.
To evaluate the performance of the module-based comorbidity index in predicting severe COVID-19, we compared its performance with two existing indices, the Charlson Comorbidity Index (CCI) [27] and a 16-comorbidity based index [28]. We used both Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) models that incorporated age and sex as additional covariates in this comparison. We randomly divided the dataset into training and test sets in a 9:1 ratio for index development and evaluation, respectively. The area under the receiving operating characteristic curve (AUC) was chosen as the primary evaluation metric, and differences in AUC between indices were compared using the DeLong test [29]. Additionally, we considered the Akaike information criterion (AIC) [30] and Bayesian information criterion (BIC) [31] as secondary evaluation metrics. The detailed workflow is depicted in S2 Fig.
All the analyses were carried out using R (version 3·6·2, R Foundation for Statistical Computing, Vienna, Austria) and Python 3·8 (Python Software Foundation Delaware, USA), with two-sided P-value<0·05 as statistically significant.
Results
We included a total of 420,920 participants as our study population for comorbidity network analysis. The median age at recruitment was 69·30 years, and 54·31% of the participants were females. Among the 115 comorbidities studied, the most prevalent non-communicable diseases included bacterial skin diseases, low back pain, and anxiety disorders, while the most prevalent communicable diseases included upper respiratory infections and lower respiratory infections. A total of 30,914 individuals with COVID-19 were identified in the study population, of which 3,970 were classified as severe COVID-19 (S1 Fig and S6 Table). Compared to mild COVID-19 cases, severe COVID-19 cases were generally older (median age 73·6 vs 64·6, P < .001) and more likely to be male (58·77% vs. 45·18%, P < .001), obese (41·61% vs. 27·75% for BMI ≥ 29·9, P < .001), and current smokers (41·86% vs. 34·22%, P < .001) but less likely to have a higher household income (10·88% vs. 23·49% for household income ≥52,000£, P < .001) and be current drinkers (84·89% vs. 91·50%, P < .001) (Table 1).
Comorbidity modules
The comorbidity network analysis identified 190 disease pairs with confirmed comorbidity associations, which were further divided into seven comorbidity modules (Fig 1). According to the predominant diseases in the module, they were named as age-related eye diseases module, cardiometabolic diseases module, infectious and neuropsychiatric diseases module, circulatory and respiratory diseases module, gastrointestinal diseases module, digestive diseases module, and mental and skin disorders module, respectively.
Main analyses
As shown in Fig 2A, all six modules, except the infectious and neuropsychiatric diseases module, demonstrated statistically significant associations with severe COVID-19. The strongest association was found for the circulatory and respiratory diseases module (OR: 1·67, 95% CI: 1·54–1·81), followed by age-related eye diseases module (OR: 1·39, 95% CI: 1·27–1·52), digestive diseases module (OR: 1·35, 95%CI: 1·25–1·45), gastrointestinal diseases module (OR: 1·30, 95%CI: 1·20–1·41), cardiometabolic diseases module (OR: 1·27, 95%CI: 1·17–1·40), and mental and skin disorders module (OR: 1·27, 95%CI: 1·17–1·37). The magnitude of the association for each individual disease was generally similar to that of the comorbidity module it belonged to, with the highest ORs observed for decubitus ulcer, atrial fibrillation and flutter, and diabetes mellitus (Fig 3 and S7 Table). Notably, 14 diseases were found to be associated with a reduced risk of severe COVID-19, mainly in the infectious and neuropsychiatric diseases module (Fig 3). Further analyses by numbers of diagnosed comorbidities in each module revealed that out of the seven modules, only two exhibited a distinct dose-response relationship between increasing number of comorbidities in the same module and increasing risk of severe COVID-19 (Fig 2B).
A, association between individual comorbidity modules and risk of severe COVID-19; B, association between numbers of diagnosed diseases within a module and risk of severe COVID-19. OR and the corresponding 95% CI were derived from logistic regression, adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules.
OR and the corresponding 95% CI were derived from the logistic regression, adjusted for age, sex, Townsend deprivation index, household income, BMI, smoking, and alcohol drinking status. The red color indicates positive association (i.e., OR>1) and the green color indicates negative association (i.e., OR<1). The degree of color represents the magnitude of the corresponding association. Detailed results are shown in S5 Table.
The sub-analyses revealed that the results were largely similar for most comorbidity modules between males and females, although females had stronger associations for modules related to gastrointestinal diseases (OR for females vs. males: 1·43 vs. 1·21, P-value for interaction = .048), and mental and skin disorders (1·39 vs. 1·18, P-value = .049; S3 Fig). Similarly, age-stratified analyses indicated generally consistent associations across age groups, except for the circulatory and respiratory disease module, which showed a stronger association in the older group (OR for ≥66 vs. < 66 years: 1.90 vs. 1.43; P for interaction = .006; S4 Fig). Although a lower proportion of severe COVID-19 cases were noted among COVID-19 cases that were vaccinated at the time of diagnosis, the observed association was slightly stronger among vaccinated patients than the unvaccinated patients for most comorbidity modules, apart from the circulatory and respiratory diseases module (OR for vaccinated vs. unvaccinated: 1·31 vs. 1·83, P-value = .002) and the cardiometabolic diseases module (OR for vaccinated vs. unvaccinated: 1·11 vs. 1·38, P-value = .191) (S5 Fig). No clear difference in ORs was identified when stratified the time since the last vaccine dose to the COVID-19 diagnosis (S6 Fig). Finally, the stratification analyses by the wave of the COVID-19 pandemic revealed stronger associations for almost all the comorbidity modules during the first wave, compared to the second wave. The largest difference was noted for infectious and neuropsychiatric diseases module (OR for wave 1 vs. wave 2: 1·94 vs. 1·11, P-value < .001), followed by digestive diseases module (1·27 vs. 1·74, P-value = .002) (S7 Fig).
Module-based comorbidity index
The weights of individual diseases in different comorbidity modules used to calculate the module-based comorbidity index are presented in S8 Table. Cataract in the age-related eye diseases module was assigned the highest weight (5·00), followed by diphtheria in the gastrointestinal diseases module (3·39) and four diseases in the circulatory and respiratory diseases module (i.e., asthma, atrial fibrillation and flutter, ischemic heart disease, and chronic obstructive pulmonary disease, with a weight of 3·14, 2·50, 2·48 and 2·25, respectively). The full module-based comorbidity index, which included all 51 diseases, achieved a statistically significantly higher AUC of 0·779 on the XGBoost model, compared to the CCI (0·714) and 16-comorbidity index (0·714), as confirmed by the DeLong test (P < .001) (Fig 4). Furthermore, the module-based comorbidity index also demonstrated favorable performance on the secondary evaluation metrics, as evidenced by its lower AIC and BIC values (S9 Table). Although the predictive performance of the module-based comorbidity index decreased with the number of included diseases, the simplified index, which included the top 15 diseases, still achieved satisfactory results (S9 Table). Using the SVM model, we further confirmed the superior predictive performance of the module-based comorbidity index compared to the CCI and 16-comorbidity index (S9 Table).
AUC, the area under the receiving operating characteristic curve; CCI, Charlson Comorbidity Index. 16-comorbidity based index was reported in previous literature, including 16 diseases. The full module-based comorbidity index included all 51 diseases, and the simplified module-based comorbidity index included the top 15 diseases.
Discussion
In this cohort study based on UK Biobank, we identified six distinct comorbidity modules that were statistically significantly associated with severe COVID-19, with the highest effect size observed for modules characterized by circulatory and respiratory diseases followed by age-related eye diseases. We further noted that females with comorbidity modules related to gastrointestinal diseases and mental and skin disorders were more susceptible to severe COVID-19. We further found stronger associations between majority of the comorbidity modules and risk of severe COVID-19 during first wave than second wave of the pandemic while limited attenuation by vaccination status. Finally, the new module-based comorbidity index that outperformed existing prediction indices in predicting severe COVID-19. Overall, our study provided valuable insights into the relationship between pre-existing comorbidity patterns and severe COVID-19 and has important implications for identifying high-risk individuals and personalized medical care at the time of pandemic like the COVID-19.
While no existing studies have examined the relationship between comorbidity modules and risk of severe COVID-19, our finding that six comorbidity modules are associated with the risk of severe COVID-19 aligns with finding of previous studies investigating individual or groups of diseases, including chronic obstructive pulmonary disease [32], stroke [33], diabetes [34], depression [35], inflammatory bowel disease [36], and cirrhosis [37]. Notably, the circulatory and respiratory diseases module exhibited the strongest association with severe COVID-19, corroborating several prior studies that identified respiratory diseases [38] as well as hypertension and cardiovascular diseases [39] as potent risk factors for severe COVID-19. This finding could be attributed to the high expression of angiotensin-converting enzyme 2 (ACE2) in the lung, heart, and blood vessels [40], with receptors serving as important binding sites for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [41]. In addition, individuals with respiratory diseases may have a heightened risk due to their anatomically small airways and impaired pulmonary function [42]. We found that age-related eye diseases module was also strongly related to severe COVID-19. This finding is supported by two recent studies where individuals with exudative age-related macular degeneration and individuals on a waiting list for cataract surgery were shown to have a higher risk of developing severe COVID-19 [43,44]. A potential explanation for this finding is that eye diseases may be an indicator of biological aging and chronic inflammation, which are also well-established risk factors for severe COVID-19. We observed strong associations between the digestive diseases module and gastrointestinal diseases module with severe COVID-19 [43,45,46]. This association might be linked to medications, including those with anticholinergic effects or impacting the gastrointestinal system [47]. Conversely, our study found that the infectious and neuropsychiatric diseases module, although common, was not statistically significantly associated with risk of severe COVID-19. This result may be possibly attributed to heightened self-protection awareness [48], and dysregulated immune responses, specifically contributing to certain components involved in anti–COVID-19 immune response [49] among individuals with these conditions.
Stratification analysis by sex revealed stronger associations for gastrointestinal diseases and mental and skin disorders in females, compared to males. This may be due to the fact that females typically possess stronger innate and adaptive immune responses than males, particularly in the presence of specific diseases (i.e., inflammatory bowel diseases and depression) [50,51]. In the case of COVID-19, this heightened immune response may also elevate the risk of cytokine imbalances, thereby increasing the likelihood of severe COVID-19 in females. Age-stratified analyses showed generally consistent associations across age groups, suggesting these links are not solely attributable to age, although modest differences in effect size were noted for individual diseases, particularly those within the circulatory and respiratory module. In the stratification analysis based on vaccination status, we observed a marginal impact of vaccination. We speculate this might be attributed to the prioritization of early vaccine recipients based on risk factors associated with severe outcomes [52]. Our findings suggest that the presence of comorbidities is equally significant for COVID-19 patients, irrespective of their vaccination status. In the stratification analysis by waves of the pandemic, we observed stronger associations between different comorbidity modules and risk of severe COVID-19 during wave 1. While this disparity might be largely attributed to delayed diagnosis (due to lack of testing for instance) and limited treatment for COVID-19 during wave 1 [25], other factors such as changes in viral strains and improved awareness and preparedness during wave 2 [24,53] could also have contributed.
The proposed module-based comorbidity index achieved superior risk prediction for severe COVID-19 when compared to existing indices like CCI and the 16-comorbidity index, across all predictive criteria. One of the major advantages of this index is its comprehensive coverage of 51 diseases, which were selected from 115 common diseases through a data-driven approach. This enables us to identify a wide range of diseases that may affect patient outcomes to include in one index for an accurate risk prediction. In comparison, CCI and the 16-comorbidity indices have a limited definition of comorbidities and exclude many common conditions [54]. Additionally, unlike existing indices that either treat diseases equally or only consider individual diseases, our index considers both the magnitude of association between a comorbidity module and severe COVID-19 and the importance for each disease for the specific module. This is under the assumption that comorbidities of a same comorbidity module typically impact health outcomes in a modular fashion, as opposed to individual diseases. Consequently, these two advantages make module-based comorbidity index a more comprehensive and accurate comorbidity index in predicting the risk of severe COVID-19, and severe outcomes of future pandemics.
The present study has several key strengths. Firstly, utilizing a comorbidity network analysis approach in the study allowed us to comprehensively assess the association between comorbidity patterns and severe COVID-19. Secondly, by leveraging enriched data on healthcare records and COVID-19 test results, with full coverage for UK Biobank participants in England, we were able to identify the majority of COVID-19 cases in the study cohort and minimize information bias. Lastly, detailed sociodemographic and lifestyle factors collected in the UK Biobank enabled us to consider several important confounders in the analysis.
However, our study also has some limitations. First, the dynamic changes over the COVID-19 pandemic, such as the evolution of virus, use of vaccine boosters and some novel medical interventions, may lead to uncertainties about the implication of our findings. However, a population-based study in the UK has found that, after the first vaccine booster, older people, those with high multimorbidity, and those with certain underlying health conditions remain at the highest risk of COVID-19-related hospitalization and death [55]. Therefore, the importance of multimorbidity to COVID-19 severity may persist. Additionally, for a broader picture that outside of the current COVID-19 challenges, our study offers a novel approach to study or capture the joint influence of comorbidities (i.e., the developed module-based comorbidity index) on a health issue, which could consequently contribute to a quicker or more comprehensive risk assessment for future epidemic, in terms of refining preventive and therapeutic approaches by pinpointing high-risk groups.
Other limitations of the current study include that external validation using datasets outside of the UK Biobank is needed to validate the module-based comorbidity index. Additionally, our definition for severe COVID-19 can be debated and may have limited the generalizability of our findings to clinical settings (e.g., identifying high risk COVID-19 patients requiring mechanical ventilation). Furthermore, the comorbidity network analysis method employed in our study has inherent limitations, and it may not fully eliminate the influence of other confounding variables on the comorbidity associations. Despite our efforts to control for environmental confounders, such as sociodemographic factors and lifestyle risk factors (i.e., BMI, drinking status, smoking status, household income, physical activity, Townsend deprivation index, household income), the influence of other comorbidities in the network was not explicitly controlled for in the model and may have contributed to observed complex comorbidity associations. Moreover, the confounding factors were collected at baseline, which may not reflect the status at the time of COVID-19 diagnosis. Although we had adjusted for multiple confounders in the analysis, residual confounding due to unmeasured factors like comorbidity severity and medication use cannot be ruled out. Lastly, it is important to note that UK Biobank participants are representative of the entire UK population or European population.
Conclusions
Based on a community-based cohort study, we identified six disease clusters that were associated with severe COVID-19, with circulatory and respiratory diseases module showing the strongest association. Additionally, we observed that the association of gastrointestinal diseases module and mental and skin disorders module with severe COVID-19 were more pronounced for females than males. Finally, we developed a module-based comorbidity index that outperformed existing indices in predicting severe COVID-19. Our findings provide valuable insight on the relationship between pre-existing comorbidity patterns and severe COVID-19 and have important implications for identifying high-risk individuals and personalized medical care.
Supporting information
S1 File. Comorbidity network and prediction methods.
https://doi.org/10.1371/journal.pone.0329701.s001
(PDF)
S1 Table. COVID-19 identification in primary care.
https://doi.org/10.1371/journal.pone.0329701.s002
(PDF)
S2 Table. The theoretical and observed infection rate in the study population.
https://doi.org/10.1371/journal.pone.0329701.s003
(PDF)
S3 Table. List of 115 GBD codes used in the current study.
https://doi.org/10.1371/journal.pone.0329701.s004
(PDF)
S4 Table. Diagnostic codes for the identification of severe COVID-19.
https://doi.org/10.1371/journal.pone.0329701.s005
(PDF)
S5 Table. COVID-19 vaccination identification in primary care.
https://doi.org/10.1371/journal.pone.0329701.s006
(PDF)
S6 Table. The ICD-10 codes for the identification of severe COVID-19 case.
https://doi.org/10.1371/journal.pone.0329701.s007
(PDF)
S7 Table. The association between each individual disease and the risk of severe COVID-19.
Adjusted for age, sex, annual household income, TDI, BMI, smoking status, drinking status, and the disease status of other modules.
https://doi.org/10.1371/journal.pone.0329701.s008
(PDF)
S8 Table. The node importance of diseases in seven comorbidity modules.
https://doi.org/10.1371/journal.pone.0329701.s009
(PDF)
S9 Table. Comparison of comorbidity indices in predicting risk of severe COVID-19.
AIC, Akaike Information Criterion; AUC, the area under the receiver operating characteristic curve; BIC, Bayesian Information Criterion; CCI, Charlson Comorbidity Index; SVM, Support Vector Machine; XGBoost, EXtreme Gradient Boosting.16-comorbidity based index was reported in previous literature, including 16 diseases. The full module-based comorbidity index included all 51 diseases.
https://doi.org/10.1371/journal.pone.0329701.s010
(PDF)
S1 Fig. Flowchart of the study population selection.
The first COVID-19 case was confirmed on 31 January 2020 in the UK.
https://doi.org/10.1371/journal.pone.0329701.s011
(TIF)
S2 Fig. Flow chart of constructing the predictive model.
AIC, Akaike Information Criterion; AUC, the area under the receiver operating characteristic curve; BIC, Bayesian Information Criterion; CCI, Charlson Comorbidity Index. c16 comorbidities reported in previous literature, including coronary artery disease, heart failure, atrial fibrillation, T1DM, T2DM, hypertension, asthma, COPD, cancer, dementia, depression, anxiety, psychosis, bipolar, cognitive impairment, and stroke.
https://doi.org/10.1371/journal.pone.0329701.s012
(TIF)
S3 Fig. Sex-specific associations between comorbidity modules and severe COVID-19.
ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and sex was evaluated by including the interaction term in the logistic models. * P < .05.
https://doi.org/10.1371/journal.pone.0329701.s013
(TIF)
S4 Fig. Associations between comorbidity modules and severe COVID-19, stratified by median age.
ORs (95%CI) were derived from fully adjusted logistic models (adjusted for sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and sex was evaluated by including the interaction term in the logistic models. * P < .05.
https://doi.org/10.1371/journal.pone.0329701.s014
(TIF)
S5 Fig. Association between comorbidity modules and severe COVID-19, stratified by vaccination status at diagnosis.
ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and vaccination status was evaluated by including the interaction term in the logistic models. * P < .05.
https://doi.org/10.1371/journal.pone.0329701.s015
(TIF)
S6 Fig. Association between comorbidity modules and severe COVID-19, stratified by time of last vaccination dose at diagnosis.
ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules).
https://doi.org/10.1371/journal.pone.0329701.s016
(TIF)
S7 Fig. Association between comorbidity modules and severe COVID-19 stratified by different waves of the pandemic.
ORs (95%CI) were derived from fully adjusted logistic models (adjusted for age, sex, Townsend deprivation index, annual household income, BMI, smoking status, drinking status, and disease status of other modules). The interaction between comorbidity modules and different waves was evaluated by including the interaction term in the logistic models. * P < .05.
https://doi.org/10.1371/journal.pone.0329701.s017
(TIF)
Acknowledgments
This work uses data provided by patients and collected by the NHS as part of their care and support. This research used data assets made available by National Safe Haven as part of the Data and Connectivity National Core Study, led by Health Data Research UK in partnership with the Office for National Statistics. We thank all the sponsors and team members involved in West China Biomedical Big Data Center and Med-X Center for Informatics, Sichuan University.
References
- 1.
WHO. Coronavirus disease(COVID-19) situation reports. 2022. [cited 2022 Dec]. Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
- 2. Ballow M, Haga CL. Why do some people develop serious COVID-19 disease after infection, while others only exhibit mild symptoms? J Allergy Clin Immunol Pract. 2021;9(4):1442–8.
- 3. Chen U-I, Xu H, Krause TM, Greenberg R, Dong X, Jiang X. Factors associated with COVID-19 death in the United States: cohort study. JMIR Public Health Surveill. 2022;8(5):e29343. pmid:35377319
- 4. Laires PA, Dias S, Gama A, Moniz M, Pedro AR, Soares P, et al. The association between chronic disease and serious COVID-19 outcomes and its influence on risk perception: survey study and database analysis. JMIR Public Health Surveill. 2021;7(1):e22794. pmid:33433397
- 5. Rodilla E, Saura A, Jiménez I, Mendizábal A, Pineda-Cantero A, Lorenzo-Hernández E, et al. Association of hypertension with all-cause mortality among hospitalized patients with COVID-19. J Clin Med. 2020;9(10):3136. pmid:32998337
- 6. Cao H, Baranova A, Wei X, Wang C, Zhang F. Bidirectional causal associations between type 2 diabetes and COVID-19. J Med Virol. 2023;95(1):e28100. pmid:36029131
- 7. CAPACITY-COVID Collaborative Consortium and LEOSS Study Group. Clinical presentation, disease course, and outcome of COVID-19 in hospitalized patients with and without pre-existing cardiac disease: a cohort study across 18 countries. Eur Heart J. 2022;43(11):1104–20. pmid:34734634
- 8. Liu H, Chen S, Liu M, Nie H, Lu H. Comorbid chronic diseases are strongly correlated with disease severity among COVID-19 patients: a systematic review and meta-analysis. Aging Dis. 2020;11(3):668–78. pmid:32489711
- 9. Li J, Huang DQ, Zou B, Yang H, Hui WZ, Rui F, et al. Epidemiology of COVID-19: a systematic review and meta-analysis of clinical characteristics, risk factors, and outcomes. J Med Virol. 2021;93(3):1449–58. pmid:32790106
- 10. Fitzgerald KC, Mecoli CA, Douglas M, Harris S, Aravidis B, Albayda J, et al. Risk factors for infection and health impacts of the coronavirus disease 2019 (COVID-19) pandemic in people with autoimmune diseases. Clin Infect Dis. 2022;74(3):427–36. pmid:33956972
- 11. Nemani K, Li C, Olfson M, Blessing EM, Razavian N, Chen J, et al. Association of psychiatric disorders with mortality among patients with COVID-19. JAMA Psychiatry. 2021;78(4):380–6. pmid:33502436
- 12. Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68. pmid:21164525
- 13. Hidalgo CA, Blumm N, Barabási A-L, Christakis NA. A dynamic network approach for the study of human phenotypes. PLoS Comput Biol. 2009;5(4):e1000353. pmid:19360091
- 14. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. pmid:25826379
- 15. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9. pmid:30305743
- 16. Burns EM, Rigby E, Mamidanna R, Bottle A, Aylin P, Ziprin P, et al. Systematic review of discharge coding accuracy. J Public Health (Oxf). 2012;34(1):138–48. pmid:21795302
- 17. Hou C, Yang H, Qu Y, Chen W, Zeng Y, Hu Y, et al. Health consequences of early-onset compared with late-onset type 2 diabetes mellitus. Precis Clin Med. 2022;5(2):pbac015. pmid:35774110
- 18. Hou C, Zeng Y, Chen W, Han X, Yang H, Ying Z, et al. Medical conditions associated with coffee consumption: Disease-trajectory and comorbidity network analyses of a prospective cohort study in UK Biobank. Am J Clin Nutr. 2022;116(3):730–40. pmid:35849013
- 19. GBD 2019 Risk Factors Collaborators. Global burden of 87 risk factors in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396(10258):1223–49. pmid:33069327
- 20.
Biobank U. Data providers and dates of data. [cited 2021 Apr 14]. Available from: https://biobank.ndph.ox.ac.uk/showcase/exinfo.cgi?src=Data_providers_and_dates
- 21. White T, Westgate K, Hollidge S, Venables M, Olivier P, Wareham N, et al. Estimating energy expenditure from wrist and thigh accelerometry in free-living adults: a doubly labelled water study. Int J Obes (Lond). 2019;43(11):2333–42. pmid:30940917
- 22.
Townsend PP, Beattie A. Health and deprivation: inequality and the North. 1988. pp. 236.
- 23.
De Meo PF, Fiumara G, Provetti A. Generalized louvain method for community detection in large networks. In: 11th international conference on intelligent systems design and applications. IEEE; 2011. pp. 88–93.
- 24. Fokas AS, Kastis GA. SARS-CoV-2: the second wave in Europe. J Med Internet Res. 2021;23(5):e22431. pmid:33939621
- 25. Verduri A, Short R, Carter B, Braude P, Vilches-Moraga A, Quinn TJ, et al. Comparison between first and second wave of COVID-19 outbreak in older people: the COPE multicentre European observational cohort study. Eur J Public Health. 2022;32(5):807–12. pmid:35997587
- 26. Liu J, Xiong Q, Shi W, Shi X, Wang K. Evaluating the importance of nodes in complex networks. Phys A-Stat Mech Appl. 2016;452:209–19.
- 27. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83. pmid:3558716
- 28. Wong KC-Y, Xiang Y, Yin L, So H-C. Uncovering clinical risk factors and predicting severe COVID-19 cases using UK Biobank Data: machine learning approach. JMIR Public Health Surveill. 2021;7(9):e29544. pmid:34591027
- 29. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45. pmid:3203132
- 30. Akaike H. A new look at the statistical model identification. IEEE Transac Autom Control. 1974;19(6):716–23.
- 31. Andel J, Perez MG, Negrao AI. Estimating the dimension of a linear-model. Kybernetika. 1981;17:514–25.
- 32. Aveyard P, Gao M, Lindson N, Hartmann-Boyce J, Watkinson P, Young D, et al. Association between pre-existing respiratory disease and its treatment, and severe COVID-19: a population cohort study. Lancet Respir Med. 2021;9(8):909–23. pmid:33812494
- 33. Shakil SS, Emmons-Bell S, Rutan C, Walchok J, Navi B, Sharma R, et al. Stroke among patients hospitalized with COVID-19: results from the American Heart Association COVID-19 cardiovascular disease registry. Stroke. 2022;53(3):800–7. pmid:34702063
- 34. Elliott J, Bodinier B, Whitaker M, Delpierre C, Vermeulen R, Tzoulaki I, et al. COVID-19 mortality in the UK Biobank cohort: revisiting and evaluating risk factors. Eur J Epidemiol. 2021;36(3):299–309. pmid:33587202
- 35. Ceban F, Nogo D, Carvalho IP, Lee Y, Nasri F, Xiong J, et al. Association between mood disorders and risk of COVID-19 infection, hospitalization, and death: a systematic review and meta-analysis. JAMA Psychiatry. 2021;78(10):1079–91. pmid:34319365
- 36. MacKenna B, Kennedy NA, Mehrkar A, Rowan A, Galloway J, Matthewman J, et al. Risk of severe COVID-19 outcomes associated with immune-mediated inflammatory diseases and immune-modifying therapies: a nationwide cohort study in the OpenSAFELY platform. Lancet Rheumatol. 2022;4(7):e490–506. pmid:35698725
- 37. Efe C, Dhanasekaran R, Lammert C, Ebik B, Higuera-de la Tijera F, Aloman C, et al. Outcome of COVID-19 in patients with autoimmune hepatitis: an international multicenter study. Hepatology. 2021;73(6):2099–109. pmid:33713486
- 38. Zhou Y, Yang Q, Chi J, Dong B, Lv W, Shen L, et al. Comorbidities and the risk of severe or fatal outcomes associated with coronavirus disease 2019: a systematic review and meta-analysis. Int J Infect Dis. 2020;99:47–56. pmid:32721533
- 39. Cheng S, Zhao Y, Wang F, Chen Y, Kaminga AC, Xu H. Comorbidities’ potential impacts on severe and non-severe patients with COVID-19: a systematic review and meta-analysis. Medicine (Baltimore). 2021;100(12):e24971. pmid:33761654
- 40. Delorey TM, Ziegler CGK, Heimberg G, Normand R, Yang Y, Segerstolpe Å, et al. COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets. Nature. 2021;595(7865):107–13. pmid:33915569
- 41. Rodrigues R, Costa de Oliveira S. The impact of angiotensin-converting enzyme 2 (ACE2) expression levels in patients with comorbidities on COVID-19 severity: a comprehensive review. Microorganisms. 2021;9(8):1692. pmid:34442770
- 42. He Y, Xie M, Zhao J, Liu X. Clinical characteristics and outcomes of patients with severe COVID-19 and chronic obstructive pulmonary disease (COPD). Med Sci Monit. 2020;26:e927212. pmid:32883943
- 43. Stuart M, Mooney C, Hrabovsky M, Silvestri G, Stewart S. Surgical planning during a pandemic: Identifying patients at high risk of severe disease or death due to COVID-19 in a cohort of patients on a cataract surgery waiting list. Ulster Med J. 2022;91(1):19–25. pmid:35169334
- 44. Yang JM, Moon SY, Lee JY, Agalliu D, Yon DK, Lee SW. COVID-19 morbidity and severity in patients with age-related macular degeneration: a Korean Nationwide Cohort Study. Am J Ophthalmol. 2022;239:159–69. pmid:34102151
- 45. Fang X, Li S, Yu H, Wang P, Zhang Y, Chen Z, et al. Epidemiological, comorbidity factors with severity and prognosis of COVID-19: a systematic review and meta-analysis. Aging (Albany NY). 2020;12(13):12493–503. pmid:32658868
- 46. Jager MJ, Seddon JM. Eye diseases direct interest to complement pathway and macrophages as regulators of inflammation in COVID-19. Asia Pac J Ophthalmol (Phila). 2020;10(1):114–20. pmid:33290288
- 47. McKeigue PM, McAllister DA, Caldwell D, Gribben C, Bishop J, McGurnaghan S, et al. Relation of severe COVID-19 in Scotland to transmission-related factors and risk conditions eligible for shielding support: REACT-SCOT case-control study. BMC Med. 2021;19(1):149. pmid:34158021
- 48. Kuroda N. Epilepsy and COVID-19: updated evidence and narrative review. Epilepsy Behav. 2021;116:107785. pmid:33515934
- 49. Patrick MT, Zhang H, Wasikowski R, Prens EP, Weidinger S, Gudjonsson JE, et al. Associations between COVID-19 and skin conditions identified through epidemiology and genomic studies. J Allergy Clin Immunol. 2021;147(3):857–869.e7. pmid:33485957
- 50. Takahashi T, Ellingson MK, Wong P, Israelow B, Lucas C, Klein J, et al. Sex differences in immune responses that underlie COVID-19 disease outcomes. Nature. 2020;588(7837):315–20. pmid:32846427
- 51. Takahashi T, Iwasaki A. Sex differences in immune responses. Science. 2021;371(6527):347–8. pmid:33479140
- 52. McIntyre PB, Aggarwal R, Jani I, Jawad J, Kochhar S, MacDonald N, et al. COVID-19 vaccine strategies must focus on severe disease and global equity. Lancet. 2022;399(10322):406–10. pmid:34922639
- 53. COVID-19 National Preparedness Collaborators. Pandemic preparedness and COVID-19: an exploratory analysis of infection and fatality rates, and contextual factors associated with preparedness in 177 countries, from Jan 1, 2020, to Sept 30, 2021. Lancet. 2022;399(10334):1489–512. pmid:35120592
- 54. Renson A, Bjurlin MA. The Charlson index is insufficient to control for comorbidities in a national trauma registry. J Surg Res. 2019;236:319–25. pmid:30694772
- 55. Agrawal U, Bedston S, McCowan C, Oke J, Patterson L, Robertson C, et al. Severe COVID-19 outcomes after full vaccination of primary schedule and initial boosters: pooled analysis of national prospective cohort studies of 30 million individuals in England, Northern Ireland, Scotland, and Wales. Lancet. 2022;400(10360):1305–20. pmid:36244382