Clustering of physical health multimorbidity in people with severe mental illness: An accumulated prevalence analysis of United Kingdom primary care data

Background People with severe mental illness (SMI) have higher rates of a range of physical health conditions, yet little is known regarding the clustering of physical health conditions in this population. We aimed to investigate the prevalence and clustering of chronic physical health conditions in people with SMI, compared to people without SMI. Methods and findings We performed a cohort-nested accumulated prevalence study, using primary care data from the Clinical Practice Research Datalink (CPRD), which holds details of 39 million patients in the United Kingdom. We identified 68,783 adults with a primary care diagnosis of SMI (schizophrenia, bipolar disorder, or other psychoses) from 2000 to 2018, matched up to 1:4 to 274,684 patients without an SMI diagnosis, on age, sex, primary care practice, and year of registration at the practice. Patients had a median of 28.85 (IQR: 19.10 to 41.37) years of primary care observations. Patients with SMI had higher prevalence of smoking (27.65% versus 46.08%), obesity (24.91% versus 38.09%), alcohol misuse (3.66% versus 13.47%), and drug misuse (2.08% versus 12.84%) than comparators. We defined 24 physical health conditions derived from the Elixhauser and Charlson comorbidity indices and used logistic regression to investigate individual conditions and multimorbidity. We controlled for age, sex, region, and ethnicity and then additionally for health risk factors: smoking status, alcohol misuse, drug misuse, and body mass index (BMI). We defined multimorbidity clusters using multiple correspondence analysis (MCA) and K-means cluster analysis and described them based on the observed/expected ratio. Patients with SMI had higher odds of 19 of 24 conditions and a higher prevalence of multimorbidity (odds ratio (OR): 1.84; 95% confidence interval [CI]: 1.80 to 1.88, p < 0.001) compared to those without SMI, particularly in younger age groups (males aged 30 to 39: OR: 2.49; 95% CI: 2.27 to 2.73; p < 0.001; females aged 18 to 30: OR: 2.69; 95% CI: 2.36 to 3.07; p < 0.001). Adjusting for health risk factors reduced the OR of all conditions. We identified 7 multimorbidity clusters in those with SMI and 7 in those without SMI. A total of 4 clusters were common to those with and without SMI; while 1, heart disease, appeared as one cluster in those with SMI and 3 distinct clusters in comparators; and 2 small clusters were unique to the SMI cohort. Limitations to this study include missing data, which may have led to residual confounding, and an inability to investigate the temporal associations between SMI and physical health conditions. Conclusions In this study, we observed that physical health conditions cluster similarly in people with and without SMI, although patients with SMI had higher burden of multimorbidity, particularly in younger age groups. While interventions aimed at the general population may also be appropriate for those with SMI, there is a need for interventions aimed at better management of younger-age multimorbidity, and preventative measures focusing on diseases of younger age, and reduction of health risk factors.

for health risk factors reduced the OR of all conditions. We identified 7 multimorbidity clusters in those with SMI and 7 in those without SMI. AAU : PleasecheckwhethertheeditstothesentenceAtotalo total of 4 clusters were common to those with and without SMI; while 1, heart disease, appeared as one cluster in those with SMI and 3 distinct clusters in comparators; and 2 small clusters were unique to the SMI cohort. Limitations to this study include missing data, which may have led to residual confounding, and an inability to investigate the temporal associations between SMI and physical health conditions.

Conclusions
In this study, we observed that physical health conditions cluster similarly in people with and without SMI, although patients with SMI had higher burden of multimorbidity, particularly in younger age groups. While interventions aimed at the general population may also be appropriate for those with SMI, there is a need for interventions aimed at better management of younger-age multimorbidity, and preventative measures focusing on diseases of younger age, and reduction of health risk factors.

Author summary
Why was this study done?
• People with severe mental illness (SMI) such as schizophrenia, bipolar disorder, and other psychoses have more physical illnesses, such as diabetes or heart disease, than people without SMI.
• They are also more likely to have risk factors for poor health, such as smoking, obesity. or substance misuse.
• While we know that these patients have poorer physical health, we do not know whether the patterns of disease are the same as the general population.
What did the researchers do and find?
• We used electronic medical records to investigate how common 24 physical illnesses are in people with SMI and used a mathematical model to compare them to people who did not have SMI.
• We then investigated how common it is to have multiple physical illnesses (multimorbidity) and which diseases are commonly found together in people with and without SMI.
• We found that people with SMI had more physical health conditions than people without SMI, particularly in younger age groups.
• We also found that physical conditions cluster similarly in people with and without SMI.

Introduction
People with severe mental illness (SMI) are known to be at increased risk of a range of physical health conditions, at a younger age [1][2][3], and suffer worse outcomes related to these conditions [4]. Comorbidity has been well studied in people with SMI, and previous studies have found that people with SMI have a higher number of physical health conditions than the general population [5]. The challenges of the increased complexity of managing multiple physical health conditions [6][7][8] may disproportionally affect those with SMI, further increasing inequality in health outcomes [9,10] and increasing both secondary mental health and acute service use [11,12]. The concept of multimorbidity represents a shift from a single disease-centric approach to a more patient-centred view. Moving beyond disease pairs or counts of disease, and studying the way in which diseases and risk factors cluster within individuals, is crucial for improving patient outcomes through better diagnosis, treatment, and healthcare service provision [13,14]. There is currently not a common approach to the number or conditions studied, nor the methods used to describe multimorbidity [15,16]. The Academy of Medical Science has proposed a definition of multimorbidity that includes long-term physical health conditions, infectious diseases of long duration, and mental health conditions [17], while the National Institute for Health Care and Excellence (NICE) in England also includes risk factors for disease such as substance misuse [18].
While mental health diagnoses have been recognised as an important component of multimorbidity in the general population [6][7][8]15,19,20], there is a lack of evidence regarding the clustering of physical diseases in individuals with SMI or how profiles of physical health multimorbidity in this population compare to those without SMI.
Given the increased disease burden, poorer health outcomes, and higher mortality rate in people with SMI, it is important to characterise the disease profiles occurring in this population. We aimed to investigate the prevalence and clustering of chronic physical health conditions in people with SMI in a large national sample, compared to a matched comparator group without SMI, and investigate the impact of health risk factors in this population.
representative of the UK population [21,22]. At the time of this study, these databases contained deidentified electronic medical records for over 39 million patients. Ethical approval for this study was obtained from the Independent Scientific Advisory Committee of CPRD (protocol no. 18_288).
We included patients with a first diagnosis of SMI between 1 January 2000 and 31 December 2018 via medical codes for schizophrenia, bipolar disorder, or other nonaffective psychotic illnesses (S1 Code Lists [23]). Patients entered the cohort at the latest of registration with the primary care practice, age 18 or 1 January 2000 and exited the cohort at the earliest of end of registration, age 100, death or 31 December 2018. We excluded patients under the age of 18 at SMI diagnosis and those who had less than 1 year of active follow-up. Patients with SMI were matched to patients without SMI at a ratio of 1:1 to 1:4. Patients were matched strictly by sex, 5-year age band, primary care practice, and year of primary care practice registration and were required to be active in the database at the time of SMI diagnosis. Matching was performed by CPRD prior to receipt of the dataset.

Study design
No prospective analysis plan for this study was documented; however, we identified the study aims, designed the study and planned the analyses and sensitivity analyses a priori. Following peer review, we performed an additional sensitivity analysis to investigate the impact of multiple imputation of ethnicity and changed the matching strategy from strict 1:4 matching, to allow cases to matched to comparators at a ratio of 1:1 up to 1:4. This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist).

Outcomes
The primary outcomes were presence of physical health multimorbidity, defined as 2 or more of the studied conditions, and accumulated prevalence of 24 everdiagnosed chronic physical health conditions in people with SMI compared to the people without SMI. Diagnoses were as recorded in primary care. disorders, valvular disease, deficiency anaemia, blood loss anaemia, coagulopathy, and fluid or electrolyte disorders.

Health risk factors for physical health conditions
We conceptualised alcohol misuse, drug misuse, smoking, and obesity as health risk factors for the development of physical health conditions. We defined alcohol and drug misuse using the code lists for the Elixhauser comorbidity index [24]. We categorised body mass index (BMI) as the heaviest ever recorded of obese (BMI � 30), overweight (BMI 25 to 29.9), healthy weight (BMI 18.5 to 24.9), or underweight (BMI < 18.5), derived from specific medical code lists for obesity, recorded BMI, and BMI calculated from weight and height recording. We categorised smoking status as nonsmoker, ex-smoker, or current smoker using medical code lists, taking the most recent category and recording any nonsmokers with a historical code for smoking as ex-smokers.

Covariates
We defined age as age at the end of follow-up based on year of birth. We considered age as a continuous variable and in 10-year age groups where results were stratified by age. Sex and ethnicity were as recorded in patient medical records and ethnicity was grouped as "Asian," "Black," "Mixed," "White," or "Other," in line with UK 2011 Census Ethnic Group categories (https://www.ons.gov.uk/census/2011census/2011censusdata/2011censususerguide/ variablesandclassifications). Where multiple ethnicities existed for an individual, we selected the most frequent, and where frequencies were equal, the most recent. Region was defined as the 9 English regions as listed by the Office for National Statistics and Scotland, Wales, and Northern Ireland and was based on primary care practice postcode.

Missing data
As general practitioners are less likely to record values that are within the normal range [25,26], we coded patients with missing smoking or BMI data as nonsmoker or normal range BMI, respectively. We coded ethnicity, as recorded in primary care, as white ethnicity where this variable was missing [25]. This approach is in line with previous research using primary care data, which suggests that more than 93% of individuals without ethnicity recorded are from a white ethnic group [27]. We performed sensitivity analyses to assess the effect of coding missing ethnicity as missing rather than white and of using multiple imputation to estimate missing ethnicity.

Analysis
We determined the prevalence of individual physical health conditions and pairs of conditions (e.g., hypertension and diabetes), stratified by SMI diagnosis, age, and sex. We used logistic regression to investigate the relative prevalence of each physical health condition, first controlling for age, sex, ethnicity, and region and then for these variables plus health risk factors: smoking status, BMI category, alcohol misuse, and drug misuse. We considered a 2-sided pvalue of less than 0.05 to represent statistical significance, although due to the large number of observations, we examined effect size and confidence intervals (CIs) to interpret clinical significance.
We then undertook cluster analysis using the subset of patients with multimorbidity, stratified by presence or absence of an SMI diagnosis. We performed multiple correspondence analysis (MCA) to investigate the relationship between physical health conditions and to transform the discrete physical health conditions into continuous variables prior to cluster analysis. We then used the MCA dimensions in K-means cluster analysis to identify clusters of physical health conditions and assign individual patients to clusters. We determined the optimum number of clusters by visual inspection of both the Silhouettes and Calinski-Harabaz results. We described clusters using the variables with an observed/expected ratio of more than 1.2 or by variables for which more than 70% of patients with that variable were contained within the cluster. We then reran the MCA and cluster analysis with health risk factors included.

Results
We identified 70,855 patients with a diagnosis of SMI. Of these, 273 patients were excluded as they did not meet the age criteria, 27 due to less than 1 year's follow-up, 172 because SMI diagnosis was not within the study period, 1,571 because diagnosis was prior to age 18, and 29 because of missing practice details. Of the remaining 68,783 patients with SMI, 15,028 had a diagnosis of schizophrenia, 24,420 a diagnosis of bipolar disorder, and 29,335 a diagnosis of other psychoses. These patients were matched to 274,684 patients without SMI. A higher proportion of patients with SMI died during follow-up than comparators, and death occurred at a younger mean age (Table 1). A greater proportion of patients in the comparator group had missing information for ethnicity (43.3% versus 35.8%), smoking (7.0% versus 2.4%), and BMI (18.9% versus 9.8%) than in the SMI cohort.

Prevalence of chronic physical health conditions and multimorbidity
There was a higher prevalence of at least 1 physical health condition and multimorbidity in the SMI cohort (Table 1). When controlling for age, sex, ethnicity, and region, those with SMI were at higher risk of multimorbidity (adjusted odds ratio [aOR]: 1.84; 95% CI: 1.80 to 1.88, p < 0.001). In both cohorts, multimorbidity was more common in females and in older age groups (Fig 1). The greatest difference in prevalence of multimorbidity between those with and without SMI was in patients aged 18 to 29 in females (aOR: 2.69; 95% CI: 2.36 to 3.07; p < 0.001) and 30 to 39 in males (aOR: 2.49; 95% CI: 2.27 to 2.73; p < 0.001; Fig 1). The difference got smaller with increasing age. In those age 80 and over, the prevalence of multimorbidity was similar in patients with and without SMI ( Table 2, Fig 1). In patients aged 80 and over, patients with schizophrenia appeared at lower risk of multimorbidity compared to those without SMI (males: aOR: 0.39, 95% CI: 0.30 to 0.51, p < 0.001; females: aOR: 0.71, 95% CI: 0.60 to 0.84, p < 0.001; Table 2).
Additionally, controlling for smoking status, BMI category, alcohol misuse, and drug misuse reduced the OR for multimorbidity in patients with SMI, although it was still elevated compared to comparators (aOR: 1.40; 95% CI: 1.37 to 1.43, p < 0.001). When controlling for these additional factors, the greatest difference in multimorbidity between those with and without SMI was in those aged 18 to 29 ( Table 2).
The most common physical health conditions were hypertension, asthma, and diabetes in both SMI and comparator cohorts and when stratified by SMI diagnosis ( Table 3). The most common multimorbidity pairs were the same in both cohorts: hypertension and diabetes (SMI: 7.41%; no SMI: 6.09%) followed by hypertension and renal disease (SMI: 4.71%; no SMI: 4.95%; Fig 2).
When adjusting for age, sex, ethnicity, and region, patients with SMI had greater odds of recorded diagnoses of 19 out of 24 diseases (Fig 3, Table 3). ORs were particularly high for neurological disease (aOR: 2.92; 95% CI: 2.82 to 3.03, p < 0.001), paralysis or paresis (aOR: 1.96; 95% CI: 1.78 to 2.17, p < 0.001). and liver disease (aOR: 1.95; 95% CI: 1.85 to 2.06, p < 0.001). When stratified by SMI diagnosis, patients with schizophrenia had lower odds of recorded cardiac arrhythmia, cancer, valvular disease, rheumatoid and collagen disease, and hypertension than the comparator population, while patients with bipolar disorder had particularly high rates of hypothyroidism and fluid and electrolyte disorders (Table 3).
Patients with SMI had more health risk factors than the comparator cohort (Table 1). Obesity was particularly prevalent in those with a diagnosis of bipolar disorder (42.68%), while smoking was most prevalent in those with schizophrenia (53.75%), and alcohol and drug misuse was most prevalent in those with a diagnosis of other psychoses (14.11% and 14.68%, respectively). After adjustment for these risk factors, the ORs for all diseases in the SMI cohort reduced, in particular for liver disease, HIV, COPD, diabetes, and neurological disease (Fig 3, Table 3).

Clustering of physical health conditions and multimorbidity profiles
In MCA of patients with physical health multimorbidity (SMI cohort: 23,382 (33.99%), comparators: 70,003 (25.48%)), 16 dimensions were required to explain 70% of the variance of physical health conditions in both the SMI cohort and comparator cohort. The first 2 dimensions in MCA had similar disease profiles (S1 Fig). We identified 7 profiles of physical health multimorbidity in both those with and without SMI. The largest patient group in both populations (56.06% of the SMI population and 47.99% of the comparator population) consisted of patients with varied multimorbidity (Table 4, Fig  4). In those with SMI, a second cluster consisted of patients with a high prevalence of heart disease (7.36%), while 3 distinct heart disease clusters were found in those without SMI: one of predominantly valvular disease (5.78%), one of pulmonary circulatory disease (3.55%), and one of myocardial infarction and peripheral vascular disease (12.03%). All of these clusters were characterised by older age, and varied multimorbidity, heart disease, and valvular disease had low prevalence of health risk factors. We identified a cluster of respiratory disease in both cohorts (26.72% in those with SMI and 28.45% in comparators) associated with younger age and a higher prevalence of smoking or substance misuse and 2 small clusters of patients with blood loss anaemia and coagulopathy (Table 4).
Finally, we identified 2 small clusters unique to the SMI population and 2 unique to comparators, comprising 7.61% (n = 1,779) of the multimorbid patients in total. One cluster consisted of patients with peptic ulcer disease and the other paresis or paralysis.
When we included health risk factors in the cluster analysis, 6 of the 7 clusters were common to both those with and without SMI. The largest cluster we identified was a "general multimorbidity" cluster, accounting for 56.70% of the SMI and 70.30% of the comparator cohorts. We also identified a large cluster of in the SMI cohort defined by respiratory disease, a high  prevalence of health risk factors, male sex, neurological disease, and liver disease (17.02%). In contrast, a similar cluster identified in the comparator cohort accounted for only 4.87% of patients.

Sensitivity analyses
In sensitivity analysis, recoding missing ethnicity to "missing" did not alter the interpretation of disease prevalence (S1 Table), nor did using multiple imputation (S2 Table).

Discussion
Our study investigated physical health conditions and multimorbidity in a large cohort of patients with SMI and matched comparators. Clustering of multimorbid health conditions was not dramatically different between those with and without SMI, despite higher prevalence of many physical health conditions in the SMI cohort. Patients with a diagnosis of SMI had a higher prevalence of multimorbidity, particularly in younger age groups.

Patterns of physical health conditions and multimorbidity
To the best of our knowledge, our analysis is the first cluster analysis of multimorbidity in a large, representative cohort of patients with and without SMI and suggests that patients with SMI develop similar profiles of multimorbidity to the general population. Similarities in physical health profiles of those with and without SMI were also apparent in individual and disease pair ranking, MCA, and cluster analysis. Two previous studies have  found similarity in ranking the most frequently diagnosed conditions and pairs of conditions between those with and without SMI [1,28], and a hospital-based study of self-reported physical health conditions in 1,060 psychiatric patients and 837 members of the general population found similar profiles of multimorbidity between the 2 cohorts using latent class analysis [29]. Despite the similarities in clusters of diseases, those with SMI have a higher prevalence of physical health conditions, more risk factors for poor physical health and develop multimorbidity at a younger age. Health risk factors likely explain some of the higher risk of physical health conditions and multimorbidity in people with SMI. We found that including smoking status, BMI category, and alcohol and drug misuse in cluster analysis resulted in a higher proportion of patients in the SMI cohort being in a "health risk" cluster and that adjusting for these factors decreased the ORs of physical health conditions between SMI and comparator cohorts, particularly for liver disease, HIV, COPD, diabetes, and neurological disease.
In line with other studies [5], we found a higher prevalence of multimorbidity in women in both SMI and comparator cohorts. Higher prevalence of multimorbidity with increasing age is well described in the general population [7,20], but we found the largest differences between those with and without SMI in the younger age groups. This suggests that people with SMI develop multimorbidity earlier than the general population. A higher prevalence of multimorbidity in younger patients was also found in a study of multimorbidity in those with psychosis in lower-and middle-income countries [30]. At older ages, the similar prevalence of multimorbidity in those with and without SMI could be due to survivorship bias in the SMI cohort or due to the high background prevalence of multimorbidity at that age.

Underascertainment of physical health conditions in patients with schizophrenia
We identified lower prevalence of a range of physical health conditions in patients with schizophrenia and also lower rates of multimorbidity in older age in this population. This is surprising given the observed high prevalence of smoking, obesity, and alcohol and drug misuse, and known side effects of antipsychotic medication [9]. Lower prevalence of cardiovascular disease [31][32][33] and cancer [31] have been reported in other studies using routine primary care data, and our study corroborates this finding using a matched comparator population and controlling for both demographic and health risk factors. Underreporting is likely not due to lack of contact between primary care physicians and patients with SMI, as in the UK, annual health checks in primary care have been recommended and incentivised in this patient group since 2004. This underreporting could reflect poor access to care, underdiagnosis, or diagnostic overshadowing in the schizophrenia population. There is evidence that those with schizophrenia are more likely to have physical health conditions recorded at the time of death [34,35], suggesting late and missed diagnoses in this population, with diagnoses at the time of death less likely to be subsequently recorded in primary care records.

Strengths and limitations
To our knowledge, this study is the largest investigation of multimorbidity and clustering of physical health conditions in patients with SMI. A key strength of our study was the ability to adjust for smoking, BMI category, alcohol misuse, and drug misuse as risk factors for physical health conditions.
The large sample size of this study, and representativeness of data from CPRD [21,22], suggests that the results of this study are generalisable to the UK population with SMI. However, the population without SMI is likely not representative as they are matched to the SMI population and therefore share the population characteristics in terms of age, sex, and area of residence with the SMI population. The similarities between the populations may have diminished differences in disease prevalence between the 2 cohorts.
As with all studies using electronic health records, a limitation of this study is potential biases in recording variables. Surveillance bias may have resulted in higher disease detection in people with SMI, a population who may have more regular contact with the healthcare system, affording more opportunities for physical health conditions to be recorded. Furthermore, while the apparent underrecording of a range of physical health conditions in those with schizophrenia is a clinically important finding, it limits the interpretation of disease prevalence and multimorbidity clusters in this population.
There may be residual confounding due to missing information. For physical health conditions and risk factors, the absence of coding for a condition was assumed to mean absence of disease or risk factor. However, particularly for risk factors, some missingness may be due to lack of measurement or recording. Missing values for smoking status, ethnicity, and BMI were replaced in line with other primary care studies [25,26] and sensitivity analyses performed for ethnicity. For BMI, we were only able to include broad categories as some patients had BMI category recorded rather than a BMI value. BMI itself is an indirect measure of obesity, and its accuracy varies with age, gender, and ethnicity. This may have introduced biases into the analysis [36]. Alcohol misuse was based on medical code lists and did not account for the level of alcohol consumption, nor include patients that had consumption recorded without an accompanying alcohol misuse code. While we were unable to control for deprivation, patients were matched on primary care practice and therefore from a broadly comparable geographic area.
This study focused on physical health conditions ever diagnosed, which limits the study of temporality of diagnoses of SMI and physical health conditions. However, with both SMI and chronic physical health conditions, a prodromal stage or period of undiagnosed disease may occur, and, therefore, diagnosis dates may not give a clear indication of temporal association. Furthermore, previous studies have found higher prevalence of physical health problems [37][38][39] and health risk factors for physical health conditions such as smoking [40] and alcohol and drug misuse [41], prior to SMI diagnosis.

Implications
The absence of large novel clusters of disease in those with SMI suggests that the same drivers of physical health conditions are at play in both those with and without SMI, and, therefore, research and service provision for patients with SMI should focus on the same disease clusters as in the general population. However, while much of the focus of multimorbidity in the general population has been on old age, our study found that the largest difference in multimorbidity was at younger ages. This highlights an unmet need in terms of interventions aimed at a younger cohort of multimorbid patients and demonstrating the importance of physical health checks in this population. We found a higher prevalence of obesity, smoking, drug and alcohol misuse in this population, and adjusting for these factors reduced the ORs of many diseases. This suggests that a focus on risk factor reduction would also reduce the incidence of physical health conditions in those with SMI. Interventions to modify these risk factors, for example, via smoking or alcohol cessation support [42,43], have been shown to be effective in people with SMI and need to be more widely available.
Further work is warranted to investigate the temporality of SMI and physical health condition diagnoses, and trajectories of multimorbidity in this population. The low prevalence of some physical health conditions in the schizophrenia cohort also requires further investigation, to elucidate the reasons for this finding. Finally, the relevance of the identified clusters to outcomes such as hospitalisation and mortality, both in patients with and without SMI, is an area for future research.

Conclusions
We found that physical health conditions cluster in people with SMI in a similar manner to people without SMI. However, there is a higher prevalence of physical health conditions, physical health multimorbidity, and risk factors for poor physical health in those with SMI, and those with SMI may develop multimorbidity at a younger age. Therefore, while interventions aimed at the general population should also be applicable to those with SMI, there is a need for a greater focus on diseases of younger age, younger-age multimorbidity and of reduction of risk factors for poor physical health.