A longitudinal analysis of PM2.5 exposure and multimorbidity clusters and accumulation among adults aged 45-85 in China

While previous studies have emphasised the role of individual factors in understanding multimorbidity disparities, few have investigated contextual factors such as air pollution (AP). We first use cross-sectional latent class analysis (LCA) to assess the associations between PM2.5 exposure and multimorbidity disease clusters, and then estimate the associations between PM2.5 exposure and the development of multimorbidity longitudinally using growth curve modelling (GCM) among adults aged 45–85 in China. The results of LCA modelling suggest four latent classes representing three multimorbidity patterns (respiratory, musculoskeletal, cardio-metabolic) and one healthy pattern. The analysis shows that a 1 μg/m3 increase in cumulative exposure to PM2.5 is associated with a higher likelihood of belonging to respiratory, musculoskeletal or cardio-metabolic clusters: 2.4% (95% CI: 1.02, 1.03), 1.5% (95% CI: 1.01, 1.02) and 3.3% (95% CI: 1.03, 1.04), respectively. The GCM models show that there is a u-shaped association between PM2.5 exposure and multimorbidity, indicating that both lower and higher PM2.5 exposure is associated with increased multimorbidity levels. Higher multimorbidity in areas of low AP is explained by clustering of musculoskeletal diseases, whereas higher AP is associated with cardio-metabolic disease clusters. The study shows how multimorbidity clusters vary contextually and that PM2.5 exposure is more detrimental to health among older adults.


Introduction
The number of older adults living with multimorbidity, defined as the coexistence of two or more chronic diseases or conditions, is rising globally [1]. Research based on survey data shows a high prevalence of multimorbidity among older adults in low and middle income countries (LMICs), such as 63% in India [2], 65% in Brazil [3], 69% in South Africa [4], and 46% in China [5]. The increasing prevalence of multimorbidity is associated with worse functional ability, reduced healthy life expectancy, increased mortality, and a higher rate of hospitalisations [6][7][8], leading to a heavy burden on medical and health systems and inequalities in PLOS GLOBAL PUBLIC HEALTH PLOS Global Public Health | https://doi.org/10.1371/journal.pgph.0000520 June 29, 2022 1 / 20 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 health outcomes [9]. This is a particular challenge in rapidly ageing societies such as China, which has a high prevalence of multimorbidity among older adults [5]. The proportion of the population with multimorbidity in China, as measured by population-based panel data, was 62% for people aged 50 years and 69% for those aged over 75 years [9]. In terms of the determinants of multimorbidity, current studies highlight a range of individual factors, including demographic (e.g., age, sex and race) and socio-economic (e.g., education and income) characteristics [9][10][11][12]. However, research on possible contextual and environmental determinants of multimorbidity is less common, and in particular the role of air pollution (AP) remains poorly understood [13].
Recent research on elderly health shows that older people are more susceptible to environmental factors than younger adults [14], with higher risks of living with chronic diseases due to exposure to environmental pollution [15,16]. Furthermore, there is abundant evidence on the association between AP and individual chronic diseases, for example, cardiorespiratory disease [17], chronic obstructive pulmonary disease (COPD) [18], diabetes [19], heart disease [20], hypertension [21], and kidney diseases [22]. Although chronic diseases cluster due to shared biological or environmental risk factors [13], there is limited understanding of how AP might operate to promote accumulation of multiple chronic diseases.
Similar to many LMICs, AP is an important public health risk in China: in 2017, 81% of the Chinese population lived in regions which exceed the World Health Organisation Interim Target 1 (35 μg/m 3 ) [23]. In particular, ambient AP was estimated to be responsible for over 850,000 deaths in China in 2017 [23]. Although ambient AP has decreased markedly in the last two decades, the older population of China has spent a large proportion of the life course experiencing historically high levels of AP exposure [24,25]. China therefore suffers from a double burden of multimorbid ageing and AP, and understanding the association between them may be beneficial for development of strategies to prevent or manage chronic diseases in later life.
Evidence shows that multimorbidity prevalence is likely higher in some social groups because chronic diseases often cluster due to common risk factors, such as socioeconomic deprivation and environmental risks [13]. These risk factors for diseases clustering make it difficult to isolate the effects of AP from other factors of socioeconomic deprivation [26]. Therefore, motivations for this study are not only to understand the relationship between historic AP exposure and changes in multimorbidity, but also to explore individual-level characteristics that are associated with multimorbidity inequalities.
In this study, we analyse the associations between cumulative, historic exposure to AP as predictive of cross-sectional multimorbidity disease clusters and multimorbidity accumulation. We use large, prospective and nationally representative survey data, the China Health and Retirement Longitudinal Study (CHARLS), linked with historical satellite data on PM 2.5 exposure over 15 years. Using this novel dataset, we address a research gap for longitudinal studies of multimorbidity and provide assessment of the associations between AP and the development of multimorbidity.

Study population
Data used in this study are from three waves of the China Health and Retirement Longitudinal Study (CHARLS 2011(CHARLS , 2013(CHARLS , 2015, which is a nationally representative longitudinal survey of the middle-aged and elderly population of China, consisting of persons aged 45 years or older, as well as their spouses when possible. CHARLS used computer-assisted in-person interviews to obtain samples through four-stage stratified sampling, with an overall response rate of 80.5% at the baseline. From June 2011 and March 2012, CHARLS conducted the baseline survey (wave 1) that included assessments of the social, economic, and health circumstances of 17,705 respondents from 28 provinces, 150 cities/counties/districts, 450 communities, and 10,257 households [27]. Following wave 1, two follow-up surveys were conducted in 2013 and 2015.  https://doi.org/10.1371/journal.pgph.0000520.g001 respondents died and 1,658 attrited, and 1,017 respondents, interviewed in 2011 but missing in 2013, returned. It is noted that these refreshment samples did not participate in the first wave in 2011. Finally, the three waves of CHARLS include 24,956 respondents (57,409 observations).
We restrict our analysis to the 19,098 respondents (45,788 observations) from 125 cities who were aged 45 to 85 years old at any wave of the study. The listwise deletion process is also shown in Fig 1.

Main outcome: Multimorbidity
The 'Health status and function' module in the CHARLS questionnaire includes 14 selfreported doctor-diagnosed chronic diseases for each respondent, asking "Have you been diagnosed with the following conditions by a doctor": hypertension; dyslipidaemia; diabetes or high blood sugar; cancer or malignant tumour; chronic lung disease; liver disease; heart problems; stroke; kidney disease; stomach or other digestive diseases; emotional, nervous or psychiatric problems; memory-related disease; arthritis or rheumatism [12]. In line with previous CHARLS studies, we use a disease count approach where we summed 14 binary disease indicators (range 0-14) to capture multimorbidity [12,28].
We exclude respondents who are missing any components of these 14 indicators of chronic disease. Using this method, there are 7,469 observations with missingness on multimorbidity. Note that CHARLS 2015 contributes to over half of these (4,654 observations), partly due to survey design, because around 2,800 respondents were interviewed in the Life History Survey (in 2014) but not in previous waves (CHARLS 2011 and. In these cases, the respondents were not asked if they ever had a condition and therefore their chronic disease records are missing in 2015. We explain how we deal with missingness in the statistical analysis section.

Air pollution: PM 2.5
Many studies use ground air pollutants concentration (e.g., PM 2.5 , PM 10 , NO 2 ) from monitoring stations to measure the exposure to AP [29,30]. In China, however, most AP monitoring stations were established by the Ministry of Ecology and Environment only after 2013, limiting the ability to study long-term exposure.
Compared with ground monitoring data, satellite data with broad spatial coverage, a longterm data record, and high spatial resolutions could support the assessment of historical AP levels in developing regions. A detailed description of the ensemble machine learning model to generate our long-term PM 2.5 exposure estimates is published elsewhere [31], and summarised briefly here. Given the large modelling domain, China was first divided into seven subregions using a geographically weighted regression approach to allow the machine learning algorithms to capture different spatiotemporal patterns of PM 2.5 in each subregion caused by different terrain, weather conditions, and emission source profiles. A random forest, an extreme gradient boosting (XGBoost), and a generalized additive model (GAM) were then trained in each subregion. Their predictions of daily PM 2.5 concentration levels at 10 km 2 spatial resolution were combined using weights determined by prediction accuracy. Compared with previous models, this ensemble model provided more accurate out-of-range predictions at the daily and monthly levels. Based on the administrative regions in China, 150 cities of CHARLS are clustered to aggregate PM 2.5 concentration from satellite data at the city level. To match with CHARLS, we selected monthly PM 2.5 concentration as the temporal scale (see Table G in S1 File).
As used in previous studies, the measure of cumulative exposure is operationalised using the average mean of pollutant concentrations during the exposure window [32,33]. Chinese ambient air quality standards use two cut-offs to indicate the hazardous level of exposures, 35 μg/m 3 (Level 1) and 75 μg/m 3 (Level 2) [34]. In this study, we exploit our longer-term PM 2.5 data and calculate a more fine-grained measure: the average concentration of monthly exposure from March 2000 until the survey date, which provides a measure of historical exposure. Additionally, we initially categorise PM 2.5 exposure into six groups using 10 μg/m 3 intervals: 0-35, 36-45, 46-55, 56-65, 66-75, 76+ μg/m 3 . Due to small numbers exposed to PM 2.5 over 76 + μg/m 3 , we later collapse the last two categories resulting in five groups.

Covariates: Demographic, socioeconomic status (SES), health behaviour and regional factors
In this study, the covariates include four components: demographic, socioeconomic status (SES), behavioural, and contextual factors. Demographic variables include age, age squared, gender, and marital status (single vs. partnered). Individual SES consists of education (no schooling, primary, middle or more education), occupation (agricultural, non-agricultural, and managerial), and HuKou (rural, rural-urban, urban). CHARLS life history survey records the longest occupation during the respondents' occupational history. Agricultural jobs include farming, fishing, managing forest products or fruit trees, raising livestock, and selling these products in the market. Non-agricultural jobs include civil servants, office clerks or non-agricultural self-employment (e.g., running a restaurant or supermarket). Respondents who are in a supervisory position in their offices are considered "managers". Respondents who have never worked (e.g., "housewife", or disabled people) are sparse in the CHARLS (only 137 respondents) and are marked as missing.
HuKou is a special national household registration system in China that has two categories: rural and urban. People usually remain in the same HuKou as their parents, and once HuKou is registered, it is difficult to change even if people move. HuKou is related to occupational status, education, and health care access [35,36]. Due to the urbanisation and internal migration, people who originate in a rural HuKou increasingly live in the urban areas. Thus, considering both HuKou and current residence, there are three types: rural (rural HuKou living in rural areas), rural-urban (rural HuKou living in urban areas), and urban (urban HuKou living in urban areas). HuKou is an important feature in this study as it is strongly related to personal SES, not solely housing address [35].
Smoking status is controlled for as an important health behaviour (never smoking, former smoker, and current smoker). To account for the urbanisation and industrialisation of cities, this study includes annual regional Gross Domestic Product (GDP) at the city level (logged).

Analysis strategies
To analyse multimorbidity disease clusters, we use latent class analysis (LCA) on a cross-sectional sample of the baseline wave of CHARLS as LCA can pick up clusters of diseases shared with the common risk factors. The LCA models can identify multimorbidity patterns by assigning individuals to a set of discrete, mutually exclusive groups-latent classes-based on their responses to the 14 chronic disease indicators in the CHARLS. A sequence of 14 LCA models was estimated starting with a one-class model and increasing the number of classes in a stepwise approach. Following examples in previous studies [37, 38], the LCA model selection was based on examinations of several fit indices, including AIC, BIC and likelihood estimates. We then regressed the resulting latent class memberships on cumulative PM 2.5 exposure in 2011 using multinomial regression, adjusting for the covariates discussed above.
We use growth curve modelling (GCM) to examine the relationship between PM 2.5 exposure in the period of 2000-2015 and multimorbidity accumulation between 2011-2015. An important advantage of GCM is the ability to model the trajectories of individuals over time and distinguish within-individual from between-individual heterogeneities in estimating multimorbidity accumulation/changes shaped by other variables [39]. In this study, we use three waves of CHARLS across four years (2011-2015) of data collection. As multimorbidity is a count variable, we assume a Poisson distribution. As there is likely a non-linear relationship between PM 2.5 exposure and elderly health, we add a quadratic term of PM 2.5 exposure, as well as alternative models (detailed below) using categorisations of PM 2.5 . To examine heterogeneity in the associations of PM 2.5 exposure among different groups, we further explore the interactions between PM 2.5 exposure and age, SES (education, occupational status, HuKouresidence), and smoking status.
We conduct a number of robustness checks. First, we run the same set of LCA and GCMs models but use a categorical measure of PM 2.5 exposure to allow for a more flexible estimation of the association between AP and multimorbidity (shown in Tables B and C in S1 File). Second, we run the GCMs first on the entire sample 45-85 years, then we subset the data into ages 45-64 and 65-85 years to compare middle-aged and oldest individuals (Tables D and E in S1 File). Third, given that 7,867 observations are deleted due to missingness, we apply multiple imputation (MI) using chained equations to complete our analysis samples under the missingat-random assumption. We then use multilevel random-intercept Poisson regression to compare the results from the MI and complete datasets (Table F in S1 File).

Descriptive analysis
At baseline, the average age of respondents is approximately 59 years old. 49% are men. 39% of the population attained primary education and 34% had middle or higher education. 78% of respondents are registered with rural HuKou but 19% respondents with rural HuKou are living in urban areas; 72% worked for agricultural jobs, and 88% are married. Average multimorbidity is 1.5 at baseline, increasing to 2.08 by 2015. From 2011 to 2015, average PM 2.5 rises only slightly from 51.27 to 52.90 μg/m 3 (Table 1). In addition, Table 1 shows that descriptive statistics for both the baseline and entire period's analytical samples are very similar to the entire sample, suggesting that sample selection, including attrition, may not substantially affect results.

Latent class analysis
First, we use LCA to explore the association between PM 2.5 exposure and multimorbidity patterns at baseline (CHARLS 2011). Based on the comparisons of AIC, BIC, and likelihood estimates, the four-class model was chosen as the final model in this study (Table A and Table 2 shows the distribution of the sample across the four classes. Based on the probability distribution of chronic diseases across the classes, they are labelled: respiratory (Class 1), musculoskeletal (Class 2), cardio-metabolic (Class 3) and relatively healthy (Class 4). The label takes its name from the main diseases (items) that characterise it. For example, we labelled Class 1 as respiratory because the probability of lung diseases is 0.79, which means 79% of respondents in Class 1 suffered from lung diseases. Similarly, labels of class 2 and 3 are originated from high probabilities of arthritis/rheumatism (0.81) and hypertension (0.71).

Multinomial regression
Second, we present the results from the cross-sectional multinomial models analysing the associations between PM 2.5 and multimorbidity patterns at baseline, controlling for a set of covariates. The findings show that higher exposure to AP is associated with a higher prevalence of the other three classes of chronic diseases, compared to those who are "relatively healthy" (Table 3). Specifically, a 1 μg/m 3 increase in cumulative exposure to PM 2.5 , is associated with a higher likelihood of belonging to respiratory, musculoskeletal and cardio-metabolic cluster: 2.4% (95% CI: 1.02, 1.03), 1.5% (95% CI: 1.01, 1.02) and 3.3% (95% CI: 1.03, 1.04), respectively.
In terms of the other covariates, the regression model shows that women have a lower likelihood of belonging to any of three multimorbidity classes compared with the healthy class. Older age is associated with a higher likelihood of belonging to respiratory and cardio-metabolic clusters but with a lower likelihood in musculoskeletal cluster. The relationship between education and multimorbidity patterns is complex. Compared with respondents without schooling, those with primary education have a lower likelihood of belonging to the musculoskeletal cluster, and those with middle or higher education have a higher likelihood of belonging to cardio-metabolic cluster.
Respondents with urban HuKou have a higher likelihood of belonging to any of the three disease clusters, especially the cardio-metabolic cluster. Working in non-agricultural positions is associated with a higher likelihood of being in those three classes. Being single is associated

PLOS GLOBAL PUBLIC HEALTH
with a higher likelihood of being in the respiratory cluster but is not associated with membership in the musculoskeletal and cardio-metabolic clusters. Smoking status is associated with a higher likelihood of respiratory and cardio-metabolic clusters but not associated with the musculoskeletal cluster. Compared with non-smokers, former smokers have a higher likelihood of belonging to the cardio-metabolic cluster, whereas current smokers have a lower likelihood. Higher GDP is associated with a higher likelihood of belonging to any of the three disease clusters.

Heterogeneity in patterns
We also analyse the heterogeneity in multimorbidity cluster membership by age, gender and HuKou. Fig 2 shows the association between multimorbidity patterns and 11-year exposure to PM 2.5 , comparing middle-aged (45-65 years old) with older adults (66-85 years old). Generally, higher exposure to PM 2.5 is associated with a higher probability of belonging to respiratory and, especially, cardiometabolic clusters. Unexpectedly, lower levels of PM 2.5 are associated with higher likelihood of belonging to the musculoskeletal cluster. We can see the negative associations of PM 2.5 exposure are more substantial among the group aged 66-85, because there is a lower likelihood of belonging to the relatively healthy class but a higher likelihood in the respiratory, musculoskeletal, and cardio-metabolic clusters when PM 2.5 exposure levels are elevating.

Growth curve models
To examine the associations between cumulative PM 2.5 exposure and multimorbidity accumulation, we conduct a set of GCMs. First, we examine the linear and non-linear relationships between PM 2.5 exposure and multimorbidity by adding linear and quadratic terms of PM 2.5 exposure in GCMs ( Table 4). The significant coefficients of both PM 2.5 exposure and its quadratic term suggest a u-shaped association between PM 2.5 exposure and multimorbidity (Table 4). This means that exposure to PM 2.5 is positively associated with the likelihood of

PLOS GLOBAL PUBLIC HEALTH
In Model 4 (the full model) women have a higher prevalence of multimorbidity than men, and there is a curvilinear increase in multimorbidity over age. Unexpectedly, people with higher education and urban HuKou have a higher prevalence of multimorbidity. Occupation and partnership status is not significantly associated with the risk of multimorbidity. Compared with respondents who never smoked, former smokers have a higher prevalence of multimorbidity, but current smokers do not. There is not a significant association between GDP and multimorbidity accumulation.

Heterogeneity in multimorbidity accumulation
To explore the trajectory of multimorbidity associated with PM 2.5 exposure across age, based on Model 4 in Table 4, we interact age with PM 2.5 exposure and PM 2.5 squared; then, we plot the association between PM 2.5 exposure and multimorbidity scores in Fig 3. Generally, respondents exposed to higher PM 2.5 exposure have higher risk of multimorbidity. Over the age of 60, the respondents in the highest AP exposure categories (e.g., PM 2.5 over 80+) have steeper multimorbidity trajectories than those in lower exposure categories. However, it is only at certain older ages (75 years, for example) that this is statistically significant. There is an inverted u-shaped relationship between age and multimorbidity, indicating a higher risk in multimorbidity with ageing among adults aged under 75 but a decline in multimorbidity with ageing among those between 75-85 years old. The oldest old (aged over 75) have a lower prevalence of multimorbidity.
We conduct a set of analyses to understand heterogeneities in the association between PM 2.5 and multimorbidity among different HuKou-residence groups. Generally, we can see that there is a u-shaped relationship between PM 2.5 exposure and multimorbidity  We conduct a number of robustness checks. We use categorical PM 2.5 exposure to check the non-linear relationship between PM 2.5 exposure and multimorbidity. First, we find that compared with the relatively healthy class, higher exposure to PM 2.5 is associated with a higher prevalence of the other three classes of chronic diseases (Table B in S1 File). Second, higher exposure to PM 2.5 is associated with a higher incidence rate ratio of multimorbidity in the longitudinal analyses (Table C in S1 File). In addition, comparing findings from complete data

PLOS GLOBAL PUBLIC HEALTH
and MI data shows that results are consistent (Table F in S1 File). These sensitivity analyses indicate the robustness of our results.

Discussion
By linking the CHARLS, a nationally representative dataset, with historical PM 2.5 records derived from remote sensing technology, we investigate the associations between long-term exposure to PM 2.5 and patterns and accumulation of multimorbidity. Incorporating 15-year PM 2.5 exposure histories enables us to capture the associations between long-term exposure and chronic disease accumulation. To the best of our knowledge, this is the first study to establish a link between PM 2.5 exposure and multimorbidity patterns, and to estimate associations between cumulative exposure and the accumulation of multimorbidity longitudinally. Findings from the LCA for multimorbidity patterns suggest that higher exposure to PM 2.5 is associated with a higher risk of cardio-metabolic and respiratory multimorbidity (dominated by lung disease), whereas lower PM 2.5 exposure is associated with a higher likelihood of musculoskeletal multimorbidity. Our longitudinal GCM findings show that both lower and higher historical AP exposure is associated with faster multimorbidity accumulation. This u-shaped association may be explained by the different multimorbidity clusters at opposite ends of AP exposure spectrum, as shown in the LCA models. These estimates suggested that for many middle-income countries such as China, more efforts to reduce PM 2.5 concentrations would be associated with a substantial reduction in burden of multiple diseases. First, our LCA analyses show that the four latent classes are differentially associated with PM 2.5 exposure, which are partly in accordance with previous studies of AP and single diseases [40,41]. For example, higher exposure to PM 2.5 is associated with an increased likelihood of developing respiratory diseases and particularly, cardio-metabolic diseases (a cluster dominated by hypertension). Previous studies about associations between AP and hypertension are inconsistent, some of which find significant associations between them but others do not [32,[42][43][44]. These studies suggest that hypertension may be related to AP or caused and exacerbated by other cardiometabolic disorders (e.g., cardiovascular diseases) attributable to PM 2.5 .
One unexpected finding, partly inconsistent with previous research [45], is that higher PM 2.5 exposure is associated with a reduced likelihood of developing musculoskeletal diseases such as arthritis, or to put it another way, that those suffering with musculoskeletal multimorbidity are more likely to live in areas of low air pollution. A possible explanation is that the rural-urban difference in environmental exposures (besides PM 2.5 exposure)-like noise, green space, food environments-which makes urban residents more likely to be ill from metabolic conditions than achy joints and cartilages [32,[46][47][48]. Second, rural and urban residents may have had different occupational exposures throughout their lifetime. Our analysis shows that there are more urban residents working at non-agricultural jobs than rural residents (65% vs. 15%). It is likely that urban residents have led a more sedentary lifestyle (working in whitecollar occupations) compared with rural residents who have been employed in farming, forest, hunting and fishing that are physically demanding [49]. This might predispose them to develop musculoskeletal conditions as there is more wear and tear on the body [50, 51]. Third, although AP exposure is higher in urban areas than in rural areas in China, there are potential offsetting benefits of urban residence, such as better access to health care [52]. Urban residents may be more likely to have better healthcare and get diagnosed for conditions that are not immediately obvious (e.g., hypertension); hence, those disorders might be undiagnosed and potentially underestimated in rural residents. Fourth, in China, rural areas consume more solid fuels (e.g., coal and wood) as the major source of energy [53], which leads to more severe indoor air pollution that is associated with increased odds of arthritis [54]. Furthermore, compared with respiratory or cardio-metabolic diseases, musculoskeletal diseases (especially arthritis) might be more influenced by health care than AP [55]. This should be explored in future research by using finer measures of rural-urban residence and migration, and by explicitly investigating these potential mechanisms for disparities.
Our results of GCMs suggest that cumulative PM 2.5 exposure is associated with higher multimorbidity scores at both lower and higher levels of PM 2.5 (e.g., a u-shaped association).
Although there are few studies related to the u-shape association of environmental pollution with multimorbidity, the u-shape links between AP and health risks (e.g., hospital admissions and mortality) are well established [56]. Most AP in Chinese cities comes from industrial production and vehicle traffic, which is increasing in conjunction with economic development [57]. Due to a high proportion of industrial sectors and increasing traffic intensity in China, massive fossil fuels, especially coal, are consumed for economic development, and AP has been more and more severe [57]. This may explain why higher AP is associated with higher risk of chronic health diseases. Apart from the above contextual characteristics (urbanisation and economic development), the elevated multimorbidity at lower PM 2.5 exposure might be attributed to higher levels of musculoskeletal multimorbidity that are related to the differences in the rural-urban context (as explained above). However, due to the strong side effects of AP, higher exposure to PM 2.5 could lead to increased risk of multimorbidity once PM 2.5 levels exceed the threshold (approximately 53 μg/m 3 in our findings). The annual average PM 2.5 exposure in China in 2015 was 55.2 μg/m 3 [58], indicating that current PM 2.5 exposure is harmful to human health among the majority of Chinese adults. These findings suggest that corresponding policies regarding AP should be implemented based on the strategies of sustainable development and disease prevention.
Third, the associations of multimorbidity accumulation show unexpected links with SES (the higher SES, including higher education and urban HuKou, is associated with a higher risk of multimorbidity). These unexpected results regarding SES can be understood from two perspectives. First of all, respondents with higher SES are more likely to be urban dwellers, who are exposed to higher AP as well as engaging in less physical exercise and more harmful health behaviours [59,60]. Second, as mentioned above, this rural-urban differences in social fabric, including contextual and compositional factors associated with rural-urban residence, are perhaps fully captured by the current covariates. For example, urban residents tend to have higher educational levels and other social advantages (e.g., income, wealth, health awareness, social support etc.), as well as better access to health care. In addition, the measure of multimorbidity in this study is based on self-rated diagnosis, so respondents with higher SES might report a higher prevalence of chronic diseases [59]. The rural-urban environmental context may also contribute to these findings. Urban residents experience more environmental stressors (including noise and fast food), which lead to higher risks of multimorbidity [61,62].
In the LCA results, we find that women have a lower likelihood of belonging to multimorbid classes, which is inconsistent with our findings in the longitudinal analyses. The reason might be that the multimorbidity classes in the LCA are each dominated by one disease. In this study, the respiratory cluster is dominated by lung diseases, and cardio-metabolic and musculoskeletal clusters are dominated by hypertension and arthritis. These findings are in line with previous studies that show that the prevalence of pulmonary diseases, hypertension and arthritis, is higher in men than women in China [63][64][65]. However, when considering multiple chronic diseases, women might have higher risks in multimorbidity because Chinese women have less access to medical resources than men [12].
In additional analyses (Figs 3 and 4), we further explore the associations between PM 2.5 exposure and multimorbidity accumulation by age and HuKou-residence. First, these associations by age show that PM 2.5 is associated with a higher number of morbidities among respondents aged 45-75, whereas for respondents aged over 75, PM 2.5 is associated with lower risk of multimorbidity. This might be due to mortality selection. Older people with multiple morbidities might die younger or they are less exposed to AP. Second, we see that when PM 2.5 exposure is lower than 80 μg/m 3 , respondents with urban HuKou have a higher risk of multimorbidity accumulation than those with rural residence. As previously discussed, urban residents have more advantageous socio-economic conditions and might report a higher prevalence of multimorbidity. However, when PM 2.5 exposure exceeds 80 μg/m 3 , there is no significant difference in the associations between PM 2.5 exposure and multimorbidity among groups with different HuKou-residence. This inconsistent finding might be due to two reasons. Firstly, the insignificant difference might be due to small sample sizes. There are only 5% of respondents living in cities where PM 2.5 levels are over 80 μg/m 3 . Secondly, due to more opportunities in major cities (e.g., more high-paying jobs, better education, and more access to health care), many people with rural HuKou decide to live and work in urban areas (30% people with rural HuKou live in urban areas). This might explain why there is no rural-urban difference in areas with high exposure (over 80 μg/m 3 ).
Our study has several advantages over previous studies. First, we link the CHARLS with historical PM 2.5 records over a 15-year period, which enables us to measure long-term exposure to AP for each respondent. Second, this is a first study to analyse associations between AP and multimorbidity clusters. Nevertheless, there are several limitations in this study. First, we were unable to obtain detailed addresses of respondents from CHARLS, and thus we use city-level exposure to predict individual multimorbidity. In the Chinese context, a city might cover a large area (e.g., Beijing city) and consist of inner-city areas (more urban, more polluted) and suburb areas (more rural, less polluted). This means that we cannot accurately compare the variations within the same city. Future research should link AP at a smaller geographic scale. Second, multimorbidity is measured by self-reported doctor's diagnosis which might underestimate chronic diseases due to lower levels of diagnosis in some groups [66]. This might be related to individual characteristics (e.g., gender, age) making it less likely to seek treatment or related to reduced access to healthcare in some places. Third, our findings only indicate the associations of PM 2.5 with multimorbidity, and thus do not have a causal interpretation. We cannot rule out that our findings might be the result of other air pollutants (e.g., indoor air pollution, NO 2 , O 3 , etc), or contextual factors that are highly correlated with PM 2.5 exposure, for example, lack of green space or noise pollution. Fourth, our latent class analysis is based on 14 chronic diseases available in the CHARLS and does not therefore cover all chronic diseases. From the perspective of the association between multimorbidity and exposure to air pollution, the breadth of chronic diseases not included in the CHARLS survey makes it difficult to predict the direction of bias for each cluster. Thus, further analyses should collect a broader range of diseases to reduce the bias in disease clusters. Fifth, 13% of participants (7,469 observations) did not have complete disease-reporting data. In longitudinal analyses, we used multiple imputation to complete the dataset; however, for the LCA analyses, this was not an option. It is possible that underreporting of certain types of diseases might create a bias in how disease clusters are associated with exposure to air pollution, although it is difficult to predict that bias. Future research with more complete disease data could help to solve this issue. Finally, we measure multimorbidity accumulation over a relatively short period of 4 years (2011-2015).

Conclusion
This study provides evidence showing that higher cumulative exposure to PM 2.5 is associated with increased risks of all types of multimorbidity patterns, but especially cardio-metabolic multimorbidity, and higher multimorbidity accumulation over 15 years. Notably, areas with low AP exposure still have higher rates of multimorbidity, associated with musculoskeletal disorders. Thus, our study highlights how multimorbidity clusters vary contextually and reveals that PM 2.5 exposure is more detrimental to health among older adults. However, further research is needed to unpick the nexus of contextual and compositional factors associated with the development of chronic diseases in rural and urban settings and to detect their causal mechanisms.