Importance of applying Mixed Generalized Additive Model (MGAM) as a method for assessing the environmental health impacts: Ambient temperature and Acute Myocardial Infarction (AMI), among elderly in Shanghai, China

Association between acute myocardial infarction (AMI) morbidity and ambient temperature has been examined with generalized linear model (GLM) or generalized additive model (GAM). However, the effect size by these two methods might be biased due to the autocorrelation of time series data and arbitrary selection of degree of freedom of natural cubic splines. The present study analyzed how the climatic factors affected AMI morbidity for older adults in Shanghai with Mixed generalized additive model (MGAM) that addressed these shortcomings mentioned. Autoregressive random effect was used to model the relationship between AMI and temperature, PM10, week days and time. The degree of freedom of time was chosen based on the seasonal pattern of temperature. The performance of MGAM was compared with GAM on autocorrelation function (ACF), partial autocorrelation function (PACF) and goodness of fit. One-year predictions of AMI counts in 2011 were conducted using MGAM with the moving average. Between 2007 and 2011, MGAM adjusted the autocorrelation of AMI time series and captured the seasonal pattern after choosing the degree of freedom of time at 5. Using MGAM, results were well fitted with data in terms of both internal (R2 = 0.86) and external validity (correlation coefficient = 0.85). The risk of AMI was relatively high in low temperature (Risk ratio = 0.988 (95% CI 0.984, 0.993) for under 12°C) and decreased as temperature increased and speeded up within the temperature zone from 12°C to 26°C (Risk ratio = 0.975 (95% CI 0.971, 0.979), but it become increasing again when it is 26°C although not significantly (Risk ratio = 0.999 (95% CI 0.986, 1.012). MGAM is more appropriate than GAM in the scenario of response variable with autocorrelation and predictors with seasonal variation. The risk of AMI was comparatively higher when temperature was lower than 12°C in Shanghai as a typical representative location of subtropical climate.

autocorrelation and predictors with seasonal variation. The risk of AMI was comparatively higher when temperature was lower than 12˚C in Shanghai as a typical representative location of subtropical climate.

Introduction
Generalized linear model (GLM) and generalized additive model (GAM) are the two most commonly used statistical methods to analyze the relationship between environmental factors with epidemiological outcomes [1][2][3][4]. However, both GLM and GAM with existed model fitting framework might not appropriately fit time series data in environmental epidemiological studies.
Acute myocardial infarction (AMI) is a life-threatening condition, which affects more than 7 million individuals worldwide annually and causes over one-third of deaths in developed countries [5][6][7]. The global burden of cardiovascular diseases, including myocardial infarction will be rocketed up in developing countries due to huge population size and aging society [8][9][10].
Among the spectrum of risk factors of AMI, ambient temperature has attracted many interest of society [11,12] in the era of climate change. However, the association between ambient temperature and AMI remains unclear and inconsistent [13][14][15]. The inconsistence may be attributed to various sources of data, inconsistent AMI ascertainment, and use of different statistical methodologies [11,12].
As the response variable in a time-dependent model, AMI count is a time series data characterized with auto-correlated patterns, which does not follow the independence assumption for GLM and GAM. Also, when accounting the degree of freedom (df) in the natural spine using GAM, df is often arbitrarily set at 4 or 7 per year in previous studies [3,16,17]. As reported in our previous studies, this arbitrary rule would heighten the risk of over fitting [18,19]. Due to the weak association between ambient temperature and AMI [20,21], the estimation bias caused by arbitrary rule of GAM and GLM would be considered improper that cannot be ignored. Mixed generalized additive model (MGAM), with an autoregressive term in random effect, offers a better alternative for data analysis in environmental epidemiological study. Besides of it, we have developed a robust strategy to determine the degrees of freedom of natural splines in MGAM [19,[22][23][24].
Additionally, most of recent studies on the association between temperature and risk of AMI were conducted in high latitude areas since the cold weather triggers cardiovascular diseases of different types [25]. Very few studies have been conducted in metropolitan areas with sub-tropical climate like the Municipality of Shanghai [15].To manifest the performance of MGAM in the scenario of time series environmental epidemiological data, MGAM and GAM will be compared on the modeling of ambient temperature and AMI morbidity in Shanghai, China.

Ethical statement
As aggregated data with no personal information were involved, ethical review was exempted by the Institutional Review Board of the Public Health School, Fudan University.

Data
Shanghai is situated in the central-eastern China with north subtropical monsoon climate. It is a density metropolis with a total resident population of 19.2 million according to the 2010 national census [26].
The daily number of emergency department (ED) attendances by AMI from 2007 to 2011 were obtained from the official Medicare Database in Shanghai. All the researcher can only access to aggregated daily number of AMI by sex and age rather than any identifying AMI patient information. Under the Tenth Revision of the International Classification of Diseases (ICD-10), AMI was defined as I21 and I22.913. Since the population size in Shanghai during 2007 to 2011 was stable, daily AMI cases, rather than incidence rate of AMI, was used as the response variable in GAM or MGAM in the present study. Only AMI patients aged 65 years or above were included since we thought those subjects were vulnerable to the impact of ambient temperature. The official Medicare Database records all ED attendances among members of Shanghai's social health insurance, which included usual residents with Shanghai's household registration or persons with paid employment contract of more than six months.
Meteorological index, including daily averaged ambient temperature and relative humidity, were retrieved from the Shanghai Meteorological Bureau. Daily concentrations of particulate matter 10 micrometers or less in diameter (PM 10 ), sulphur dioxide (SO 2 ) and nitrogen dioxide (NO 2 ) were obtained from the Shanghai Environmental Monitoring Center.
The population sizes of Shanghai residents age 65+ from Jan 2007 to Jan 2012 were collected from the Shanghai Research Center on Aging.

Statistical methods
Mean, standard deviation, minimum and maximum were used to describe the count of AMI, ambient temperature, relative humidity and air pollutants for whole examined period, as well as for all seasons. Pearson correlation were applied for the correlation among ambient temperature, relative humidity and air pollutants. Spearman correlation were applied for the correlation between ambient temperature and count of AMI.
GAM and MGAM [27] were used to analyze the statistical association between ambient temperature, relative humidity, air pollutants, week days and AMI morbidity. The detailed methodology was described elsewhere [22,23] and also in the supplement.
The most important step was using GAM to estimate the degree of freedom of the natural spline function for time, NS(t, df t ). The seasonal pattern of temperature effect was used to determine suitable degree of freedom to control those unmeasured factors in (NS(t, df t )) and thus achieve the unbiased estimation of temperature effect. After the selection of df of time, the Akaike information criterion was applied to determine the degree of freedom of other factors such as df temp and df pm10 . The coefficients and weights for temperature were estimated by maximum partial likelihood using Newton's Method. Variable selection was based on the statistical significance (p value) and professional rationale. The model with ambient temperature and PM 10 as independent variables was selected based on the Akaike information criterion using GAM. After the modeling of spline, risk ratio and confidence interval was approximately calculated for 3 temperature zones based on the change point of spline.
The four seasons were classified as spring (March to May), summer (June to August), autumn (September to November) and winter (December to February) as usual [19].
All statistical analyses were conducted using R software (version 3.6.2) and p<0.05 was considered as statistically significant.
The ambient temperature within the study period was with daily mean temperature around 17.3˚C and 17.0˚C in spring, 27.7˚C in summer, 19.8˚C in autumn and 6.0˚C in winter. The mean of PM 10 within the study period was around 80.5 μg/m 3 and 88.6 μg/m 3 in spring, 64.6 μg/ m 3 in summer, 75.3 μg/m 3 in autumn and 90.5 μg/m 3 in winter ( Table 2). The description of relative humidity, SO 2 , NO 2 was also listed in Table 2. The ambient temperature, relative humidity and air pollutants were correlated with statistical significance (Table 3). Daily numbers of AMI had negative correlation with temperature (Spearman correlation = −0.4, p<0.05, Fig 1).

Model fitting
For both female and male AMI counts, df t = 5 and df temp = 5 were determined using the methods described in the previous section. The degree of freedom of natural spline was selected because the MGAM model we adopted shown seasonal pattern of temperature for both female and male (Fig 2).
From the residual plots of ACF and PACF, the autocorrelation and partial autocorrelation coefficients of GAM exceeded the uncorrelated criteria 0.10 for some nonzero lags (Fig 3). However, the autocorrelation and partial autocorrelation coefficients of MGAM did not exceed 0.10 for all nonzero lags (Fig 3) with the autocorrelation order p = 2.
If we arbitrarily set the degree of freedom of natural spline at 4 per year, this equals to 20 for a 5-year period from 2007 to 2011. With the increasing of df, GAM handled the autocorrelation issue better than df = 5 (Fig 4 compared with Fig 3). But no seasonal pattern temperature would be found which indicated over-fitting (Fig 5).
The degree of freedom in the spline function was 5 for both time and temperature, and the order of the autocorrelation was 2. The models derived by MGAM were well fitted for the female AMI data (R 2 = 0.860) and male AMI data (R 2 = 0.856) (Figs 6 and 7).

Association between ambient temperature and AMI counts
Since both genders shared similar patterns, only the association between ambient temperature and AMI counts of female was presented here. The relationship between the risk of AMI with

PLOS ONE
Applying Mixed Generalized Additive Model (MGAM) as a method for assessing the environmental health impacts ambient temperature was manifested as a mirror image of J-shape curve within the range of temperature in this study. The risk of AMI was relatively high in low temperature (Risk ratio = 0.988 (95% CI 0.984, 0.993) for under 12˚C) and decreased as temperature increased and speeded up within the temperature zone from 12˚C to 26˚C (Risk ratio = 0.975 (95% CI 0.971, 0.979), but it become increasing again when it is 26˚C although not significantly (Risk ratio = 0.999 (95% CI 0.986, 1.012) (Fig 8A). For GAM model with df set at 4 per year, equal to df = 20 for 5 years, no association could be found (Fig 8B).

PLOS ONE
Applying Mixed Generalized Additive Model (MGAM) as a method for assessing the environmental health impacts

Discussion
MGAM used in this present study successfully incorporated the autocorrelation effect of daily AMI cases in consecutive days. Concurrently, we proposed a robust strategy to select the degree of freedom on the time series predictor, which is controversial in other widely used GLM or GAM [1][2][3][4]. The time effect (NS(t, df t )) in our model was used to control those  On the other hand, specific lag effect chosen subjectively or by time series methods for forecasting [12][13][14] have risk on over-fitting (Fig 5) which could erase the association between exposure and disease (Fig 8) In the present study, weighted average of daily mean temperature in the past 7 days was used to model the lag effect of ambient temperature with AMI. Our model was more appropriate as it does not only take into account the effect of average temperature, but also the impulse impact of some particular temperature [19,22,23].
Our study indicates that in general, lower temperature is a risk factor for the incidence of acute myocardial infarction among elderly in Shanghai. Both older men and older women had mirror image J-shape association between ambient temperature and the incidence of AMI. When temperature stayed below 12˚C, incidence of AMI was relatively high, and as the temperature increased above 12˚C the risk decreased. However, the trend was reversed when the temperature exceeded 26˚C, AMI incidence seems increasing as the temperature rose. To our knowledge, this is the first study on this topic in a warm climate city using an appropriate statistical method, MGAM. Results of this study were consistent with previous findings from Hong Kong and Taiwan, where temperature below a threshold of 24˚C was significantly associated with AMI hospitalization but no significant heat effects were found [15]. Low temperature exposure as a risk factor for AMI were also found in the Worcester metropolitan area, MA, US [14] and in Belgium [28]. A 10-year longitudinal study also found that rates of myocardial infarction events decreased with increasing atmospheric temperature [29]. However, low temperature exposure was not found to be a major triggering factor of myocardial infarctions in cold area like Sweden [30] and Minnesota, US [31]. The U-shape association between temperature and risk of AMI was found in Korea [13] and the Hunter Region of New South Wales in Australia [32]. Inferred from these studies, the association between ambient temperature with AMI might has various pattern in different regions. Residents of different regions might have already adapted to their habitat's weather, but remain sensitive to stress caused by extreme temperature. In an analysis on 21 countries registry database, rates of coronary events increased during comparatively cold periods, especially in warm area [33]. These findings indicate that both the normal range of temperature and extreme weather of the specific area should be considered in the analysis of the impact of temperature on human health. In this era of climate change and extreme weather, residents and policy maker should prepare for it, especially for vulnerable population.

PLOS ONE
Applying Mixed Generalized Additive Model (MGAM) as a method for assessing the environmental health impacts Several possible mechanisms can explain why low temperature exposure could increase the risk of AMI. Cold stress is known to result in vasoconstriction and blood pressure rises [34,35] which is one of the most prominent risk factors for myocardial infarction, especially in patients with essential hypertension [36]. Additionally, increased sympathetic nervous activity and an increased load of sodium presented to the kidney for excretion in winter may also increase the risk of AMI [36]. Low temperature exposure may also impact the haemostatic system by increasing platelet counts and its sensitivity in whole blood and hence the plasma viscosity [34,35].
Humidity, air pollutants like SO 2 , NO 2 , especially PM 10 were found to be a risk factor for disease like AMI. However, these factors were not significant in our MGAM model. It might be explained by the correlation between these factors and ambient temperature in our study.
Some limitations should be considered for the present study. First is the ecological fallacy. Although the natural cubic spline (NS(t)) controlled the potential effect of some unmeasured factors, but it is impossible to completely rule out the bias in the ecologic study design. Secondly, PM 2.5 was commonly used as a confounder in previous environmental epidemiology studies. However, this variable was not available in the current study. Third, there was potential disease misclassification for AMI despite the chance is low. Fourth, we can't do stratified analysis by subtypes of ST elevation MI (STEMI) and NSTEMI. Fifth, we use the AMI cases reached for the emergency visits reported in the Official Medicare Database. It will miss some mild AMI or some severe AMI cause sudden death.

Conclusion
MGAM is more appropriate than GAM for time series studies of environmental impact on health, such as effects of ambient temperature on acute myocardial infarction morbidity. Low temperature less than 12˚C is a risk factor of AMI morbidity in north subtropical monsoon climate. Our finding enhanced the knowledge about association between the change in temperature and incidence of AMI.