Air Pollution and Acute Myocardial Infarction Hospital Admission in Alberta, Canada: A Three-Step Procedure Case-Crossover Study

Adverse associations between air pollution and myocardial infarction (MI) are widely reported in medical literature. However, inconsistency and sensitivity of the findings are still big concerns. An exploratory investigation was undertaken to examine associations between air pollutants and risk of acute MI (AMI) hospitalization in Alberta, Canada. A time stratified case-crossover design was used to assess the transient effect of five air pollutants (carbon monoxide (CO), nitrogen dioxide (NO2), nitric oxide (NO), ozone (O3) and particulate matter with an aerodynamic diameter ≤2.5 (PM2.5)) on the risk of AMI hospitalization over the period 1999–2009. Subgroups were predefined to see if any susceptible group of individuals existed. A three-step procedure, including univariate analysis, multivariate analysis, and bootstrap model averaging, was used. The multivariate analysis was used in an effort to address adjustment uncertainty; whereas the bootstrap technique was used as a way to account for regression model uncertainty. There were 25,894 AMI hospital admissions during the 11-year period. Estimating health effects that are properly adjusted for all possible confounding factors and accounting for model uncertainty are important for making interpretations of air pollution–health effect associations. The most robust findings included: (1) only 1-day lag NO2 concentrations (6-, 12- or 24-hour average), but not those of CO, NO, O3 or PM2.5, were associated with an elevated risk of AMI hospitalization; (2) evidence was suggested for an effect of elevated risk of hospitalization for NSTEMI (Non-ST Segment Elevation Myocardial Infarction), but not for STEMI (ST segment elevation myocardial infarction); and (3) susceptible subgroups included elders (age ≥65) and elders with hypertension. As this was only an exploratory study there is a need to replicate these findings with other methodologies and datasets.


Introduction
Adverse associations between air pollution and myocardial infarction (MI) are widely reported in medical literature  and a recent systematic review [31] concluded significant associations with MI short-term risk increase for all pollutants except ozone. While part of the associations are to some extent apparent, mechanisms underlying these associations are not completely understood [20]. Associations of health effects (particularly short-term) with air pollutants are often relatively small [32]; and these types of studies can often have unfavorable signal-to-noise ratios, and substantial correlations among both the exposures and the potential confounders [33]. The health effects can be confounded by study design; lack of sufficient adjustment for covariates; and flexibility in data collection, defining and quantifying exposure, and analysis and reporting [34][35][36] and lead to variable results.
To highlight this, we summarized case-crossover study designs cited in PubMed before Sep 23 2014 that reported associations between ambient particulate matter (PM) and MI ( Table A in S1 File). A number of observations are made from Table A in S1 File: (i) most studies used univariate or simple models adjusted/matched with selected meteorological factors (typically relative humidity and temperature), and only a few studies were adjusted with a second air pollutant; (ii) the findings did not always agree with each other across studies; and (iii) negative/ protective effects were not reported. Of eleven studies in Table A in S1 File investigating associations between fine particulate matter (PM 2.5 ) and MI, three studies found no associations [1,12,21], one of which included a very large sample of 452,343 MI cases [1]; and eight studies reported associations [3][4][5]7,15,18,19,23].
We undertook an exploratory study examining associations between air pollutants and acute myocardial infarction (AMI) hospital admission in Alberta, Canada. Of the various ways described above in which these investigations can be confounded, we attempted to address issues of i) adjustment uncertainty or lack of adjustment with co-pollutants commonly present in the atmosphere and other meteorological variables (e.g., wind speed) [33,37], and ii) and regression model uncertainty [38]. For this we developed a three-step procedure-nonparametric univariate (simple model) testing, multivariate logistic regression analysis fully adjusted for co-pollutant and meteorological variables, and bootstrap model averaging. The multivariate analysis was used in an effort to address adjustment uncertainty; whereas the bootstrap technique was used as a mechanism to account for uncertainty in our regression model. We applied this procedure to data in Alberta spanning the period April 1, 1999 to March 31, 2010, searching in a large amount of candidates for potential associations between air pollutants and AMI hospital admission.

Health Administrative Data
Using Alberta Health administrative databases, a province-provider-registry system in Alberta, we obtained all de-identified historical patient records with a primary diagnosis code of AMI, International Classification of Diseases, version 10 (ICD-10), code I21-I22, or version 9 (ICD-9), code 410. The resulting cohort with 25,894 patients was defined as: patient with his/her first AMI admission event during April 1, 1999 to March 31, 2010; aged 20 or over and resident of Alberta during AMI event period; living 15 km or less to the closest effective air pollution monitoring station in Alberta; and living 50 km or less to the closest effective meteorological monitoring station in Alberta.
Sex and age (at the start date of an AMI hospitalization event) were used to define four subcohorts (sub-cohorts of Male and Female, and sub-cohorts of Agecat1 and Agecat2 corresponding to patients with age <65 or patients with age 65). Patients in the main cohort or in one of the four sub-cohorts were further divided into subgroups defined by AMI type or comorbidities, including: all patients in the cohort or sub-cohort, patients with NSTEMI, patients with STEMI, patients with hypertension, patients with diabetes, patients with dysrhythmia, and patients with a prehistory of heart disease. Table 1 lists sample size for each of these groups.

Air Pollutant and Meteorological Data
Air pollution data for Alberta were obtained from the Environment Canada National Ambient Pollution Surveillance (NAPS) database [39] for the January 1999 to December 2010 period and linked to patients in the cohort. The NAPS database contains quality assured data compiled by Environment Canada for air monitoring stations across Canada. Station locations in Alberta are shown in Fig A in S1 File (left panel). Hourly records of five criteria air pollutants from a total of 65 monitor stations in Alberta were available during the study period-carbon monoxide (CO, with 14 stations), nitric oxide (NO, with 51 stations), nitrogen dioxide (NO 2 , with 51 stations), ozone (O 3 , with 41 stations), and particulate matter with an aerodynamic diameter 2.5 (PM 2.5 , with 40 stations). We were unable use sulfur dioxide (SO 2 ) as a covariate because of limited availability of pollutant records. SO 2 is recognized as an important criteria air pollutant in urban areas and an earlier study reported associations with acute myocardial infarction hospital admissions London [40]. However our initial sample size before linkage with SO 2 data was 25,895 hospitalizations (Table 1), and after linkage with available SO 2 data from the NAPS database [39] we only had 18,011 hospitalizations (69.55% of the initial sample). This was too small to permit analysis of subgroups. Our main interest was in exploring the usefulness of the three-step procedure to address adjustment uncertainty and regression model uncertainty. From hourly concentration data for each of the other air pollutants we calculated and used five concentration variables to represent each air pollutant for each day: daily average (i.e., 24-hour average), 6-hour average for the hours 07:00 to 10:00 and 17:00 to 20:00, 12-hour average for the hours 07:00 to 19:00, daily 1-hour maximum and daily 1-hour minimum.
Daily meteorological data were obtained from the United States National Climatic Data Center (NCDC) [41]. NCDC provides historical daily meteorological records for air temperature (daily average, minimum and maximum temperature, in°C), daily average dew point temperature (°C), and daily average wind speed (in meter per hour). Historical records from 209 meteorological monitoring stations in Alberta were available for the study period. Locations of these stations in Alberta are shown in Fig A in S1 File (right panel). We used five variables to represent meteorological data for each day: minimum-and maximum-temperature, apparenttemperature [42], average dew point temperature and average wind speed. We adopted the following procedure to link patients in the cohort to air pollution data: (1) latitude and longitude of both the postal-codes of patients and NAPS stations were used to calculate a matrix of distance from each postal-code to each station; (2) a list of stations (up to 20) within 15 km was found for each postal-code, ordered from closest to farthest if the list was not null; (3) each patient was linked to a list of stations via postal-code; (4) checking in the list of stations from the first to the last, available NAPS records were found for each patient dated the same month with the exposure date (defined as 0-5 th day before the onset day); and (5) patients without NAPS records were eliminated. The linkage procedure was conducted separately for each pollutant because each had a different set of monitoring stations. Patients without records for any one of the five pollutants were eliminated. A similar strategy was used to link patients to meteorological data. After linkage to air pollution and meteorological data, some patients still had a few (or a small fraction) missing records for some air pollutant and meteorological variables. About 3.01% of patients had at least one missing record in the 25 NAPS variables that we used. About 0.01% of patients had at least one missing record in the 5 NCDC meteorological variables that we used. A linear interpolation method was used for imputation of these missing records.

Ethics Statement
Ethical approval for the study was granted by the University of Alberta's Health Research Ethics Board-Health Panel (IREB Pro00010852). Patient records/information was anonymized and de-identified with a unique scrambled ID and released to us in this form by the ministry of health prior to analysis.

Statistical Analysis
We used a case-crossover design [43] with the k th day (k ranging from 0 to 5) before onset of an AMI hospitalization event as the case exposure period for a patient in the cohort. For selection of reference periods using a time-stratified reference-selection design [44,45], the whole study period was stratified into calendar months, and all days in the same year, same month and matching weekday of the hazard exposure day were selected as controls. This strategy is reported as a preferred approach for minimizing confounding by time-trend as well as overlap bias [46].
The following three-step procedure was used for searching for associations. In Step 1 we initially searched a total of 5,250 candidate variables in an univariate analysis-defined by 5 cohorts, 7 subgroups, 5 pollutants, 5 types of daily pollutant concentrations and 6 different lag times (0-to 5-day lag)-for potentially significant associations for AMI hospitalization. Searching was done one-at-a-time using the nonparametric Wilcoxon test and a p-value 0.1 as the reference point to identify candidate variables for Step 2.
When estimating health effects from air pollution, it is important to properly adjust for all confounding factors, including exposure to co-pollutants [33,37]. In Step 2 a multivariate conditional logistic regression model was built for each candidate variable of interest. The model was fully adjusted for 20 pollution variables from other pollutants and the 5 metrological variables. A stepwise selection procedure was adopted to eliminate redundant variables and critical level for a variable entry and critical level for a variable stay were both set at 0.25. Coefficient estimation and OR estimation were calculated using the interquartile difference (i.e., difference between the 25th and the 75th percentile concentration) for the candidate variable of interest. Only if a candidate variable had p-value 0.05, was it selected into Step 3.
Model uncertainty is an important issue in the interpretation of air pollutant-acute health effect associations [38,47]. An important air pollutant-acute health event association should be replicable for multiple datasets. The bootstrap technique is a computer-intensive resampling method that provides a direct computational way of testing this assumption and assessing model uncertainty by repeated sampling from a set of data [48]. We used the bootstrap technique in Step 3 as a data perturbation method to create a set of 1,000 'similar data environments' and then performed multivariate analysis described in Step 2 1,000 times. Medians from the 1,000 multivariate logistic regression models were used to represent central tendency for statistical parameters of interest. We reported the frequency (number of times) that a candidate variable of interest from Step 2 had p-value 0.05 for the 1,000 multivariate logistic regression model replications as a simple measure of how robust/reliable an association is. Like Bayesian model averaging [35], bootstrap model averaging incorporates model uncertainty that results from searching through a set of candidate models and results are obtained easier than through Bayesian analysis [48]. In our case, model averaging was only performed on those variables identified as being potentially important after full adjustment for possible confounders. Further comparative analysis was employed to confirm some of the suggested robust associations found in the three-step procedure. All related analyses were conducted in SAS (release 9.3; SAS Institute, Cary, NC).

Results
There were a total 25,894 hospital admission records for AMI (average of 6.45 hospital admission events per day). Fig 1 displays a summary of monthly average frequency of AMI hospitalizations and monthly average levels of the air pollutants at Alberta NAPS air monitoring stations over the study period. Obvious seasonal trends are apparent for several of the air pollutants. Much higher (lower) NO and NO 2 levels occur during winter (summer) which is opposite to that of O 3 , which has lower (higher) levels occurring during winter (summer). The highest monthly PM 2.5 levels occur during the summer period (mid-June to mid-September). As indicated in Fig 1, the monthly frequency of AMI hospitalizations averaged over the study period did not differ substantially compared to monthly levels of NO 2 , NO and O 3 averaged over the study period. While this figure shows that AMI hospitalizations are insensitive to levels of NO 2 , NO and O 3 averaged over the study period, concentrations used to represent each of these pollutants in the analysis were related to various daily levels (i.e., 24-hour, 6-hour and 12-hour averages, and daily 1-hour maximum and 1-hour minimum levels). After Step 1 univariate analysis there were 192 of the 5,250 candidate variables with p-values 0.1 (Table B in S1 File). After Step 2 multivariate analysis with full adjustment there were only 37 variables with p-value 0.05 from the list of 192 (Table 2), reflecting many of the variables exhibiting only weak associations with AMI prior to full adjustment. Results from Step 3 bootstrap model averaging are shown in Table 3. The measure of effect size based on median p-value weakened after bootstrapping for most of the variables (26 of 37) in Table 3. More importantly, 9 of 37 variables no longer had p-value 0.05 after bootstrapping; illustrating the importance of controlling for model uncertainty. The frequency that a variable had p-value 0.05 from the 1,000 model replications for each of the 37 variables is reported in the last column of Table 3. The lowest frequency was 87 which, although statistically significant, is not at all suggestive of a robust finding.
We further identified those variables with positive associations and a bootstrap frequency over 700 as the most robust findings of the study (summarized in Fig 2). From these positive robust associations we observed: (1) only 1-day lag NO 2 concentrations (6-, 12-or 24-hour average), but not those of CO, NO, O 3 or PM 2.5 , were associated with an elevated risk of AMI hospitalization; (2) evidence was suggested for an effect of elevated risk of hospitalization for NSTEMI, but not for STEMI from increased NO 2 concentrations; and (3) subgroups susceptible to increased NO 2 concentrations included elders (age 65) and elders with hypertension. The most robust association (bootstrap frequency 935) was for an IQR increase of 34.2 μg/m 3 in the 6-hour average NO 2 concentrations 1-day before that elevated risk of AMI hospitalization 9.2% (95% CI 3.9% to 14.8%).
Effects of the five measures of NO 2 (with 1-day lag) in the four subgroups defined by age categories (agecat1 and 2) and AMI type (STEMI and NSTEMI) are compared in Fig 3 for OR results estimated from bootstrapping (Step 3). NO 2 concentration increases were mainly associated with NSTEMI, instead of STEMI; and elders (age65) were suggested be more susceptible to increased NO 2 concentration. Effects of the five measures of NO 2 (with 1-day lag) in the four subgroups defined by age and hypertension conditions-with (HTN) or without (NO-HTN)-are also compared in Fig 4 for OR results estimated from bootstrapping. Fig 4 further supports that NO 2 concentration increases were associated with hypertension for elders (age65).     The largest air pollutant dataset used in our analysis was for NO 2 (from 51 stations). Because of this, NO 2 can be considered as a proxy air pollutant to assess spatial variation in exposure to the ambient air pollutant mixture. Our most robust air pollutant associations with AMI were for NO 2 and this may be because NO 2 is the most representative air pollutant for exposure assessment. To better understand whether our findings were sensitive to the size of the cohort, we undertook further analysis using two smaller distance categories to identify patients hospitalized that were living within 5 km and within 10 km of the closest effective air monitoring station and then compared results to our original analysis that used patients hospitalized that were living within 15 km of the closest effective air monitoring station. Specifically, using the same data we linked patients living within 5 km (and 10 km) of the closest effective air monitoring station to air pollution data and then performed our three-step procedure for each distance category cohort.
The 5-km distance category had the smallest cohort size (13,071 hospitalization records), while the 10-km distance category included another 9,127 records (total 22,198) (see Table C in S1 File) compared to the original of 25,894 hospitalization records for all patients within a 15-km distance of the closest effective air monitoring station (Table 1). Bootstrap model averaging results for cohorts in the 5-km and 10-km distance categories (see Table D in S1 File) indicated that sample size influenced our findings. The smallest cohort within 5 km (13,071 hospitalization records) suggested positive associations with O 3 and an elevated risk of AMI hospitalizations for bootstrap frequencies over 700; whereas these associations were lost for the larger sample sizes within 10 km and 15 km (22,198 and 25,894 hospitalization records, respectively). In addition, the only variable suggesting positive associations for bootstrap frequencies over 700 for the within 10-km samples (Table D in S1 File) was NO 2 -similar to what was found in the original analysis (Table 3). Whereas there was a large increment in sample size going from patients living within 5 km to patients living within 10 km of an effective air monitoring station (70% increase), there was only a small increment in sample size going from patients living within 10 km to our original cohort (<17% increase). Thus a similar finding of positive associations with NO 2 for bootstrap frequencies over 700 for patients living within 10 km compared to our original analysis was expected. Air Pollution and AMI Hospitalization in Alberta

Discussion
The most robust results in our study, after controlling for adjustment uncertainty and model uncertainty, suggest that only NO 2 -but not the other air pollutant investigated including CO, NO, O 3 or PM 2.5 -was associated with elevated risk of NSTEMI hospitalization. These findings are consistent with a recent observational study of a very large sample of 452,343 MI cases [1] and several other studies [11,[17][18][19]22]. The suggested finding of NSTEMI, not STEMI, associated with increasing NO 2 concentrations is contrary with findings of others [3] in which it was reported that ambient fine particulate air matter (PM 2.5 ) triggers STEMI, but not NSTEMI.
Our findings also suggest that elders (age 65) with hypertension are more susceptible. Despite numerous studies indicating elders to be more susceptible to increased air pollution [3,6,11,13,19,20,22,29], only one study [3] indicated people with hypertension could be more susceptible to increased particle concentrations. To the best of our knowledge, we have not seen finding like ours in literature about elders with hypertension being at increased risk to NO 2 pollution. Because studies of air pollution effects on MI in people with hypertension are unusual, it is hard to compare our results with previous research. It is also unclear what mechanisms of action may be behind this suggested effect. Finally, our study did not see evidence that people age <65 years or those with pre-existing diabetes/dysrhythmia/prehistory of heart disease were susceptible to increased pollution.
We caution readers about preliminary findings suggesting associations of NO 2 with elevated risk of NSTEMI hospitalization and AMI hospitalizations of elders (age 65) with hypertension. This was only an exploratory study and the emphasis was on application of a methodology to address adjustment and model uncertainty. There is clearly a need to replicate these preliminary findings using other approaches and/or different datasets in order to corroborate the suggested associations with NO 2 .
The general lack of a robust air pollution effect on risk of AMI (especially STEMI) in our analysis is not unexpected. Although a recent systematic review reported that most air pollutants were associated with increased short-term risk for MI [31], a previous review [49] indicated that less than half of the identified studies found clear evidence of raised MI risk from air pollutant exposure. Also, the fully adjusted associations in our analysis may be very different from those estimated with limited adjustment. For example, eight studies reported in Table A in S1 File associated increasing PM 2.5 concentrations with increased risk for MI [3][4][5]7,15,18,19,23]; whereas in our analysis-which included full adjustment and model averaging -the effects of PM 2.5 were negative (Table 3). We fully agree with recommendations of others [33,37] about the importance of focusing on estimating health effects that are properly adjusted for all confounding factors. We also highlight the need to consider controlling for model uncertainty as we found that 9 of 37 fully-adjusted associations lost statistical significance after bootstrapping in our analysis.
Our exploratory study had a number of strengths that address important limitations in the investigation of subtle air pollution-acute health effect associations. We used large air pollutant and meteorological datasets consisting of multiple locations and 11 years of AMI hospitalization records collected throughout Alberta. We only considered patients living 15 km or less to the closest effective air pollution monitoring station in Alberta. Most importantly, confounding from lack of adjustment and model uncertainty were examined by using fully adjusted models and bootstrap model averaging, and only the most robust findings were considered important. This was an ecological study with the exposure variables (air pollutants and meteorological variables) measured at central locations, and thus they do not represent actual exposures for AMI patients. Although each of the multivariate models was adjusted with possible air pollutant and meteorological confounders, because of data limitations we did not consider other potentially important time-varying factors such as SO 2 , special events (e.g., alcohol consumption, physical activity) or special drug usage just prior to the onset of AMI. Because SO 2 may be an important criteria air pollutant in urban areas, further study of potential associations between air pollutants, including SO 2 , and AMI hospital admission is suggested.

Conclusions
Estimating health effects that are properly adjusted for all possible confounding factors and accounting for model uncertainty are important for making interpretations of air pollutionhealth effect associations. The most robust statistical associations in our analysis were suggested for increasing 6-, 12-or 24-hour average concentrations of NO 2 with 1-day lag and hospitalization for NSTEMI in Alberta. In addition, elderly people with hypertension were suggested to be at increased risk. As this was only an exploratory study there is a need to replicate these findings with other methodologies and datasets.