Genetic Variants in the Bone Morphogenic Protein Gene Family Modify the Association between Residential Exposure to Traffic and Peripheral Arterial Disease

There is a growing literature indicating that genetic variants modify many of the associations between environmental exposures and clinical outcomes, potentially by increasing susceptibility to these exposures. However, genome-scale investigations of these interactions have been rarely performed particularly in the case of air pollution exposures. We performed race-stratified genome-wide gene-environment interaction association studies on European-American (EA, N = 1623) and African-American (AA, N = 554) cohorts to investigate the joint influence of common single nucleotide polymorphisms (SNPs) and residential exposure to traffic (“traffic exposure”)—a recognized vascular disease risk factor—on peripheral arterial disease (PAD). Traffic exposure was estimated via the distance from the primary residence to the nearest major roadway, defined as the nearest limited access highways or major arterial. The rs755249-traffic exposure interaction was associated with PAD at a genome-wide significant level (P = 2.29x10-8) in European-Americans. Rs755249 is located in the 3’ untranslated region of BMP8A, a member of the bone morphogenic protein (BMP) gene family. Further investigation revealed several variants in BMP genes associated with PAD via an interaction with traffic exposure in both the EA and AA cohorts; this included interactions with non-synonymous variants in BMP2, which is regulated by air pollution exposure. The BMP family of genes is linked to vascular growth and calcification and is a novel gene family for the study of PAD pathophysiology. Further investigation of BMP8A using the Genotype Tissue Expression Database revealed multiple variants with nominally significant (P < 0.05) interaction P-values in our EA cohort were significant BMP8A eQTLs in tissue types highlight relevant for PAD such as rs755249 (tibial nerve, eQTL P = 3.6x10-6) and rs1180341 (tibial artery, eQTL P = 5.3x10-6). Together these results reveal a novel gene, and possibly gene family, associated with PAD via an interaction with traffic air pollution exposure. These results also highlight the potential for interactions studies, particularly at the genome scale, to reveal novel biology linking environmental exposures to clinical outcomes.


Introduction
Given the more than 255 million registered highway vehicles in the United States [1], http:// www.bts.gov/publications/national_transportation_statistics/html/table_01_11.htmltrafficrelatedair pollution is a ubiquitous environmental exposure.Air pollution in general, and traffic-related air pollution in particular, is associated with adverse cardiovascular disease outcomes, including peripheral arterial disease (PAD) [2].PAD is characterized by occlusive atherosclerosis in the peripheral arteries, principally the lower extremities, and affects approximately 4.6% of the population [3].Complications from PAD include limb ischemia, infection, gangrene, and peripheral limb amputation, and PAD is a predictor of both all-cause and cardiovascular mortality [4].Often estimated via the distance between a primary residence and the nearest higher-use roadway, residential exposure to traffic-related air pollution ("traffic exposure") is associated with increased circulating angiogenic cells [5], PAD [6,7], deep vein thrombosis [8], incident coronary heart disease [9], and mortality [10,11].
PAD has a strong genetic component [12,13], and studies show that gene-environment interactions play a role in cardiovascular disease risk [14].These gene-environment interactions can arise from a number of biological models.Ottman outlined five models that together encompass the possible biological underpinnings of gene-environment interactions, along with observed examples for each model [15].All of these models could manifest as a traditional statistical multiplicative interaction and perhaps the most relevant model for air pollution is her "model B" where a genotype exacerbates the effect of a risk factor on a clinical outcome, e.g.genotypes exacerbating the effect of traffic air pollution on PAD.This interaction model has been previously shown to be the case with air pollution-GSTM1 variants and number of clinical outcomes [16,17].However, despite a clear biological basis for gene-environment interactions and several observed gene-air pollution interactions few to no genome-wide interaction studies have been done.
To date the study of gene-environment interactions and PAD has been limited.In a 2008 study, a genetic variant in a gene cluster linked to smoking behavior was also linked to PAD and lung cancer [18].However there have been no genome-scale efforts to estimate the joint effect of genetic variants and air pollution exposure on PAD, or even vascular disease in general.In this study, we examined the joint impact of traffic exposure and genetic variants on PAD risk at a genome-wide scale within the CATHeterization GENetics (CATHGEN) biorepository [19].Our aim was to advance the understanding of PAD pathogenesis by using a genome-wide interaction study (GWIS) to analyze single nucleotide polymorphism (SNP)-traffic exposure interactions and thereby identify novel genes associated with PAD pathogenesis.

Study design
The CATHGEN cohort is a large sample and data biorepository of consenting patients receiving services at the Duke University Cardiac Catheterization Laboratory.A complete description is provided elsewhere [19].Briefly, collection of the samples began in 2001 and was finished in 2011 with 9,334 unique patients enrolled over the 10 year period.In addition to the Health and Physical examination, demographic characteristics and peripheral blood was collected for subsequent analyses.The Duke University Institutional Review Board approved the collection and all subsequent analyses of the CATHGEN cohort.Clinical data was obtained from the Health and Physical examination performed by clinician prior to the catheterization procedure and supplemented by information from the medical record.The binary PAD variable indicated the presence or absence of PAD history, and it was collected during the Health and Physical examination.Clinical covariates separated by race for the CATHGEN cohort are presented in Table 1.Clinical covariates separated by PAD status are presented in S1 Table.

Residential Exposure to Traffic Assessment
The National Center for Geospatial Medicine, previously at Duke University and now at Rice University, performed assignment of residential geocodes using the patient addresses.Geocoded primary residential address information was obtained for a total of 8,071 CATH-GEN participants, 7,158 residing in North Carolina.We restricted all analyses to CATHGEN participants residing in North Carolina to enhance the homogeneity of the sample, for example excluding individuals who may have traveled from long distances for specialized treatment at Duke University.This restriction matches previous approaches taken with the CATHGEN cohort [20].After geocoding the participants, we used the ArcGIS [21] software package to import both the patient locations and locations of all primary and secondary roadways in North Carolina (Fig 1) and to calculate the perpendicular distance between each primary residence and the nearest primary (A1) or secondary (A2) roadway, as defined by the North Carolina Department of Transportation [22].Primary roadways were defined as limited-access highways with interchanges while secondary roadways were inter-and intra-city arterials that had multiple lanes and potentially at-grade intersections.This definition is consistent with the definition used for the Master Address File/Toplogically Integrated Geographic Encoding and Referencing Feature Class Code employed by the U.S. Census Bureau [23].This distance to nearest roadway measure of traffic exposure strongly correlates with exposure to particulate matter generated by traffic [24] and is associated with health outcomes [7,8,25].Full details of the geocoding, restriction to those individuals residing in North Carolina, and calculation of traffic exposure via the distance between the primary residence and nearest major roadway have been previously described [26].

Genotyping
Genotyping was performed on 3,512 CATHGEN participants using the Illumina HumanOmni 1-Quad_v1-0_C array.The selection of patients for genotyping was done irrespective of their geocoded location and yielded a total of 2,177 individuals (1623 European-Americans (EA), 554 African-Americans (AA) residing in North Carolina and possessing genome-wide genotype data (Table 1C).Quality control was performed prior to all analyses and matched previous quality control for race-stratified genome-wide association studies performed in CATHGEN [27].The quality control included removal of related individuals, low quality genotypes, SNPs with a call frequency < 98%, individuals with a call rate < 98%, and individuals whose genotypic gender did not match the recorded self-reported gender.Genome-wide interaction study analyses were restricted to those SNPs with a minor allele frequency (MAF) greater than 0.05.At total of 905,956 variants passed QC in at least one of the two race-stratified cohorts and were thus available for analysis.

Statistical methods
All statistical analyses were conducted using the R statistical package [28].The statistical analysis consisted of three stages (Fig 2).The first stage was a race stratified analysis of the European-American (EA) and African-American (AA) cohorts.Case-control logistic regression was used to calculate the odds ratio for the SNP-traffic exposure interaction term and a Score test [29] was used to calculate the significance of this odds ratio.An additive genetic model was used for all analyses, with a multiplicative interaction for the SNP-traffic exposure term.For the traffic exposure measure the distance between the primary residence and nearest roadway was scaled to the inter-quartile range as done for previous analyses [26], and for both the AA and EA cohorts the model was adjusted for age, sex, and principal components calculated using Eigenstrat [30] to remove racial substructure.Based on previous genome-wide association studies done with EA and AA within CATHGEN [27] we used principal components for the EA cohort and two principal components for the AA cohort.A clinical covariate adjustedmodel that added body mass index, and binary indicators for hypertension, smoking, diabetes, Flowchart displaying the analysis plan undertaken.In the first step separate SNP-traffic exposure GWIS of the EA and AA cohorts was performed.Following this suggestive (P < 1x10 -5 ) results were examined for replication via a meta-analysis of both cohorts using METAL [31].Finally step 3 involved an examination of suggestive results from a meta-analysis of all the interactions.doi:10.1371/journal.pone.0152670.g002 and dyslipidemia was also used.However, the addition of these covariates did not substantially affect the interaction odds ratio as compared to the previous model and thus results from the more parsimonious age, sex, and racial principal components model were considered the primary results.Results from the clinical covariate-adjusted model for the interactions with P < 1x10 -4 are presented in S2 Table.
A robust Score test was used to calculate the significance of the gene-environment interaction term.The quantile-quantile plot revealed inflation in the AA GWIS; thus, further adjustment of these results was performed using the genomic control F-test (GCF) whereby we generated new p-values based on the chi-squared statistic from the Score test.GCF is the method recommended for large-scale analyses where a small p-value is required [32].We used 103,196 randomly selected SNPs to calculate the median of the Score test statistic, which provided a better adjustment than the mean of the Score statistic.The median of the Score statistics then was used to calculate the inflation in the AA GWIS (λ m = 1.22).The Score test statistic was then adjusted by λ m and an F-test (F(1,100)) was used to calculate the new P-value for each interaction, which is presented in the tables.To account for multiple testing of 905,956 interactions (one for each SNP) we used the conventional genome-wide significance of P < 5x10 -8 .We defined suggestive interactions as those with a P < 1x10 -5 matching the threshold used in previous GWIS [33]; a nominal P-value of P < 0.05 was used for replication of results between the race-stratified GWIS and examinations of candidate genes uncovered by each GWIS.
The last two stages of the analysis used METAL [31] to conduct a fixed effect inverse-variance weighted meta-analysis of the most significant race-stratified GWIS results and a metaanalysis of all available interactions.For the meta-analyses the MAF cutoff was relaxed so that variants with a MAF 0.05 in at least one of the two cohorts would still be analyzed.Thus the second stage of the analysis was an examination of the suggestive race-stratified GWIS results after meta-analysis.The third stage of analysis was an examination of the results from the meta-analysis of all interactions to identify consistent results, in terms of the same direction of effect, across the race-stratified GWIS.After performing these three stages, the results were checked for their statistical stability and biological significance.We considered statistical stability by examining plots of the fitted residuals and estimated probabilities for each logistic regression analysis.Potential outliers were identified, removed, and the analyses rerun.An order of magnitude or greater change in the odds ratio or p-value after removal of outliers was considered evidence of statistical instability and statistically unstable results were not considered further.The biological significance of each SNP involved in the interaction was investigated by examining their annotation to known genes and the potential regulatory function of the sequence surrounding each SNP, e.g.alteration of CpG sites important for regulation via methylation or location in open chromatin regions.Information from the NCBI dbSNP database [34] and information on DNaseI hypersensitivity sites from the ENCODE project [35] were used to annotate variants with their biological significance Data on DNaseI hypersensitivity sites from the Duke University contributions to the ENCODE project are summarized at http://dnase.genome.duke.edu/[36].We used the Genotype Tissue Expression database Release V6 (GTEx) in order to determine if any variants found were known expression quantitative trait loci (eQTL) [37][38][39].From the GTEx resource we report all single tissue cis-eQTLs with a q-value < 0.05 [37,40].

Results
For the 6,066 individuals residing in North Carolina for whom clinical data available were also available, we calculated their traffic exposure according to the procedure defined in the Methods section (Table 1B).In these individuals an interquartile range decrease (IQR) in the distance to major roadways (641 meters) was significantly associated with a decrease in PAD prevalence in a race and sex adjusted logistic regression model (OR = 0.88, CI: 0.77-1.00,P = 0.044), which is consistent with results from previous studies [6,8].For subjects on whom we also performed genome-wide genotyping via the Illumina HumanOmni 1-Quad v1-0 C array system, we investigated SNP-by-traffic exposure interactions associated with PAD in EA (N = 1623) and AA (N = 554) cohorts (Table 1C).The QQ-plot for both the EA and corrected AA GWIS did not reveal significant genomic inflation (Fig 3).For the EA GWIS, rs755249, located in the 3'-untranslated region (UTR) of BMP8A, achieved genome-wide significance (P = 2.3x10 -8 , Table 2A).As the AA MAF did not meet the > 5% cutoff (AA MAF = 0.04, EA MAF = 0.24), the rs755249-traffic exposure interaction was not considered stable in the racestratified AA GWIS.In addition to the single genome-wide significant interaction there were 24 additional suggestive interactions in the EA GWIS (S3 Table ), of which the top 10 ranked  2A.Fourteen of the 25 variants with interaction P < 1x10 -5 were within 1Mb of BMP8A; these included six intronic and two missense SNPs in MACF1.The strong linkage disequilibrium (LD) across BMP8A and MACF1 in EA individuals limited our ability to identify independent signals within the region using only the EA cohort (Fig 4).The BMP8A region had the typical "candlestick" pattern often observed for significant variants in genome-wide association studies.All interactions with P < 1x10 -4 in the primary model are given in S3 Table .To place the interactions in the context of the genetic main effect and environmental effect the odds ratio, standard errors, and P-values for all three terms (interaction, genetic main effect, and traffic exposure) from the primary model are given in S4 Table for both the EA and AA GWIS.
None of the AA results achieved genome-wide significance after correcting for genomic inflation and statistical instability (Table 2B).The most significant, stable SNP-traffic exposure interaction was with rs634138 (P = 7.67x10 -7 ) located in an intergenic region of chromosome 2.The nine suggestive AA interactions are presented in Table 2B after removing four interactions that were statistically unstable.Of the remaining nine suggestive interactions, six were with variants located in introns, and three were with intergenic variants.Examinations of the intergenic SNPs, rs634138 (P = 7.67x10 -7 ), rs2989314 (P = 2.90x10 -6 ), and rs7787478  (P = 7.06x10 -6 ), revealed that only rs2989314 was located near genes (pseudo-genes KRT18P2 and RPS3AP1).Examination of DNAseI hypersensitivity sites from multiple tissues via http:// dnase.genome.duke.edu/[35] did not indicate that any of the three variants were in a putative regulatory region as defined by DNAseI hypersensitivity sites.Of the suggestive EA interactions six had a consistent direction of association in the AA GWIS, but none of them replicated (P<0.05) in the AA GWIS (Table 2, S5 Table ).Of the nine suggestive AA GWIS interactions two were consistent in the EA with the intergenic variant rs2989314 just nearly missing the Pvalue cutoff for replication (EA GWIS P = 0.08).
In stage two of the analysis, to more formally examine the consistency of the suggestive variants from the race-stratified GWIS analyses, we performed a meta-analysis of the EA and AA Table 3. Meta-Analysis results for suggestive (P<1x10 -5 ) race-stratified interactions (a) and all interactions (b).3: Meta-analysis results from the suggestive EA and AA interactions (a) and all interactions (b).For the meta-analysis variants with a MAF < 0.05 in one of the two cohorts were allowed.For Table 3a only the 10 most significant results are presented with the full results appearing in S6 Table .The column consistent gives whether the effect was consistent after aligning the results so that both cohorts had the same effect allele (Effect Allele).The race-stratified odds ratios (OR (EA) and OR (AA)) are given relative to the minor allele for each race. doi:10.1371/journal.pone.0152670.t003 GWIS interactions with P < 1x10 -5 .As many variants differed in allele frequency between the EA and AA cohorts, we removed the MAF cutoff for this analysis.Table 3A gives the ten most significant interactions from the meta-analysis of the suggestive race-stratified GWIS results, with the full list given in S6 Table.
In the meta-analysis of all interactions, no meta-analysis result reached genome-wide significance; however there were five suggestive interactions.Examination of the suggestive metaanalysis results independent of their association in either race-stratified GWIS revealed an additional BMP8A SNP (rs710913, meta-analysis P = 6.70x10 -6 , 3'-UTR BMP8A, Table 3B).Of the remaining four meta-analysis associations three were intronic and rs12024301 was in an intron of RGL1 (Table 3B).
In addition to examining genes belonging to the BMP family, we examined additional interactions in MACF1 and PABPC4, two genes near BMP8A with strong interaction signals however no additional interaction in these genes achieved a P < 0.05 in either race-stratified GWIS.

Discussion
In this first genome-wide gene-environment interaction study for PAD we have uncovered several suggestive interactions and one genome-wide significant interaction that indicates that a spectrum of genetic variants modify the association between PAD and residential exposure to traffic.The genome-wide significant interaction was found in the EA GWIS with a variant located in BMP8A and this region was highly represented among the suggestive interactions.Genes in the BMP family are regulators of muscle mass [45], are involved in endothelial signaling pathways [46,47], and affect vascular smooth muscle cell progression [48].They promote  vascular, aortic, and smooth muscle cell calcification [49][50][51][52], and are associated with atherosclerosis [50,53] and angiogenesis [54,55].To date BMP8A has been primarily implicated in spermatogenesis and development of the epididymis [43,44].Methylation is proposed to be an important regulator of BMP8A [56,57].Genetic variants in BMP8A were significant eQTLs in a variety of tissues perhaps most prominently tibial nerve where five variants in BMP8A were associated with BMP8A expression (Table 4).Two of the six BMP8A variants examined in the GTEx database, rs3738676 and rs755249, were most prominently eQTLs in tibial artery tissue, a tissue highly relevant for PAD.The other four variants examined were additionally often strongly associated as a cis-eQTL in non-reproductive tissue types (S8 Table ).Taken together we conclude that while BMP8A expression is significantly regulated by non-coding variants in reproductive tissue, BMP8A expression is also significantly regulated in a variety of other tissues that point to novel functions of this gene.Additionally, variants annotated to BMP8A may broadly regulate cis-genes in a variety of tissue types.Further research on this genetic locus is needed to fully elucidate the role of BMP8A and the potentially regulatory variants.
In our analysis, the five most significant BMP8A SNPs in the EA cohort altered a CpG site (Table 5) with rs755249 interrupting a CpG dimer (CGCG -> CGCA) and rs710913 removing a CpG site (CGGA -> TGGA).Given the evidence that traffic-related air pollution alters DNA methylation status [58][59][60], the important role that epigenetics and methylation play in the vascular endothelium [61], and the link between DNA methylation and vascular diseases [62][63][64] it is reasonable to speculate that the causal pathway linking traffic exposure, BMP8A, and PAD runs through DNA methylation events.
In previous studies, BMP2 gene expression increased in vascular endothelial cells after exposure to black carbon, a by-product of incomplete combustion in internal combustion engines [42].BMP2 is associated with calcification in vascular cells [48] and may mediate the effects of estrogen-related receptor γ on vascular calcification [52].Thus, our study adds to the growing body of evidence linking BMP2, air pollution, and vascular dysregulation.The other BMP genes have a variety of functions related to bone and vascular growth; however none of the other BMP family genes showed a level of association near that of BMP2 or BMP8A, and only BMP2 is linked to air pollutants specifically associated with both traffic-related air pollution and vascular phenotypes.BMP2 also was the only BMP gene to have coding variants associated with PAD via an interaction with traffic exposure; both a synonymous (rs1049007, P = 0.01) and a missense (rs235768, R ! S, P = 7.27x10 -3 ) variant showed nominal association in the EA cohort.While neither of these variants was an eQTL for BMP2 in GTEx, we believe that further research is warranted given the possible dependence on air pollution exposure.
In addition BMP8A variants, we also observed interactions with SNPs in the nearby genes of MACF1 and PABPC4.These two genes in combination with BMP8A bridge an extended locus with significant LD in the EA cohort (Fig 4).An examination of the LocusZoom [41] plots reveals that the p-values in the EA cohort correlate strongly with the LD with our most significant variant, rs755249.This would fit a single locus hypothesis where associations in this genetic region are due to LD with a single causal locus that is associated with PAD via an interaction with traffic exposure, and rs755249 is a marker variant for that locus.This hypothesis is supported by the complete lack of interactions in MACF1 or PABPC4 with even a nominal level of significance, i.e.P < 0.05, in the AA GWIS where the LD was much lower, while we observed an additional interaction with P < 0.05 in the AA GWIS in BMP8A (Table 3B).
For the AA GWIS there were relatively few suggestive interactions, indicative of the lower power due to the smaller sample size.Among the suggestive interactions was rs2161716, located in an intron of WWOX.A previous genome-wide interaction study of gene-smoking interactions associated with coronary artery calcification replicated an interaction with a variant in WWOX [33].It was hypothesized that this could be due to the involvement of WWOX and other replicated interactions in inflammatory processes mediated by the NF-κB pathway.As the NF-κB pathway mediates bone remodeling [65], this adds to our evidence that variants more typically associated with bone regulation may be involved in vascular disease pathogenesis via interactions with environmental exposures.Of the remaining suggestive AA interactions none have been previously associated with vascular disease.UTRN is a target for microRNA-206 which has been associated with skeletal muscle development [66] and microRNAs are known to be associated with air pollution exposure [67].HABP2 is a Hyaluronan binding protein associated with acute lung injury [68], and inflammation from lung injury is hypothesized to be a causal mechanism linking air pollution exposures and vascular disease [2].Additionally HABP2 is sometimes called Factor VII-activating protein and is involved in the activation of Factor VII and fibrinolysis [68].Several hemostatic factors have been associated with PAD [69] implying a potential role for HABP2 in PAD pathogenesis potentially mediated or modified by air pollution exposure as indicated by our results.
There are some limitations of this study related to sample size, exposure bias, statistical instability, and generalizability.The primary limitation is sample size.Relative to estimated sample sizes for interaction studies, both of these GWIS are small; this likely limited our ability to find more than one genome-wide significant variant.To combat this, we used a Score test rather than the traditional Wald test to determine associations.The Score test is the most powerful asymptotic test under many conditions, and thus increased our power to detect significant results [70].
Although distance to roadways is a well-utilized and recognized measure in the air pollution epidemiology literature [25,26,71] and is associated with PAD in our cohort and others [6], it remains a composite measure of all components of traffic-related pollution: particulate, gaseous, and even noise pollution.Thus, in this study, it is impossible to identify the specific causal components of traffic exposure.Nevertheless, we believe that this limitation is mitigated by the robust nature of this measure.Additionally, distance is measured in the same manner and with the same error for all individuals no matter their residential location minimizing potential bias due to differential exposure assessment methods and measurement accuracies.We were able to maximize our sample size relative to other methods that can suffer a loss of sample size due to several factors, such as air pollution monitors not being active or measurements being incompatible due to differing measurement methods.Future studies might incorporate specific components of traffic generated air pollution to determine the causal traffic exposure components.
The statistical stability of results is a concern for any study, particularly large unbiased studies such as this one.We addressed this issue by evaluating significant associations for statistical stability and removing unstable results.If our observed p-value or odds ratio changed by an order or magnitude or more after the removal of an apparent outlier that result was not considered stable.We also removed results with standard errors greater than 10, as standard error of that magnitude combined with a small p-value meant the estimated odds ratio was beyond the realm of what might be considered reasonable.At 6.3% (Table 1) the prevalence of PAD in our GWIS cohort is higher than the estimated 4.6% prevalence for the general population [3].This slight enrichment of PAD cases is likely secondary to sampling bias from a group presenting for assessment of coronary vascular disease, to which PAD is related.This unique sampling nature of CATHGEN can limit generalizability.To address this, we evaluated the consistency of the associations across ethnicities and have highlighted several trans-ethnicity consistent interactions.Associations consistent across ethnicities may be more generalizable than ethnicity specific associations.Nevertheless, it is important that our observations be replicated in other general population-based studies.

Conclusion
We observed that a decrease in the proximity from a primary residence to major roadways is associated with an increased prevalence of PAD, and this association is modified by genetic variants.The rs755249-traffic exposure interaction was associated with PAD at a genome-wide significant level in the EA cohort.Rs755249 had a very low minor allele frequency in the AA cohort (MAF = 0.04); thus, this interaction could not be replicated in the AA GWIS.In addition to rs755249, we observed multiple additional associations in BMP8A, including rs710913 -the most significant SNP in the BMP gene family in our meta-analysis.Both rs755249 and rs710913 are potentially eQTLs for BMP8A and other nearby genes in tissues related to the vasculature and peripheral limbs (Table 4).An examination of other genes in the BMP gene family revealed important BMP2 interactions with traffic exposure.The potential for novel functions for BMP8A and for multiple genes in the BMP gene family to be associated with PAD via interactions with traffic exposure highlight the need for more experimental models and cohort analyses to confirm and expand upon these findings.

Fig 1 .
Fig 1. CATHGEN participant Locations.The location of the North Carolina CATHGEN participants who were selected for these analyses.The locations have been randomized to a small degree to protect the identity of the individuals while maintaining the spatial structure and dependency in the data.doi:10.1371/journal.pone.0152670.g001

Fig 2 .
Fig 2.Flowchart displaying the analysis plan undertaken.In the first step separate SNP-traffic exposure GWIS of the EA and AA cohorts was performed.Following this suggestive (P < 1x10 -5 ) results were examined for replication via a meta-analysis of both cohorts using METAL[31].Finally step 3 involved an examination of suggestive results from a meta-analysis of all the interactions.

Fig 3 .
Fig 3. Manhattan and QQ plots for both the European-American (top) and African-American (bottom) cohorts.The red line in the Manhattan plots represents the Bonferroni significance level for 1,000,000 tests (P < 5x10 -8 ) while the blue line represents a suggestive P (P < 1x10 -5 ).doi:10.1371/journal.pone.0152670.g003

Fig 4 .
Fig 4. LocusZoom [41] plots of the BMP8A region.LocusZoom reveals strong LD between rs755249 in BMP8A and multiple SNPs in MACF1 among the European-Americans (top).This region has a much lower level of LD in the African-American cohort (bottom).doi:10.1371/journal.pone.0152670.g004 Figures demonstrating the interactions for SNPs rs755249 and rs710913 are included in S2 Fig and S3 Fig.Using the results from the AA GWIS we were able to narrow down the large LD block in the EA, which spanned the nearby genes of MACF1 and PABPC4 (S1 Fig) and localize our interaction signal to BMP8A as it was the only gene with a consistent interaction with P < 0.05 in both cohorts.

Table 1 :
Relevant clinical covariates for the CATHGEN clinical cohort are summarized below for the entire cohort, the air pollution study cohort, and the GWIS cohort.These clinical covariates are also stratified by race, European-Americans (EA) and African-Americans (AA).P-values were assessed via ANOVA for the continuous covariates of Age and BMI and were assessed via a Chi-squared test for the binary covariates Sex, Smoking, Diabetes, Hypertension, Dyslipidemia, and PAD.doi:10.1371/journal.pone.0152670.t001

Table 2 :
Associations with P < 1x10-5for the EA and AA GWIS.SNPs were restricted to a minor allele frequency (MAF) of > 0.05.Variants with a MAF < 0.05 in one of the ethnicities are listed with "< 0.05" in the MAF column.Blank cells for the odds ratio, standard error, and P indicate that the model did not converge or the MAF was less than 0.05.Four interactions were removed from Table2b dueto being statistically unstable.DHS = DNaseI Hypersensitivity sites.Rs9409787 was in a site designated as a DNaseI hypersensitivity site in retinal endothelial cells.For the EA cohort only the top 10 results (25 total) are shown.The complete list of 25 variants involved in suggestive EA interactions appears in S5 Table.* = genome-wide significant doi:10.1371/journal.pone.0152670.t002

Table 3a .
Meta-analysis of suggestive EA and AA interactions (top 10 of 34)

Table 5 .
Sequence surrounding the five most significant BMP8A variants in the EA GWIS.