Adult onset asthma and interaction between genes and active tobacco smoking: The GABRIEL consortium

Background Genome-wide association studies have identified novel genetic associations for asthma, but without taking into account the role of active tobacco smoking. This study aimed to identify novel genes that interact with ever active tobacco smoking in adult onset asthma. Methods We performed a genome-wide interaction analysis in six studies participating in the GABRIEL consortium following two meta-analyses approaches based on 1) the overall interaction effect and 2) the genetic effect in subjects with and without smoking exposure. We performed a discovery meta-analysis including 4,057 subjects of European descent and replicated our findings in an independent cohort (LifeLines Cohort Study), including 12,475 subjects. Results First approach: 50 SNPs were selected based on an overall interaction effect at p<10−4. The most pronounced interaction effect was observed for rs9969775 on chromosome 9 (discovery meta-analysis: ORint = 0.50, p = 7.63*10−5, replication: ORint = 0.65, p = 0.02). Second approach: 35 SNPs were selected based on the overall genetic effect in exposed subjects (p <10−4). The most pronounced genetic effect was observed for rs5011804 on chromosome 12 (discovery meta-analysis ORint = 1.50, p = 1.21*10−4; replication: ORint = 1.40, p = 0.03). Conclusions Using two genome-wide interaction approaches, we identified novel polymorphisms in non-annotated intergenic regions on chromosomes 9 and 12, that showed suggestive evidence for interaction with active tobacco smoking in the onset of adult asthma.


Introduction
Exposure to environmental tobacco smoke increases the risk to develop asthma in childhood [1]. However, the role of active tobacco smoking in the onset of adult asthma remains inconclusive. Current and former smokers have a lower lung function [2][3][4] and increased bronchial hyperresponsiveness [5], whereas active smoking increases asthma severity [6]. The evidence for new onset asthma after active tobacco smoking is less clear. Active tobacco smoking has been associated with the onset of adult asthma [7,8], but not in all studies [6,9,10]. It has been hypothesized that tobacco smoking moderates the immune system by increasing IgE levels, thereby contributing to asthma onset [11].
Asthma is a complex disease that is thought to be caused by an interaction of environmental exposures and genetic susceptibility. Active tobacco smoking may increase the risk for asthma in a susceptible population only. Two candidate gene studies have suggested an interaction between active tobacco smoking and genetic variants in the occurrence of asthma in adults, i.e. the genes thymic stromal lymphopoietin (TSLP) [12] and filaggrin (FLG) [13]. Similarly, a study showed an interaction between active tobacco smoking and genes involved in lung function decline [14]. Above studies were based on hypothesis driven gene selection. One genomewide association study on adult onset asthma, with a hypothesis free design, revealed that polymorphisms in the HLA-DQ gene increase the risk for adult onset asthma [15], an effect that was independent of tobacco smoke exposure.
Insight in the interaction between active tobacco smoking and genetic susceptibility is crucial for further development on knowledge on the etiology of adult onset asthma and for the development of effective strategies for asthma prevention. We therefore performed a genomewide interaction (GWI) analysis using data of studies participating in the GABRIEL consortium [15] We replicated our top hits in a large population study in the Northern part of the Netherlands: LifeLines Cohort Study [16]. We set out to identify new genetic variants that interact with active tobacco smoking with respect to asthma onset at adult age.

Methods Subjects
Data from six individual studies selected on presence of adult onset asthma data were included in the discovery meta-analysis on the interaction between single nucleotide polymorphisms (SNPs) and ever active tobacco smoking (Fig 1, S1 and S2 Checklists). All cases and controls were of European descent and two studies had a family structure. The study was approved by the local Medical Ethical Review Committees and all subjects gave written informed consent (Description of studies and ethical approval in the supporting information (S1 File)). Adult onset asthma was defined as asthma diagnosed by a doctor when the subject was 16 years of age or older, as defined within the GABRIEL consortium [15]. Controls were all free of asthma, including childhood onset asthma. Active tobacco smoking was defined as 'ever active tobacco smoking'. Details on the outcome and exposure definition for the individual studies can be found in the S1 File.

Genotyping and quality control
Genotyping was performed using the Illumina Human610 quad array (www.illumina.com) at CEA-Centre National de Génotypage, Evry, France. Details on the genotyping method have been described previously [15]. We restricted our meta-analyses to SNPs fulfilling the following quality control criteria in each study: genotype missing rate <3% in cases and controls, minor allele frequency >5% in controls and consistency with Hardy-Weinberg equilibrium in controls (p-value>10 −4 ). Samples with >95% genotyping success rate were included in the analyses. We excluded putative non-European samples, identified using EIGENSTRAT2.0 software.

Statistical analyses
All individual studies were analysed using a logistic regression model with adult onset asthma as outcome. For each individual study a genome wide analysis on adult onset asthma was performed using logistic regression analysis including the SNP, ever active tobacco smoking, as well as the interaction between the SNP and ever active tobacco smoking to assess whether the effect of smoking on adult asthma differed between subjects with different genotypes. Also a stratified analysis was performed to analyse the genetic effect in exposed and non-exposed subjects. In all models an additive genetic model was used. Gender, age and informative principal components for within-Europe diversity were included as covariates. For the studies containing family data, a cluster variable indicating the family relations was included.
We meta-analysed the results of the individual studies (discovery meta-analysis) and used two selection procedures to identify SNPs that interact with ever active tobacco smoking in the adult onset asthma. To assess heterogeneity Cochran's Q statistic was calculated of each SNP and a random effect model was fitted.
Firstly, we followed the classical GWI study approach that is based on selection of the most significant interaction effect, i.e. the overall difference between the genetic effect in smokers and non-smokers with the lowest p-value. With this approach, smaller genetic effects occurring only after exposure to active tobacco smoking can be missed. For that reason we also followed a second approach where we selected genetic markers that are significantly associated with adult onset asthma in exposed subjects, but not in non-exposed subjects.
In the first approach we meta-analysed the study specific interaction effects and we selected SNPs with a fixed effect meta-analysis interaction effect with p-value <10 −4 . In the second approach we meta-analysed the genetic main effect in exposed and non-exposed subjects separately and we then selected SNPs with a genetic effect with p-value <10 −4 only in exposed subjects based on the fixed effect model. SNPs with the same effect in exposed and non-exposed subjects were omitted by filtering on a nominal interaction effect (p-value >10 −2 ).
Only SNPs present in at least two studies were included in the discovery meta-analysis, yielding to a total of 525,150 SNPs. Genome wide significance was set to a p-value < 9.5 Ã 10 −8 based on Bonferroni correction. All SNPs selected from the discovery meta-analysis were tested for replication in an independent population, the LifeLines Cohort Study [16] (Description of study in S1 File).
To investigate if the association between genetic background, tobacco smoking and adult onset asthma was robust for the different smoking habits we assessed the genetic effects of the identified SNPs on adult onset asthma in different strata of smoking habits (ever, current and former active smoking, as well as current passive smoking) in the LifeLines cohort study: exposed versus non-exposed to ever active tobacco smoking; exposed versus non-exposed to current active tobacco smoking; exposed versus non-exposed to active smoking in the past; exposed versus non-exposed to current passive smoking (details on the exposure definitions in S1 File). The analyses were conducted using Plink 1.07 [17] and R [18]. For annotation and inspection of linkage disequilibrium (LD) patterns WGAviewer [19] was used.

Results
The discovery genome-wide interaction meta-analysis consisted of 1,324 cases and 2,733 controls derived from six studies (Table 1). Overall, active tobacco smoking was not associated with adult onset asthma (Fig 2).
Firstly, we identified 50 SNPs in the discovery meta-analysis with an interaction p-value<10 −4 . None of the SNPs reached genome-wide significance. The results for two SNPs showed heterogeneity across studies (p-value Q-statistic <0.05); these SNPs were omitted from further analysis. In the replication study, 29 of the 48 SNPs were included since 19 SNPs were not successfully imputed in the LifeLines Cohort Study or did not pass quality control (S1 Table). In total, 16 SNPs showed the same direction of the interaction effect in the discovery and replication analysis. None of the associations reached statistical significance in the replication study after Bonferroni correction for multiple testing for 29 SNPs (p-value<0.0017) ( Table 2). One SNP reached nominal significance: rs9969775 on chromosome 9. For this SNP the interaction estimate in the discovery meta-analysis was OR int = 0.50, p-value = 7.63 Ã 10 −5 and in the replication study: OR int = 0.65, p-value = 0.02 (Table 2). Fig 3 shows the forest plots with the results for the discovery studies. In the smoking stratified analysis, non-exposed subjects carrying an A allele tended to have an increased asthma risk (discovery meta-analysis OR = 1.57, p-value = 1.88 Ã 10 −3 , replication study OR = 1.20, p-value = 0.19), which was not observed in exposed subjects. Secondly, we identified 35 SNPs in the discovery meta-analysis with a genetic effect of p-value<10 −4 and an interaction p-value<10 −2 . Findings did not reach genome-wide significance. None of the SNPs showed heterogeneity across studies (p-value Q-statistic <0.05). In the replication study, 27 of the 35 SNPs were included, since 8 SNPs were not successfully imputed in the LifeLines Cohort Study or did not pass quality control (S1 Table). For 15 SNPs, the direction of the effect in the exposed subjects was the same in the discovery and replication analysis. None of the associations reached statistical significance in the replication study after Bonferroni correction for multiple testing for 27 SNPs (p-value<0.0019) ( Table 3). One SNP reached nominal significance in the replication: rs5011804 on chromosome 12 (OR int = 1.40, p-value = 0.03). The interaction estimate for this SNP was OR int = 1.50, p-value = 1.21 Ã 10 −4 in the discovery meta-analysis (Table 3). Fig 4 shows the forest plots with results for the individual studies. In subjects who ever smoked, carriers of the minor allele C had an increased risk for asthma (discovery meta-analysis OR = 1.42, p-value = 1.56 Ã 10 −6 ; replication study OR = 1.21, p-value = 0.05), while in non-exposed subjects, carriers of the C allele had no increased asthma risk (discovery meta-analysis OR = 0.92, p-value = 0.31, replication study OR = 0.86, p-value = 0.24).
Four SNPs were identified by both approaches (Table 4), but the results for these SNPs could not be replicated in LifeLines Cohort Study. The S2 Table shows the annotation of all SNPs identified in at least one of the approaches.
The analyses of the robustness of the results showed that the identified SNPs interacted with active tobacco smoking and not with passive smoking (Table 5), effects being particularly apparent among ex-smokers.

Discussion
This study is the first hypothesis-free genome-wide study specifically aiming to identify SNPs that interact with active tobacco smoking with respect to asthma onset at adult age. The results  Forest plots for the meta-analysis and replication study on the genetic effect of SNP rs9969775 on chromosome 9 in subjects exposed and non-exposed to ever active tobacco smoking (identified in first approach). The bottom forest plot presents the interaction meta-analysis and replication study for this SNP. ORs are calculated using a fixed effect model.    Forest plots for the meta-analysis and replication study on the genetic effect of SNP rs5011804 on chromosome 12 in subjects exposed and non-exposed to ever active tobacco smoking (identified in second approach). The bottom forest plot presents the interaction meta-analysis and replication study for this SNP. ORs are calculated using a fixed effect model.   Adult onset asthma and interaction between genes and active tobacco smoking are based on data from GABRIEL, a large consortium on adult onset asthma. We found suggestive evidence for an interaction between active tobacco smoking and rs9969775 on chromosome 9 and rs5011804 on chromosome 12. Both SNPs are intergenic markers that do not annotate to genes nor do SNPs in LD with these markers. The SNPs found have not been identified previously in general GWA studies on asthma. Although the identified markers do not annotate for a protein coding region, they may have a regulatory function. rs9969775 is a tri-allellic polymorphism but in our datasets only two alleles were present (effect allele: A, reference allele: C). Rs9969775 is located between the FLJ41200 gene (distance~129 KB, also known as LINC01235) and RP11-284P20.1 (distance3 66 KB). Both FLJ41200 and RP11-284P20.1 are long intergenic non-protein coding RNA genes. With the development of whole genome and transcriptome sequencing technologies, long noncoding RNAs have received increased attention. Multiple studies indicate that they can regulate gene expression in many ways, including chromatin modification, transcription and post-transcriptional processing [20]. A search for rs9969775 in the ENCODE database (using the WashU Epi Genome Browser http://epigenomegateway.wustl.edu/) showed that this SNP is located at a CpG site with a high methylation score in lung tissue. Further analysis of this SNP using Haploreg indicated that this SNP is located in a region of active chromatin in the lung, as indicated by a DNASE I hypersensitivity site, in an enhancer region (Haploreg version 4.1: http://archive.broadinstitute.org/mammals/haploreg/haploreg.php).
The second identified SNP, rs5011804, is located between the KRAS gene (distance~38 KB) and the RPL39P27 gene (distance~120 KB). The KRAS gene encodes a protein that is a member of the small GTPase superfamily. Small GTPases regulate a wide variety of processes in the cell, including growth, cellular differentiation, cell movement and lipid vesicle transport. RPL39P27 is a ribosomal protein pseudogene. Pseudogenes are fragments of genes that were functional but have been silenced by one or more mutations [21]. It was assumed that pseudogenes were not functional but recent studies suggest that they may have a functional role such as gene expression, gene regulation, and generation of genetic diversity [22]. Finally, to gain more insight in the possible regulatory roles of rs9969775 and rs5011804 on gene expression, data from the Genotype-Tissue Expression project (http://www.gtexportal.org/home/) was used. The results showed that the SNPs were not associated with gene expression of any gene in any tissue. In summary, our identified SNPs are located in regions with potential regulatory function and future research is needed to unravel their role in adult asthma further. Of interest, the two SNPs that were previously reported to be associated with adult onset asthma [15]  (rs17843604 and rs9273349 on chromosome 6) showed nominal significant associations with asthma in both smokers and non-smokers but no interaction with active tobacco smoking in our meta-analysis (S3 Table). The GWI study design is specifically suited to identify novel SNPs that interact with an environmental exposure in an unbiased way. Genes identified to interact with active tobacco smoking are crucial for further insight in the etiology of adult onset asthma and development of effective strategies for asthma prevention. A strength of our study is that we followed two different approaches to detect SNPs that show a differential effect in subjects exposed and non-exposed to smoking. The classical GWI study approach is to select SNPs with the largest interaction effect. Since we also aimed to identify subpopulations that are genetically susceptible for active tobacco smoking we followed a second approach in which we selected SNPs that only affected the risk of asthma in exposed subjects and not in non-exposed subjects. In our analyses, four SNPs were identified with both approaches.
Since adult onset asthma is not common, only a subset of asthmatics is exposed, and the expected effect size is small, a large sample size is needed to obtain a genome-wide significant finding. In this study we combined data from multiple studies to achieve this. We additionally harmonized the exposure and outcome definitions in the different studies as much as possible to improve the chance of finding significant interactive effects. However, small differences in these definitions between studies could create random error which compromises study power and thus makes it harder to detect a significant interaction [23].
A limitation of our study is that active tobacco smoking is related to exposure to environmental smoke at different periods in life, which makes it difficult to disentangle the effects of these exposures. Therefore, we assessed the genetic effects of the identified SNPs on adult onset asthma in different strata of smoking habits in the LifeLines Cohort Study. Results showed that genetic effects of the identified SNPs were particularly apparent among ex smokers.
Two studies included in the meta-analysis contained cross-sectional and retrospectively collected data. In these studies, asthma onset before the start of smoking could not be ruled out. Inclusion of these subjects would lead to a dilution of the actual interaction between genetics and ever smoking on adult onset asthma. Since data from the LifeLines Cohort Study showed that only eight (3.6%) subjects out of 225 ever smoking adult onset asthmatics started smoking after the start of adult onset asthma (data not shown), it is unlikely that this issue biased our results.
A general problem in GWI studies is their limited power, due to often a small number of subjects with overlapping exposures and genotypes [24,25]. The power to detect an interaction can be increased by assessing the association between exposure and genotype in a case-only design or a two-step design [24,25] A case-only design assumes that exposure and genotype are independent. We chose not to use this design given the known strong genetic component of smoking addiction, and relatively modest violations of this assumption can have a substantial impact on bias relating to the interaction parameters [26], hence leading to false positive or false negative findings [27]. In a two-step design the interaction is tested among a selection of SNPs. The method we used to detect interactions between exposure and genotype did not assume exposure and genotype independence nor did we a priori select SNPs. To limit the possibility to miss possible interaction effects, we first selected the most promising SNPs using an arbitrary threshold for interaction (p <10 −4 ) and included them in a replication study. A similar approach has been used successfully in a GWI study on interaction between genetic markers and waist hip ratio on total serum cholesterol [28].
In summary, we performed two approaches for GWI analyses and identified SNPs on chromosome 9 and 12, both intergenic variants with potential regulatory functions. These are novel SNPs, previously unidentified by regular genome-wide association and candidate gene studies that showed suggestive evidence for interaction with active tobacco smoking in adult onset asthma. We propose that future studies replicate our findings.