Strong Association between Two Polymorphisms on 15q25.1 and Lung Cancer Risk: A Meta-Analysis

Background The association between polymorphisms on 15q25.1 and lung cancer has been widely evaluated; however, the studies have yielded contradictory results. We sought to investigate this inconsistency by performing a comprehensive meta-analysis on two polymorphisms (CHRNA3 gene: rs1051730 and AGPHD1 gene: rs8034191) on 15q25.1. Methods Data were extracted from 15 and 14 studies on polymorphisms rs1051730 and rs8034191 involving 12301/14000 and 14075/12873 lung cancer cases/controls, respectively. The random-effects model was applied, addressing heterogeneity and publication bias. Results The two polymorphisms followed Hardy-Weinberg equilibrium for all studies (P>0.05). For rs1051730-G/A, carriers of A allele had a 36% increased risk for lung cancer (95% confidence interval [CI]: 1.27–1.46; P<0.0005), without heterogeneity (P = 0.258) or publication bias (PEgger = 0.462). For rs8034191-T/C, the allelic contrast indicated that C allele conferred a 23% increased risk for lung cancer (95% CI: 1.08–1.4; P = 0.002), with significant heterogeneity (P<0.0005), without publication bias (PEgger = 0.682). Subgroup analyses suggested that the between-study heterogeneity was derived from ethnicity, study design, matched information, and lung cancer subtypes. For example, the association of polymorphisms rs1051730 and rs8034191 with lung cancer was heterogeneous between Caucasians (OR = 1.32 and 1.22; 95% CI: 1.25–1.44 and 1.05–1.42; P<0.0005 and 0.008, respectively) and East Asians (OR = 1.51 and 1.03; 95% CI: 0.76–3 and 0.47–2.27; P = 0.237 and 0.934, respectively) under the allelic model, and this association was relatively strengthened under the dominant model. There was no observable publication bias for both polymorphisms. Conclusions Our findings demonstrated that CHRNA3 gene rs1051730-A allele and AGPHD1 gene rs8034191-T allele might be risk-conferring factors for the development of lung cancer in Caucasians, but not in East-Asians.


Introduction
Lung cancer is the most common malignancy and the firstleading cause of cancer mortality, with an estimated 1.3 million new cases diagnosed annually in the world [1,2]. The well-known risk factors for lung cancer include cigarette smoking and exposure to ionizing radiation (e.g., radon, medical imaging). Accumulating evidence has suggested that genetic factors may contribute to the variation in susceptibility to lung cancer. It is widely accepted that lung cancer is a complex multifactorial disease, attributed to the interaction of genetic factors with environmental factors [3,4]. Despite intensive efforts devoted to investigating the genetic factors for lung cancer, the driving genes and genetic variants that determine the development of lung cancer are unclear.
The chromosome 15q25.1 region has been identified as a hotspot for lung cancer susceptibility by recent genome-wide association (GWA) studies [5,6,7,8]. Results of genetic association studies for nicotine dependence, smoking behavior, and smokingrelated diseases have converged to implicate the chromosome 15q25.1 region. The relationship between polymorphisms rs1051730 in CHRNA3 gene and rs8034191 in the AGPHD1 gene and lung cancer risk or related phenotypes has been widely investigated. As stated by McClellan and King, many if not most of the genetic polymorphisms that are reported to be associated with common disorders in GWA studies are factually spurious associations caused by subtle differences in ancestry between the populations being studied (known as ''cryptic population stratification'') [9]. Moreover, based on the fact that individual studies with insufficient sample sizes lack sufficient statistical power to detect the common variants with tiny effects on lung carcinogenesis, the results are not reproducible. To derive a more precise estimation and investigate the inconsistency, we evaluated the effect of two polymorphisms rs1051730 and rs8034191 on the risk of lung cancer, addressing heterogeneity and publication bias.

Methods
We performed this analysis in accordance with the guidelines of the Preferred Reporting Items for Systematic Reviews and Metaanalyses (PRISMA) statement [10] (see flowchart S1 and checklist S1).

Search Strategy for Identification of Studies
We searched the PubMed and EMBASE databases for articles published before January 2012, using the Boolean combinations of subject terms (CHRNA3 OR AGPHD1 OR LOC123688) AND (lung cancer OR carcinoma OR neoplasm) AND (gene OR polymorphism OR allele OR genotype OR variant OR mutation). Articles were restricted to English-language and human studies. The full text of the retrieved articles was scrutinized to decide whether information on the topic of interest was included. Reference lists of these retrieved articles and systematic reviews were also checked for citations of articles not initially identified. For articles involving more than one geographic or ethnic heterogeneous group, each group was treated separately. When genotype frequency was not reported, we contacted the authors to obtain the relevant information.

Inclusion/Exclusion Criteria
Articles were included in this meta-analysis if they 1) examined the hypothesis that CHRNA3 gene rs1051730 polymorphism and/ or AGPHD1 gene rs8034191 polymorphism were associated with lung cancer risk; 2) followed a nested case-control or case-control or cross-sectional study design; and 3) provided sufficient information on genotype/allele counts between cases and controls to estimate the odds ratio (OR) and the corresponding 95% confidence interval (95% CI). The relatively complete and recent results were extracted when there were multiple articles involving the same population.

Extracted Information
The following information was extracted independently and entered into separate databases by two authors (MG and WN) from each qualified study: first author's last name, publication date, population ethnicity, study design, baseline characteristics of the study population including age, ethnicity, sex, smoking status, and the genotype counts in cases and controls. Any encountered discrepancy was adjudicated by a discussion until a consensus was reached.

Quality Score Assessment
The study quality was assessed by using a quality assessment score developed for genetic association studies by Thakkinstian et al [11]. Total scores ranged from 0 (worst) to 12 (best). The criteria for quality assessment of the genetic association between two studied polymorphisms and lung cancer are described in Table S1.

Statistical Analysis
Data management and statistical analyses were conducted using STATA software (StataCorp, Texas, USA, version 11.0 for Windows). Deviation from Hardy-Weinberg equilibrium was tested by x 2 or Fisher's exact test in control groups. Irrespective of between-study heterogeneity, a random-effects model using the DerSimonian and Laird method was implemented to bring the individual effect-size estimates together, and the estimate of heterogeneity was taken from the Mantel-Haenszel model [12]. Unadjusted OR and 95% CI were used to compare allelic and dominant contrast between cases and controls.
Between-study heterogeneity was assessed by the inconsistency index I 2 statistic (ranging from 0 to 100%), which was documented for the percentage of the observed between-study variability due to heterogeneity rather than by chance, with higher values suggesting the existence of heterogeneity [13,14]. In the case of betweenstudy heterogeneity, we examined the study characteristics that could stratify the studies into subgroups with homogeneous effects. To estimate the extent to which one or more covariates explained the heterogeneity, we employed meta-regression, as an extension of random-effects meta-analysis.
Cumulative meta-analysis was conducted to identify the influence of the first published study on the subsequent publications, and the evolution of the combined estimates over time according to the ascending date of publication. To identify potentially influential studies, we performed influential analysis (also known as sensitivity analysis) by removing an individual study each time to check whether any of these estimates biased the overall estimate.
The funnel plot and Egger's test were applied to assess publication bias [15]. Egger's test can detect funnel plot asymmetry by determining whether the intercept deviates significantly from zero in a regression of the standardized effect estimates against their precision. Trim and fill method was also used to estimate the number and outcomes of potentially missing studies resulting from publication bias. A probability ,0.05 was considered significant except for the I 2 and Egger's statistic, for which a significance level was defined as ,0.1.

Search of Studies
Based on our search strategy, the primary screening produced 40 potentially relevant articles, of which 12 met the inclusion criteria in an attempt to evaluate the association of CHRNA3 gene rs1051730 and/or AGPHD1 gene rs8034191 polymorphisms with lung cancer risk [5,16,17,18,19,20,21,22,23,24,25,26]. A flow diagram schematized the process of selecting and excluding articles with specific reasons (Figure 1). The 12 qualified articles were published between 2008 and 2011 involving 16 studies with 9 in Caucasians, 4 in East Asians, 2 in African-Americans, and 1 in mixed (Caucasian, African-American and Hispanic) populations. The quality score of studies ranged from 7 to 10 (mean: 8.5) out of a maximal score of 12. In detail, there were 11 (15) and 10 (14) articles (studies) for rs1051730 and rs8034191 polymorphisms involving 12301/14000 and 14075/12873 lung cancer cases/ controls, respectively.

Study Characteristics
The baseline characteristics of all qualified studies are summarized in Table 1. Genotype distributions of two polymorphisms were in Hardy-Weinberg equilibrium for all studies (P.0.05). Ten of 16 qualified studies were matched on age or sex or smoking status between cases and controls [16,18,19,21,22,24,25,26]. Five studies were hospital-based [16,19,23,26], and the rest were population-based. Three studies involved non-small-cell lung cancer as an end point, and one study involved squamous cell lung carcinoma as an end point. The frequencies of CHRNA3 gene rs1051730-A allele ranged widely between Caucasians and East Asians with African-Americans in between. For example in control groups, the rs1051730-A allele ranged from 29.45% to 37.14% in Caucasians, from 1.39% to 3.3% in East Asians, and from 16.15% to 19.85% in African-Americans. The observation was similar for AGPHD1 gene rs8034191-C allele, with frequencies ranging from 23.04% to 39.47% in Caucasian controls, from 1.82% to 3.72% in East Asian controls, and from 16.15% to 31.44 in African-American controls.

Overall Analysis
Due to the sparseness of the mutant alleles of both studied polymorphisms in East Asians and to maximize the statistical power to detect an association, we considered the risk effect of two polymorphisms under both allelic and dominant models.
For both polymorphisms, as reflected by the visual funnel plot inspection (Figure 2-A and 2-C) and Egger's regression asymmetry statistic, there was low probability of publication bias (P = 0.742 and 0.682 for rs1051730 and rs8034191, respectively). Further evidence of selective publication suggested that there were no missing studies required to make the funnel plot symmetrical for both polymorphisms (

Cumulative and Influential Analyses
In the cumulative meta-analysis, across all genetic models we found no evidence suggesting that the first published study that reported a potentially significant result then triggered subsequent publication replication. The influential analysis showed that no single study influenced the overall results significantly for both polymorphisms (data not shown).

Subgroup Analysis
In view of significant heterogeneity and to seek for its potential sources, we performed a panel of subgroup analyses on ethnicity, matched information, study design, and disease type.
Grouping studies by descent of populations indicated that the odds of developing lung cancer was significantly augmented in African-Americans for both polymorphisms, and was non-signif-  Table 3). In contrast, there were no material changes in risk estimates in Caucasians for both polymorphisms. Upon stratification by the matched information on age or gender or smoking status between cases and controls, the risk estimates were relatively weakened in matched studies for both polymorphisms under both allelic and dominant models (Tables 2 and 3), and the quality of heterogeneity was not improved.
In subgroup analysis by study design, association of both studied polymorphisms with lung cancer was potentiated in hospital-based studies under allelic model (rs1051730: OR = 1.56, 95% CI: 1.

Meta-regression analysis
To identify other sources of heterogeneity, we undertook metaregression analysis of age (mean or median value), sex (male percent), and smoking rate (percentage of current and former smokers). Among these variables, the association of CHRNA3 gene rs1051730 (correlation coefficient: 0.48, P = 0.069) and AGPHD1 gene rs8034191 (correlation coefficient: 0.57, P = 0.043) polymor-  For each study, OR is shown by the middle of the blue solid circle whose upper and lower extremes represent the corresponding 95% CI. OR values were calculated for the current smokers against nonsmokers (including former smokers) when available or ex-smokers against never-smokers otherwise. The green dotted line is plotted by fitting OR and smoking percent in cases for the included studies. doi:10.1371/journal.pone.0037970.g003 phisms with lung cancer risk was observed in cases with a high smoking rate under the allelic model ( Figure 3).

Discussion
Via a comprehensive meta-analysis, we evaluated the association of two common polymorphisms on 15q25.1 with the risk of lung cancer. Overall results demonstrated that CHRNA3 gene rs1051730-A allele and AGPHD1 gene rs8034191-T allele might be riskconferring factors for the development of lung cancer in Caucasians, but not in East-Asians. Although potential sources of heterogeneity could not be easily eliminated, the present study, to our knowledge, is the first meta-analysis to date dealing with the association of these two polymorphisms with lung cancer susceptibility.
We identified ethnicity as a potential source of between-study heterogeneity by subgroup analysis. Genetic heterogeneity is inevitable in disease identification strategy [27]. We found that the association of rs1051730 and rs8034191 polymorphisms with lung cancer risk was heterogeneous between Caucasians and East Asians. The significance was observed only in the former, which consisted with the results of GWA studies from western populations. We also have noticed remarkable differences in CHRNA3 gene rs1051730-A allele and AGPHD1 gene rs8034191-C allele between Caucasians and East Asians, making it very difficult to detect the weak association in Asians unless examining a very large population. This suggests that different genetic backgrounds may cause this discrepancy or that different populations may have different linkage disequilibrium patterns. The studied polymorphisms may be in linkage with another causal variant in one ethnic population but not in another [28]. For example, the rs1051730 polymorphism is in complete linkage disequilibrium with the potentially pathogenic allele of rs16969968 (D398N) in the CHRNA5 gene [17,29]. We therefore speculate that polymorphism rs1051730 may have a pleiotropic effect on the etiology of lung carcinogenesis across different ethnic groups. In view of the divergent genetic backgrounds, it is necessary to construct a database of polymorphisms related to lung cancer in each ethnic/racial group.
Besides the disturbing influence of ethnicity on overall estimate, any estimate should be treated with caution when studies were stratified by study design. In this meta-analysis, for both polymorphisms, the risk estimates in hospital-based studies were stronger than that in population-based studies. Besides the relatively small sample size, drawbacks of hospital-based studies should not be disregarded, as population stratification remains an important issue [30]. Two studies had recruited subjects from only one hospital, and thus there might be a narrow socioeconomic profile for both cases and controls. In addition, poor comparability between cases and controls in hospital-based studies might exert a confounding effect on the true association in light of a regional specialty for the disease and the differential hospitalization rates between cases and controls [31]. In contrast, subjects drawn from the community or the general population might be more representative of the population, making the results from population-based studies more convincing. Considering the wider confidence intervals of estimates, more studies are required to quantify the effect size reliably.
Furthermore, our meta-regression analysis found an association of two studied polymorphisms with lung cancer risk in patients with a higher smoking rate. We defined smoking rate based on the percentage of current and former smokers if available. This definition is unlikely to undermine our observation since the exclusion of ever-smoking might lead to an underestimation of the risk for lung cancer. Moreover, our data on smoking and other confounders were extracted from recent publications (after the year 2008) from professional cancer journals as reflected by the high quality score. Additionally, smoking is by far the major contributor to lung cancer, accounting for about 90% of the lung cancer incidence [32]. Previous studies demonstrated that polymorphisms in the CHRNA3 gene were associated with an increased risk of smoking initiation, indicating a potential genotype-phenotype interaction [33].
The strengths of this study include the relatively large sample size, no deviation from Hardy-Weinberg equilibrium, and the high quality of the qualified studies. However, our current study should be interpreted with several technical limitations in mind. Firstly, most of the studies in this meta-analysis were case-control studies, which are susceptible to selection bias by including only nonfatal cases. Secondly, because only published studies in English were retrieved and the ''grey'' literature (articles in languages other than English) was not included, publication bias might be possible, even though our funnel plots and statistical tests did not show it. However, asymmetry in the funnel plot, being either visually interpreted or statistically tested, may result from an essential difference between the small and larger studies that arises from inherent between-study heterogeneity [34]. Because currently we have no golden standard to compare the results of funnel plot tests [34], Egger's test and the usual funnel plot have been challenged. We cannot completely rule out a low probability that small negative studies are missing from the plot. Nevertheless, the trim and fill method suggested no missing studies required to make the funnel plot symmetrical for both polymorphisms. Thirdly, the single locusbased nature of meta-analysis precluded the possibility of gene-gene and gene-environment interactions, as well as haplotype-based effects, suggesting that additional studies assessing these aspects are necessary. Fourthly, we focused only on two polymorphisms on 15q25.1 and did not consider other candidate genes or polymorphisms. It is likely that the studied polymorphisms by itself make a minor contribution to risk prediction in lung cancer patients, but whether the two polymorphisms when integrated with other risk factors will enhance the prediction requires further investigation.
Taken together, we have expanded previously individual studies by providing the convincing evidence that CHRNA3 gene rs1051730-A allele and AGPHD1 gene rs8034191-T allele might be risk-conferring factors for the development of lung cancer in Caucasians, but not in East-Asians. We have strengthened the previous findings on the association of high smoking rate with increased lung cancer risk. Further studies should investigate the markers on and adjacent to 15q25.1 to clarify whether the present association is causal or due to linkage disequilibrium.

Supporting Information
Table S1 Criteria for quality assessment of genetic association of CHRNA3 gene rs1051730 polymorphism and AGPHD1 gene rs8034191 polymorphism with lung cancer.