Evaluation of association studies and a systematic review and meta-analysis of CYP1A1 T3801C and A2455G polymorphisms in breast cancer risk

Background Nine previous meta-analyses have been published to analyze the CYP1A1 T3801C and A2455G polymorphisms with BC risk. However, they did not assess the credibility of statistically significant associations. In addition, many new studies have been reported on the above themes. Hence, we conducted an updated systematic review and meta-analysis to further explore the above issues. Objectives To explore the association on the CYP1A1 T3801C and A2455G polymorphisms with BC risk. Methods Preferred Reporting Items for Systematic Reviews and Meta-Analyses (The PRISMA) were used. Results In this study, there were 63 case–control studies from 56 publications on the CYP1A1 T3801C polymorphism (including 20,825 BC cases and 25,495 controls) and 51 case–control studies from 46 publications on the CYP1A1 A2455G polymorphism (including 20,124 BC cases and 29,183 controls). Overall, the CYP1A1 T3801C polymorphism was significantly increased BC risk in overall analysis, especially in Asians and Indians; the CYP1A1 A2455G polymorphism was associated with BC risk in overall analysis, Indians, and postmenopausal women. However, when we used BFDP correction, associations remained significant only in Indians (CC vs. TT + TC: BFDP < 0.001) for the CYP1A1 T3801C polymorphism with BC risk, but not in the CYP1A1 A2455G polymorphism. In addition, when we further performed sensitivity analysis, no significant association in overall analysis and any subgroup. Moreover, we found that all studies from Indians was low quality. Therefore, the results may be not credible. Conclusion This meta-analysis strongly indicates that there is no significant association between the CYP1A1 T3801C and A2455G polymorphisms and BC risk. The increased BC risk may most likely on account of false-positive results.


Results
In this study, there were 63 case-control studies from 56 publications on the CYP1A1 T3801C polymorphism (including 20,825 BC cases and 25,495 controls) and 51 case-control studies from 46 publications on the CYP1A1 A2455G polymorphism (including 20,124 BC cases and 29,183 controls). Overall, the CYP1A1 T3801C polymorphism was significantly increased BC risk in overall analysis, especially in Asians and Indians; the CYP1A1 A2455G polymorphism was associated with BC risk in overall analysis, Indians, and postmenopausal women. However, when we used BFDP correction, associations remained significant only in Indians (CC vs. TT + TC: BFDP < 0.001) for the CYP1A1 T3801C polymorphism with BC risk, but not in the CYP1A1 A2455G polymorphism. In addition, when we further performed sensitivity analysis, no significant association in overall analysis

Introduction
Breast cancer (BC) is one of the most common cancers and the main cause of cancer mortality among women worldwide. Moreover, the incidence rate of BC is unequal in different areas and races [1,2]. Cumulative evidence indicated that environment, lifestyle, tobacco, alcohol consumption, gene, and several reproductive factors were important risk factors for BC [3][4][5][6].
In recent years, the study on gene polymorphism has received much attention in the development of BC worldwide [7,8].
Cytochrome P450 1A1 (CYP1A1), which codes the enzyme cytochrome P450 1A1, is a pivotal gene in metabolism of carcinogens, particularly polycyclic aromatic hydrocarbons (PAHs) [9][10][11]. PAH gain carcinogenicity once they are activated by xenobiotic-metabolizing enzymes into highly reactive metabolites [12]. Phase-I metabolic reaction is catalyzed by Cytochrome P450 enzyme, and CYP1A1 was considered to be the most foremost enzyme which catalyzes these PAHs to highly reactive metabolites [13]. Therefore, CYP1A1 plays an important role in the etiology of BC. CYP1A1 T3801C and A2455G are two of the common polymorphisms and they have been explored on their potential impacts with risk of BC. Hence, potential roles of CYP1A1 polymorphisms with BC risk have been assumed [14,15].
Both candidate-gene based and genome-wide association studies (GWAS) have revealed several significant loci associated with breast cancer in different cancer-regulating pathways [16][17][18] that modify the risk toward breast carcinogenesis. However, the genetic association studies subcontinent are primarily candidate association studies and have often reported contradictory results. Moreover, in the past decade, nine meta-analyses have been published to investigate the association between the CYP1A1 T3801C and A2455G polymorphisms and BC risk [19][20][21][22][23][24][25][26][27]. However, the results of these meta-analyses were also contradictory and heterogeneous (S1 Table). Finally, 88 studies [S1 Appendix References] have been reported to evaluate the association between the CYP1A1 T3801C and A2455G polymorphisms and risk of BC in different populations. However, results were still contradictory. Hence, we performed an updated systematic review and meta-analysis to assess the association on the above two issues.
cytochrome P-450 OR cytochrome P450) AND (polymorphism OR variant OR variation OR mutation OR SNP OR genome-wide association study OR genetic association study OR genotype OR allele) AND breast. No language restriction was applied in the eligible studies. Additional studies have been screened out from the references of reviews and meta-analyses that published in the past decade. All the eligible studies were identified by reading the title, abstract, and full text of literatures. Moreover, we contacted the corresponding authors to obtain detailed information by e-mail if necessary.

Inclusion and exclusion criteria
Eligible studies were included if they met the following criteria: (1) studies must be based on case-control or cohort studies; (2) genotype frequencies or odds ratios (ORs) and 95% confidence intervals (CIs) must be provided; (3) studies must investigate the association between the CYP1A1 T3801C and A2455G polymorphisms and risk of BC. Exclusion criteria were as listed below: (1) articles were not on BC, (2) studies didn't provide the genotype data or ORs and 95% CIs, (3) for multiple publications of the same data, we only included the data from the largest or the latest studies.

Data extraction and quality assessment
Data extraction and quality score assessment were performed by two authors (Yang and He) using pre-designed tables independently and was cross-checked for consensus to ensure its accuracy. Conflicts were discussed between the two authors to reach an agreement. The following information was collected from each study: first author, year of publication, country, ethnicity, source of controls, sample size, genotype distribution for cases and controls, and matching.
Quality assessment was performed by the two authors independently with a pre-designed scoring scale by one previous meta-analysis [29] ( Table 1). The total score ranged from 0 to 20. Studies with scores 0-7, 8-13, or 14-20 were of low, moderate, or high-quality by two previously published meta-analyses [30,31], respectively.

Statistical analysis
Crude ORs and 95% CIs were used to estimate the association between the CYP1A1 T3801C and A2455G polymorphisms and the risk of BC. The CYP1A1 T3801C and A2455G polymorphisms were analyzed using the following five genetic models: CC vs. TT/GG vs. AA, TC vs. TT/AG vs. AA, CC vs. TC + TT/GG vs. AG + AA, CC + TC vs. TT/GG + AG vs. AA, and C vs. T/G vs. A.
We used Q test and I 2 value to check heterogeneity among between-study heterogeneity (significant heterogeneity was regarded if P < 0.01 and/or I 2 > 50%) [32]. For each genetic model contrast, summary ORs were calculated using random-effects model [33,34]. The random-effects model was applied by the following two main reasons: (1) because the Q test is characterized by low statistical power for between-study heterogeneity, which is especially relevant when few studies are available; (2) Usually, the random-effects model is a more conservative choice when heterogeneity is present, whereas it reduces to the fixed effect model when heterogeneity is absent. Subgroup analyses were calculated to assess the effects in the Asians, Caucasian, African, and Indian. Further subgroup analysis was conducted by menopausal status. Moreover, a meta-regression analysis was applied to explore the source of heterogeneity. Furthermore, a sensitivity analysis was performed by the following methods: a single study was removed each time and a dataset was used that the comprised only high-quality studies, matching studies, HWE, and genotyping performed blindly or with quality control [35]. Chi-square goodness-of-fit test was used to check Hardy-Weinberg equilibrium (HWE), and statistically significant deviation was considered in control groups if P < 0.05 [36]. In addition, a Bayesian false discovery probability (BFDP) was used to correct multiple comparisons [37]. A cutoff value of BRDP was set up to be a level of 0.8 and a prior probability of 0.001 to assess whether the positive associations were noteworthy or not. Finally, publication bias was confirmed by Begg's funnel plot [38] and Egger's test [39]. All statistical analyses were performed using Stata version 12.0 (Stata Corporation, College Station, TX, USA).

Study characteristics
Fig 1 lists a flow diagram for identifying and including studies. Overall, a total of 108 studies were involved in the present study. Then, 7 studies were excluded because their data overlapped with another 7 studies. Finally, 75 articles were eligible in this meta-analysis. S2 Table  list  . In addition, ten and twelve studies were performed to analyze CYP1A1 T3801C and A2455G polymorphisms in premenopausal women, and thirteen and seventeen studies were conducted to analyze CYP1A1 T3801C and A2455G polymorphisms in postmenopausal women, respectively, as shown in S3 Table. Quantitative synthesis      vs. AA + AG: OR = 3.59, 95% CI: 1.09-11.80) and postmenopausal women (OR = 1.27, 95% CI: 1.07-1.50 for GG vs. AA + AG) for the CYP1A1 A2455G polymorphism. However, after using BFDP correction, no significant associations were found in overall, Indians, and postmenopausal women.

Heterogeneity and sensitivity analyses
Significant heterogeneity was observed in this study. Then, a meta-regression analysis was conducted to explore the source of heterogeneity by ethnicity, sample size, source of controls, type of controls, matching, HWE, and quality score. source of heterogeneity only be found in quality score (AG vs. AA: P = 0.031, G vs. A: P = 0.030) for the CYP1A1 A2455G polymorphism. Then, a sensitivity analysis was performed to assess the stability of results (as shown in Tables 2 and 3). The results did not change when a single study was deleted each time in the meta-analysis (Figures not shown). However, when we only included studies of high-quality, HWE, matching, and genotyping examination done blindly or with quality control, no significant association was observed between the CYP1A1 T3801C and A2455G polymorphisms and risk of BC.   S4 Table shows the results of published meta-analyses for the CYP1A1 T3801C and A2455G polymorphisms with BC risk in various different ethnic groups. Only one study [19] found that the CYP1A1 T3801C polymorphism was significantly increased BC risk in Indians. Concerning the CYP1A1 A2455G polymorphism, two studies [20,21] observed a significantly  increased BC risk in Caucasians and one study [22] found an obviously decreased BC risk in East Asians. However, when we used BFDP correction, only the CYP1A1 T3801C polymorphism still be significant associated in Indians (CC vs. TT: BFDP < 0.001; TC + CC vs.TT: BFDP < 0.001).

Discussion
Cytochrome P450s are enzymes which catalyze phase-I metabolism reactions. Cytochrome P450 1A1 (CYP1A1) is one of the member of the CYP family and plays an important role in phase-I metabolism of polycyclic aromatic hydrocarbons as well as in estrogen metabolism. The dysfunction of CYP1A1 can cause damages to DNA, lipids, and proteins, which further lead to carcinogenesis.
Overall, the CYP1A1 T3801C polymorphism was significantly increased BC risk in overall analysis, especially in Asians and Indians; the CYP1A1 A2455G polymorphism was associated with BC risk in overall analysis, Indians, and postmenopausal women. Published meta-analysis [19] found that the CYP1A1 T3801C polymorphism was significantly increased BC risk in South Indians. Concerning the CYP1A1 A2455G polymorphism, two meta-analyses [20,21] observed a significantly increased BC risk in Caucasians and one study [22] found an obviously decreased BC risk in East Asians. As far as we know, meta-analyses of gene polymorphism and disease risk because they used several subgroups and genetic models at the expense of multiple comparisons, under these circumstances, the pooled P-value must be adjusted [40]. Wakefield et al. [37] proposed a precise Bayesian measure of false discovery in genetic epidemiology studies. Therefore, BFDP were considered to assess the significant associations in this study. When we used BFDP correction, associations remained significant only in Indians (CC vs. TT + TC: BFDP < 0.001) for CYP1A1 T3801C polymorphism with BC risk. However, when we further performed sensitivity analysis, no significant association in overall analysis and any subgroups. Moreover, we found that all studies from Indians was low quality. Therefore, the results may be not credible. Further studies should be based on more high quality studies to confirm the association in Indians.
Obvious publication bias was observed by Begg's funnel plots and Egger's test between the CYP1A1 T3801C polymorphism and BC risk in the current meta-analysis. Some small sample studies were easier to publish if there were positive results as they tend to obtain false-positive results because they may be not rigorous and are often of low-quality. In addition, random error and bias were common in small sample size, therefore, their conclusions may be unreliable on gene polymorphism with disease risk. Figs 2-4 also indicate that the asymmetry of the funnel plots were caused by some studies with low-quality small samples.
S4 Table shows the results of published meta-analyses for CYP1A1 T3801C and A2455G polymorphisms with BC risk in various different ethnic groups (S1 Table). An significant inconsistency was observed in classification of ethnic groups among the published meta-analyses, especially for studies from USA, India, and Brazil (cells with red color in S1 Table). Moreover, we found that the published meta-analyses involved some repeat studies and many studies were also included. Furthermore, no studies adjusted positive results for multiple comparison using BFDP test.
Of these published meta-analyses, one involved studies only from African population [18], one from Chinese population [25], one from Indians [27], and the remaining examined all races [19-22, 24, 26]. Previous meta-analyses of maximum sample size was performed in 2014 for CYP1A1 T3801C (47 studies 16,272 case and 20,930 controls) and A2455G (38 studies 15,969 case and 24,931 controls) with BC risk [19,21]. The studies number and sample size of the present meta-analysis (63 studies including 20,825 BC cases and 25,495 controls for T3801C and 51 studies including 20,124 BC cases and 29,183 controls) were larger than published meta-analyses. There were several deficiencies with the present study comparison. First, all previous meta-analyses [19][20][21][22][23][24][25][26][27] did not perform literature quality assessment. Second, all previous meta-analyses [19][20][21][22][23][24][25][26] did not adjusted positive results for multiple comparison excepting one study using FDR method [27]. Third, several published meta-analysis did not perform the sensitivity analysis. Moreover, previous meta-analyses included incomplete studies and some repeat studies did not be excluded (S1 and S4 Tables). Finally, An obvious inconsistency was found in classification of ethnic groups between these published meta-analyses, especially for studies from USA, India, Brazil, and so on (cells with blue color in S1 Table). Hence, we performed an updated meta-analysis to further explore the CYP1A1 T3801C and A2455G polymorphism with BC risk. In the current meta-analysis, a larger sample size was collected. In addition, we evaluated quality assessment of the eligible studies. Moreover, we applied meta-regression analysis to investigate the source of heterogeneity. Further, we performed a sensitivity analysis, especially we used a data set only including studies of high-quality, matching, HWE, and in which genotyping was performed blindly or with quality control (this was an attempt to avoid random errors and confounding bias that sometimes distorted the results of molecular epidemiological studies). Finally, we used BFDP method to assess the significant associations.
Despite all our efforts to improve our research. However, this study still exists several limitations. First, only published articles were included, so publication bias may be unavoidable. Second, some subgroup analyses included less studies, for instance, there were only five studies on the CYP1A1 T3801C polymorphism with BC risk in Indians and four studies on the CYP1A1 A2455G polymorphism with BC risk in Africans. Third, data were not stratified by age, family history, smoking status, and other environmental factors. Hence, a more precise analysis should be performed when enough data was available in future.

Conclusions
In summary, this meta-analysis strongly indicates that there is no significantly associated between the CYP1A1 T3801C and A2455G polymorphisms and BC risk. The increased BC risk may most likely on account of false-positive results. Significant association should be interpreted with caution and it is essential that future analysis be based on sample sizes wellpowered to identify these variants having modest effects on BC risk, especially the combined effects, such as gene-gene and gene-environmental.