Epigenetic, Genetic and Environmental Interactions in Esophageal Squamous Cell Carcinoma from Northeast India

Background Esophageal squamous cell carcinoma (ESCC) develops as a result of complex epigenetic, genetic and environmental interactions. Epigenetic changes like, promoter hypermethylation of multiple tumour suppressor genes are frequent events in cancer, and certain habit-related carcinogens are thought to be capable of inducing aberrant methylation. However, the effects of environmental carcinogens depend upon the level of metabolism by carcinogen metabolizing enzymes. As such key interactions between habits related factors and carcinogen metabolizing gene polymorphisms towards modulating promoter methylation of genes are likely. However, this remains largely unexplored in ESCC. Here, we studied the interaction of various habits related factors and polymorphism of GSTM1/GSTT1 genes towards inducing promoter hypermethylation of multiple tumour suppressor genes. Methodology/Principal Findings The study included 112 ESCC cases and 130 age and gender matched controls. Conditional logistic regression was used to calculate odds ratios (OR) and multifactor dimensionality reduction (MDR) was used to explore high order interactions. Tobacco chewing and smoking were the major individual risk factors of ESCC after adjusting for all potential confounding factors. With regards to methylation status, significantly higher methylation frequencies were observed in tobacco chewers than non chewers for all the four genes under study (p<0.01). In logistic regression analysis, betel quid chewing, alcohol consumption and null GSTT1 genotypes imparted maximum risk for ESCC without promoter hypermethylation. Whereas, tobacco chewing, smoking and GSTT1 null variants were the most important risk factors for ESCC with promoter hypermethylation. MDR analysis revealed two predictor models for ESCC with promoter hypermethylation (Tobacco chewing/Smoking/Betel quid chewing/GSTT1 null) and ESCC without promoter hypermethylation (Betel quid chewing/Alcohol/GSTT1) with TBA of 0.69 and 0.75 respectively and CVC of 10/10 in both models. Conclusion Our study identified a possible interaction between tobacco consumption and carcinogen metabolizing gene polymorphisms towards modulating promoter methylation of tumour suppressor genes in ESCC.


Introduction
The Esophageal cancer (EC) is the sixth most common cancer in men worldwide with distinct geographical differences in its incidence rate and pattern. The incidence and mortality rate of EC is highest in certain Asian countries, stretching from Northern Iran through the central Asian republics to North-Central China, referred to as the ''esophageal cancer belt.'' Around 90% of the esophageal cancers in these areas are squamous cell carcinomas (SCCs) and are thought to develop as a result of complex interactions between environmental, genetic and epigenetic factors [1]. However, these interactions are not well understood in ESCC. Environmental and dietary factors like smoking and smokeless tobacco consumption, betel quid chewing, alcohol intake, poor nutrition, etc., are considered to be associated with ESCC in the high risk areas [2,3]. Moreover, polymorphism in various carcinogen metabolizing genes modulates the effect of these environmental carcinogens and further increases the risk of ESCC [4]. The interaction of tobacco related carcinogens and carcinogen metabolizing genes like GSTM1, GSTT1, etc., were found to modify the effect of tobacco exposure thereby increasing the susceptibility for developing ESCC [5,6].
Epigenetic events like aberrant DNA methylation of tumours suppressor genes (TSGs) are considered as important factors in development and progression of ESCC. The TSGs involved in different cellular pathways like cell cycle regulation (p16), apoptosis (DAPK), DNA repair (BRCA1) and protection of DNA (GSTP1) [7,8]. Increasing evidence are growing that tobacco smoke associated carcinogens and carcinogen metabolizing gene polymorphisms are capable of modulating DNA methylation in cultures, animal models as well as certain tobacco-related cancers like lung cancer [9][10][11][12]. Cigarette smoke has also been found to induce promoter methylation of particular genes in esophageal epithelial and ESCC cell lines; however, no study involving human subjects were carried out [13,14]. Furthermore, null genotype of GSTM1 gene was associated with an increased susceptibility of CpG island hypermethylation in gastric-mucosa [15]. Although, ESCC is one of the most important tobacco related cancers, but the interaction of smoking and smokeless tobacco, carcinogen metabolizing gene polymorphisms and aberrant DNA methylation in ESCC has remained largely unexplored.
This study is conducted on a unique population of Northeast (NE) India, where tobacco related habits like tobacco chewing; beedi and cigarette smoking are common. Moreover, consumption of a combination of areca nut, betel leaf, slaked lime with or without tobacco, called 'betel quid (BQ)' or locally as 'pan' or 'tambul' is customary in this concerned population. The Assam and Mizoram states of NE-India are among the highest incidence region of esophageal cancer, with an age-adjusted rate of around 17/100000 to 27 per/100000 population [16]. Although, previous studies on the risk factors of ESCC in NE-Indian population specify the association of tobacco and BQ chewing with its carcinogenesis, but very little is established about the environmental, genetic or epigenetic risk factors [17]. Moreover, no studies were conducted on DNA methylation signatures of the ESCC patients in this population. Here, we analyzed the association of various habits related factors (like tobacco chewing, beedi and cigarette smoking, BQ chewing and alcohol consumption) and carcinogen metabolizing gene polymorphisms (GSTM1, GSTT1) in ESCC and also stratified by promoter hypermethylation of TSGs, like p16, DAPK, BRCA1 and GSTP1 by logistic regression analysis. Multifactor dimensionality reduction (MDR) and false-positive report frequencies (FPRP) were used to predict high order interactions involving those factors of Epigenetic, Genetic and Environmental in ESCC from NE Indian population.

Study Population
Surgically excised cancer tissues (prior to chemo-radiation therapy), biopsy specimen or formalin fixed paraffin-embedded tissues of 112 histopathologically confirmed ESCC patients from different cancer hospitals of NE India during January 2011 to October 2012 were included. Histological proven normal margins of 30 patients undergoing curative surgery for ESCC were considered for comparison. Oral swabs from inner cavity of 130 age and gender matched healthy controls were also collected. Both cases and controls with family history of esophageal or other cancers were excluded. All possible precautions were taken to avoid any cross-contamination while collecting as well as processing of the samples.

Ethics Statement
The study was approved by the Institutional Review Board of Cachar Cancer Hospital and Research Centre (http:// cacharcancerhospital.org), Assam, and the written consents were taken from the subjects (IRB No: IRB/CCHRC/01/2010).

Exposure to Environmental Factors
Demographic and habit related data such as dietary factors, life time betel quid and tobacco chewing, smoking and alcohol consumption details, family history of cancer in first degree relatives, co-morbid conditions and clinical features of esophageal cancer with complete medical history were collected using a structured questionnaire. Tobacco and betel quid chewing, smoking and alcohol consumption were included in the analysis as ever or never. Betel quid chewing is defined as betel leaf, areca nut (raw/dried/fermented), slaked lime without tobacco. Similarly, tobacco chewing is the chewing of dried tobacco leaf, zarda (moist or dry tobacco mixed with variety of colourings and spices) and khaini (tobacco mixed with lime and flavours) either alone or with betel quid. For tobacco and betel quid chewing, subjects who did not chew or chewed less than 100 times or were non-chewers during the collection of information were considered as never chewers. Subjects who do not smoke or smoke less than 100 cigarettes/beedis in their lifetime or currently non-smokers were considered as never smokers. Majority of the subjects belonged to rural background with agriculture, business or small jobs, which does not radically expose them to occupational hazards.

DNA Extraction
Genomic DNA was isolated from cancerous biopsy samples, surgically excised cancer tissues and inner oral swabs by standard phenol/chloroform protocol [18]. The isolated DNA was then dissolved in Tris-EDTA buffer and stored at 280uC for further analysis. Genomic DNA from formalin fixed paraffin embedded tissues were isolated using Bioline Isolate Genomic DNA minikit (Bioline, UK) following manufacturer's instructions.

Bisulfite Conversion of DNA and Methyl Specific PCR
Bisulfite modification of genomic DNA was done by using ImprintH DNA Modification kit (Sigma-Aldrich), following manufacturer's instructions. Promoter methylation status of p16, DAPK, GSTP1 and BRCA1 was determined by Methylation Specific PCR (MSP) following the primers and conditions [20]. We used two sets of primer, one specific for methylated DNA at the promoter region of each gene and the other set specific for unmethylated DNA. DNA from peripheral blood lymphocytes treated with SssI methyltranferase was used as positive control and DNA from peripheral blood lymphocytes of healthy individuals were used as negative control for methylated genes and viewed in 3% agarose gel.

Statistical Analysis
Association between the environmental, genetic and epigenetic factors were carried out by conditional logistic regression and pvalue ,0.05 was considered statistically significant. Comparison between categorical data was done by Fisher's exact test or Chisquare tests as appropriate.

Multifactor Dimensionality Reduction (MDR) Analysis
The MDR software package (www. multifactordimensionalityreduction.org) was used to detect the gene-gene and gene-environment interactions. MDR is a modelfree, non-parametric approach that can detect higher order interactions even in a small population by reducing the dimensionality of multi-locus information to identify the polymorphisms or factors associated with an increased risk of disease. This helps in overcoming the limitations of low statistical power due to very high degrees of freedom when using logistic regression in studying higher order interactions. The best model for each order of interaction was selected by maximum cross validation consistency (CVC) and testing balanced accuracy (TBA). Interaction models showing highest TBA and CVC was further tested by 1000 folds permutation tests and x 2 test at 0.05% significance levels.

Interaction Entropy Graphs
The entropy-based analysis included in the MDR software package was used to determine synergistic and non-synergistic interactions among the variables. The graphs comprise of nodes containing entropy removed by individual variables and connections joining them pairwise showing entropy of interaction between them. Positive entropy signifies synergy and negative entropy indicate redundancy, whereas, zero entropy indicates independence.

False Positive Report Probability (FPRP)
Results of higher order gene-environment interactions are often affected by the risk of being false positives. In order to detect the false positive report probability (FPRP) and the consistency of our MDR results, we used odds ratios and 95% confidence intervals from MDR analysis, observed p-values and power to detect odds ratios (ORs) of 1.5 and 2.0 in a Bayesian approach [21]. Considering a small sample size as ours, the FPRP was computed using prior probabilities ranging from 0.25 to 10 25 with a preset FPRP for noteworthiness equal to 0.5.

Characteristics of the Subjects Under Study
The study comprised of 63% males and 37% females in cases and 66% males and 34% females in controls. The median age was 55 years (range = 30-76 years) and 57 years (range = 25-85 years) for cases and controls respectively. Most of the subjects belonged to rural areas (73% cases and 79% controls) and had weak financial conditions. Betel quid chewing with or without tobacco was the most prevalent habit as it is customary in the concerned population of NE India. Among the subjects, 39.28% and 41.07% of the cases and 31.53% and 29.23% of the controls had null variants of GSTM1 and GSTT1 genes respectively.
The methylation index (MI) (calculated as the ratio of the number of methylated promoters and total number of promoters under study) ranged from 0 to 1.27 of the 112 (24.10%) patients had MI of 0.46 (41.07%) had MI of 0.25-0.5 and 39 (34.82%) had MI of 0.75-1.0. The frequency of promoter methylation was significantly higher in tobacco chewers as compared to nonchewers for all the genes under study (p,0.001, Figure 2A) whereas, smokers had higher frequency of p16, DAPK and GSTP1 methylation than non-smokers ( Figure 2B).  16] for smoking) were found to be the major risk factors for ESCC after adjusting for the potential confounding factors like age, gender, betel quid chewing and alcohol consumption. Although, the practice of betel quid chewing was very common, but it was not found to be significantly associated with ESCC independently. Similarly, alcohol consumption had shown only modest association with ESCC (adjusted OR = 1.23[95% CI = [0.67-2.46]). Null variants of GSTM1 and GSTT1 had a moderately increased risk of ESCC; however, the risk was significantly higher for GSTT1 null variants only (Table 1).

Risk Assessment of ESCC with Promoter Hypermethylation
The effects of environmental and genetic polymorphisms on ESCC stratified by promoter hypermethylation status as compared to controls are shown in table 2. Both smokeless and smoked forms of tobacco consumption had the highest risk of ESCC with promoter methylation of all the four genes under study. Tobacco chewing had 4.84, 5.69, 5.28 and 6.27 folds increased risk, and smoking had 5.14, 2.67, 2.63 and 2.84 folds risk of ESCC with promoter hypermethylation of p16, DAPK, GSTP1 and BRCA1 genes respectively. In addition, significant association was observed between GSTM1 null genotypes and promoter methylation of p16, DAPK and GSTP1 genes. Null genotypes of GSTT1 gene had an association with p16 and BRCA1 methylation only.

MDR Analysis
The best predictive models of interaction between environmental and genetic parameters up to four orders of interaction, showing the CVC, training and testing balanced accuracy and pvalue of chi-square and 1000 fold permutation test are summarized in Table 4 In addition, tobacco chewing was found to be the major 1 st order interaction term for both ESCC with methylation index of

Interaction Entropy Models
Interaction entropy graphs were constructed on MDR results for ESCC with and without promoter methylation ( Figure 3A and 3B). The model constructed for ESCC cases without promoter hypermethylation, and controls had a strong independent effect of betel quid chewing with a synergistic interaction with tobacco chewing (0.73%). Substantial entropy (2.43%) was removed by GSTT1 null genotype and its interaction with GSTM1 null genotype (0.52%). Although only a small percentage of entropy in a case-control group was explained by GSTM1 null genotype (0.81%) and alcohol consumption (1.71%) individually, but their interaction removed 2.02% of the entropy. Moreover, alcohol consumption showed strong synergy with betel quid chewing (1.71%) and GSTT1 null genotype (1.49%). The model considering ESCC cases with promoter hypermethylation and controls had sizeable entropy removed from the case control group by tobacco chewing (8.48%) and smoking (3.85%) individually, their interaction among themselves (3.85%) as well as with GSTT1 null genotypes (0.66%); in addition, only a minuscule proportion of the  entropy could be explained by betel quid chewing (0.13%) and GSTT1 null (0.66%) on their own, but a large percentage of the entropy was removed by interaction of these two factors (1.45%).

False Positive Report Probability (FPRP) of MDR Best Models
The FPRPs of the best models selected in MDR analysis are summarized in Table 5. The best interaction models for ESCC, ESCC with promoter hypermethylation and ESCC with methylation index of 0.25-0.5 were noteworthy even for very low prior probability assumptions (upto10 23 to 10 25 ) for detecting ORs of 1.5 and 2.0 for an FPRP value of 0.5. Although, the predicted best models for ESCC without promoter methylation and ESCC with methylation index 0.75-1.0 were noteworthy for low prior probability assumptions (10 22 to 10 23 ) when detecting OR = 2.0, but it demonstrated true associations only for high to moderate (0.25-0.10) prior probability assumptions for OR = 1.5.

Discussion
In this case-control study, we examined the association and interaction of various habit related factors and carcinogen metabolizing gene polymorphisms in ESCC and stratified by promoter hypermethylation of multiple tumour suppressor genes using both conventional logistic regression statistics as well as MDR approach. Here, we exploited this non-parametric genetic model free approach to study complex genetic, environmental and epigenetic interactions in ESCC. We identified tobacco consumption as the major risk factor for ESCC and also its probable role in modulating promoter hypermethylation. Moreover, two distinct interaction models for ESCC with and without the promoter hypermethylation advocates discrete gene-gene or gene-environment interactions in both groups.
Tobacco smoking and alcohol consumption are considered among the prominent causes of esophageal cancer worldwide [22,23]. Although, our study also confirmed tobacco smoking (beedi and cigarette) as a predominant risk factor for ESCC, but highest risk was associated with tobacco chewing in the concerned population. Tobacco is chewed in various forms either alone or with slaked lime or betel quid, and the spit is often swallowed. Like tobacco smoke, smokeless forms of tobacco are also known to contain several carcinogenic compounds, the most potent of which are the tobacco specific N-nitrosamines like N'-nitrosonornicotine (NNN), 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) etc. [24]. Although the environmental and lifestyle factors are undoubtedly associated with ESCC development, but only a minuscule proportion of the exposed individuals actually develops cancer in due course. This is largely due to the differences in inherent carcinogen detoxification capabilities of these individuals, defined by the potency of various carcinogen-metabolizing enzymes that catalyzes the breakdown of the carcinogens present in the body. The GSTM1 and GSTT1 genes are responsible for the degradation of several carcinogenic compounds present in tobacco [25]. Null genotypes of GSTM1 and GSTT1 were considered to be associated with an increased risk of ESCC [26,27]. In the present study; null genotypes of both GSTM1 and GSTT1 genes were higher in cases than controls, imparting 1.44 folds and 1.74 folds risk towards developing ESCC respectively. A study from Chinese population documented 2.17 folds increased risk of ESCC in GSTM1 null individuals than GSTM1 carriers [27]. However, a  , and two others failed to establish any association between GSTM1 and GSTT1 polymorphisms and risk of ESCC [28,29]. The best model for ESCC in MDR analysis was the interaction of tobacco chewing, betel quid chewing, smoking and GSTT1 null genotype with an OR of 8.66 [95% CI = 4.41-17.01, p,0.0001]. Although no prior report used MDR for studying these risk factors in ESCC, but an earlier case-control study using logistic regression conducted on the population of north-east India have documented the highest risk of esophageal cancer in betel quid and tobacco chewers with smoking habit (OR = 15.3 in males, OR = 27.4 in females) [17]. A study conducted on a South Asian population established a 21.4 fold increased risk of ESCC in betel quid and tobacco chewers who smoked cigarettes [30]. However, both the studies did not take genetic factors into consideration.
The role of promoter hypermethylation of tumour suppressor genes is recognized as one of the key events in instigation and progression of cancer by repressing the expression of the corresponding genes. Here, we studied promoter methylation status of key tumour suppressor genes involved in different cellular pathways and thought to be important in cancer development and progression, namely, p16 (cell cycle regulation), DAPK (apoptosis) BRCA1 (DNA repair) and GSTP1(protection of DNA). Although, promoter methylation of p16, DAPK, GSTP1 and BRCA1 genes are frequent events in several carcinomas, including ESCC [26,[31][32][33][34], but very few studies considered these genes together in ESCC. In our study group, 37.5%, 61.60%, 58.92% and 20.53% of the ESCC tumours had p16, DAPK, GSTP1 and BRCA1 promoter methylation respectively, which was significantly higher than adjacent normal tissues. A study conducted by Guo et al. [35] found a comparatively higher proportion of p16 (52%) and a lower percentage of DAPK (24%) promoter hypermethylation in ESCC tumours, which might be due to ethnic variations; however, they reported a similar proportion of BRCA1 promoter methylation (28%).
Based on the promoter hypermethylation status, the cases were further categorized as ESCC with and without promoter hypermethylation. Tobacco chewing and smoking were the main risk factors for ESCC with promoter hypermethylation of all the four genes under study when compared to controls. Further classifying the cases according to methylation index, tobacco chewing had the highest risk of ESCC having methylation index 0.25-0.50, followed by betel quid chewing. However, cases with higher methylation index (0.75-1.0) had strongest associations with tobacco consumption, both chewing and smoking, with an odds ratio of 6.04 and 5.29 respectively. The same was reflected in MDR, as tobacco chewing was the best one factor model in ESCC with promoter hypermethylation overall and also stratified by methylation index. A similar study from Indian oral cancer patients found a significantly higher percentage of p16 and DAPK promoter methylation in tobacco chewers as compared to nonchewers [35]. The fact that certain tobacco specific nitrosamines and poly-aromatic hydrocarbons (PAHs) like NNK, Benzo[a]pyrene etc. are capable of modulating DNA methylation is evident to both in-vitro as well as human studies. In previous studies, NNK was found to induce hypermethylation of multiple tumour suppressor genes like p16, DAPK, Rarb etc. in liver and lung tumours of rat and mouse models [11,36,37]. A recent study combining cell, animal and clinical lung cancer tissues as a modelfound that, NNK attenuates DNMT1 degradation and also induces its nuclear accumulation resulting in subsequent hypermethylation of promoters of tumour suppressor genes [10]. Benzo[a]pyrene present in tobacco is converted to its carcinogenic form BPDE (benzo[a]pyrenediolepoxide) by the phase I enzymes CYP1A1, CYP1B1, etc., which are further metabolized by the GSTs. In a study on esophageal cancer cells, BPDE was found to suppress Rarb expression via promoter hypermethylation by recruiting DNMT3A [38]. Moreover, DNA repair and carcinogen metabolising gene polymorphisms are believed to predispose cells  towards promoter hypermethylation of genes [15]. In this study, significant association was observed between GSTM1 null genotypes and promoter methylation of p16, DAPK and GSTP1 genes, whereas, null genotypes of GSTT1 gene had the association p16 and BRCA1 methylation only. In addition, both GSTM1 and GSTT1 null polymorphisms were significantly associated with ESCC having MI = 0.75-1.0. In a case-control study on lung cancer, null genotype of GSTM1 has been found to increase the risk of promoter hypermethylation of DAPK and Rarb. Moreover, they also identified significant interaction of tobacco smoking and null GSTM1 genotype in modulating promoter hypermethylation of multiple TSGs [12]. In our study, the interaction of tobacco and betel quid chewing, smoking and null GSTT1 genotype was the best model for both ESCC with promoter hypermethylation and MI = 0.25-0.50 in MDR. Additionally, the interaction of tobacco chewing, smoking, GSTT1 and GSTM1 null genotypes was the optimal model for ESCC with MI = 0.75-1.0. This further supports the hypothesis that a complex interaction is likely to interplay between tobacco-related habits and carcinogen metabolizing gene polymorphisms towards promoting aberrant DNA methylation in ESCC.
In ESCC without promoter hypermethylation, betel quid chewing was the most prominent risk factor, followed by alcohol drinking and GSTT1 polymorphism. The same was reflected in MDR, as betel quid chewing was the best one factor model and the interaction of betel quid, alcohol and GSTT1 null genotype was the finest model recognized. The alcohol-betel quid interaction was not only found to modify the risk of ESCC in earlier studies, but was also associated with methylation of certain genes [23,39]. However, we were not able to establish any strong association of betel quid chewing or alcohol consumption with promoter hypermethylation, except for a moderate association of betel quid chewing with ESCC having a comparatively lower methylation index (MI = 0.25-0.50).
Entropy graphs were drawn for visualization and interpretation of MDR interactions. Tobacco chewing and smoking showed highest individual effects as well as strongest synergistic effects among each other in ESCC with promoter hypermethylation, supporting the role of tobacco carcinogens in promoting DNA methylation in ESCC. In ESCC without promoter hypermethylation, interactions of alcohol consumption with betel quid chewing and GSTT1 null genotype was most striking.
There are both strengths as well as limitations to this study. This is the first case-control study on the association and interaction of environmental, genetic and epigenetic factors in ESCC using both LR as well as MDR approaches. We further strengthened the data by testing the robustness and consistency of the best interaction model obtained from MDR using false positive report probability (FPRP) analysis. The best models for total data set of ESCC and ESCC with promoter hypermethylation showed excellent reliability even at low prior probabilities. The relatively small sample size in our study might be a drawback for predicting high-order interactions; however, MDR is known to reliably predict interactions in spite of for low sample sizes. Moreover, while studying the habit-related factors, the duration or frequency of use was not considered, as such the dose-related response could not be established.
In conclusion, our study not only confirmed tobacco consumption as the main risk factor of ESCC in NE India, but also indicated its possible interaction with carcinogen metabolizing genes towards modulating promoter hypermethylations of TSGs. Nevertheless, it is only a pilot association study and requires indepth investigations involving larger populations and in-vitro models to establish the role of these interactions in ESCC.