Evaluating the Association of Eight Polymorphisms with Cancer Susceptibility in a Han Chinese Population

Background The identification of susceptibility genes for specific types of cancer can provide necessary information for the complete characterization of cancer syndromes. Eight single nucleotide polymorphisms (SNPs), rs465498, rs17728461, rs4488809, rs753955, rs13361707, rs9841504, rs2274223, and rs13042395, were reported by genome wide association studies (GWASs) to be closely related to the susceptibility of lung cancer (LC), gastric cancer (GC) or esophageal cancer (EC) in Han population from northern or southern China. However, Chinese Han people from different geographic areas may have different genetic backgrounds. This study aims to assess the genetic associations of the eight SNPs mentioned above with three cancers risk in a Han population from northwest China. Methods A total of 186 cancer-free controls and 436 cases with non-small cell lung cancer (NSCLC) (159 cases), non-cardia GC (167 cases) or EC (110 cases) were enrolled in this study. Chi-square test and polytomous logistic regression analyses were used to estimate the association between eight cancer-related SNPs and three cancers in a Han Chinese population from northwest China. The logistic regression results were adjusted for confounding factors and Benjamini and Hochberg False Discovery Rate (FDR) method was used to adjust the multiple hypothesis tests. Association analyses by cigarette smoking or alcohol drinking status were analyzed by crossover analyses. Results One of the eight SNPs, rs17728461 was associated with NSCLC susceptibility (in a heterozygous model, OR = 0.44, 95% CI = 0.27–0.72, p = 0.001). Two SNPs, rs753955 and rs13042395, were associated with the risk of non-cardia GC in different genetic models (p < 0.05). No SNPs were associated with EC. The crossover analyses showed that the rs13042395 CT genotype, combined with cigarette smoking or alcohol drinking, could further increase the risk for non-cardia GC (p < 0.05). Conclusions These results indicated that rs17728461 may be specifically associated with the risk of NSCLC. rs753955 and rs13042395 were specifically associated with susceptibility to non-cardia GC in Ningxia Han Chinese. Susceptibility-associated polymorphisms in the northwestern Han Chinese were not very consistent with those in the northern Han Chinese or southern Han Chinese. The validation of these findings with a functional evaluation and a larger population is still required.


Introduction
Cancer is now understood to have both a genetic and an environmental component. Studies of hereditary cancer syndromes and targeted and genome-wide mutation analyses have provided ample evidence for the participation of genetic alterations in carcinogenesis [1]. Understanding genetic factors in relation to cancer is important because identifying such factors might be useful for risk prediction and for the development of chemo-preventive agents and other preventive measures [2].
Until now, genome-wide association studies (GWASs), a powerful method to investigate the genetic determinants of complex diseases, have successfully identified hundreds of SNPs related to the risk of cancers [3], including lung cancer (LC), gastric cancer (GC), and esophageal cancer (EC). The GWAS by Hu et al indicated that four SNPs, rs465498 at 5p15.33, rs17728461 at 22q12, rs4488809 at 3q28, and rs753955 at 13q12, were associated with LC risk in Han Chinese [4]. Two new SNPs, rs13361707 at 5q13.1 and rs9841504 at 3q13.31, identified by Shi et al, were significantly associated with the risk of GC [5]. Two SNPs, rs2274223 at 10q23 and rs13042395 at 20p13, were associated with risk of EC in a large number of Han Chinese [6]. However, the Han populations that were enrolled in these three GWASs were mainly selected from northern China (Beijing city) or southern China (Nanjing city of Jiangsu province). Chinese Han people from different places may have different genetic backgrounds due to their complex origins and long history of interaction with many surrounding ethnic groups [7]. Therefore, the Han people from the Ningxia Hui autonomous region, which is located in northwest China, may have a different genetic background from those in northern or southern China. To further explore the associations and specificities of the eight SNPs (rs465498, rs17728461, rs4488809, rs753955, rs13361707, rs9841504, rs2274223, and rs13042395) mentioned above with the susceptibility of three cancers in the Han Chinese population in the Ningxia region of northwest China, a hospital-based case-control study was performed.

Study population and data collection
All of the samples were from Ningxia Han residents whose ancestral native living places were Ningxia Hui Autonomous Region and at least three generations of their families were Han people. Patients with primary non-small cell lung cancer (NSCLC), non-cardia GC or EC were recruited between 2009 and 2012 in the General Hospital of Ningxia Medical University (Ningxia Region, China). All of the enrolled cancer patients were diagnosed by pathological means, and their diagnoses were histologically confirmed. Additionally, patients with chronic diseases, conditions that involved vital organs, or severe endocrinological, metabolic, or nutritional diseases were excluded from this study. Patients were also excluded if they had received any blood transfusion during the past 6 months, immunosuppressive therapy, chemotherapy or radiation therapy.
Healthy, unrelated individuals were randomly recruited as control subjects from the health check-up center at the General Hospital during the same period. The inclusion criterion for the controls was the absence of cancer history. None of these healthy people had a history of contact with a strong carcinogen like asbestos, arsenic trioxide, benzene and so on.
Each subject was personally questioned by trained interviewers using a pre-tested questionnaire to obtain information on demographic data, including the age at diagnosis, gender, ethnicity, family history of cancer, residential region, occupation, living and eating habits (e.g., cigarette smoking and alcohol consumption). Individuals who smoked one cigarette per day for more than 1 year were classified as smokers, and those who had three times or more alcoholic drinks a week for more than 6 months were defined as alcohol drinkers. After the interview, 2-ml samples of venous blood were collected from each participant for DNA preparation and genotyping. Finally, 440 cases (including 162 NSCLC patients, 168 non-cardia GC patients, and 110 EC patients) and 186 cancer-free controls were included in this study.
This hospital-based case-control study was approved by the Medical Ethics Review Committee of Ningxia Medical University (Ningxia Region, China). Signed informed consent was obtained from each participant.

SNP selection and genotyping
Literature on GWASs published until December 2012 was reviewed, and candidate SNPs that were previously reported to be associated with LC (rs465498, rs753955, rs17728461, and rs4488809), GC (rs13361707 and rs9841504) and EC (rs2274223 and rs13042395) were selected for investigation. Genomic DNA was extracted from the peripheral blood leukocytes of the participants using a QIAamp DNA Mini kit (Qiagen, Hilden, Germany) following the manufacturer's instructions and stored at -20°C until use.
The SNP genotyping work was performed by an improved multiplex ligase detection reaction method (iMLDR, Genesky Bio-Tech Cod., Ltd., Shanghai, China) as previously described [8]. The primers for the polymerase chain reaction (PCR) and the probes for the LDR were listed in Table 1. Two negative controls were set: one with double-distilled water as template and the other with DNA sample without primers while keeping all other conditions the same in one plate. Duplicate tests were designed and the results were consistent. The call rates in the genotyping of all these eight SNPs were above 99% ( Table 2). Three NSCLC cases and one non-cardia GC cases were excluded due to low genotyping quality. rs9841504 and rs2274223 were also selected and genotyped by both iMLDR method and direct sequencing method to evaluate concordance of the genotyping. The concordance rates in the genotyping of two SNPs were 100%. Finally, 436 cases (159 NSCLC patients, 167 non-cardia GC patients, and 110 EC patients) and 186 cancer-free controls were included in this study.

Statistical analysis
A chi-squared test was used to evaluate the differences in the distributions of demographic characteristics, selected variables and genotypes between the cases and controls. Continuous variables were analyzed using Student's t test. Genotype frequencies in the control subjects for each SNP were tested for departure from Hardy-Weinberg equilibrium (HWE) using the goodness-of-fit χ 2 test. Polytomous logistic regression analyses were used to estimate the adjusted odds ratios (ORs) and the 95% confidence intervals (CIs) for the association between genetic variants and cancer risk, as evaluated by different genetic models (additive, heterozygous, homozygous, and dominant and recessive) adjusted for age, gender, family history of cancer, cigarette smoking and alcohol drinking status. Benjamini and Hochberg False Discovery Rate

Demographic data of the participants
Demographic data including age, gender, and family history of cancer, cigarette smoking, and alcohol drinking for all subjects recruited in this study were summarized in Table 3. There were significant differences between the NSCLC, non-cardia GC or EC group and control group in age (p < 0.05). No statistical difference was found in gender between the NSCLC group (males account for 64.2%) and the control group (males account for 60.8%). However, the percentages of males in non-cardia GC (72.5%) and EC (74.5%) were higher than that in the control group (60.8%). The difference between NSCLC and control group in alcohol drinking was not significant but more NSCLC patients were cigarette smokers (p = 0.027). Statistically significant differences were found between non-cardia GC or EC and cigarette smoking, alcohol drinking or family history of cancer (p < 0.05).  The results of the logistic regression analyses in Table 4 indicated that the association between NSCLC and cigarette smoking exposure was significant(OR = 1.65, 95% CI = 1.06-2.57, p = 0.027). Similar results were found between non-cardia GC or EC and cigarette smoking exposure (OR = 2.91, 95% CI = 1.88-4.51, p < 0.001, and OR = 2.72, 95% CI = 167-4.42, p < 0.001, respectively) or alcohol drinking exposure (OR = 3.22, 95% CI = 1.92-5.39, p < 0.001, and OR = 3.11, 95% CI = 1.76-5.48, p < 0.001, respectively).

Genetic distribution of the SNPs in a case-control study
Basic information of the selected SNPs was shown in Table 2. All of the SNPs had distributions within the parameters of HWE for the control population. The genotype frequencies of the eight SNPs in the cases and controls were shown in Table 5.

Association between SNPs and cancers
Polytomous logistic regression analyses were used to estimate the association between SNPs and cancer risk using different genetic models. The results were listed in Tables 6 and 7, and all of the results were adjusted by age, gender, family history of cancer, cigarette smoking, and alcohol drinking.
After adjusted by Benjamini and Hochberg False Discovery Rate (FDR) method, only one of the eight SNPs, rs17728461 in the HORMAD2-LIF gene was associated with NSCLC risk. The CG heterozygote of rs17728461, when compared to the CC homozygote, was associated with a decreased risk of NSCLC (adjusted OR = 0.44, 95% CI = 0.27-0.72, p = 0.0010).
However, no association between the eight SNPs and EC risk was observed in this study. Crossover analysis by cigarette smoking or alcohol drinking status The results of crossover analyses by cigarette smoking or alcohol drinking status were illustrated in Table 8. Crossover analysis indicated that there was significantly increased risk of non-cardia GC among participants with rs13042395 CT+TT genotype who were also cigarette smokers or alcohol drinkers when compared to those non-cigarette smokers or non-alcohol drinkers who carried the rs13042395 CC genotype (adjusted OR = 6.03, 95% CI = 2.63-13.86, p < 0.0001, and adjusted OR = 4.62, 95% CI = 1.87-11.39, p = 0.0009, respectively). When

Discussion
Cancer is the second-leading cause of death and disability in the world, behind only heart disease [9]. According to the new version of International Agency Research on Cancer (IARC)'s online database GLOBOCAN 2012, among the more than two dozen distinct cancers that have been examined, LC, non-cardia GC, and EC rank in the top ten in incidence and mortality rates [10]. Approximately 40% of new cases and deaths from these three cancers in the world have occurred in China which makes LC, non-cardia GC and EC a great threat to human health in China [10].
To assess the associations between SNPs and cancers, a large number of studies have been conducted within the last decade in different ethnic populations, including the Han Chinese. The Han Chinese population is the largest ethnic group in China, composing 98% of the entire Chinese population [7]. Some studies have reported that the genetic background of the Han population in northern China is different from that in southern China [11][12][13][14]. Therefore, the Han Chinese population, a seemingly homogeneous population, actually has a complicated substructure. In this study, we analyzed the associations between the eight SNPs that were previously reported in GWASs from two Han populations in different geographical places and NSCLC, non-cardia GC and EC in a Han Chinese population from northwest China. Among the eight SNPs, only rs17728461 was specifically associated with NSCLC in our study. rs17728461 is located at 22q12.2, approximately 38 kb downstream of the LIF gene that encodes leukemia inhibitory factor (LIF). LIF plays an important role in the process of LC through the signal transducer and activator of transcription 3 (STAT3) pathway [15,16]. Hu et al's GWAS first identified rs17728461 G-allele as a risk factor of LC in Han populations from northern and southern China. In contrast, however, our work found that rs17728461 Gallele was associated with a decreased risk of NSCLC in a Han population from northwest China. The allele information indicates that the minor allele frequency (MAF) of rs17728461 in the population from northwest China and the populations from northern and southern China varied widely (0.26 vs. 0.17).
Two other SNPs, rs753955 and rs13042395 were observed to be specifically and separately associated with non-cardia GC risk in this study. rs753955 is located in the MIPEP gene at 13q12.12, encoding mitochondrial intermediate peptidase (MIPEP). MIPEP may contribute to frataxin deficiency and iron utilization and the exact mechanism of MIPEP in carcinogenesis still needs to be addressed in the future [17]. Hu et al's GWAS first reported that rs753955 Gallele was correlated with the susceptibility of LC in populations from both northern China and southern China. In our study, however, rs753955 G-allele was found to be related to noncardia GC risk instead of LC risk in the Han population from northwest China. In addition, we also noticed that rs13042395 T-allele was associated with the increased risk of non-cardia GC. The results of crossover analysis by cigarette smoking or alcohol drinking status further indicated that rs13042395 CT carriers showed a significantly increased risk of non-cardia GC when they were also cigarette smokers or alcohol drinkers. rs13042395, located in the C20orf54 gene at 20p13, was first reported as a risk factor for esophageal squamous cell carcinoma (ESCC) in Wang et al's GWAS of a northern Han population. However, five studies from different groups, including ours, failed to find any association with EC [18][19][20][21]. Additionally, in a later study, Wang et al reported a correction that the published association for rs13042395 T-allele could not be replicated in additional analyses of data from the same region; the original finding might be the result of an inadequate control for population stratification using genetically unmatched subjects or, less likely, could have been due to chance alone [6]. Overall, we analyzed the associations between the reported eight SNPs and the risk of three cancers (NSCLC, non-cardia GC and EC) in a Han population from northwest China. Our statistical data showed that rs17728461 was specifically associated with the risk of NSCLC, rs753955 and rs13042395 were specifically related with susceptibility to non-cardia GC. The susceptibility polymorphisms in the northwestern Han Chinese were not very consistent with those in the northern or southern Han Chinese.

Conclusions
In conclusion, our work and that of others have shown that susceptibility polymorphisms identified in GWASs can vary in different ethnic populations, even in very close Han populations living in different geographic areas. However, several limitations of our work should be mentioned. First, since all of the participants were enrolled from a hospital and the average age of the controls was also younger than those of the cases, therefore, selection bias cannot be excluded. Second, our results were obtained with a limited sample size, allowing us to draw only preliminary conclusions. Finally, functional assays are required for further studies. Therefore, the validation of these findings by functional evaluation and with a larger population is required.