Our previous genome-wide association study (GWAS) identified three independent single nucleotide polymorphisms (SNPs) in human major histocompatibility complex (MHC) region showing association with esophageal squamous cell carcinoma (ESCC). In this study, we increased GWAS sample size on MHC region and performed validation in an independent ESCC cases and normal controls with aim to find additional loci at MHC region showing association with an increased risk to ESCC.
The 1,077 ESCC cases and 1,733 controls were genotyped using Illumina Human 610-Quad Bead Chip, and 451 cases and 374 controls were genotyped using Illumina Human 660W-Quad Bead Chip. After quality control, the selected SNPs were replicated by TaqMan genotyping assay on another 2,026 ESCC cases and 2,384 normal controls.
By excluding low quality SNPs in primary GWAS screening, we selected 2,533 SNPs in MHC region for association analysis, and identified 5 SNPs with p <10−4. Further validation analysis in an independent case-control cohort confirmed one of the 5 SNPs (rs911178) that showed significant association with ESCC. rs911178 (PGWAS = 6.125E-04, OR = 0.644 and Preplication = 1.406E-22, OR = 0.489) was located at upstream of SCAND3.
The rs911178 (SCAND3 gene) in MHC region is significantly associated with high risk of ESCC. This study not only reveal the potential role of MHC region for the pathogenesis of ESCC, but also provides important clues for the establishment of tools and methods for screening high risk population of ESCC.
Citation: Zhang P, Li X-M, Zhao X-K, Song X, Yuan L, Shen F-F, et al. (2017) Novel genetic locus at MHC region for esophageal squamous cell carcinoma in Chinese populations. PLoS ONE 12(5): e0177494. https://doi.org/10.1371/journal.pone.0177494
Editor: Zongli Xu, National Institute of Environmental Health Sciences, UNITED STATES
Received: January 31, 2017; Accepted: April 27, 2017; Published: May 11, 2017
Copyright: © 2017 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This study was funded by the National Natural Science Foundation of China (No. 81472323, http://www.nsfc.gov.cn/) and the NSFC-Guangdong Joint Fund (No. U1301227, http://www.nsfc.gov.cn/) for LDW.
Competing interests: The authors have declared that no competing interests exist.
Esophageal cancer (EC) is the sixth most common cancer deaths worldwide and the fourth leading cancer deaths in China. There were an estimated 455,800 new EC patients and 400,200 deaths per year worldwide while 291,238 new incidences and 218,958 mortalities in China [1,2]. Tai-Hang Mountain at Henan, Hebei and Shanxi provinces’ junction are the highest risk area for EC in China. The ESCC and esophageal adenocarcinoma are the two main histological types. More than 90% cases are ESCC in China, compared to about 20% in Western countries . In high-risk regions, known risk factors include nutritional deficiencies, low intake of fresh fruits and vegetables, intake of pickled vegetables, intake of nitrosamine-rich or mycotoxin-contaminated foods, drinking beverages at high temperatures, and low socioeconomic status [4,5]. The dramatic geographic distribution and apparent family aggregation suggest that both environmental and genetic factors may play important roles in pathogenesis of EC [6,7].
Several studies suggest that immune defense mechanism may play an important role in the esophageal carcinogenesis . The human MHC is the most important region for autoimmunity, which encodes human leukocyte antigens (HLA) responsible for antigen presentation to T cells. The HLA gene complex is located on the short arm of chromosome 6 and covers an about 3.5 Mb segment that included three genomic regions, class I (1.9 Mb; HLA-A, HLA-B, and HLA-C), class III (0.7 Mb), and class II (0.9 Mb). The HLA seems to be generated through repeated gene duplication and conversion during evolution . An extended MHC (xMHC) region is densely populated with genes that are critical for innate and adaptive immunity in humans, spanning about 7.6 Mb that covers over 250 known expressed loci. The xMHC is divided into five sub-regions consisting of the extended class I region (3.9 Mb), the classical class I, III, and II clusters, and extended class II region (0.2 Mb) [10–12].
In our previous studies, we report 10 SNP loci and 8 corresponding genes responsible for increased risk of developing ESCC through GWAS in Chinese populations [4,13,14]. We also report an association of 3 independent SNPs in the MHC region with ESCC. However, significance of the 3 MHC risk SNPs remains unknown . To validate the association of the MHC loci, in this study, we analyzed all possible SNPs in MHC region in an increased number of GWAS samples, and performed TaqMan-based genotyping in independent ESCC and normal controls.
Materials and methods
This study was approved by the ethical review committee of Zhengzhou University and conducted following Declaration of Helsinki principles. All patients and normal controls in this study have provided written informed consent.
Hospital-based ESCC case-control design was used for this study. Subjects of Chinese Han ESCC and normal controls for GWAS scan and replications were recruited from the high-incidence areas for ESCC in northern China and Endoscopic Screening Centers within multiple hospitals for early detection of upper gastrointestinal tumors . The patients were confirmed as ESCC by histopathology and the controls were confirmed without early ESCC and other upper gastrointestinal tumors by upper gastrointestinal endoscopy. The ESCC patients and normal controls were interviewed to obtain demographic and lifestyle histories related to cancer risks. Family histories of ESCC patients regarding upper gastrointestinal cancers in the first-, second-, and third-degree relatives were obtained through questionnaires. All normal controls did not have a family history of cancers.
In the screening phase of GWAS, the previously published 1,077 ESCC patients and 1,733 normal controls of Chinese Han descent were genotyped using Illumina Human610-Quad BeadChip . New cohort of 451 ESCC patients and 374 normal controls were genotyped using the Illumina Human660W-Quad BeadChip. A total of 3,635 samples of Chinese Han descent were screened, including 1,528 ESCC patients (921 male cases, 607 female cases, mean age 61 ± 9 years) and 2,107 controls (1,052 males, 1,055 females, with an average age of 31 ± 15 years) (Table 1).
In validation phase, the TaqMan genotyping assays were used for replication in a new separate 4,410 people samples, including 2,026 Chinese Han ESCC cases (1,256 males, 770 females, mean age 60 ± 9 years) and 2,384 normal controls (1,198 males, 1,186 females, with an average age of 50 ± 11 years) (Table 1).
Genomic DNAs were extracted from peripheral blood by using FlexiGene DNA kits (Qiagen, Hilden, Germany). The concentration of DNA was normalized to 50 ng/ μl with Nanodrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, USA). 200 ng of DNA were used for genotyping. The genome-wide genotyping analysis were conducted using Illumina Human 610- and 660W- Quad BeadChips (Illumina, San Diego, USA) in the Key Laboratory of Dermatology (Anhui Medical University, Hefei, China).
Each DNA sample had gone through whole-genome amplification, fragmentation, precipitation, and re-suspension in hybridization buffer. Denatured samples were hybridized on prepared Illumina Human 610- or 660W- Quad BeadChips. After hybridization, the Bead Chips oligonucleotides were extended by a single labeled base, which was detected by fluorescence imaging with an Illumina Bead Array Reader. Normalized bead intensity data obtained from each sample were loaded into the Illumina Bead Studio 3.2 software, which converted fluorescence intensities into SNP genotypes. The clustering of genotypes was carried out with Gen-Call 22.214.171.124 software, which assigns a quality score to each locus and an individual genotype confidence score that is based on the distance of a genotype from the center of the nearest cluster.
We performed principal components analysis (PCA) to identify genetic outliers and removed genetically deviated samples using EIGENSTRAT 3.0 software package . To extract the principal components, original script was modified. Criteria  for quality control were: 1) Drop if call rate of genotype < 0.90 in the cases or normal controls; 2) Drop if minor allele frequency (MAF)< 0.01 in the cases and normal controls; 3) Drop if the P value of Hardy-Weinberg equilibrium (HWE)< 10−7 in the normal controls; 4) All the SNPs on the X, Y and mitochondrial chromosomes as well as the copy number probes were excluded from the statistical analysis; 5) Only shared SNP markers among different Illumina BeadChip were considered. The SNPs in the MHC region potentially associated with high risk of ESCC were selected after removing unqualified SNPs.
Criteria  of SNP loci selection for TaqMan-based assay were: 1) the MAF> 0.02 in the cases and controls; 2) the P value of HWE ≥ 0.001 in the controls; 3) the P value of GWAS analysis (Cochran-Armitage trend test) < 10−4; 4) the related genes had functional role in carcinogenesis. The high risk SNPs were finally selected from these criteria. For the replication study, DNA concentration was normalized to 15-20ng/μl with Nanodrop 2000 Spectrophotometer. Approximately 15ng of genomic DNA was used to genotype each sample. Genotypes for the selected SNPs were obtained using the TaqMan genotyping assay on 7900HT Fast Real-Time polymerase chain reaction (PCR) system (Applied Biosystems, Foster City, USA) in the Key Laboratory of Dermatology (Anhui Medical University).
The SNP call rates, MAF, and HWE were calculated. Quality controlled genotyping data were analyzed and outputted with Plink1.07 software . Association analyses were performed on ESCC cases and genetically matched controls using the Cochran-Armitage trend test with genomic control correction for population stratification. The P value, odds ratio (OR) and 95% confidence interval (95% CI) were calculated using Cochran-Armitage trend test.
Outlier removal and SNP selection
After removing 1,051 deviated samples from controls by the PCA, the genetically matched 1,528 ESCC cases and 1,056 normal controls were remained (genome-wide χ2 inflation factor λgc = 1.07). The genome-wide SNPs scanning were conducted on the DNA of matching samples. After quality control checking, the 488,919 SNPs were left for final analysis, among which 2,533 SNPs were located at MHC region. Further analysis identified 5 SNPs (PGWAS < 10−4) in MHC region for validation: rs17533090 (PGWAS = 9.709E-06, OR = 1.503, 95%CI = 1.254–1.802), rs35399661 (PGWAS = 6.070E-06, OR = 1.712, 95%CI = 1.354–2.166), rs1536501 (PGWAS = 8.874E-04, OR = 1.805, 95%CI = 1.268–2.568), rs911178 (PGWAS = 6.125E-04, OR = 0.644, 95%CI = 0.500–0.830) and rs6901869 (PGWAS = 2.523E-05, OR = 1.973, 95%CI = 1.430–2.722) (Table 2).
The 5 SNPs were located at the different linkage disequilibrium (LD) blocks of MHC region. rs17533090 and rs35399661 on 6p21.32 fall within a high LD block between HLA-DQA1 and HLA-DRB1. rs1536501 on 6p21.31 is located between LEM domain containing 2 (LEMD2) and inositol hexakisphosphate kinase 3 (IP6K3). rs911178 on 6p22.1 is located 35-kb upstream of SCAN domain containing 3 (SCAND3). rs6901869 on 6p21.33 is located between HLA-C genes and HLA complex group 27 (HCG27).
The TaqMan validation was conducted on 2,026 cases and 2,384 controls for 5 selected SNPs. The concordance rate among the genotypes from the Illumina and TaqMan analyses was >99%. We checked the cluster patterns of the 5 SNPs from the genotyping data from the Illumina and TaqMan analyses to confirm their good quality. rs17533090 and rs35399661 did not pass the HWE test (PHWE <10−5). rs1536501 and rs6901869 did not have significant p value (Preplication >0.01). Finally, one SNP rs911178 was validated with Preplication = 1.406E-22, OR = 0.489. The SNP is located 35-kb upstream of SCAND3 on 6p22.1 (Fig 1, Table 3).
Several studies have shown that the chromosomal 6p21-6p22 is a hot spot for loss of heterozygosity in ESCC, which results in the downregulation of HLA class I genes [18–20]. Loss of HLA class I and gain of class II protein expression are frequently observed in ESCC. The HLA-DRB1 allele has been correlated with the risk of ESCC. These support the notion that structural variation in the MHC region might be a major mechanism related to genetic susceptibility to ESCC [18–22]. In a joint analysis of NCI, Beijing, and our laboratory we identified that SNP rs35597309 at MHC class II gene region was associated with ESCC . In this study, we identified another important risk locus in MHC region using GWAS, followed by TaqMan validation. The SNP rs911178 is located at upstream of SCAND3 (also known as ZBED9 or ZNF452). This gene encodes a protein of unknown function with CHCH (C-terminal coiled-coil-helix-coiled-coil-helix motif) and hATC domains (N-terminal hAT family dimerisation motif). It is down-regulated during mouse embryonic stem cell differentiation . SCAND3 is involved in the self-renewal of mouse embryonic stem cells. SCAN domains are typically associated with transcriptional regulation of gene expression suggests that SCAND3 is transcription factor [23,24].
GWAS can identify susceptibility loci for cancers by simultaneously comparing hundreds of thousands of SNPs between human genome from cases and healthy individuals. The identified new genetic loci may also further elucidate the factors in the development of cancer. Several GWAS studies not only add to the known genetic factors that predispose individuals to ESCC, but also highlight the importance of genetic factors and genetic heterogeneity in the development of ESCC, which could advance our understanding of the pathogenesis and carcinogenesis of ESCC [25–27].
In summary, we identified and validated that the rs911178 (SCAND3 gene) in MHC region is significantly associated with the high risk of ESCC through GWAS and TaqMan genotyping assay. Our study provides more understanding of MHC region for the pathogenesis of ESCC, also provides important clues for the establishment of the tools and methods at the screening for high risk population. Further functional studies were needed to elucidate the molecular mechanisms underlying rs911178 on ESCC. Additionally, fine mapping and sequencing in these loci would be required to determine the optimal genetic variants to be studied in laboratory systems to explain these association signals in the future.
We thank all the ESCC patients, healthy controls and their family members whose contributions made this work possible. We acknowledge personnel and the working arrangements with the genotyping and bioinformatics analyses at Key Laboratory of Dermatology (Anhui Medical University, Hefei, China).
- Conceptualization: LDW.
- Data curation: PZ XML XKZ XS LY FFS ZMF LDW.
- Formal analysis: PZ XML XS LDW.
- Funding acquisition: LDW.
- Investigation: PZ XML XKZ XS LY FFS ZMF.
- Methodology: PZ XML XKZ XS LDW.
- Project administration: XKZ XS ZMF LDW.
- Resources: PZ XML XKZ XS ZMF LDW.
- Software: PZ XS LDW.
- Supervision: ZMF LDW.
- Validation: PZ XML XKZ XS LY FFS ZMF LDW.
- Visualization: PZ XML XKZ XS LY FFS LDW.
- Writing – original draft: PZ LDW.
- Writing – review & editing: PZ XML XKZ XS LY FFS ZMF LDW.
- 1. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108. pmid:25651787
- 2. Chen W, Zheng R, Zeng H, Zhang S. The updated incidences and mortalities of major cancers in China, 2011. Chin J Cancer. 2015;34(11):502–507. pmid:26370301
- 3. Tran GD, Sun XD, Abnet CC, Fan JH, Dawsey SM, Dong ZW, et al. Prospective study of risk factors for esophageal and gastric cancers in the Linxian general population trial cohort in China. Int J Cancer. 2005;113(3):456–463. pmid:15455378
- 4. Wang LD, Zhou FY, Li XM, Sun LD, Song X, Jin Y, et al. Genome-wide association study of esophageal squamous cell carcinoma in Chinese subjects identifies susceptibility loci at PLCE1 and C20orf54. Nat Genet. 2010;42(9):759–763. pmid:20729853
- 5. Wu M, Liu AM, Kampman E, Zhang ZF, Van't Veer P, Wu DL, et al. Green tea drinking, high tea temperature and esophageal cancer in high- and low-risk areas of Jiangsu Province, China: a population-based case-control study. Int J Cancer. 2009;124(8):1907–1913. pmid:19123468
- 6. Hongo M, Nagasaki Y, Shoji T. Epidemiology of esophageal cancer: Orient to Occident. Effects of chronology, geography and ethnicity. J Gastroenterol Hepatol. 2009;24(5):729–735. pmid:19646015
- 7. Ko JM, Zhang P, Law S, Fan Y, Song YQ, Zhao XK, et al. Identity-by-descent approaches identify regions of importance for genetic susceptibility to hereditary esophageal squamous cell carcinoma. Oncol Rep. 2014;32(2):860–870. pmid:24890309
- 8. Schwartz M, Zhang Y, Rosenblatt JD. B cell regulation of the anti-tumor response and role in carcinogenesis. J Immunother Cancer. 2016;4:40. pmid:27437104
- 9. Mizuki N, Ando H, Kimura M, Ohno S, Miyata S, Yamazaki M, et al. Nucleotide sequence analysis of the HLA class I region spanning the 237-kb segment around the HLA-B and -C genes. Genomics. 1997;42(1):55–66. pmid:9177776
- 10. Horton R, Wilming L, Rand V, Lovering RC, Bruford EA, Khodiyar VK, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5(12):889–899. pmid:15573121
- 11. Urayama KY, Chokkalingam AP, Metayer C, Hansen H, May S, Ramsay P, et al. SNP association mapping across the extended major histocompatibility complex and risk of B-cell precursor acute lymphoblastic leukemia in children. PLoS One. 2013;8(8): e72557. pmid:23991122
- 12. Urayama KY, Thompson PD, Taylor M, Trachtenberg EA, Chokkalingam AP. Genetic variation in the extended major histocompatibility complex and susceptibility to childhood acute lymphoblastic leukemia: a review of the evidence. Front Oncol. 2013;3:300. pmid:24377085
- 13. Wu C, Wang Z, Song X, Feng XS, Abnet CC, He J, et al. Joint analysis of three genome-wide association studies of esophageal squamous cell carcinoma in Chinese populations. Nat Genet. 2014;46(9):1001–1006. pmid:25129146
- 14. Abnet CC, Wang Z, Song X, Hu N, Zhou FY, Freedman ND, et al. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies. Hum Mol Genet. 2012;21(9):2132–2141. pmid:22323360
- 15. Shen FF, Yue WB, Zhou FY, Pan Y, Zhao XK, Jin Y, et al. Variations in the MHC region confer risk to esophageal squamous cell carcinoma on the subjects from high-incidence area in northern China. PLoS One. 2014;9(3): e90438. pmid:24595008
- 16. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–909. pmid:16862161
- 17. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–575. pmid:17701901
- 18. Zhao X, Sun Q, Tian H, Cong B, Jiang X, Peng C. Loss of heterozygosity at 6p21 and HLA class I expression in esophageal squamous cell carcinomas in China. Asian Pac J Cancer Prev. 2011;12(10):2741–2745. pmid:22320985
- 19. Yang Y, Zhang J, Miao F, Wei J, Shen C, Shen Y, et al. Loss of heterozygosity at 6p21 underlying [corrected] HLA class I downregulation in Chinese primary esophageal squamous cell carcinomas. Tissue Antigens. 2008;72(2):105–114. pmid:18721270
- 20. Nie Y, Yang G, Song Y, Zhao X, So C, Liao J, et al. DNA hypermethylation is a mechanism for loss of expression of the HLA class I genes in human esophageal squamous cell carcinomas. Carcinogenesis. 2001;22(10):1615–1623. pmid:11577000
- 21. Rockett JC, Darnton SJ, Crocker J, Matthews HR, Morris AG. Expression of HLA-ABC, HLA-DR and intercellular adhesion molecule-1 in oesophageal carcinoma. J Clin Pathol. 1995;48(6):539–544. pmid:7665697
- 22. Lin J, Deng CS, Sun J, Zheng XG, Huang X, Zhou Y, et al. HLA-DRB1 allele polymorphisms in genetic susceptibility to esophageal carcinoma. World J Gastroenterol. 2003;9(3):412–416. pmid:12632487
- 23. Makeyev AV, Bayarsaihan D. New TFII-I family target genes involved in embryonic development. Biochem Biophys Res Commun. 2009;386(4):554–558. pmid:19527686
- 24. Llorens C, Bernet GP, Ramasamy S, Feschotte C, Moya A. On the transposon origins of mammalian SCAND3 and KRBA2, two zinc-finger genes carrying an integrase/transposase domain. Mob Genet Elements. 2012;2(5):205–210. pmid:23550032
- 25. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009;106(23):9362–9367. pmid:19474294
- 26. Matejcic M, Iqbal Parker M. Gene-environment interactions in esophageal cancer. Crit Rev Clin Lab Sci. 2015;52(5):211–231. pmid:26220475
- 27. Chen J, Kwong DL, Cao T, Hu Q, Zhang L, Ming X, et al. Esophageal squamous cell carcinoma (ESCC): advance in genomics and molecular genetics. Dis Esophagus. 2015;28(1):84–89. pmid:23796192