Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A Genome Wide Study of Copy Number Variation Associated with Nasopharyngeal Carcinoma in Malaysian Chinese Identifies CNVs at 11q14.3 and 6p21.3 as Candidate Loci

  • Joyce Siew Yong Low,

    Affiliations Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia, Translational Genomics Lab, High Impact Research Building (Level 2), University of Malaya, Kuala Lumpur, Malaysia

  • Yoon Ming Chin,

    Affiliations Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia, Translational Genomics Lab, High Impact Research Building (Level 2), University of Malaya, Kuala Lumpur, Malaysia

  • Taisei Mushiroda,

    Affiliation Laboratory for Pharmacogenetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan

  • Michiaki Kubo,

    Affiliation Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan

  • Gopala Krishnan Govindasamy,

    Affiliation Department of Otorhinolaryngology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia

  • Kin Choo Pua,

    Affiliation Department of Otorhinolaryngology, Hospital Pulau Pinang, Penang, Malaysia

  • Yoke Yeow Yap,

    Affiliation Department of Surgery, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Kuala Lumpur, Malaysia

  • Lee Fah Yap,

    Affiliation Department of Oral Biology & Biomedical Sciences and Oral Cancer Research & Coordinating Centre, Faculty of Dentistry, University of Malaya, Kuala Lumpur, Malaysia

  • Selva Kumar Subramaniam,

    Affiliation Department of Otorhinolaryngology, Head and Neck Surgery, Sarawak General Hospital, Sarawak, Malaysia

  • Cheng Ai Ong,

    Affiliation ENT Department, Hospital Queen Elizabeth, Karung Berkunci No. 2029, Kota Kinabalu, Sabah, Malaysia

  • Tee Yong Tan,

    Affiliation Department of Otorhinolaryngology, Sarawak General Hospital, Kuching, Sarawak, Malaysia

  • Alan Soo Beng Khoo,

    Affiliation Molecular Pathology Unit, Cancer Research Centre, Institute for Medical Research, Kuala Lumpur, Malaysia

  • The Malaysian NPC Study Group ,

    The complete membership of the Malaysian NPC Study Group can be found in the Acknowledgments.

    Affiliation The Malaysian Nasopharyngeal Carcinoma Study Group: Hospital Pulau Pinang, Hospital Kuala Lumpur/Universiti Putra Malaysia, University of Malaya, Institute for Medical Research, Cancer Research Initiatives Foundation, Sarawak General Hospital/Universiti Malaysia Sarawak, Queen Elizabeth Hospital and Hospital Universiti Sains, Malaysia

  • Ching Ching Ng

    Affiliations Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia, Translational Genomics Lab, High Impact Research Building (Level 2), University of Malaya, Kuala Lumpur, Malaysia

A Genome Wide Study of Copy Number Variation Associated with Nasopharyngeal Carcinoma in Malaysian Chinese Identifies CNVs at 11q14.3 and 6p21.3 as Candidate Loci

  • Joyce Siew Yong Low, 
  • Yoon Ming Chin, 
  • Taisei Mushiroda, 
  • Michiaki Kubo, 
  • Gopala Krishnan Govindasamy, 
  • Kin Choo Pua, 
  • Yoke Yeow Yap, 
  • Lee Fah Yap, 
  • Selva Kumar Subramaniam, 
  • Cheng Ai Ong



Nasopharyngeal carcinoma (NPC) is a neoplasm of the epithelial lining of the nasopharynx. Despite various reports linking genomic variants to NPC predisposition, very few reports were done on copy number variations (CNV). CNV is an inherent structural variation that has been found to be involved in cancer predisposition.


A discovery cohort of Malaysian Chinese descent (NPC patients, n = 140; Healthy controls, n = 256) were genotyped using Illumina® HumanOmniExpress BeadChip. PennCNV and cnvPartition calling algorithms were applied for CNV calling. Taqman CNV assays and digital PCR were used to validate CNV calls and replicate candidate copy number variant region (CNVR) associations in a follow-up Malaysian Chinese (NPC cases, n = 465; and Healthy controls, n = 677) and Malay cohort (NPC cases, n = 114; Healthy controls, n = 124).


Six putative CNVRs overlapping GRM5, MICA/HCP5/HCG26, LILRB3/LILRA6, DPY19L2, RNase3/RNase2 and GOLPH3 genes were jointly identified by PennCNV and cnvPartition. CNVs overlapping GRM5 and MICA/HCP5/HCG26 were subjected to further validation by Taqman CNV assays and digital PCR. Combined analysis in Malaysian Chinese cohort revealed a strong association at CNVR on chromosome 11q14.3 (Pcombined = 1.54x10-5; odds ratio (OR) = 7.27; 95% CI = 2.96–17.88) overlapping GRM5 and a suggestive association at CNVR on chromosome 6p21.3 (Pcombined = 1.29x10-3; OR = 4.21; 95% CI = 1.75–10.11) overlapping MICA/HCP5/HCG26 genes.


Our results demonstrated the association of CNVs towards NPC susceptibility, implicating a possible role of CNVs in NPC development.


Nasopharyngeal carcinoma (NPC) (OMIM 161550 and 607107) is a malignant tumour arising in the nasopharyngeal mucosa. NPC displays a geographic bias. It is a rare malignancy in most parts of the world (ASR = 1 case per 100,000 per year), but a highly endemic disease in selected demographics such as Southern Chinese population of Guangdong (as high as 15 to 25 cases per 100,000 per year) [1], native Greenlanders [2] and migrants of the Southern Chinese in America, Australia, and Malaysia [3, 4]. The skewed demographic distribution of the disease coupled with familial aggregation [5], Epstein-Barr virus (EBV) infection and environmental factors such as air pollutants [6], dietary intake of salted fish and preserved food [7], suggest possible interplay between genetic and environmental factors in NPC pathogenesis. Many genetic studies on single nucleotide polymorphisms (SNPs) have reported associations to NPC. Genes linked to NPC include HLA-A (Human leukocyte antigen A) [8, 9], HLA-B (Human leukocyte antigen B) [10], GABBR1 (Gamma-aminobutyric acid (GABA) B receptor 1) [11], CYP2E1 (cytochrome P450 2E1) and hOGG1 (human 8-oxoguanine DNA N-glycosylase 1) [12]. Despite many reports linking SNP variants to NPC predisposition, structural variants such as CNVs and its possible influence on NPC predisposition remain grossly neglected. Thus far, only deletions linked to NPC have been reported at chromosome 3p [13] and 6p21.3 [14].

Copy number variants are structural variants present in multiple copies involving DNA segments of ≥ 1 kb [15]. CNVs can have simple structures such as tandem duplication or complex structures like sequence complexity at junctions that arise from complex amplification and deletion rearrangements. Some CNVs remain neutral in function, others convey differences in phenotypes by disrupting genes, creating fusion genes, changing copy number of dosage sensitive genes and altering the regulatory transcription levels [16]. These changes in phenotypes can be benign or can have detrimental implications in diseases such as osteoporosis [17], schizophrenia [18], Li-Fraumeni-syndrome-associated cancers [19] and neuroblastoma [20].

The Malaysian Chinese population is largely made up of descendants from Southern China. This population comprises various dialect groups of the Hokkien, Hakka, Cantonese, Teochew and Hainanese. NPC incidence is high among the Malaysian Chinese, especially in Chinese males (ASR = 14 per 100,000) [21]. As for the Malaysian Malay population, they are a heterogeneous ethnic group with different ancestral origins based on migration patterns of centuries ago. They were the descendants of the Proto-Malays, admixed with Siamese, Javanese, Sumatran, Indian, Thai, Arab and Chinese traders [22]. The Malays make up the majority of the Malaysian population. NPC incidence is low among the Malays, including the Malay males (ASR = 4.0 per 100,000) [21].

This study aims to evaluate the role of CNVs in NPC predisposition in the Malaysian Chinese and Malay populations. We report a case control genome-wide SNP microarray approach to identify CNVs associated to NPC in the Malaysian Chinese and Malay populations.

Materials and Methods

Study samples

Samples were recruited from the University Malaya Medical Centre (UMMC), Penang General Hospital (HPP), Kuala Lumpur General Hospital (HKL), Sarawak General Hospital (HUS) and Queen Elizabeth Hospital Sabah (QES) from year 2006 to 2013. All cases were histo-pathologically diagnosed according to the World Health Organization (WHO) classification. GWAS discovery cohort consisted of 193 cases and 290 controls (140 cases and 256 healthy controls after QC filtering) of Malaysian Chinese descent and the replication cohort consisted of 465 NPC patients and 677 healthy controls of Malaysian Chinese descent, giving a combined cohort of 605 cases and 933 controls after QC filtering. The Malaysian Malay replication cohort consisted of 114 cases and 124 controls of Malaysian Malay descent. Demographic characteristics of Malaysian Chinese study participants are described in S1 Table and demographic characteristics of Malaysian Malay study participants are described in S2 Table.

Ethical approval and consent

All cases and controls gave written informed consent and the study was approved by Medical Research Ethic Committee (MREC) Ministry of Health Malaysia (Registration ID: NMRR-13-969-17878), ethical committees of the Yokohama Institute, The Institutes of Physical and Chemical Research (RIKEN), Yokohama, Japan and Medical Ethics Committee of University Malaya Medical Centre.

Genotyping, quality control and correction for population structure

Genome-wide genotyping of NPC cases and healthy controls of Malaysian Chinese was conducted using Illumina® HumanOmniExpress_12 v1.1 Genotyping BeadChip (San Diego, CA, USA) according to manufacturer's protocol. The genomic DNA was extracted from peripheral blood leukocytes using conventional phenol-chloroform method. Samples with sample call rates <0.99, mismatch between recorded and estimated gender, cryptic relatedness and population outliers were excluded from the subsequent studies. Population outliers were identified using Principal component analysis (PCA) performed in EIGENSTRAT version 2.0 [23].

Generation of CNV calls

Samples passing quality control measures were used for generating CNV calls. Log R ratio (LRR) and B allele frequency (BAF) were generated from intensity data of the genome-wide genotyping using GenomeStudio (v.3.1.6) (San Diego, CA, USA) and CNVs were called using PennCNV (v.2011 Jun16) [24] and cnvPartition (v.3.1.6) (San Diego, CA, USA). For uniformity, samples with standard deviation (SD) of LRR≥ 0.20, SD of BAF ≥ 0.2, BAF drifting value of ≥ 0.01, waviness factor ≥ 0.05 and samples with more than 50 CNV calls were removed from analysis using PennCNV prior to CNV calling. Centromeric regions, telomeric regions, XY chromosomes and regions coding for immunoglobulin genes were also excluded from CNV calling. Minimum number of markers of 5 were set as threshold for all CNV calling. For cnvPartition analysis, confidence score threshold was set to 35 and minimum number of probes was set to 5.

CNV validation and replication

Validation of GWAS CNV calls and replication of CNV association at candidate loci on chromosome 11q14.3 and chromosome 6p21.3 were performed using Taqman CNV genotyping assays (ABI, Foster City, CA) on the GWAS discovery cohort (Malaysian Chinese only) and replication cohort (Malaysian Chinese and Malaysian Malay). Quantification cycle (Cq) of each region of interest was determined by QuantStudio 12K Flex Software (v.1.0, Applied Biosystem, Foster City, CA) and copy numbers were determined using Copy Caller (v.2.0, Applied Biosystems, Foster City, CA) according to a comparative delta Cq method. Random samples with a mixture of copy number (CN) state of 1, 2 and 3 were subjected to validation with Droplet digital PCR (ddPCR) (Bio-Rad Laboratories, Hemel Hempstead, UK) following manufacturer’s protocol.

Gene-set enrichment analysis

Five pathways previously found to be associated to NPC were subjected to enrichment analysis [25]. Enrichment analysis was carried out using cnv-enrichment-test [26] implemented in PLINK v.1.07 [27]. Gene sets were compiled from the Molecular Signatures Database (MSigDB) v.3.1 namely KEGG WNT signalling pathway, ST ERK1 ERK2 MAPK pathway, KEGG NOTCH signalling pathway, REACTOME regulation of apoptosis and REACTOME signalling by EGFR in cancer. One sided empirical P-values with 10,000 permutation of enrichment of a subset of genes of a pathway relative to all genes were acquired.

Global CNV burden analysis

PLINK v.1.07 [27] was used to perform global CNV burden analysis. CNVs were classified as rare when found in <1% of total GWAS samples or common when found in ≥ 5% of total GWAS samples. Tests for CNV burden (one-sided) were done for number of CNV segments per individual, number of genes overlapped by CNVs per individual and number of genes intersected per total CNV kb using UCSC RefSeq (hg18) gene annotation. CNVs were considered co-localized if they overlapped by at least 50% of their length.

HLA allele imputation

SNPs that pass quality control filtering on chromosome 6 were input into the SNP2HLA software [28] in forms of PLINK formats of (.bed/.bim/.fam). SNP allele annotations were mapped to forward strand and physical coordinates corresponding to hg18. Default parameter settings for SNP2HLA were used in the phasing and imputation. The Pan-Asian reference panel was used for this study [29, 30].

Statistical association analysis

Copy number variant regions (CNVRs) are identified using reciprocal overlap (minimum threshold 50%). Logistic regression using PLINK v.1.07 was carried out to assess the frequency difference between NPC cases and healthy controls using gender and age as covariates. Gene-based association was carried out using ParseCNV v.17 [31]. GWAMA v.1.4 [32] was used for I2 heterogeneity analysis. G*Power v. was employed for power analysis for z tests logistic regression, using post-hoc computation of achieved power, one tail at α = 0.05 setting [33].


Genome-wide CNV Distribution in the Malaysian Chinese GWA Sample

Our CNV analysis workflow is detailed in Fig 1. From the 733,202 genotyped SNPs, 13,365 SNPs were removed due to low call frequency (<0.99). In total, 9 cases and 24 controls were removed due to population outliers, mismatch between recorded and estimated gender and cryptic relatedness. CNV calling quality control measures removed a further 44 cases and 13 controls due to high standard deviation of Log R ratio (SD BAF ≥ 0.2), excessive CNV calls (≥50 CNVs) and drifting B allele frequency values (BAF drifting value of ≥ 0.01). Post quality control measures, 140 cases and 256 controls remained and were used for downstream CNV analysis.

Fig 1. Flowchart illustrating workflow for whole genome CNV analysis and NPC susceptibility associated candidate gene selection from our GWAS discovery cohort.

For global CNV analysis, we used only stringent consensus CNVs from both PennCNV and cnvPartition calling algorithms. Total of 1679 CNVs were called (mean size = 147,798bp; median size = 66739bp; range size = 720bp- 4,979,575bp). CNV burden analysis revealed no significant difference in global distribution of rare CNVs between cases and controls (Table 1). There was also no significant difference in common CNV rates (CNVfreq≥5%) (P = 0.5326) (Table 1), although common CNVs were 5.62 times more enriched with genes in cases than in controls (P = 7.00x10-4) (Table 1). NPC cases also had an over-representation of common genic deletions compared to controls (Case/control ratio = 2.26, P = 0.03). There were however no significant difference between cases and controls in terms of rare CNV (CNVfreq<1%) rate and gene enrichment.

CNVRs associated to NPC susceptibility and pathway enrichment

We focused on consensus CNVs called from both PennCNV and cnvPartition calling algorithms. From the consensus CNVRs, we selected regions overlapping genes and genomic regions previously linked to NPC and other cancers. We also checked for genes differentially expressed in NPC tissue specimens compared to normal nasopharyngeal tissue [34, 35]. Six candidate CNVRs fulfilling the selection criteria were presented in Table 2. Further qPCR validation was done on CNVR 11q14.3 and 6p21.3 and Fig 2 provided showed example LRR and BAF in these CNVs. LRR and BAF from SNP data within CNVs in chromosome 6 (S1 Text) and chromosome 11 (S2 Text) can be found in the supporting information. Concordance rate between CNV calls generated from calling algorithms and qPCR was shown in Table 3. As shown in Table 3, PennCNV calls had higher concordance for both CNVRs on chromosome 6p21.3 (99.75%) and 11q14.3 (91.67%). Taqman genotyping of CNVRs 11q14.3 and 6p21.3 confirmed suggestive association in the Malaysian Chinese discovery cohort (P11q14.3_Discovery = 4.60x10-3; P6p21.3 Discovery = 0.051) (Table 4). CNVRs 11q14.3 and 6p21.3 were selected for further replication in a separate Malaysian Chinese (465 cases and 677 controls) and Malaysian Malay (114 cases and 124 controls) replication cohort.

Fig 2. An illustration of B allele frequency, Log R ratio, and CNV value for CNVs at chromosome 6p21.3 (left) and 11q14.3 (right) generated from GenomeStudio (v.3.1.6).

Table 2. List of most significantly associated candidate CNVRs identified from GWAS cohort.

Table 3. Results from multiple CNV calling algorithms and qPCR validation of CNVRs on chromosome 6p21.3 and chromosome 11q14.3.

Table 4. Association of CNVRs on chromosome11q14.3 and chromosome 6p21.3 with susceptibility in discovery and validation cohorts of Malaysian Chinese.

From our combined analysis, CNVR on chromosome 11q14.3 showed the strongest association to NPC susceptibility in our Malaysian Chinese cohort (Pcombined = 1.54x10-5; OR = 7.27; 95% CI = 2.96–17.88) (Table 4). In our study, the most common recurrent CNVs at chromosome 11q14.3 were hg18 chr11:88,336,310–88,384,073 (8 cases) and hg18 chr11: 88,336,310–88,387,195 (6 cases) (Fig 3). CNVR 11q14.3 was however not associated to NPC susceptibility in the Malaysian Malay cohort (P = 0.99; OR = 10.14; 95% CI = 0.54–190.47) (Table 5). Heterogeneity test done between discovery and Malaysian Chinese validation cohorts using GWAMA v.1.4 provided I2 value of 0.7. I2 value was 0.7 between discovery and validation cohort of Malaysian Chinese, however the association of CNV 11q14.3 for both discovery (p-value = 4.6x10-3) and validation cohorts (p-value = 2.60x10-3) were significant and consistent in direction.

Fig 3. Schematic representation of the genomic organization according to the human genome hg18 at chromosome 11q14.3 with the positions of CNVs called in the region and reported CNVs in the vicinity found in the Database of Genomic Variants, accessed by July 2015.

Blue bars represent gains identified by other studies that were deposited in the database while red bars represent losses. Brown bars depict CNVs with both losses and gains documented.

Table 5. Association of CNVRs on chromosome11q14.3 and chromosome 6q21.33 with susceptibility in Malaysian Malay replication cohort.

The combined association of CNVR on chromosome 6p21.3 was more modest in the Malaysian Chinese cohort (Pcombined = 1.29x10-3; OR = 4.21; 95% CI = 1.75–10.11) (Table 4) (Fig 4). From our results, we observed that carriers of the 6p21.3 deletion also carried the HLA-B*4801 (8/10) or HLA-C*0801 (8/10) allele (Table 6). CNVR 6p21.3 was not associated with NPC susceptibility in the Malay cohort (P = 0.32) (Table 5).

Fig 4. Schematic representation of the genomic organization according to the human genome hg18 at chromosome 6p21.3 with the positions of CNVs called in the region and reported CNVs in the vicinity found in the Database of Genomic Variants, accessed by July 2015.

Blue bars represent gains identified by other studies that were deposited in the database while red bars represent losses. Brown bars depict CNVs with both losses and gains documented.

Table 6. HLA-A, HLA-B, and HLA-C alleles of samples with heterozygous deleted MICA from HLA imputation software SNP2HLA.

The CNVR 11q14.3 and 6p21.3 copy numbers identified through Taqman qPCR were highly concordant with digital PCR copy numbers though comparison of CNV copy numbers was limited to only 10 random samples.

The CNVs were subjected to pathway enrichment using a gene set-enrichment method [26] implemented in PLINK v1.07. The ERK1/ERK2-MAPK pathway and NOTCH Signaling pathway were identified and deemed to be significantly enriched (PERK-MAPK = 7.00x10-4; PNotch = 8.30x10-3) (Fig 5) (Table 7).

Fig 5. -Log10 of one-tailed empirical p-values from pathway enrichment analysis implemented using PLINK v.1.07 cnv-enrichment-test.

Table 7. p-values of gene-set enrichment analysis for five pathways previously associated to nasopharyngeal carcinoma.


Common CNVs have been associated with predisposition to cancers such as neuroblastoma [20] and breast cancer [36]. The over-representation of common deletion genic CNVs in cases as compared to controls (Case/control ratio = 2.26, p-value = 0.03) could potentially bring about deleterious and pathogenic consequences that would lead to susceptibility to NPC. No previous studies on global CNV burden analysis were reported in NPC.

The most significantly associated CNVR from our study was within the metabotropic glutamate receptor 5 (GRM5) gene, suggesting its potential role in NPC pathogenesis. GRM5 gene (approximately 563kbp) located on the minus strand of chromosome 11q14.3 encodes a group I Gq-coupled receptor. Alternative 5' splicing and usage of multiple promoters were involved in the regulatory mechanisms of GRM5 expression [37]. Evidence from studies suggests a role for glutamatergic signalling in the biology of cancer in peripheral tissues [38, 39], especially those of mucosal nature. Different groups have also shown evidence for a role of glutamate and its receptors in regulation of tumour growth [40]. Examples of metabotropic glutamate receptors being implicated in development of cancers are the involvement of GRM1 [41] and GRM5 [42] in the induction of melanoma in transgenic mice. In addition, GRM5 was found to play a role in tumour cell migration and invasion in oral squamous cell carcinoma [43] and found to be overexpressed in lung cancer cells [44].

Deletion on chromosome 6p21.3 which overlaps MHC Class I Polypeptide-Related Sequence A (MICA), HLA complex 5 (HCP5) and HLA complex group 26 (HCG26) genes was previously associated to NPC predisposition. MICA encodes a ligand for the natural killer-cell receptor NKG2-D type II. MICA is highly expressed on cancer cells and can activate antitumor effects from natural killer cells and CD8+ T cells [45]. Meanwhile the function of HLA complex P5, HCP5 is not fully understood.

The MICA/HCP5 region has been linked to NPC susceptibility and HCV-associated hepatocellular carcinoma [14, 46]. MICA-STR has been implicated in NPC predisposition among male Southern Chinese Han population [47]. The frequency of the deletion in our results from combined Chinese cases and controls was 0.02, similar to the frequency reported by Tse et al., 2011. In addition, we also found similar results in which deleted MICA gene was frequently found in the HLA-B48 (B*4801) associated haplotype in a Japanese study cohort [48]. They found 62.5% (5/8) of their samples with homozygous HLA-B*4801 allele lack intact MICA gene while we found 53.55% (8/15) of our samples with single copy of HLA-B*4801 allele (HLA allele imputation results) had one copy number for MICA gene. The concordance rate between the results from HLA allele imputation software and previous genotyping results of HLA-A allele in our laboratory [49] was 98.52%.

CNVs identified from our study were enriched for ERK1/2 and Notch signalling pathways. Mutations in GRM5 influenced Ca2+ oscillations in transgenic mice that showed tumour/melanoma phenotype in addition to dramatic increase in phosphorylation of ERK in these tumour samples. They had implicated ERK as a downstream effector of GRM5 signalling in tumours. The possible role of ERK pathway in NPC pathogenesis had also been corroborated by Lan et al. [50]. Further studies are needed to better understand the dynamics and interaction of GRM5 within the ERK pathway and its role in NPC.

We report the association of CNV 11q14.3 and 6p21.3 as well as the ERK1/2 and Notch signalling pathways with NPC susceptibility in the Malaysian Chinese cohort using a case control genome-wide SNP microarray approach. The caveat attached to SNP microarray CNV detection remains the diverse algorithmic calling methods, resulting in considerable variation in the CNV calls. We have employed extensive validation and replication using Taqman real-time PCR and a further more sensitive digital PCR method to negate false CNV calls for CNVs 11q14.3 and 6p21.3. Our study was able to detect association of CNV 11q14.3 and 6p21.3 in the Malaysian Chinese cohort but not in the Malaysian Malay cohort. However power analysis showed that power for CNV study for Malay replication cohort at 11q14.3 and 6p21.3 is 0.26 and 0.11 respectively, hence a larger sample size would be needed to improve the achieved power for better confidence. To increase reproducibility and confidence in our association, replication efforts with new NPC cohorts will aid to confirm the link between CNV 11q14.3 and 6p21.3 and NPC.

Supporting Information

S1 Table. Basic characteristics of Malaysian Chinese NPC patients and healthy controls in the study.



S2 Table. Basic characteristics of Malaysian Malay NPC patients and healthy controls in the study.



S1 Text. Log R ratio and B allele frequency of SNPs involved in CNVR on chromosome 6.



S2 Text. Log R ratio and B allele frequency of SNPs involved in CNVR on chromosome 11.




The authors would like to thank the Director General of Health, Ministry of Health of Malaysia for his permission to publish this article, and the Director of the IMR for his support. We would like to thank the Malaysian NPC Study Group: Hospital Pulau Pinang- K.C. Pua (Project Leader), S. Subathra, N. Punithavati, B.S. Tan, Y.S. Ee, L.M. Ong, R.A. Hamid, M. Goh, J.C.T. Quah, J. Lim;, Hospital Kuala Lumpur/Universiti Putra Malaysia- Y.Y. Yap, B.D. Dipak, R. Deepak, F.N. Lau, P.V. Kam, S. Shri Devi;, Queen Elizabeth Hospital- C.A. Ong, C.L. Lum, Ahmad N.A., Halimuddin S., M. Somasundran, A. Kam, M. Wodjin; Sarawak General Hospital/ Universiti Malaysia Sarawak- S.K. Subramaniam, T.S. Tiong, T.Y. Tan, U.H. Sim, T.W. Tharumalingam, D. Norlida, M. Zulkarnaen, W.H. Lai; University of Malaya- G. Gopala Krishnan, C.C. Ng, A.Z. Bustam, S. Marniza, P. Shahfinaz, O. Hashim, S. Shamshinder, N. Prepageran, L.M. Looi, O. Rahmat, J. Amin, J. Maznan; Hospital Universiti Sains Malaysia- S. Hassan, B. Biswal; Cancer Research Initiatives Foundation- S.H. Teo, L.F. Yap; Institute for Medical Research- A.S.B. Khoo (Program Leader), A. Munirah, A. Subasri, L.P. Tan, N.M. Kumaran, M.S. Nurul Ashikin, M.S. Nursyazwani, B. Norhasimah, R. Sasela Devi, S. Shri Devi, C.Y. Koh. We also extend our gratitude to all the participants of this study, staff of the Department of Otorhinolaryngology, UMMC and staff of the Blood Bank, UMMC.

Author Contributions

Conceived and designed the experiments: JSYL YMC CCN. Performed the experiments: JSYL. Analyzed the data: JSYL YMC. Contributed reagents/materials/analysis tools: TM MK GKG KCP YYY LFY SKS CAO TYT ASBK The Malaysian NPC Study Group CCN. Wrote the paper: JSYL YMC ASBK CCN.


  1. 1. Hepeng J (2008) Zeng Yi profile. A controversial bid to thwart the 'Cantonese cancer'. Science (New York, NY) 321: 1154–1155.
  2. 2. Chou J, Lin YC, Kim J, You L, Xu Z, He B, et al. (2008) Nasopharyngeal carcinoma—review of the molecular mechanisms of tumorigenesis. Head & neck 30: 946–963.
  3. 3. Armstrong RW, Kutty MK, Dharmalingam SK (1974) Incidence of nasopharyngeal carcinoma in Malaysia, with special reference to the state of Selangor. British journal of cancer 30: 86–94. pmid:4413823
  4. 4. McCredie M, Williams S, Coates M (1999) Cancer mortality in East and Southeast Asian migrants to New South Wales, Australia, 1975–1995. British journal of cancer 79: 1277–1282. pmid:10098772
  5. 5. Jia WH, Collins A, Zeng YX, Feng BJ, Yu XJ, Huang LX, et al. (2005) Complex segregation analysis of nasopharyngeal carcinoma in Guangdong, China: evidence for a multifactorial mode of inheritance (complex segregation analysis of NPC in China). European journal of human genetics: EJHG 13: 248–252. pmid:15483644
  6. 6. Armstrong RW, Imrey PB, Lye MS, Armstrong MJ, Yu MC, Sani S (2000) Nasopharyngeal carcinoma in Malaysian Chinese: occupational exposures to particles, formaldehyde and heat. International journal of epidemiology 29: 991–998. pmid:11101539
  7. 7. Gallicchio L, Matanoski G, Tao XG, Chen L, Lam TK, Boyd K, et al. (2006) Adulthood consumption of preserved and nonpreserved vegetables and the risk of nasopharyngeal carcinoma: a systematic review. International journal of cancer Journal international du cancer 119: 1125–1135. pmid:16570274
  8. 8. Simons MJ, Wee GB, Day NE, Morris PJ, Shanmugaratnam K, De-The GB (1974) Immunogenetic aspects of nasopharyngeal carcinoma: I. Differences in HL-A antigen profiles between patients and control groups. International journal of cancer Journal international du cancer 13: 122–134. pmid:4131857
  9. 9. Bei JX, Li Y, Jia WH, Feng BJ, Zhou G, Chen LZ, et al. (2010) A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci. Nature genetics 42: 599–603. doi: 10.1038/ng.601. pmid:20512145
  10. 10. Chan SH, Day NE, Kunaratnam N, Chia KB, Simons MJ (1983) HLA and nasopharyngeal carcinoma in Chinese—a further study. International journal of cancer Journal international du cancer 32: 171–176. pmid:6874140
  11. 11. Tse KP, Su WH, Chang KP, Tsang NM, Yu CJ, Tang P, et al. (2009) Genome-wide association study reveals multiple nasopharyngeal carcinoma-associated loci within the HLA region at chromosome 6p21.3. American journal of human genetics 85: 194–203. doi: 10.1016/j.ajhg.2009.07.007. pmid:19664746
  12. 12. Cho EY, Hildesheim A, Chen CJ, Hsu MM, Chen IH, Mittl BF, et al. (2003) Nasopharyngeal carcinoma and genetic polymorphisms of DNA repair enzymes XRCC1 and hOGG1. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 12: 1100–1104.
  13. 13. Xiong W, Zeng ZY, Xia JH, Xia K, Shen SR, Li XL, et al. (2004) A susceptibility locus at chromosome 3p21 linked to familial nasopharyngeal carcinoma. Cancer research 64: 1972–1974. pmid:15026332
  14. 14. Tse KP, Su WH, Yang ML, Cheng HY, Tsang NM, Chang KP, et al. (2011) A gender-specific association of CNV at 6p21.3 with NPC susceptibility. Human molecular genetics 20: 2889–2896. doi: 10.1093/hmg/ddr191. pmid:21536588
  15. 15. Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nature reviews Genetics 7: 85–97. pmid:16418744
  16. 16. Gu W, Zhang F, Lupski JR (2008) Mechanisms for human genomic rearrangements. PathoGenetics 1: 4. pmid:19014668
  17. 17. Yang TL, Chen XD, Guo Y, Lei SF, Wang JT, Zhou Q, et al. (2008) Genome-wide copy-number-variation study identified a susceptibility gene, UGT2B17, for osteoporosis. American journal of human genetics 83: 663–674. doi: 10.1016/j.ajhg.2008.10.006. pmid:18992858
  18. 18. Need AC, Ge D, Weale ME, Maia J, Feng S, Heinzen EL, et al. (2009) A genome-wide investigation of SNPs and CNVs in schizophrenia. PLoS genetics 5: e1000373. doi: 10.1371/journal.pgen.1000373. pmid:19197363
  19. 19. Shlien A, Tabori U, Marshall CR, Pienkowska M, Feuk L, Novokmet A, et al. (2008) Excessive genomic DNA copy number variation in the Li-Fraumeni cancer predisposition syndrome. Proceedings of the National Academy of Sciences of the United States of America 105: 11264–11269. doi: 10.1073/pnas.0802970105. pmid:18685109
  20. 20. Diskin SJ, Hou C, Glessner JT, Attiyeh EF, Laudenslager M, Bosse K, et al. (2009) Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459: 987–991. doi: 10.1038/nature08035. pmid:19536264
  21. 21. Omar ZA, Ali ZM, Tamin NSI (2006) Malaysian Cancer Statistics-Data and Figure, Peninsular Malaysia 2006. National cancer registry, ministry of health Malaysia.
  22. 22. Hatin WI, Nur-Shafawati AR, Zahri MK, Xu S, Jin L, Tan SG, et al. (2011) Population genetic structure of peninsular Malaysia Malay sub-ethnic groups. PloS one 6: e18312. doi: 10.1371/journal.pone.0018312. pmid:21483678
  23. 23. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nature genetics 38: 904–909. pmid:16862161
  24. 24. Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, et al. (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome research 17: 1665–1674. pmid:17921354
  25. 25. Tulalamba W, Janvilisri T (2012) Nasopharyngeal Carcinoma Signaling Pathway: An Update on Molecular Biomarkers. International journal of cell biology 2012: 594681. doi: 10.1155/2012/594681. pmid:22500174
  26. 26. Raychaudhuri S, Korn JM, McCarroll SA, Altshuler D, Sklar P, Purcell S, et al. (2010) Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function. PLoS genetics 6: e1001097. doi: 10.1371/journal.pgen.1001097. pmid:20838587
  27. 27. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics 81: 559–575. pmid:17701901
  28. 28. Jia X, Han B, Onengut-Gumuscu S, Chen W-M, Concannon PJ, Rich SS, et al. (2013) Imputing Amino Acid Polymorphisms in Human Leukocyte Antigens. PloS one 8: e64683. doi: 10.1371/journal.pone.0064683. pmid:23762245
  29. 29. Okada Y, Kim K, Han B, Pillai NE, Ong RT, Saw WY, et al. (2014) Risk for ACPA-positive rheumatoid arthritis is driven by shared HLA amino acid polymorphisms in Asian and European populations. Human molecular genetics.
  30. 30. Pillai NE, Okada Y, Saw WY, Ong RT, Wang X, Tantoso E, et al. (2014) Predicting HLA alleles from high-resolution SNP data in three Southeast Asian populations. Human molecular genetics 23: 4443–4451. doi: 10.1093/hmg/ddu149. pmid:24698974
  31. 31. Glessner JT, Li J, Hakonarson H (2013) ParseCNV integrative copy number variation association software with quality tracking. Nucleic Acids Res 41: e64. doi: 10.1093/nar/gks1346. pmid:23293001
  32. 32. Magi R, Morris AP (2010) GWAMA: software for genome-wide association meta-analysis. BMC Bioinformatics 11: 288. doi: 10.1186/1471-2105-11-288. pmid:20509871
  33. 33. Faul F, Erdfelder E, Buchner A, Lang A-G (2009) Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods 41: 1149–1160. doi: 10.3758/BRM.41.4.1149. pmid:19897823
  34. 34. Dodd LE, Sengupta S, Chen IH, den Boon JA, Cheng YJ, Westra W, et al. (2006) Genes involved in DNA repair and nitrosamine metabolism and those located on chromosome 14q32 are dysregulated in nasopharyngeal carcinoma. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 15: 2216–2225.
  35. 35. Sengupta S, den Boon JA, Chen IH, Newton MA, Dahl DB, Chen M, et al. (2006) Genome-wide expression profiling reveals EBV-associated inhibition of MHC class I expression in nasopharyngeal carcinoma. Cancer research 66: 7999–8006. pmid:16912175
  36. 36. Long J, Delahanty RJ, Li G, Gao YT, Lu W, Cai Q, et al. (2013) A common deletion in the APOBEC3 genes and breast cancer risk. Journal of the National Cancer Institute 105: 573–579. doi: 10.1093/jnci/djt018. pmid:23411593
  37. 37. Corti C, Clarkson RW, Crepaldi L, Sala CF, Xuereb JH, Ferraguti F (2003) Gene structure of the human metabotropic glutamate receptor 5 and functional analysis of its multiple promoters in neuroblastoma and astroglioma cells. The Journal of biological chemistry 278: 33105–33119. pmid:12783878
  38. 38. Cavalheiro EA, Olney JW (2001) Glutamate antagonists: deadly liaisons with cancer. Proceedings of the National Academy of Sciences of the United States of America 98: 5947–5948. pmid:11371628
  39. 39. Hinoi E, Takarada T, Ueshima T, Tsuchihashi Y, Yoneda Y (2004) Glutamate signaling in peripheral tissues. European journal of biochemistry / FEBS 271: 1–13. pmid:14686914
  40. 40. Ishiuchi S, Tsuzuki K, Yoshida Y, Yamada N, Hagimura N, Okado H, et al. (2002) Blockage of Ca(2+)-permeable AMPA receptors suppresses migration and induces apoptosis in human glioblastoma cells. Nature medicine 8: 971–978. pmid:12172541
  41. 41. Pollock PM, Cohen-Solal K, Sood R, Namkoong J, Martino JJ, Koganti A, et al. (2003) Melanoma mouse model implicates metabotropic glutamate signaling in melanocytic neoplasia. Nature genetics 34: 108–112. pmid:12704387
  42. 42. Choi KY, Chang K, Pickel JM, Badger JD 2nd, Roche KW (2011) Expression of the metabotropic glutamate receptor 5 (mGluR5) induces melanoma in transgenic mice. Proceedings of the National Academy of Sciences of the United States of America 108: 15219–15224. doi: 10.1073/pnas.1107304108. pmid:21896768
  43. 43. Park SY, Lee SA, Han IH, Yoo BC, Lee SH, Park JY, et al. (2007) Clinical significance of metabotropic glutamate receptor 5 expression in oral squamous cell carcinoma. Oncology reports 17: 81–87. pmid:17143482
  44. 44. Li S, Huang S, Peng SB (2005) Overexpression of G protein-coupled receptors in cancer cells: involvement in tumor progression. International journal of oncology 27: 1329–1339. pmid:16211229
  45. 45. Kumar V, Yi Lo PH, Sawai H, Kato N, Takahashi A, Deng Z, et al. (2012) Soluble MICA and a MICA Variation as Possible Prognostic Biomarkers for HBV-Induced Hepatocellular Carcinoma. PloS one 7: e44743. pmid:23024757
  46. 46. Lange CM, Bibert S, Dufour JF, Cellerai C, Cerny A, Heim MH, et al. (2013) Comparative genetic analyses point to HCP5 as susceptibility locus for HCV-associated hepatocellular carcinoma. Journal of hepatology 59: 504–509. doi: 10.1016/j.jhep.2013.04.032. pmid:23665287
  47. 47. Tian W, Zeng XM, Li LX, Jin HK, Luo QZ, Wang F, et al. (2006) Gender-specific associations between MICA-STR and nasopharyngeal carcinoma in a southern Chinese Han population. Immunogenetics 58: 113–121. pmid:16547745
  48. 48. Ota M, Bahram S, Katsuyama Y, Saito S, Nose Y, Sada M, et al. (2000) On the MICA deleted-MICB null, HLA-B*4801 haplotype. Tissue antigens 56: 268–271. pmid:11034563
  49. 49. Chin YM, Mushiroda T, Takahashi A, Kubo M, Krishnan G, Yap LF, et al. (2014) HLA-A SNPs and amino acid variants are associated with nasopharyngeal carcinoma in Malaysian Chinese. International journal of cancer Journal international du cancer.
  50. 50. Lan YY, Hsiao JR, Chang KC, Chang JS, Chen CW, Lai HC, et al. (2012) Epstein-Barr virus latent membrane protein 2A promotes invasion of nasopharyngeal carcinoma cells through ERK/Fra-1-mediated induction of matrix metalloproteinase 9. Journal of virology 86: 6656–6667. doi: 10.1128/JVI.00174-12. pmid:22514348
  51. 51. Bauer S, Groh V, Wu J, Steinle A, Phillips JH, Lanier LL, et al. (1999) Activation of NK cells and T cells by NKG2D, a receptor for stress-inducible MICA. Science (New York, NY) 285: 727–729.
  52. 52. Job B, Bernheim A, Beau-Faller M, Camilleri-Broët S, Girard P, Hofman P, et al. (2010) Genomic Aberrations in Lung Adenocarcinoma in Never Smokers. PloS one 5: e15145. doi: 10.1371/journal.pone.0015145. pmid:21151896
  53. 53. Zhang Y, Ma M, Han B (2014) GOLPH3 high expression predicts poor prognosis in patients with resected non-small cell lung cancer: an immunohistochemical analysis. Tumour Biol 35: 10833–10839. doi: 10.1007/s13277-014-2357-3. pmid:25081375
  54. 54. Hua X, Yu L, Pan W, Huang X, Liao Z, Xian Q, et al. (2012) Increased expression of Golgi phosphoprotein-3 is associated with tumor aggressiveness and poor prognosis of prostate cancer. Diagn Pathol 7: 127. doi: 10.1186/1746-1596-7-127. pmid:23006319
  55. 55. Peng J, Fang Y, Tao Y, Li K, Su T, Nong Y, et al. (2014) Mechanisms of GOLPH3 associated with the progression of gastric cancer: a preliminary study. PloS one 9: e107362. doi: 10.1371/journal.pone.0107362. pmid:25286393
  56. 56. Maeda T, Mahara K, Kitazoe M, Futami J, Takidani A, Kosaka M, et al. (2002) RNase 3 (ECP) is an extraordinarily stable protein among human pancreatic-type RNases. J Biochem 132: 737–742. pmid:12417023
  57. 57. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. (2005) Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences 102: 15545–15550.