Replication of Association between Schizophrenia and Chromosome 6p21-6p22.1 Polymorphisms in Chinese Han Population

Chromosome 6p21-p22.1, spanning the extended major histocompatibility complex (MHC) region, is a highly polymorphic, gene-dense region. It has been identified as a susceptibility locus of schizophrenia in Europeans, Japanese, and Chinese. In our previous two-stage genome-wide association study (GWAS), polymorphisms of zinc finger with KRAB and SCAN domains 4 (ZKSCAN4), nuclear factor-κB-activating protein-like (NKAPL), and piggyBac transposable element derived 1 (PGBD1), localized to chromosome 6p21-p22.1, were strongly associated with schizophrenia. To further investigate the association between polymorphisms at this locus and schizophrenia in the Chinese Han population, we selected eight other single-nucleotide polymorphisms (SNPs) distributed in or near these genes for a case-control association study in an independent sample of 902 cases and 1,091 healthy controls in an attempt to replicate the GWAS results. Four of these eight SNPs (rs12214383, rs1150724, rs3800324, and rs1997660) displayed a nominal difference in allele frequencies between the case and control groups. The association between two of these SNPs and schizophrenia were significant even after Bonferroni correction (rs12000: allele A>G, P = 2.50E-04, odds ratio [OR] = 1.27, 95% confidence interval [CI] = 1.12–1.45; rs1150722: allele C>T, P = 4.28E-05, OR = 0.55, 95% CI = 0.41–0.73). Haplotype ATTGACGC, comprising these eight SNPs (rs2235359, rs2185955, rs12214383, rs12000, rs1150724, rs1150722, rs3800324, and rs1997660), was significantly associated with schizophrenia (P = 6.60E-05). We also performed a combined study of this replication sample and the first-stage GWAS sample. The combined study revealed that rs12000 and rs1150722 were still strongly associated with schizophrenia (rs12000: allele G>A, P combined  = 0.0019, OR = 0.81; rs1150722: allele G>A, P combined  = 3.00E-04, OR = 0.61). These results support our findings that locus 6p21-p22.1 is significantly associated with schizophrenia in the Chinese Han population and encourage further studies of the functions of these genetic factors.


Introduction
Schizophrenia is a psychiatric disorder with a worldwide prevalence of up to 1% and highly heritable factors. It is characterized by disturbances in thinking, emotion, cognition, and social function, including hallucinations, delusions, and apathy. Prenatal immune activation appears to be a risk factor that may play an etiological role in schizophrenia [1] and may also influence the neurodevelopment process and neurotransmitterdependent functions [2].
Many previous genetic studies reported an association between schizophrenia and locus 6p22-24, which includes the human major histocompatibility complex (MHC) region [3][4][5][6][7]. Although some studies have found negative results [8][9][10][11][12][13], this locus, especially the MHC region, is still a high susceptibility factor in schizophrenia [14]. Classical MHC gene products, including MHC class I and II molecules, have become known as leukocyte antigens, and their primary function is to provide protection against pathogens. Compared with classic MHC molecules, the histone supercluster, the Zinc-finger supercluster, the heat shock cluster, and other immune-related/unrelated genes have been found in the extended MHC region [15].
In recent years, genome-wide association studies (GWASs) have been a powerful and efficient approach to identifying genetic variants that are associated with complex human diseases. The methodology of GWASs has been facilitated by technological advances that enable the simultaneous and cost-effective analysis of large numbers of genetic markers in the human genome [16,17]. GWASs of schizophrenia have identified several genes within the extended MHC region (6p21. 2-p22. 3) as a susceptibility locus in European [18][19][20] and Japanese [21] individuals. Additionally, our earlier two-stage GWAS also found that three single-nucleotide polymorphisms (SNPs; i.e., rs1233710 in ZKSCAN4, rs1635 in NKAPL, and rs2142731 in PGBD1) within this locus had a strong association with schizophrenia in a Chinese Han population [22]. These three SNPs span 35 kb and may not fully indicate chromosome 6p21-p22.1 (rs1233710 and rs1635 had the most linkage disequilibrium [LD]). To further verify our evidence of an association, we selected eight other SNPs located in or adjacent to these three genes and performed another independent case-control association study in Chinese Han subjects who are unrelated to the previous GWAS samples. We also performed a combined study using this replication sample and the first-stage GWAS sample. Because no identified functions have been reported for these three proteins, verification of the previous associations may also confirm the importance of these genetic factors and encourage further research.

Ethics Statement
Approval for this study was obtained from the Ethical Committee of the Institute of Mental Health, Peking University. All of the participants were adults. The objective and procedures of this study were explained to all of the subjects and patient's guardians, and written consent was obtained. Healthy control subjects signed informed consent forms themselves. While all of the patients' guardians signed informed consent forms on the behalf of the patients. If the patients were within a stable period according to the clinical features and could understand the consent, they also double-signed the consent forms themselves, accompanied by their guardians. Otherwise, guardians solely consented on the behalf of patients whose capacity to consent was compromised.

Subjects
In the present replication study, 902 unrelated schizophrenia patients (484 males and 418 females; mean age, 3967 years) and 1,091 healthy controls (559 males and 532 females; mean age, 4569 years) were recruited. All of these participants were Chinese Han descendants and unrelated to the previous GWAS samples. The patients were recruited from the Institute of Mental Health, Peking University, Beijing, China. The clinical diagnosis was made according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV). None of the participants had severe medical complications. Healthy controls were selected by psychiatrists using a simple non-structured interview, excluding individuals with histories of mental health problems or neurological diseases. Both the schizophrenia patients and healthy controls were Chinese Han and lived in the same area of northern China. The two groups were matched for gender and age. In the combined study, the subjects from the first-stage study were exactly the same as those we described previously [22].

SNP Selection
We consulted the dbSNP (http://www.ncbi.nlm.nih.gov/snp; accessed October 12, 2012) and HapMap (release #24, CHB; http://hapmap.ncbi.nlm.nih.gov/; accessed October 12, 2012) databases and determined the LD block using the criterion of D9 .0.80 and Haploview version 4.0 [23]. The locus that we determined spanned 63 kb and covered four genes (from telomere to centromere: ZKSCAN4, NKAPL, ZNF187, and PGBD1) and nearby regions. Single-nucleotide polymorphisms with minor allele frequencies (MAFs) ,5% were excluded from the genetic analysis according to the HapMap CHB population. Therefore, no SNPs were chosen in the ZNF187 gene. Additionally, for the combined analysis, our GWAS data were consulted. Single-nucleotide polymorphisms with P values that did not reach genome-wide significance but were ,0.05 were considered in this study [22]. Finally, eight SNPs were selected, and their positions are shown in Table 1. All eight SNPs were tag SNPs and displayed strong associations with schizophrenia in our previous GWAS. Three of the SNPs (rs12000, rs3800324, and rs1997660) were located in exons, and their alleles were recognized as missense mutations.

Sample Preparation and Genotyping
Peripheral blood samples were collected from all of the subjects. Genomic DNA was extracted using the QIAamp DNA Mini Kit (Qiagen). The genotyping of denatured samples was performed using the Sequenom MassARRAY system (Sequenom iPLEX; Bioyong Technologies, Beijing, China) according to the manufacturer's instructions. Approximately 15 ng of genomic DNA was used to genotype each sample. Locus-specific polymerase chain reaction (PCR) and detection primers were designed using MassARRAY Assay Design 3.0 (primer sequences not shown). The sample DNAs were amplified by multiplex PCR reactions. Polymerase chain reaction products were then used for locusspecific single-base extension reactions. The resulting products were desalted and transferred to a 384-element SpectroCHIP array. Allele detection was performed using MALDI-TOF MS spectroscopy. The mass spectrograms were analyzed using MassARRAY TYPER. To evaluate genotyping quality, 5% of the samples were randomly selected, and the rs2185955 and rs12214383 SNPs were genotyped again. And no inconsistency was found.

Statistical Analysis
Hardy-Weinberg equilibrium (HWE) was separately tested among the case and control groups using x 2 goodness-of-fit tests with one degree of freedom (df). The Pearson x 2 -test was used to compare the categorical variable gender, and Student's t-test was used for the continuous variable age. All of the statistical analyses were performed using SPSS 17 software. The pairwise LD analysis was applied to detect inter-marker relationships using the D9 value. This and the case-control association analyses were performed using Haploview version 4.0 [23] and PLINK version 1.07 [24]. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated to evaluate the effects of different alleles and haplotypes. Bonferroni corrections and permutation tests were applied to correct the P values of alleles (corrected a = 0.05/8 = 0.00625) and haplotypes (10,000 times), respectively, to control inflation of the Type I error rate. All of the tests were two-tailed, with statistical significance of P,0.05. In the combined study, heterogeneity across the two samples was evaluated using the Cochran Q statistic to determine the heterogeneity statistic (I 2 ) and P value. The combined analyses were made by using RevMan version 5. If I 2 was less than 50% (P.0.05), the fixed-effect (Mantel-Haenszel) model was used to combine the results from the two different cohorts; otherwise, the random-effect (DerSimonian-Laird) model was used [25,26].

Results
All of the eight SNPs we selected showed MAFs greater than 5% in our samples. The genotype distributions of the eight SNPs in the control group did not show significant deviations from HWE (Table 1). No significant differences in age or gender distributions were found between the case and control samples. The genotypes and allele frequencies of the eight SNPs are shown in Table 1. The rs2235359 and rs2185955 SNPs did not show significant differences in allele frequencies between the case and control groups. Four SNPs (rs12214383, rs1150724, rs3800324, and rs1997660) showed nominal differences in allele frequencies that disappeared after Bonferroni corrections. Significant differences remained for rs12000 and rs1150722 after the corrections (rs12000: allele A.G, P = 2.50E-04, OR = 1.27, 95% CI = 1.12-1.45; rs1150722: allele C.T, P = 4.28E-05, OR = 0.55, 95% CI = 0.41-0.73).
Linkage disequilibrium was computed between every two SNPs to further analyze the haplotype structure. Fig. 1 shows the LD plot constructed using the eight SNPs. The D9 value of each combination was .0.8 using the combined case and control group. Therefore, we used the LD block that consisted of these eight inner-markers in the haplotype analysis (Table 2). Haplotypes ATCAGTGT and ATTGATAC showed nominal differences between the patient and control groups. ATTGACGC is a protective haplotype that is highly associated with schizophrenia (P = 6.60E-05, OR = 0.551, 95% CI = 0.410-0.741). This result was still significant after the 10,000-time permutation tests. Table  S1 shows the results of haplotype-based association test using the same eight SNPs based on the previous GWAS dataset. The results of the present study and previous GWAS were highly consistent.

Discussion
In recent years, the extended MHC region has been implicated as a main factor in schizophrenia pathogenesis, supported by GWASs of schizophrenia in different populations [18][19][20][21][22]. Association studies for markers in the human MHC region have been inconsistent, which may be attributable to heterogeneity [13,27,28]. The ZKSCAN4, NKAPL, and PGBD1 genes, located in this region, are in one LD block in the Chinese Han population. In our replication study, all six SNPs within the NKAPL and PGBD1 genes displayed an association with schizophrenia, but neither of the two SNPs in the ZKSCAN4 gene showed an association. Both rs12000 and rs1150722 showed predominant significance in the present replication study and combined analysis. The haplotype analysis also indicated that the entire LD block that comprised these eight SNPs was significant in the pathogenesis of schizophrenia. Unexpectedly, Ma et al. [29] failed to validate our GWAS results in their replication study. The genetic structure of the Chinese population was investigated by two different research groups, demonstrating a population substructure among northern, central, and southern Chinese Han populations [30,31]. The study by Chen et al. showed a strong correlation between the genetic and geographic maps of the Chinese Han population, indicating that geographic location may be a good indicator of ancestral origin, and thus geographic matching may be a good proxy for genetic matching [31]. In our study, all of the patients and controls in both the GWAS and replication samples were determined to be of northern Chinese Han origin according to their birth places and the birth places of their parents in Hebei, Shandong, and Henan provinces. In contrast, the subjects in the study by Ma et al. were recruited from Hunan province in central China [29]. The validation failure confirmed the role of regional differences in association studies of Chinese Han populations. These results also support the notion of the high genetic heterogeneity of schizophrenia.
ZKSCAN4, also referred to as ZNF307, is recognized as a member of the zinc-finger protein family. This gene did not show an association with schizophrenia, similar to our GWAS, which may attributable to the limited sample size and deviation of the patient group from HWE (Table 1). NKAPL is a novel gene first reported in our GWAS with a strong association on rs1635, and its product was suggested to play a role in neurodevelopmental processes [22]. It is approximately 55% homologous to NKAP, the protein of which has been shown to be a transcriptional repressor of notch signaling and required for T-cell development and maturation [32,33]. Both of these proteins have a SynMuv product domain in their C-terminal [34]. Similar to the rs1635 SNP, the allele of rs12000 is a missense mutation. Moreover, LD was found between rs12000 and rs1635 using the HapMap database (release #24, CHB). This evidence suggests that destruction of the structure of NKAPL may affect the development of the central nervous system. PGBD1 has been reported to be a susceptibility gene for both schizophrenia and Alzheimer's disease [27,35]. It is a member of the piggyBac transposable element-derived (PGBD) gene subfamily. It is also known as SCAND4 because of the conserved SCAN domain in the Nterminal of its products (ZKSCAN4 also contains this domain). The SCAN domain is one type of zinc-finger protein domain and has been found in several vertebrate proteins that contain C2H2 zinc-finger motifs, many of which may be transcription factors that play roles in cell survival and differentiation [34]. This proteininteraction domain is able to mediate the homo-and heterooligomerization of SCAN-containing proteins [36,37]. Although the rs2142731 and rs1150722 SNPs are both within introns of PGBD1, their alleles may affect the expression of PGBD1.
Numerous other genes in the extended MHC region have been shown to be involved in the pathogenesis of schizophrenia. MICB, HLA-A, and HLA-B are classic MHC molecules that play central roles in the development of host defense and immunity [38,39]. Tumor necrosis factor a (TNFa), located in the classic class III subregion, encodes a cytokine involved in systemic inflammation and acute-phase reactions [40,41]. Products of the MHC region implicated in the pathogenesis of schizophrenia do not only contribute to immune responses but also have general functions in different molecular biological processes [15]. Heat shock protein 70 (HSP70) works as a protein-folding machine [42]. DDR1 is a type of receptor tyrosine kinase that promotes prosurvival effects Figure 1. The linkage disequilibrium (LD) block structure consisted of the eight SNPs located in ZKSCAN4, NKAPL, and PGBD1 genes. The LD pattern was derived from the combined group (i.e., both case and healthy control subjects). The LD block was defined by a D9 value threshold of 0.8. The color scale ranges from red to white (color intensity decreases with decreasing D9 value). This locus was identified as one block, and the plot was generated by Haploview. doi:10.1371/journal.pone.0056732.g001 through Notch1 activation [43]. Dysbindin (the product of DTNBP1) has been characterized as a stable component of a multi-subunit complex termed BLOC-1, which has been implicated in intracellular protein trafficking and the biogenesis of specialized organelles of the endosomal-lysosomal system [44]. NOTCH4 is a Notch family member that plays a role in various developmental processes by controlling cell fate decisions. All of their encoding genes are located in the extended MHC region and have been shown to be associated with schizophrenia [45][46][47][48][49][50]. Agartz et al. also reported that common sequence variants in the MHC region are associated with cerebral ventricular size in schizophrenia [51]. In summary, extended MHC molecules may participate in numerous life processes, in addition to the immune response, and may influence central nervous system development.
Psychosis comprises a set of complex mental disorders that are caused by interactions between polygenic and environment factors and diagnosed by clinical symptoms. Many genetic factors have been shown to play etiological roles in diverse mental disorders, including PGBD1 in schizophrenia and Alzheimer's disease [22,35,52] and TNFa in schizophrenia and bipolar disorder [40]. This evidence suggests that multiple assemblies of polygenic risk alleles may lead to variable phenotypic outcomes during different stages of life that are recognized as various psychotic disorders.
In conclusion, we confirmed the association between chromosome 6p21-p22.1 and schizophrenia in a northern Chinese Han population. Further research is needed to fully understand the functions of these various genetic factors and their roles in the pathogenesis of schizophrenia and other mental disorders.