Identification of shared risk loci and pathways for bipolar disorder and schizophrenia

Bipolar disorder (BD) is a highly heritable neuropsychiatric disease characterized by recurrent episodes of mania and depression. BD shows substantial clinical and genetic overlap with other psychiatric disorders, in particular schizophrenia (SCZ). The genes underlying this etiological overlap remain largely unknown. A recent SCZ genome wide association study (GWAS) by the Psychiatric Genomics Consortium identified 128 independent genome-wide significant single nucleotide polymorphisms (SNPs). The present study investigated whether these SCZ-associated SNPs also contribute to BD development through the performance of association testing in a large BD GWAS dataset (9747 patients, 14278 controls). After re-imputation and correction for sample overlap, 22 of 107 investigated SCZ SNPs showed nominal association with BD. The number of shared SCZ-BD SNPs was significantly higher than expected (p = 1.46x10-8). This provides further evidence that SCZ-associated loci contribute to the development of BD. Two SNPs remained significant after Bonferroni correction. The most strongly associated SNP was located near TRANK1, which is a reported genome-wide significant risk gene for BD. Pathway analyses for all shared SCZ-BD SNPs revealed 25 nominally enriched gene-sets, which showed partial overlap in terms of the underlying genes. The enriched gene-sets included calcium- and glutamate signaling, neuropathic pain signaling in dorsal horn neurons, and calmodulin binding. The present data provide further insights into shared risk loci and disease-associated pathways for BD and SCZ. This may suggest new research directions for the treatment and prevention of these two major psychiatric disorders.

Introduction Bipolar disorder (BD) is a severe neuropsychiatric disease characterized by recurrent episodes of mania and depression. BD has an estimated lifetime prevalence of around 1% [1], and a heritability of around 70% [2]. BD shows substantial clinical and genetic overlap with other psychiatric disorders [3,4]. An analysis of the genome-wide genotype data of the Psychiatric Genomics Consortium (PGC) revealed a 68% genetic correlation between BD and schizophrenia (SCZ), which was the highest correlation with BD of all psychiatric diseases investigated [3]. However, the genes involved in this etiological overlap remain largely unknown. Although research into BD and SCZ has identified a number of susceptibility genes, the respective biological pathways still await identification. For BD, recent genome wide association studies (GWAS) have identified a number of risk loci [5][6][7][8][9][10][11][12][13].
For SCZ, a PGC meta-analysis of data from >36,000 patients and 113,000 controls identified 128 independent genome-wide significant single nucleotide polymorphisms (SNPs) in 108 genetic loci [14].
The aim of the present study was to investigate whether these 128 SCZ-associated SNPs also contribute to the development of BD. For this purpose, we performed association testing of these SNPs in our large BD GWAS dataset [12]. In addition, we analyzed whether the genomewide significant BD-associated SNPs identified in our BD GWAS [12] show association with SCZ.

Sample description
The analyses were performed using data from our previous GWAS of BD (9,747 patients and 14,278 controls) [12]. This GWAS dataset combined: (i) the MooDS data (collected from Canada, Australia, and four European countries); and (ii) the GWAS results for BD of the large multinational PGC [5]. The patients were assigned the following diagnoses (DSM-IV, DSM-IIR, Research Diagnostic Criteria): BD type 1 (n = 8,001; 82.1%); BD type 2 (n = 1,212; 12.4%); schizoaffective disorder (bipolar type; n = 269; 2.8%); and BD not otherwise specified (n = 265, 2.7%) [12]. The study was approved by the local ethics committees of the participating centers ( [12]. Written informed consent was obtained from all participants prior to inclusion [12].

Genome-wide significant loci for SCZ and BD
For the 128 linkage disequilibrium (LD)-independent genome-wide significant SNPs for SCZ, genetic information was obtained from the supplementary information of the SCZ GWAS of the PGC [14]. This is the largest GWAS of SCZ to date.
Genome-wide significant SNPs for BD were obtained from our BD GWAS [12].

Imputation and meta-analysis
Different reference panels were used for the imputation of the MooDS and PGC BD genotype data (1,000 Genomes Project, February 2012 release; and HapMap phase 2 CEU, respectively). Therefore, the summary statistics of the PGC BD GWAS [5] were imputed using the 1,000 Genomes Project reference panel and ImpG-Summary. The latter is a recently proposed method for the rapid and accurate imputation of summary statistics [15]. This resulted in z-scores for >20 million SNPs. A total of 111 SCZ-associated SNPs could be mapped to the reimputed PGC BD GWAS data. The remaining variants were either located on the X-chromosome (n = 3), or represented insertions or deletions (n = 14) which could not be imputed by the applied method. In total, 107 of the 111 SCZ-associated SNPs could be identified in the MooDS BD GWAS. A meta-analysis for these 107 SNPs was then performed by combining the PGC BD GWAS and the MooDS BD GWAS, and using the sample size based strategy implemented in METAL [16].

Analysis of shared BD-SCZ SNPs
The risk alleles for all nominally significant SNPs in our BD GWAS [12] were compared to those reported in the PGC SCZ GWAS.
The SCZ discovery meta-analysis comprised data from 35,476 patients and 46,839 controls. Our BD GWAS comprised data from 9,747 patients and 14,278 controls [12]. To correct for an overlap between the two studies of around 500 patients and 9,200 controls [17,18], we applied the framework of a bivariate normal distribution for the z-scores from both studies, corresponding to a specific SNP. Since the significant hits from a study were selected from different chromosomal regions, we assumed that the z-scores within a study are independent. According to the LD Score regression method [19], the mean inflation of the test statistics provides an approximation of the variance of the z-scores. By considering the set of SNPs in the HapMap3 reference panel [20], the calculated variance was approximately 1.82 for SCZ and 1.24 for BD. From equation (16) in Bulik-Sullivan et al. [19] (Supplementary Material), the covariance between z-scores was calculated to be 0.1644, under the assumption of no genetic correlation. This yielded a correlation of approximately 0.109. To confirm the validity of these theoretical calculations, we estimated the covariance of z-scores due to sample overlap by applying the LD Score regression software directly to the results of the PGC SCZ GWAS and our BD GWAS. After restriction to the well-imputed SNPs of HapMap3, the software estimated a covariance of 0.1707. This result provides further evidence that the degree of sample overlap was correctly estimated in the present study.
The z-scores for the 107 SCZ-associated SNPs were extracted from the PGC SCZ discovery study. The corresponding z-scores were extracted from our BD GWAS [12]. Using the values above, the mean and the variance of the normal distribution for the BD z-scores were determined, given the z-scores from the PGC SCZ discovery study. After the transformation of the initial z-scores from our BD GWAS, a total of 22 of 107 z-scores for BD had corresponding two-sided association p-values of <5% (Table 1).
Analogously, the z-scores for the genome-wide significant BD SNPs were extracted from our BD GWAS [12], and the corresponding z-scores were extracted from the PGC SCZ discovery study. Of the five BD-associated lead SNPs in our BD GWAS, one SNP (rs6550435) was in high LD (r 2 = 0.897, SNAP [21]) with a genome-wide SCZ-associated SNP (rs75968099), and was thus excluded from this additional analysis. For the remaining four SNPs, the transformation was computed in the other direction. After correction for sample overlap, no BD SNP showed association with SCZ.
Bonferroni correction for multiple testing was performed by multiplying the nominal p-values with the number of investigated SNPs (n = 107+4 = 111).
In IPA, each gene is represented in a global molecular network, which is designed using information from the Ingenuity Pathway Knowledge Base. 'Networks' were generated algorithmically, and on the basis of their connectivity in terms of activation, expression, and transcription. Molecular relationships between genes are represented by connecting lines between nodes, as supported by published data stored in the Ingenuity Pathway Knowledge Base and/ or PubMed. For the purposes of the present study, the canonical pathway analysis available in IPA was applied. Here, an SNP is mapped to a gene if it falls within the gene-coding region or within the 2 kilobase (kb) upstream/ 0.5 kb downstream range of the gene-coding region. This resulted in the inclusion of 13 genes in the pathway analysis. Significant pathways were filtered in order to achieve a minimum of two genes per set. The significance of the association between the SNP-associated genes mapped by IPA and the canonical pathway was measured using Fisher's exact test.
INRICH [24] was used as a secondary pathway analysis tool, as it enables examination of enriched association signals of LD-independent genomic intervals. Gene Ontology (GO) gene sets were extracted from the Molecular Signatures Database (MSigDB), version 5.0 (Broad Institute, http://software.broadinstitute.org/gsea/msigdb/index.jsp, downloaded in September 2015). The size of the extracted gene sets ranged from 10 to 200 genes, resulting in 1,268 target sets for testing. The intervals around the 22 SNPs of interest were based on empirical estimates of LD from PLINK (http://pngu.mgh.harvard.edu/purcell/plink/). SNPs were assigned to genes using 50 kb up-and downstream windows. In total, 21 intervals were tested for the 1,268 target sets. In IPA, correction for multiple testing was performed using the Benjamini Hochberg method. In INRICH, the empirical gene set p-value was corrected for multiple testing using bootstrapping-based re-sampling.

Results
A total of 107 of the 128 SCZ-associated SNPs could be mapped to both the re-imputed PGC BD GWAS and the MooDS BD GWAS data. A meta-analysis of these 107 SNPs was then performed using METAL [16].
After correction for sample overlap, 22 of the 107 SCZ-associated SNPs showed nominally significant p-values in our BD GWAS (Table 1, S1 Table). For all 22 SNPs, the direction of the effect was identical to that observed in the PGC SCZ GWAS [14]. Of the five genome-wide significant BD-associated SNPs identified in our BD GWAS, one SNP (rs6550435) was in high LD (r 2 = 0.897) with a genome-wide SCZ-associated SNP (rs75968099). None of the remaining four genome-wide significant BD-associated SNPs showed a nominally significant association with SCZ after correction for sample overlap (data not shown).
The number of SCZ SNPs with a p-value of <0.05 in our BD GWAS (n = 22) was significantly higher than expected (p = 1.46x10 -8 , binomial test). This provides further evidence that SCZ-associated loci contribute to the development of BD.
The most strongly associated SNP was located near the gene TRANK1 (Table 1, p = 2.03x10 -5 ), which is a reported genome-wide significant risk gene for BD [7,12]. The other nominally associated SCZ-BD SNPs implicated loci which contain interesting candidate genes for BD and SCZ. These include the chromatin remodeling gene SATB2, the glutamate receptor genes GRM3 and GRIN2A, and the calcium channel subunit gene CACNB2. The latter is a reported genome-wide significant risk gene for a number of psychiatric disorders, including BD and SCZ [17]. After Bonferroni correction for multiple testing, two SNPs (rs75968099, rs2535627) showed significant association with BD (p corr = 2.25x10 -3 and p corr = 5.19x10 -3 , respectively).
Pathway analysis using IPA revealed nine pathways with nominally significant enrichment (Fig 1). Of these, eight remained significantly enriched after Benjamini Hochberg correction for multiple testing. The pathway with the strongest enrichment was synaptic long term potentiation (p corr = 0.003, Fig 2, S2 Table). In addition, significant enrichment was found for glutamate receptor-and calcium signaling; neuropathic pain signaling in dorsal horn neurons; and CREB signaling in neurons.
These findings are consistent with previous pathway analyses of BD and SCZ [5,[25][26][27]. The present analysis also confirmed the glutamatergic signaling pathway, which was considered provisional in a recent review [28].
Pathway analysis using INRICH identified a total of 16 nominally significant gene-sets, which showed partial overlap in terms of the underlying genes. The enriched gene-sets include voltage-gated calcium channel complex/activity; calmodulin binding; glutamate receptor activity; and M phase of the mitotic cell cycle (Fig 3). None of these gene-sets remained significantly enriched for associations after correction for multiple testing (Fig 3, S3 Table).

Discussion
The present analyses revealed a significant enrichment of BD-associated SNPs within known SCZ-associated loci (p = 1.46x10 -8 ). This is consistent with previous reports of overlapping genetic susceptibility for BD and SCZ [4,29,30].
The most strongly associated SNP was located near TRANK1, which is a reported genomewide significant risk gene for BD [7]. The second SNP with significant BD association after correction for multiple testing (rs2535627, Table 1) was located in a genomic region on chromosome 3. This region contains multiple genes, including inter-alpha-trypsin inhibitor heavy chain 3 (ITIH3) and -4 (ITIH4). Common variation at the ITIH3-ITIH4 region has been identified as a genome-wide significant risk factor for five different psychiatric disorders, including SCZ and BD [17].
Interestingly, the GWAS index SNP rs2535627 represents a Bonferroni-significant fetal brain methylation quantitative trait locus (mQTL), as it has been associated with DNA methylation at cg11645453. The latter is located in the 5' untranslated region of ITIH4 [31]. This suggests that the SCZ-BD associated SNP rs2535627 might contribute to disease susceptibility by altering the expression of ITIH4 in the brain [32]. This hypothesis is supported by a recent study, which found that the G-allele of the SNP rs4687657-which is in moderate LD with rs2535627 (r 2 = 0.426, D' = 1.000, SNAP [21])-was significantly associated with reduced ITIH4 expression in the postmortem dorsolateral prefrontal cortex of controls [33].
SNPs with nominal association implicated several other plausible susceptibility genes for BD and SCZ (Table 1). These include SATB2, which is a highly conserved chromatin remodeling gene [34]. A previous animal study demonstrated that SATB2 was an essential regulator of axonal connectivity in the developing neocortex [35]. In addition, mutations spanning SATB2 have been reported in patients with neurodevelopmental disorders, including autism [36,37].
The present SCZ-BD associated SNPs implicated three promising candidate genes for shared BD-SCZ etiology, i.e., CACNB2, GRM3, and GRIN2A. The gene CACNB2 encodes an  L-type voltage-gated calcium channel subunit, and is a reported genome-wide significant risk gene for several psychiatric disorders, including SCZ and BD [17].
The gene GRM3 encodes a metabotropic glutamate receptor. GRM3 is expressed predominantly in astrocytes, and has been investigated by previous authors as a potential therapeutic target in SCZ [14]. A further SCZ-BD SNP was located near GRIN2A, which encodes an NMDA receptor subunit involved in glutamatergic neurotransmission and synaptic plasticity [14]. Interestingly, rare mutations in GRIN2A have been reported in patients with SCZ [38].
The present pathway analysis implicated calcium-and glutamate signaling, and neuropathic pain signaling in dorsal horn neurons. These findings are consistent with previous pathway analyses of BD and SCZ [5,[25][26][27]. These results thus provide further evidence that neurotransmitter signaling and synaptic processes are involved in the development of BD and SCZ. Our enrichment analysis identified a total of 25 enriched gene-sets, which showed partial overlap in terms of the underlying genes. One of the major characteristics of the GO database is its hierarchical structure. This structure involves the use of broad 'parent' terms, which can be divided into more distinctive 'child' terms [39]. After taking these relations into account, we categorized our findings from the GO database into five different parent gene-set families: channel activity, lipase activity, mitotic cell cycle, calmodulin binding, and glutamate receptor signaling (S3 Table).
The results generated by IPA and INRICH were broadly consistent, despite the fact that the underlying databases were different. In some cases, pathways were implicated by the same genes, e.g., glutamate signaling was implicated by GRIN2A and GRM3 in both IPA and INRICH. In other cases, pathways were implicated by differing genes, e.g., calcium channel activity/calcium signaling was implicated by NFATC3 and GRIN2A in IPA, and by CACNB2 and CACNA1C in INRICH (S2 and S3 Tables). This provides further support for the involvement of these pathways in the development of BD and SCZ.
The most strongly enriched pathway according to IPA was synaptic long term potentiation (Fig 2). This pathway has been implicated in learning and memory mechanisms [40]. Interestingly, several previous studies have provided evidence for the involvement of impaired long term potentiation in the pathophysiology of SCZ [41,42]. In the present study, this pathway result was driven by the genes GRIN2A, GRM3, and CACNA1C. The products of all three genes are located in the postsynaptic membrane (Fig 2), which may suggest that dysfunction at the postsynaptic level is an early step in the development of BD and SCZ [43].
The identified pathways support specific hypotheses regarding the shared neurobiology of BD and SCZ. Notably, our results provide further evidence that glutamate signaling might be involved in the development of both SCZ and BD [44]. This would be consistent with the observation from routine clinical practice that SCZ drugs which target glutamate signaling are also effective in BD patients with psychosis or mania [44].
A limitation of the present study was the substantial sample overlap between our BD GWAS [12] and the SCZ GWAS of the PGC [14], since this creates an inflation of effect. To address this, the correlation of z-scores between the two studies was calculated. Based on this information, the initial z-scores were then transformed to correct for sample overlap. To estimate the correlation of test statistics, the publically available summary statistics of the PGC SCZ GWAS were used, which comprise the results of the discovery phase (35,476 patients, 46,839 controls). As the effect of shared samples might be stronger in the discovery sample than in the complete meta-analysis, we may have overestimated the correlation of test statistics between the two GWAS. Therefore our correction for sample overlap may have been too conservative. However, since the inflation effect introduced by shared samples might be different for independent SNPs compared to the average correlation of test statistics, we assume that our conservative approach was appropriate in terms of reducing false positive results. In future cross-disorder studies, shared samples should be identified and removed from one study on the basis of individual genotype data. This was not possible in the present study, as the analyses were based on summary statistics.
The present data provide further insights into shared risk loci and disease-associated pathways for BD and SCZ.
However, further research is required to determine precisely how the genetic risk variants correlate with particular diagnoses or clinical symptoms. For example, in a previous study, we showed that common variation at the NCAN locus was associated with both BD [8] and SCZ [45]. Genetic variation at the NCAN locus thus represents a cross-diagnosis contributory factor, which may relate to a specific mania symptom-complex [46]. Therefore, future studies are warranted to determine the specific BD and SCZ phenotypic dimensions to which the present variants contribute. Such findings may suggest new research directions for the treatment and prevention of BD and SCZ.
Supporting information S1 Table. Overview of the 107 investigated schizophrenia-associated SNPs and respective test statistics. Single nucleotide polymorphisms (SNPs) are shown according to their p-values in our bipolar disorder (BD) GWAS [12] following correction for sample overlap. Chromosomal positions refer to genome build GRCh37 (hg19). An imputation accuracy metric of 1 indicates that the respective SNP was not imputed using ImpG-Summary. Abbreviations: Chr, chromosome; A1, the allele to which the z-score is predicted; A2, other allele; Z/P BD Meta, zscore/p-value in our BD GWAS [12] after correction for sample overlap; Pcorr BD Meta, pvalue in our BD GWAS [12] after correction for sample overlap and Bonferroni correction for multiple testing; Z/P PGC SCZ (discovery), derived z-score/p-value in the PGC schizophrenia GWAS (discovery phase) [14].