Improved Detection of Common Variants Associated with Schizophrenia and Bipolar Disorder Using Pleiotropy-Informed Conditional False Discovery Rate

Several lines of evidence suggest that genome-wide association studies (GWAS) have the potential to explain more of the “missing heritability” of common complex phenotypes. However, reliable methods to identify a larger proportion of single nucleotide polymorphisms (SNPs) that impact disease risk are currently lacking. Here, we use a genetic pleiotropy-informed conditional false discovery rate (FDR) method on GWAS summary statistics data to identify new loci associated with schizophrenia (SCZ) and bipolar disorders (BD), two highly heritable disorders with significant missing heritability. Epidemiological and clinical evidence suggest similar disease characteristics and overlapping genes between SCZ and BD. Here, we computed conditional Q–Q curves of data from the Psychiatric Genome Consortium (SCZ; n = 9,379 cases and n = 7,736 controls; BD: n = 6,990 cases and n = 4,820 controls) to show enrichment of SNPs associated with SCZ as a function of association with BD and vice versa with a corresponding reduction in FDR. Applying the conditional FDR method, we identified 58 loci associated with SCZ and 35 loci associated with BD below the conditional FDR level of 0.05. Of these, 14 loci were associated with both SCZ and BD (conjunction FDR). Together, these findings show the feasibility of genetic pleiotropy-informed methods to improve gene discovery in SCZ and BD and indicate overlapping genetic mechanisms between these two disorders.


Introduction
Converging evidence suggests that complex human phenotypes are influenced by numerous genes each explaining a small proportion of the variance [1]. Though thousands of single nucleotide polymorphisms (SNPs) have been identified by genomewide association studies (GWAS) [2,3], these SNPs fail to explain a large proportion of the heritability of most complex phenotypes studied. This is commonly referred to as the 'missing heritability' problem. Recent findings indicate that GWAS have the potential to explain a greater proportion of the heritability of common complex phenotypes [4][5][6], and more SNPs are likely to be identified in larger samples [7]. Due to the polygenic nature of most complex traits and disorders, a large number of SNPs are likely to have associations too small in magnitude to be identified with currently available sample sizes [8]. New analytical methods are therefore needed to reliably identify a larger proportion of SNPs associated with complex diseases and phenotypes, since recruitment and genotyping of sufficiently large samples for existing methods may be impractical and prohibitively expensive.
Genetic pleiotropy is defined as a single gene or variant being associated with more than one distinct phenotype. In the present study we use a new genetic pleiotropy-informed approach for GWAS to capture more of the polygenic effects in complex phenotypes. Given the high number of traits in humans, and the relatively small number of genes (,20,000), some genes have to affect multiple traits (genetic pleiotropy) [10]. By combining independent GWAS from associated disorders, we hypothesize that for disorders with related etiologies a genetic pleiotropyinformed approach can significantly improve gene discovery and help capture more of the missing heritability.
Recent findings suggest overlapping SNPs between several human traits [9,11] and disorders [12][13][14]. To date, methods to assess this genetic pleiotropy have not taken full advantage of the existing GWAS data and the majority of studies have focused on the subset of SNPs exceeding a Bonferroni-corrected threshold of significance for each trait or disorder [12][13][14]. However, this approach cannot detect SNPs that only reach genome-wide significance in the combined analysis but do not meet Bonferronicorrected significance in the individual phenotype (hereafter referred to as polygenic pleiotropy). Combining GWAS statistics from two disorders also provides increased power to discover genes associated with common biological mechanisms, and thus inform on overlapping pathophysiological relationships between the disorders. In the current study, we use a pleiotropy-informed statistical approach to improve gene discovery in schizophrenia and bipolar disorder, two disorders with high heritability [15], where most of the underlying genetic architecture remains unknown [13,14], despite recent discoveries [13,14,16,17].
Schizophrenia and bipolar disorder share several clinical characteristics [18][19][20], including psychotic symptoms, disorders of thought and impairment of cognitive functions [21]. The disorders are often also treated with similar pharmacological agents [18,19]. Whether schizophrenia and bipolar disorder should be regarded as separable disease entities or as a single disease with a spectrum of symptoms [18][19][20], as proposed in the continuum hypothesis of psychosis [22], has been much discussed. With the forthcoming revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM), this question has received renewed attention [19,20,23]. Both disorders have an estimated heritability of 0.7-0.8, and are regarded as complex disorders with a polygenic architecture. Several lines of evidence have suggested overlapping genetic susceptibility in bipolar disorder and schizophrenia [15,[24][25][26]. Recently, a combined analysis of two large GWAS (16,374 cases and 12,044 controls) revealed three loci (CACNA1C, rs4765905, p = 7.0610 29 , ANK3 rs10994359, p = 2.5610 28 , ITIH3/4 region rs2239547, p = 7.8610 29 ) significantly associated with both disorders (Fisher's combined p in combined samples) [13,14]. Still, given the high degree of heritability and large similarities in clinical phenotypes, there are likely several more undiscovered overlapping genetic factors.
Here, using summary statistics from two independent large GWAS, we applied a model-free statistical analysis method to identify SNPs exhibiting pleiotropic relationships between schizophrenia and bipolar disorder. First, we separated out the common controls in the bipolar disorder and schizophrenia samples [13,14], ensuring non-overlapping samples. After applying genomic inflation control, we computed the conditional empirical cumulative distribution functions (cdfs) of the corrected p-values. Empirical cdfs for schizophrenia SNP p-values were determined conditional on the significance of the corresponding nominal pvalues in bipolar disorder, and vice versa. For each nominal p-value, an estimate of the conditional False Discovery Rate (FDR) was obtained from the conditional empirical cdfs [5]. Using this conditional FDR method, we constructed two-dimensional FDR ''look-up'' tables, with FDR in schizophrenia SNPs computed conditional on nominal bipolar disorder p-values, and vice versa. Using these tables we identified 58 loci associated with schizophrenia and 35 loci associated with bipolar disorder at a conditional FDR level of 0.05. We used a conjunction method to investigate SNPs significantly associated with both schizophrenia and bipolar disorder. Specifically, we computed the conditional FDR for schizophrenia given bipolar disorder nominal p-values, and conditional FDR for bipolar disorder given schizophrenia nominal p-values, and took the maximum of both values as the conjunction FDR. With this approach we identified 14 pleiotropic loci indicating several overlapping genetic risk factors for the two disorders. Finally, using mixture model-based analyses we estimated the proportion and distribution of non-null SNPs, demonstrating that the large increase in power from using conditional vs. unconditional FDR methods is derived from the high polygenicity of both phenotypes with many test statistics just below significance thresholds, and the largely overlapping distribution (high degree of pleiotropy) of non-null SNPs for schizophrenia and bipolar disorder.

Q-Q plots of schizophrenia SNPs conditional on association with bipolar disorder and vice versa
Under large-scale testing paradigms, such as GWAS, quantitative estimates of likely true associations can be estimated from distributions of summary statistics [27,28]. A common method for visualizing the 'enrichment' of statistical association relative to that expected under the global null hypothesis is through Q-Q plots of nominal p-values obtained from GWAS summary statistics. The usual Q-Q curve has the nominal p-value, denoted by ''p'', as the y-ordinate and the corresponding value of the empirical cdf, here denoted by ''q,'' as the x-ordinate. Under the global null hypothesis the theoretical distribution is uniform on the interval [0,1]. As is common in GWAS, we instead plot 2log 10 p against 2log 10 q to emphasize tail probabilities of the theoretical and

Author Summary
Genome-wide association studies (GWAS) have thus far identified only a small fraction of the heritability of common complex disorders, such as severe mental disorders. We used a conditional false discovery rate approach for analysis of GWAS data, exploiting ''genetic pleiotropy'' to increase discovery of common gene variants associated with schizophrenia and bipolar disorders. Leveraging the increased power from combining GWAS of two associated phenotypes, we found a striking overlap in polygenic signals, allowing for the discovery of several new common gene variants associated with bipolar disorder and schizophrenia that were not identified in the original analysis using traditional GWAS methods. Some of the gene variants have been identified in other studies with large targeted replication samples, validating the present findings. Our pleiotropy-informed method may be of significant importance for detecting effects that are below the traditional genome-wide significance level in GWAS, particularly in highly polygenic, complex phenotypes, such as schizophrenia and bipolar disorder, where most of the genetic signal is missing (i.e., ''missing heritability''). The findings also offer insights into mechanistic relationships between bipolar disorder and schizophrenia pathogenesis. empirical distributions. As such, genetic 'enrichment' refers to a leftward shift in the Q-Q curve, corresponding to a larger fraction of SNPs with nominal 2log 10 p-value greater than or equal to a given threshold. Conditional Q-Q plots are formed by creating subsets of SNPs based on values of an additional variable (auxiliary measure) for each SNP, and computing Q-Q plots separately for each subset of SNPs. If SNP enrichment is captured by variation in the auxiliary measure, this is expressed as successive leftward deflections in conditional Q-Q plots as values of the additional variable increase.
Conditional Q-Q plots for schizophrenia given nominal pvalues of association with bipolar disorder (SCZ|BD; Figure 1A) show enrichment across different levels of significance for bipolar disorder. The earlier departure from the null line (leftward shift) suggests a greater proportion of true associations for a given nominal schizophrenia p-value. Successive leftward shifts for decreasing nominal bipolar disorder p-value thresholds indicate that the proportion of non-null effects in schizophrenia varies considerably across different levels of association with bipolar disorder. For example, the proportion of SNPs in the 2log 10 (p BD ) $3 category reaching a given significance level for schizophrenia (e.g., 2log 10 (p SCZ ) $4) is roughly 50 times greater than for the 2log 10 (p BD ) $0 category (all SNPs), indicating a high level of enrichment. An even stronger pleiotropic enrichment can be seen for bipolar disorder conditioned on nominal p-values of association with schizophrenia (BD|SCZ; Figure 1B), Here, the proportion of SNPs in the 2log 10 (p SCZ ) $3 category reaching a given significance level for bipolar disorder (e.g., 2log 10 (p BD )$4) is roughly 500 times greater than for the 2log 10 (p SCZ )$0 category (all SNPs), indicating a very high level of enrichment. inflation) in A) schizophrenia (SCZ) below the standard GWAS threshold of p,5610 28 as a function of significance of the association with bipolar disorder (BD) at the level of 2log 10 (p).0, 2log 10 (p).1, 2log 10 (p).2, 2log 10 (p).3 corresponding to p,1, p,0.1, p,0.01, p,0.001, respectively, and in B) BD below the standard GWAS threshold of p,5610 28 as a function of significance of association with SCZ at the level of 2log 10 (p).0, 2log 10 (p).1, 2log 10 (p).2, 2log 10 (p).3 corresponding to p,1, p,0.1, p,0.01, p,0.001, respectively. Dotted lines indicate the null-hypothesis. Lower panel: Stratified True Discovery Rate (TDR) plots illustrating the increase in TDR associated with increased pleiotropic enrichment in C) SCZ conditional on nominal BD p-values (SCZ|BD), and D) BD conditional on nominal SCZ p-values (BD|SCZ). For more information about QQ plots, see Text S1. doi:10.1371/journal.pgen.1003455.g001 Conditional True Discovery Rate (TDR) in schizophrenia is increased by bipolar disorder, and vice versa. Since categories of SNPs with stronger pleiotropic enrichment are more likely to be associated with schizophrenia, to maximize power for discovery all tag SNPs should not be treated exchangeably. Specifically, variation in enrichment across pleiotropic categories is expected to be associated with corresponding variation in the TDR (equivalent to 1-FDR) [29] for association of SNPs with schizophrenia. A conservative estimate of the TDR for each nominal p-value is equivalent to 1 -(p/q), easily read off from conditional Q-Q plots (see Material and Methods). This relationship is shown for schizophrenia conditioned on nominal bipolar disorder p-values (SCZ|BD; Figure 1C) and bipolar disorder conditioned on nominal schizophrenia p-values (BD|SCZ; Figure 1D). For a given conditional TDR the corresponding estimated nominal p-value threshold varies with a factor of 100 from the most to the least enriched SNP category for schizophrenia conditioned on bipolar disorder (SCZ|BD), and approximately a factor of 500 for bipolar disorder conditioned on schizophrenia (BD|SCZ).

Schizophrenia gene loci identified with conditional FDR
We constructed a ''conditional'' Manhattan plot for schizophrenia showing the FDR conditional on bipolar disorder ( Figure 2) and identified significant loci on a total of 18 chromosomes (1-4, 6-16, 18, 20 and 22) associated with schizophrenia leveraging the reduced FDR obtained by the associated bipolar disorder phenotype. To estimate the number of independent loci, we 'pruned' the associated SNPs (removed SNPs with linkage disequilibrium (LD).0.2), and identified a total of 58 independent loci with a significance threshold of conditional FDR,0.05 (Table 1). Using the more conservative conditional FDR threshold of 0.01, 9 independent loci remained significant. One locus was located in the HLA region on chromosome 6. Of note, using a standard Bonferroni-corrected approach, no loci would have been discovered. Using the FDR method in schizophrenia alone, 4 loci were identified. Of these, the regions close to TRIM26 (6p21.3), MMP16 (8q21.3) and NT5C2 (10q24.32) have been identified in earlier GWAS studies after including large replication samples [13]. The remaining loci would not have been identified in the current sample without using the pleiotropy-informed conditional FDR method. Of interest, the VRK2 region (2p16.1) was identified in the previous sample after including a large schizophrenia replication sample [30], and the ITIH4 region (3p21.1), ANK3 (10q21) and CACNA1C (12p13. 3) were discovered previously in the same, combined schizophrenia and bipolar disorder sample [13,14]. Thus, the current pleiotropyinformed FDR method validated 7 loci discovered in considerably larger samples, and discovered 51 new loci.

Bipolar disorder gene loci identified with conditional FDR
We constructed a ''conditional'' Manhattan plot for bipolar disorder showing the FDR conditional on schizophrenia ( Figure 3) and identified significant loci on a total of 16 chromosomes (1-3, 5-8, 10-14, 16 and 19-22) associated with bipolar disorder leveraging the reduced FDR obtained by the associated schizophrenia phenotype. To estimate the number of independent loci, we pruned the associated SNPs (removed SNP with LD .0.2), and identified a total of 35 independent loci with a significance threshold of conditional FDR,0.05 (Table 2). Of these, one locus was complex, i.e. included several significant SNPs, and the rest were single gene loci. Using the more conservative conditional FDR threshold of 0.01, 5 independent loci remained significant. The most significant locus was close to ANK3 on chromosome (10q21). This is the only locus that would have been discovered using standard methods based on p-values (Bonferroni correction). Using the FDR method in bipolar disorder alone, an additional locus was identified, close to CACNA1C (12p13.3). Both these loci have been discovered earlier [14,31]. The remaining 33 loci would not have been identified in the current sample without using the pleiotropy-informed conditional FDR method. Of these, the regions close to SYNE1 (6q25) and ODZ4 (11q14.1) have been identified in earlier GWAS after including large replication samples [14,32]. Of interest, the ITIH3 region (3p21.1), ANK3 (10q21) and CACNA1C (12p13.3) were discovered previously in the same, combined schizophrenia and bipolar disorder sample [13,14]. Thus, pleiotropy-informed conditional FDR validated 5 loci discovered in considerably larger samples, and discovered 30 new loci.

Pleiotropic gene loci in both schizophrenia and bipolar disorder identified with conjunctional FDR
To identify pleiotropic loci in schizophrenia and bipolar disorder, we performed a conjunction FDR analysis, using this to construct a ''conjunction'' Manhattan plot ( Figure 4). We detected 14 independent pleiotropic loci (pruned based on LD.0.2, black line around large circles) with conjunction FDR,0.05, all single gene loci, located on a total of 10 chromosomes (chr. 1, 3, 6, 7, 10, 12, 14, 16, 20, 22 -for further details, please see Table 3). Of these loci, 3 have been implicated in bipolar disorder and schizophrenia earlier: NOTCH4 (6p21.2) with schizophrenia using a larger replication sample [13,17], and the ITIH4 (3p21.1), and CACNA1C (12p13.3) regions, both discovered previously in the same, combined schizophrenia and bipolar disorder sample [13,14]. Interestingly only one conjunctional locus was found on chromosome 6, suggesting that there are several schizophrenia loci on this chromosome not overlapping with bipolar disorder. The ANK3 locus was not indicated in the conjunctional analysis, which indicates that the overlap is mostly driven by the association in bipolar disorder ( Table 2). The direction of the effect (z-scores) across all the pleiotropic SNPs was the same for bipolar disorder and schizophrenia, except for locus 33 (BC039673, 20p13), which could be due to differences in LD structure in this region. These findings suggest overlapping genetic pathways in schizophrenia and bipolar disorders.

Model-based power analyses
Our model-free conditional FDR analyses circumvent the issue of bias due to model misspecification. However, to ascertain the impact of effective sample size and conditioning on relative power over using unconditioned FDR on current sample sizes, it is necessary to use a model-based approach that estimates the proportion and distribution of non-null SNPs [33]. We thus posit a mixture of null and non-null Gaussian distributions [34] (see Methods and Text S1). Resulting model fits are displayed in Figure 5 for schizophrenia and bipolar disorder for absolute z scores $3. Left panels are actual data, whereas right panels are hypothetical realizations from a doubling of effective sample size, generated from mixture model fits. Null densities largely coincide with the overall densities except for z scores with absolute value larger than 4, at which point the ratio of null to total SNPs, equal to the local false discovery rate (local FDR), is less than 0.5 (left panels of Figure 5). Thus, while highly polygenic, most non-null SNPs have local FDR much larger than 0.05. The local FDR does not drop below 0.05 until absolute z scores exceed 5. Far more of the ''hidden'' non-null SNPs lie below this significance threshold than above it. Many of these hidden SNPs lie just below the significance threshold, so that an effective doubling of the sample size produces a ,30 times increase in number of rejected non-null SNPs with local FDR #0.05 (right panels of Figure 5).
Another model-based analysis using a bivariate mixture of Gaussians showed that a very high proportion of the non-null schizophrenia SNPs are also non-null for bipolar disorder (and vice versa) leading to large increases in power when using the conditional FDR approach. This increase in power is also due to the large number of SNPs with p-values just below the Bonferroni threshold. Figure 6 shows the power, or sensitivity to detect non-null SNPs for differing local FDR cut points from unconditional and conditional local FDR and, for comparison, from a hypothetical doubling of the number of subjects. Using conditional over unconditional local FDR results in an increase of 15-20 times the number of non-null SNPs discovered for a local FDR#0.05. The increase in power for conditional FDR, while dramatic, is not as large as what would be obtained by doubling the sample size. This is not unexpected, given that the highly polygenic non-null SNPs for schizophrenia and bipolar disorder, many just below the given significance thresholds, are largely but not completely overlapping. Note, given their highly polygenic distribution the vast majority of non-null SNPs remain undiscovered even using conditional FDR approaches or under an effective doubling of the number of subjects.
To test for enrichment with a ''control trait'' with little or no polygenic overlap with psychiatric disease, we performed pleiotropy analysis using type 2 diabetes (T2D) GWAS data. The analyses confirmed that there was a very small level of pleiotropic enrichment between schizophrenia and T2D, leading to little if any improvement in statistical power (See Text S1 and Figure S7).

Discussion
In the present study we leveraged the power of GWAS data from two independent schizophrenia and bipolar disorder samples, and demonstrate how GWAS from associated psychiatric disorders can improve discovery of novel susceptibility loci. Using standard GWAS analytical methods, we identified only one significant locus. By applying traditional FDR methods in the separate GWAS samples, we found an additional 6 loci (2 in bipolar disorder, 4 in schizophrenia). Combining the independent schizophrenia and bipolar disorder GWAS samples, we identified a total of 58 loci in schizophrenia and 35 in bipolar disorders, with conditional FDR,0.05 as a threshold. Nine of the current loci have been identified earlier in larger samples using standard GWAS analytical methods (7 in schizophrenia, 5 in bipolar disorder, and 3 in combined samples), while 10 other loci have been reported to show borderline association with bipolar disorder or schizophrenia (Table S1). These results demonstrate the feasibility of using a cost-effective, pleiotropy-informed conditional FDR approach to discover common variants in schizophrenia and bipolar disorders.
The proposed statistical approach is based on the observation that all SNPs should not be treated as exchangeable. Rather, a SNP with large effects in two associated phenotypes has a higher probability of being a true non-null effect, and hence also a higher probability of being replicated in independent studies. We thus applied a conditional FDR approach we have previously developed for GWAS p-values [36], adapted from methods originally used for linkage analysis and microarray expression data [5,37]. Decreased conditional FDR (equivalently, increased conditional TDR) for a given nominal p-value increases power to detect true non-null effects. Increased conditional TDR is directly related to increased replication effect sizes and replication rates in de novo samples. Using this conditional approach we were able to increase power to detect true non-null signals in independent studies for given nominal p-values cut-offs. Equivalently, in the  Table 1 conditional approach the FDR can be used to control FDR at a given level while increasing power to discover non-null SNPs over approaches that treat all SNPs as interchangeable. We also applied a previously developed conjunction FDR approach [36] to investigate which SNPs are pleiotropic, impacting risk of both schizophrenia and bipolar disorder. The conjunction statistic used is the maximum of the conditional FDR for schizophrenia given bipolar disorder and vice versa. SNPs that exceed a stringent conjunction threshold are thus highly likely to be non-null in the two phenotypes simultaneously. The extra number of significant loci identified in the current study compared to 'conventional' GWAS methods is remarkable. The power analyses suggest that the large increase in power is due to the conditional FDR method, and not an implicit higher false discovery rate. Compared to conventional GWAS methods, traditional FDR methods only identified a few extra loci. The large increase in power came from using conditional FDR, which identified 14.5 times as many schizophrenia SNPs and 17.5 times as many bipolar SNPs (at FDR#.05 level) compared to traditional FDR methods. This large increase in power seems to be due to two factors: the highly polygenic distribution of non-null SNPs and the high degree of pleiotropy between schizophrenia and bipolar disorder. We quantified this using a model-based mixture of null and non-null Gaussian distributions [34]. Mixture models estimate roughly 1.2% of tag SNPs are non-null in both bipolar disorder and schizophrenia. With over 1 million assayed SNPs in common between both phenotypes, the number of un-pruned, non-null SNPs is thus in excess of 12,000 in each phenotype. The vast majority of these non-null SNPs are hidden within the large proportion (,99%) of null SNPs. Results are in line with recent findings of a high proportion of variation in schizophrenia susceptibility captured by common SNPs [6]. Taken together, these findings strongly suggest that Empirical Bayes methods, as outlined by Efron [27] should be the method of choice for analyzing GWAS of polygenic human phenotypes, and for leveraging pleiotropy with other complex humans traits.
The current findings of polygenic enrichment suggest that genetic pleiotropy is important in severe mental disorders, as has been indicated earlier [13][14][15]24,25]. However, by using conditional FDR, we were able to leverage the overlapping polygenetic architecture to identify more of the specific SNPs involved. The current approach identified 58 loci in schizophrenia compared to 7 in the original publication [13]. In bipolar disorder, the added power from schizophrenia GWAS identified 35 loci compared to two loci in the original study [14]. It is important to note that this improvement in gene discovery was obtained despite the much smaller number of controls in the current analyses because the original analyses of the two disorders used largely overlapping control samples. Since we used data from the 1000 Genomes Project (1KGP) to calculate LD structure, the number of loci can vary somewhat compared to the original analysis. For both disorders, most of the current findings were borderline significant in the original GWAS mega-analysis, or identified in other GWAS of partly overlapping samples, such as TRANK1 [38] and SYNE1 [32]. Several of the currently identified genes have been associated in previous candidate gene studies, such as DAOA [39]. Further, we identified 14 loci strongly associated with both disorders, compared to three in the original combined analysis [13,14]. Previous studies have mainly used Fisher combined tests for joint analysis, which test the null-hypothesis of no association in any phenotype, which means that the signal can be driven by one of the phenotypes. In contrast, conjunction FDR analyses assess the evidence that either phenotype is non-null. It is therefore difficult to directly compare the current findings with previous results. However, of the three identified loci in previous combined analysis [13,14], both the ITIH3-4 and CACNA1C regions were confirmed with the conjunctional analyses, but not the ANK3 region. We found the latter to be associated with bipolar disorder in the current analysis, which suggests that previous results found with Fisher combined statistics were driven by the stronger association in bipolar disorder [13,14].
The current findings suggest some interesting gene candidates related to overlapping biology of bipolar disorder and schizophrenia. The Major Histocompatibility Complex loci associations with schizophrenia in previous studies [13,17] seem not to be strengthened by the combined analysis with bipolar disorder, as they are minimally represented among the current pleiotropic loci (conjunction FDR analyses). The only pleiotropic gene on chromosome 6 was NOTCH 4, which has recently also been implicated in bipolar disorder [26,40]. The current findings strengthen the involvement of genes related to calcium homeostasis and receptor functioning. In schizophrenia, both CACNA1C and ANK3 were identified, and in bipolar disorder TRANK1 and CACNB2 were also significantly associated. CACNA1C and CACNB2 are related to key proteins involved in unifying the generation of calcium spikes in neocortical pyramidal neurons, which is a closely integrated process [41]. It is likely that such functional processes could be involved in generation of symptoms in severe mental disorders, and may thus be a potential therapeutic target. Interestingly, PPM1F, a Mg2+/Mn2+ dependent protein phosphatase, related to calcium/calmodulin-dependent protein kinase II gamma, was also associated with both disorders, and seems to further strengthen the hypothesis that alterations in electrophysiological function play a role in the pathophysiology of these disorders. It is also noteworthy that SNPs located close to MAD1L1 were significantly associated with both schizophrenia and bipolar disorder. MAD1L1 is located in a human accelerated region in the genome, which shows a large difference between humans and chimpanzees [42], and thus is suggested to be involved in human-specific traits.
In addition to uncovering more of the missing heritability of bipolar disorder and schizophrenia, the current findings support the notion that genetic pleiotropy is important for variation in human phenotypes [9], and suggest that there is substantial polygenic pleiotropy between bipolar disorder and schizophrenia which warrants further exploration. In the current study we defined pleiotropy as a single gene or variant being associated with more than one distinct phenotype (diseases) [9]. It is possible that some of the loci identified in the current study are not pleiotropic but rather underlie common aspects of the schizophrenia and  Table 2. doi:10.1371/journal.pgen.1003455.g003 bipolar disorder phenotypes [9]. This possibility warrants further investigation, but requires samples with more detailed information on clinical characteristics. In the current analyses we focused on SNPs, but gene-based pleiotropy is also of interest [10], as is the use of the current approach for developing methods for risk prediction across traits. However, these applications require raw data from individual participants and these data are not currently available.
In conclusion, the current findings demonstrate that in schizophrenia and bipolar disorder, pleiotropy-informed conditional FDR can improve the statistical power for detecting novel polygenic effects. Results from conditional and conjunction FDR analyses also offer insights into potential shared mechanistic relationships between these two mental disorders.

Ethics statement
The relevant institutional review boards or ethics committees approved the research protocol of the individual GWAS used in the current analysis and all human participants gave written informed consent.

Participant samples
We obtained complete GWAS results in the form of summary statistics p-values from the Psychiatric GWAS Consortium (PGC) -Schizophrenia and Bipolar Disorder Working Groups. The schizophrenia (SCZ) GWAS summary statistics results were obtained from the PGC Schizophrenia Work Group [13], which consisted of 9,394 cases with schizophrenia or schizoaffective disorder and 12,462 controls (52% screened) from a total of 17 samples from 11 countries. Semi-structured interviews were used by trained interviewers to collect clinical information, and operational criteria were used to establish diagnosis. The quality of phenotypic data was verified by a systematic review of data collection methods and procedures at each site, and only studies that fulfilled these criteria were included. Controls were selected from the same geographical and ethnic populations as cases. For further details on sample characteristics and quality control procedures applied, please see Ripke et al..
The bipolar disorder (BD) GWAS summary statistics results were obtained from the PGC Bipolar Disorder Working Group [14], which consisted of n = 16,731 participants, including 7481 cases and 9250 controls, from 11 studies from 7 countries. Standardized semi-structured interviews were used by trained interviewers to collect clinical information about lifetime history of psychiatric illness and operational criteria applied to make lifetime diagnosis according to recognized classifications. All cases have experienced pathologically relevant episodes of elevated mood (mania or hypomania) and meet operational criteria for a BD diagnosis. The sample consisted of BD I (84%), BD II (11%), schizoaffective disorder bipolar type (4%), and BD NOS (1%). Controls were selected from the same geographical and ethnic populations as cases. For further details on sample characteristics and quality control procedures applied, please see Sklar et al. [14].
Due to overlapping control samples in these studies, the common controls were split randomly, and divided between the two case-control analyses. All results presented here are based on these non-overlapping control samples, with n = 9379 cases and n = 7736 control samples in schizophrenia, and n = 6990 cases and n = 4820 controls in bipolar disorder analyses.

Statistical analyses
Analyses implemented here were motivated by previously published stratified FDR methods [5,37]. However, we found that stratified empirical cdfs exhibited a high degree of variability. Instead, we computed empirical cdfs for the first phenotype conditional on nominal p-values of the second being at or below a given threshold. These conditional empirical cdfs vary more smoothly as a function of p-value thresholds in the second (associated) phenotype than do empirical cdfs employing disjoint strata. Conditional FDR estimates derived from the conditional   Table 3. doi:10.1371/journal.pgen.1003455.g004 Independent complex or single gene loci (r 2 ,0.2) with SNP(s) with a conjunctional FDR (conjFDR),0.05 in schizophrenia (SCZ) and bipolar disorder (BD). All SNPs with a conjFDR value,0.05 (bidirectional association, i.e. association with SCZ given association with BD (condFDR,0.05) and association with BD given association with SCZ (condFDR,0.05)) are listed and sorted in each LD block. We defined the most significant SNP in each LD block based on the minimum conjFDR. All independent loci are listed consecutively, and the same locus number are used as in the condFDR,0.05 results (Table 1). Chromosome (Chr). Z-scores for each pleiotropic locus are provided, with minor allele (A1) and major allele (A2). All data were first corrected for genomic inflation. empirical cdfs are a simple extension of Efron's Empirical Bayes FDR methods [33].
One advantage of the model-free empirical cdf approach is the avoidance of bias in conditional FDR estimates from model misspecification. However, there are inherent limitations to model-free approaches, especially with respect to inferring properties of the non-null distribution and, consequently, estimating power to detect non-null effects. We present complementary model-based analyses in the Supporting Information that estimate conditional and conjunctional local false discovery rate (fdr) [27]. Results presented in the Supporting Information using this model-based fdr corroborate the results of the model-free approaches presented here.

Genomic control
The empirical null distribution in GWAS is affected by global variance inflation due to population stratification and cryptic relatedness [43] and deflation due to over-correction of test statistics for polygenic traits by standard genomic control methods [34]. We applied a control method leveraging only intergenic SNPs which are likely depleted for true associations (Schork et al., under review). First, we annotated the SNPs to genic (59UTR, exon, Figure 5. Histograms of absolute z-scores for bipolar disorder (BD, top panels) and schizophrenia (SCZ, bottom panels) for z-scores $3. Left panels are actual data, whereas right panels are hypothetical realizations from a doubling of effective sample size, generated from mixture model fits of f(z) = p 0 f 0 (z)+(12p 0 )f 1 (z) (see Text S1). Black lines are null sub-densities p 0 f 0 (z) and red lines are the full mixture densities f(z). The local false discovery rate is the ratio fdr = p 0 f 0 (z)/f(z). Vertical black bars in each plot indicate the cut-points where local fdr#0.05. doi:10.1371/journal.pgen.1003455.g005 intron, 39UTR) and intergenic regions using information from the 1KGP. As illustrated in Figure S1, there is an enrichment of functional genic regions in schizophrenia compared to the intergenic SNP category. We used intergenic SNPs because their relative depletion of associations suggests that they provide a robust estimate of true null effects and thus seem a better category for genomic control than all SNPs. We converted all p-values to zscores and for each phenotype we estimated the genomic inflation factor l GC for intergenic SNPs. We computed the inflation factor, l GC as the median z-score squared divided by the expected median of a chi-square distribution with one degree of freedom and divided all test statistics by l GC . The conditional Q-Q plots for schizophrenia after control for genomic inflation are shown in Figure S1.

Conditional Q-Q plots for assessing pleiotropic enrichment
To assess pleiotropic enrichment, we used Q-Q plots conditioned on 'pleiotropic' effects. For a given associated phenotype, enrichment for pleiotropic signals is present if the degree of deflection from the expected null line is dependent on SNP associations with the second phenotype. We constructed conditional Q-Q plots of empirical quantiles of nominal 2log 10 (p) values for SNP association with schizophrenia for all SNPs, and for subsets of SNPs determined by the nominal pvalues of their association with bipolar disorder being at or below a given threshold. Specifically, we computed the empirical cumulative distribution of nominal p-values for a given phenotype for all SNPs and for SNPs with significance levels below the indicated cut-offs for the other phenotype (2log 10 (p)$0, 2log 10 (p)$1, 2log 10 (p)$2, 2log 10 (p)$3 corresponding to p#1, p#0.1, p#0.01, p#0.001, respectively). The nominal p-values (2log 10 (p)) are plotted on the y-axis, and the empirical quantiles (2log 10 (q), where q = 1-empirical cdf(p)) are plotted on the x-axis. To assess for polygenic effects below the standard GWAS significance threshold, we focused the conditional Q-Q plots on SNPs with nominal 2log 10 (p),7.3 (corresponding to p.5610 28 ).

Conditional false discovery rate
Enrichment seen in the conditional Q-Q plots can be directly interpreted in terms of the FDR. Specifically, for a given p-value cutoff, the Bayes FDR [33], closely related to the q-value of Storey [44] is defined as where p 0 is the proportion of null SNPs, F 0 is the null cdf, and F is the cdf of all SNPs, both null and non-null; see Text S1 for details on this simple mixture model formulation [33]. Under the null hypothesis, F 0 is the cdf of the uniform distribution on the unit interval [0,1], so that Eq. [1] reduces to The cdf F can be estimated by the empirical cdf q = N p /N, where N p is the number of SNPs with p-values less than or equal to p, and N is the total number of SNPs. Replacing F by q and replacing p 0 with unity in Eq. [2], we get which is biased upwards as an estimate of Eq. [2] [33]. If p 0 is close to one, as is likely true for most GWAS, the increase in bias by setting it to unity in Eq. [3] is minimal. The quantity 1 -p/q, is therefore biased downward, and hence is a conservative estimate of the TDR = 1 -FDR. Note, Eq. [3] is the Empirical Bayes estimate of the Bayesian FDR described by Efron [33]. Referring to the formulation of the Q-Q plots, we see that Eq. [3] is equivalent to the nominal p-value divided by the empirical quantile, as defined earlier. Given the 2log 10 construction of the Q-Q plots we easily obtain demonstrating that the (conservatively) estimated FDR is directly related to the horizontal shift of the curves in the conditional Q-Q plots from the expected line x = y, with a larger shift corresponding to a smaller FDR. This is illustrated in Figure 1. For each p-value threshold in the associated trait (e.g. bipolar disorder), we calculated the conditional TDR as a function of p-value in the primary trait (e.g. schizophrenia, indicated by different colored curves) in Figure 1 according to Eq. [4].
Conditional statistics-probability of association with one disorder We define the conditional FDR as the posterior probability that a given SNP is null for the first phenotype given that the p-values for both phenotypes are as small or smaller as the observed p-values. Formally, this is given by where p 1 is the p-value for the first phenotype, p 2 is the p-value for the second, and F(p 1 | p 2 ) is the conditional cdf and p 0 (p 2 ) the conditional proportion of null SNPs for the first phenotype given that p-values for the second phenotype are p 2 or smaller. Eq. [5] makes the assumption, reasonable for independent GWAS, that summary statistics are independent across phenotypes if they are null for at least one phenotype. We produce a conservative estimate of FDR(p 1 | p 2 ) by setting p 0 (p 2 ) = 1 and using the empirical conditional cdf in place of F(p 1 | p 2 ) in Eq. [5]. This is a straightforward generalization of the Empirical Bayes approach developed by Efron [33]. We assign a conditional FDR value for schizophrenia given bipolar disorder p-values (denoted by FDR SCZ | BD ) to each SNP by computing conditional FDR estimates on a grid and interpolating these estimates into a two-dimensional look-up table ( Figure S2). All SNPs with conditional FDR,0.05 (2log 10 (FDR).1.3) in schizophrenia given association with bipolar disorder are listed in Table 1 after 'pruning' (removing all SNPs with r 2 .0.2 based on 1KGP LD structure). We used the same procedure, in the opposite direction, to assign a conditional FDR value (denoted as FDR BD|SCZ ) for bipolar disorder given schizophrenia p-values to each SNP. All SNPs with FDR,0.05 (2log 10 (FDR).1.3) in bipolar disorder given schizophrenia are listed in Table 2 after pruning. A significance threshold of FDR,0.05 nominally corresponds to 5 false positives per 100 reported associations. We present a complementary model-based approach to estimating conditional FDR in the Text S1.

Conjunction statistics-test of association with both phenotypes
In order to identify which of the SNPs were associated with schizophrenia and bipolar disorder we used a conjunction FDR procedure similar to that described for p-value statistics in Nichols et al. [45]. This minimizes the effect of a single phenotype driving the common association signal. Conjunction FDR is defined as the posterior probability that a given SNP is null for both phenotypes simultaneously when the p-values for both phenotypes are as small or smaller than the observed p-values. Formally, conjunction FDR is given by where p 0 (p 1 , p 2 ) is the proportion of SNPs null for both phenotypes simultaneously, F 0 (p 1 , p 2 ) = p 1 p 2 is the joint null cdf, and F(p 1 , p 2 ) is the joint overall cdf. Conditional empirical cdfs provide a model-free method to obtain conservative estimates of Eq. [6]. This can be seen as follows. Estimate the conjunction FDR by where FDR SCZ | BD and FDR BD | SCZ (the estimated conditional FDRs described above) are conservative (upwardly biased) estimates of Eq. [5]. Thus, Eq.

Conditional Manhattan plots
To illustrate the localization of the genetic markers associated with schizophrenia given their association with bipolar disorder, and vice versa, we used a 'Conditional Manhattan plot', plotting all SNPs within an LD block in relation to their chromosomal location. As illustrated in Figure 2 for schizophrenia, the large points represent the SNPs with conditional FDR,0.05, whereas the small points represent the non-significant SNPs. All SNPs without 'pruning' (removing all SNPs with r 2 .0.2 based on 1KGP LD structure) are shown. The strongest signal in each LD block is illustrated with a black line around the circles. This was identified by ranking all SNPs in increasing order, based on the conditional FDR value for schizophrenia given bipolar disorder, and then removing SNPs in LD r 2 .0.2 with any higher ranked SNP. Thus, the selected locus was the most significantly associated with schizophrenia in each LD block (Figure 2). A similar procedure was used in the conditional Manhattan plot for bipolar disorder given schizophrenia (Figure 3).

Conjunction Manhattan plots
To illustrate the localization of the pleiotropic genetic markers associated with both schizophrenia and bipolar disorder, we present a 'Conjunction Manhattan plot', plotting all SNPs with a significant conjunction FDR within an LD block in relation to their chromosomal location. As illustrated in Figure 4, the large points represent the significant SNPs (FDR,0.05), whereas the small points represent the non-significant SNPs. All SNPs without 'pruning' (removing all SNPs with r 2 .0.2 based on 1KGP LD structure) are shown, and the strongest signal in each LD block is illustrated with a black line around the circles. We ranked all SNPs based on the conjunction statistic and removed SNPs in LD r 2 .0.2 with any higher ranked SNP.

Model-based power analyses
While model-free approaches avoid assumptions that may bias results, it is necessary to take a model-based approach for assessing the power to detect non-null SNPs [33]. As in Eq. [1], let p 0 be the proportion of null SNPs and let p 1 = 12p 0 be the proportion of non-null SNPs. Following Yang et al. [34], the probability density f(z i ) of the test statistic (z score) for the ith SNP is given by where the null density f 0 (z i ) corresponds to a N(0, s 0 2 ) distribution and the non-null density f 1 (z i ) corresponds to a N(0, s 0 2 +s 1 2 ) distribution. Both s 0 2 and s 1 2 are estimated from the data (see Text S1). The local false discovery rate, defined as the posterior probability that a SNP is non-null given the observed z score, is given by Efron and Tibshirani [35] fdr z i ð Þ~p 0 f 0 z i ð Þ=f z i ð Þ: ð9Þ Using this mixture of Gaussians formulation, we can assess relative power for gene discovery by determining the proportion of nonnull SNPs with local fdr less than a given cut-off, e.g., 0.05. We can also determine the impact of scaling the effective sample size on the distribution f 1 (z i ) of non-null SNPs. We extend this model to a bivariate framework by postulating a four groups model of bivariate Gaussians. Let z i be the bivariate z scores for the ith SNP for schizophrenia and bipolar disorder. The four groups mixture model is given by where p 0 is the proportion of SNPs which are null for both phenotypes, p 1 and p 2 are the proportion of SNPs which are nonnull for schizophrenia and null for bipolar disorder (and vice versa), and p 3 is the proportion of SNPs non-null for both simultaneously.
The component densities f 0 , f 1 , f 2 , and f 3 are bivariate Gaussian with zero mean and variance-covariance matrices estimated from the data. From model [9], we can compute conditional local fdr, similar to the conditional FDR described above. We can also determine the degree of pleiotropy from the estimated value of p 3 . Details of the methods for mixture models, local false discovery rate, and estimates of polygenicity, the degree of pleiotropic overlap, and power are presented in Text S1 and Figures S4, S5, S6, S7.