Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

Background Despite its estimated high heritability, the genetic architecture leading to differences in cognitive performance remains poorly understood. Different cortical regions play important roles in normal cognitive functioning and impairment. Recently, we reported on sets of regionally enriched genes in three different cortical areas (frontomedial, temporal and occipital cortices) of the adult rat brain. It has been suggested that genes preferentially, or specifically, expressed in one region or organ reflect functional specialisation. Employing a gene-based approach to the analysis, we used the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n = 3 samples) and bipolar affective disorder (BP) (n = 3 samples), to which cognitive impairment is linked. Principal Findings At the single gene level, the temporal cortex enriched gene RAR-related orphan receptor B (RORB) showed the strongest overall association, namely to a test of verbal intelligence (Vocabulary, P = 7.7E-04). We also applied gene set enrichment analysis (GSEA) to test the candidate genes, as gene sets, for enrichment of association signal in the NCNG GWAS and in GWASs of BP and of SCZ. We found that genes differentially expressed in the temporal cortex showed a significant enrichment of association signal in a test measure of non-verbal intelligence (Reasoning) in the NCNG sample. Conclusion Our gene-based approach suggests that RORB could be involved in verbal intelligence differences, while the genes enriched in the temporal cortex might be important to intellectual functions as measured by a test of reasoning in the healthy population. These findings warrant further replication in independent samples on cognitive traits.


Introduction
Cognitive abilities (e.g. intelligence, memory, attention and speed of processing) vary to a great extent in the population, considerably affecting the life outcome of individuals. Despite being highly heritable, with estimates ranging from 30-80%, little is known about the genetic mechanisms involved in cognitive functioning (reviewed in [1]). It is, however, widely accepted that a polygenic mechanism underlies the differences in cognition, each genetic factor having a very small effect size (reviewed in [1,2]). A recent genome-wide association study (GWAS) showed, for the first time, that common genetic variants account for ,40-50% of the variation in human intelligence [3]. However, despite an extensive search by linkage and association studies, only a limited number of genes has so far been implicated in normal cognitive functioning (e.g. ALDH5A1, APOE, COMT, BDNF, DCLK1) [1,[4][5][6][7][8][9].
Cognitive dysfunction is one of the main clinical problems observed in patients suffering from major psychiatric disorders, such as schizophrenia (SCZ) and bipolar affective disorder (BP). High heritability has been estimated for both SCZ and BP [10][11][12], and common alleles of small effect are thought to increase susceptibility to these complex disorders. However, for SCZ, some rare variants (e.g. copy number variations) have also been linked to disease susceptibility [13,14]. Although great efforts have been made over the last decades to identify genetic factors causing susceptibility to SCZ and BP, surprisingly few genes have so far been implicated [15][16][17][18][19][20][21][22]. By considering cognition as an intermediate biological phenotype (endophenotype) for major psychiatric illnesses, one might come closer to identifying causative genetic factors. An overlap in genetic factors linked to both cognition and psychiatric disorders has already been observed (e.g. ZNF804A and DISC1) [18,23], which supports the validity of testing the same genes in both normal cognitive function and in psychiatric illnesses.
Several areas of the brain, in particular different cortical regions, play important roles in normal cognitive functioning and impairment, as well as in psychiatric disease. A network consisting of areas in the dorsolateral prefrontal, parietal, anterior cingulate, temporal and occipital cortices (parieto-frontal integration theory) has been associated with differences in intellectual function [24]. The prefrontal cortex is particularly important for working memory, attention and planning, and structural and functional changes in this region have been linked to psychiatric disorders. Regions within the temporal and occipital lobes have also been implicated in cognitive abilities and psychiatric disorders, as these regions are critical for early auditory and visual sensory information processing and interpretation. In general, reduction of cortical thickness has been observed in patients suffering from SCZ and BP, particularly in the frontal and temporal lobes [25], while total brain volume (gray and white matter) and cortical thickness have been correlated to measures of intelligence [26,27].
Previously, we examined the global gene expression in the frontomedial (FMCx), temporal (TCx) and occipital (OCx) cortices from the normal adult rat brain, and identified distinct sets of regionally enriched cortical genes [28,29]. While the overall gene expression in the different cortical areas was highly similar, 65 genes showed marked regional enrichment (30,24 and 11 genes in the FMCx, TCx and OCx, respectively). Based upon the assumption that genes highly or specifically expressed within a certain region or organ are likely to reflect its functional specialisation [28,30,31], and considering the implications of different areas of the cortex in human cognition and psychiatric disorders, we hypothesised that these enriched genes might serve as candidates for individual differences in cognitive function and for psychiatric disorders.
In this study, we used the regionally enriched cortical genes as candidates to mine existing GWASs of relevant cognitive traits and of SCZ and BP, taking a gene-based approach. First, we applied a novel tool, LDsnpR, (Christoforou et al. under revision) to assign single nucleotide polymorphism (SNP) marker information from the GWAS data to their corresponding genes, and then to subsequently score the genes. Applying this gene-based approach, we tested the association of regionally enriched cortical genes to normal cognitive functioning using a GWAS recently conducted by our group (Christoforou et al. unpublished data). Next, we analysed these genes, as gene sets, using gene set enrichment analysis (GSEA) [32] to search for enrichment of association signal in the aforementioned GWAS of cognition and in GWASs of psychiatric disorders (SCZ and BP).

Candidate genes
Selection of candidate genes. Recently, we described sets of genes that show differential expression in three different cortical regions in the adult rat brain (FMCx, TCx and OCx) [29], based on global gene expression analysis of several brain regions (three cortical regions, as well as hippocampus, striatum and cerebellum) and three non-CNS samples (liver, kidney and spleen) [28]. Sixtyfive genes were found to display enriched expression in certain cortical regions (30, 24 and 11 genes in the FMCx, TCx and OCx, respectively) [29]. The Ensembl Genome Browser (release 54) was searched to identify the Ensembl ID for the human homologues to the rat genes (http://may2009.archive.ensembl.org/) [33]. Three genes were not represented in the Ensembl release 54 (i.e. two unassigned Celera genes: rCG46329 and rCG41008; and Clec2l), resulting in 62 genes eligible for the subsequent gene-based analysis in cognition and psychiatric disorders (Table 1-3).
Expression and functional characterisation of candidate genes. The expression pattern of the human homologues to the rat genes were analysed in the Allen Human Cortex Study (Whole Brain Microarray Survey) from The Allen Institute for Brain Science [34] (http://humancortex.alleninstitute.org). Functional characterisation of the human homologous genes was performed using the Panther Classification System version 7 (http://www. pantherdb.org/) [35,36], as previously described [28]. One gene was not represented in Panther (i.e. HTR5B).

GWAS datasets
GWAS of cognition in the Norwegian Cognitive Neuro-Genetics sample. The Norwegian Cognitive NeuroGenetics (NCNG) sample consists of 670 healthy adult individuals of Norwegian origin (214 males, 456 females), extensively tested for cognitive abilities. The participants were between 18 to 79 years of age (mean: 47.6), and were recruited through advertisements in local newspapers to participate at the University of Bergen (n = 171) and Oslo (n = 499) areas. In this study we focused on nine different tests, covering four major cognitive functions, namely: Intellectual function (The Vocabulary and Matrix Reasoning subtests from the Wechsler Abbreviated Scale of Intelligence, and the estimated Full-Scale Intelligence Quotient (FSIQ) [37]), memory (the total numbers of words learned across five trials (CVLT-L) and the delayed free recall score (CVLT-DR) from the California Verbal Learning Test [38]), executive attention (the third condition from the D-KEFS Color-Word Interference Test (Stroop3) [39]) and attention (Cued Discrimination Task, CDT-Valid, CDT-Invalid and CDT-Neutral [40]) (Table S1). Correlation estimates between the psychometric tests are listed in Table S2. The individuals were genotyped using the Illumina platform (Human610-Quad), and after quality control, 554,225 SNPs were incorporated into the analysis. Further details on the sample, genotyping and quality control can be found in Davies et al. 2011 [3].
GWAS of non-psychiatric phenotypes. As a control for the specificity of our analyses on cognitive traits and psychiatric illnesses, we also analysed non-psychiatric phenotypes. We performed GSEA on the non-psychiatric GWAS data sets from the WTCCC: Crohn's disease (1,748 cases), coronary heart disease (1,926 cases), hypertension (1,952 cases), rheumatoid arthritis (1,860 cases), type 1 diabetes (1,963 cases) and type 2 diabetes (1,924 cases). The GWAS data sets from the WTCCC included 2,938 healthy controls common for the six disorders. The individuals were genotyped using Affymetrix GC500K [42].

SNP to gene assignment using LDsnpR
In order to analyse the GWAS data on cognition, psychiatric disorders and non-psychiatric phenotypes at the gene level, we implemented a novel linkage disequilibrium (LD)-based SNP The 29 frontomedial enriched cortical genes [29] were used as candidates to search for association to nine test measures of cognitive functions [37][38][39][40], at the single gene-and gene set-based level. The HUGO Gene Nomenclature Committee (HGNC) symbol, Ensembl Genome Browser (release 54) identification [33] and gene description is shown. This tool assigns SNP marker information and P-values from GWAS data sets to individual genes based both on the chromosomal position of the SNP and on the LD profile of the SNP (positional-and LD-based-binning, respectively). Thus, a SNP is assigned, or binned, to a gene if it is physically located within the pre-defined boundaries of the gene, or if it is in LD with another SNP (genotyped or not) that is physically located within these boundaries of the gene. Gene bin definitions were based on Human Ensembl release 54 (May 2009). They were further  The 11 occipital cortex enriched genes [29] were used as candidates to search for association to nine test measures of cognitive functions [37][38][39][40], at the single geneand gene set-based level. The HUGO Gene Nomenclature Committee (HGNC) symbol, Ensembl Genome Browser (release 54) identification [33] and gene description is shown. doi:10.1371/journal.pone.0031687.t003 extended 10 kb on either side to best capture potential regulatory regions. The LD data was based on that of the CEU (CEPH (Utah residents with ancestry from northern and western Europe)) sample from HapMap Phase II release 27. The pairwise LD threshold was set at r 2 $0.8.

Gene scoring
The genes were scored with the minimum P-value observed among all the SNPs within each ''gene bin'', adjusted for the number of SNPs assigned to each gene with a modified version of Sidak's correction [45], as implemented in LDsnpR. This method has been shown to perform as well as a powerful regression-based method in correcting for the bias due to SNP number [46]. Furthermore, we performed PLINK's permutation-based set method [47] on an in house data set and demonstrated a high correlation between the modified Sidak's corrected P-values and the permutation based P-values (r 2 .0.95, data not shown).
Results from gene-and gene set-based analysis, using raw unadjusted (for SNP number) minimum P-values, are provided in Tables S3, S4, and S5.

Gene Set Enrichment Analysis
The 62 FMCx-, TCx-or OCx-genes were analysed as gene sets for enrichment of association signal in the GWAS data sets on cognition, psychiatric-and non-psychiatric phenotypes, using GSEA [32]. As described above, the GWAS SNPs were assigned to ''gene bins'' and scored using the modified Sidak's P-values. The genes were organised into ranked lists, upon which the gene sets were queried.
The candidate genes were treated as four separate gene sets. Gene set 1: All cortex region enriched genes (FMCx, TCx and OCx, n = 62), Gene set 2: FMCx enriched genes (n = 29), Gene set 3: TCx enriched genes (n = 22) and Gene set 4: OCx enriched genes (n = 11) ( Table 1-3). The GSEA 2.0 programme (http:// www.broadinstitute.org/gsea/index.jsp) [32] was used to analyse the distribution of the candidate genes in the pre-ranked lists of genes from the different GWAS data sets. The gene sets were analysed in the ranked files, using weighted enrichment statistics (p = 1) and 1,500 permutations. The analysis was repeated three times to ensure consistency of results, and the false discovery rate (FDR) q-values were extracted for each trait/GWAS. See Figure 1 for schematic overview of the different steps in the procedure.

Assessment of significance threshold
Gene Scores and multiple-testing correction. All reported gene-based P-values are uncorrected for the multiple psychometric traits and genes tested. Multiple testing correction in such a study is not straightforward, particularly due to the correlated nature of the test performed and the increased prior evidence supporting the relevance of these tests. However, a threshold corrected for these tests was determined as follows: Nine psychometric traits were tested in the NCNG sample. These traits are highly correlated, as shown in Table S2. Matrix Spectral Decomposition (matSpD; http://gump.qimr.edu.au/general/daleN/matSpD/) was applied to determine the equivalent number of independent traits tested, using the pairwise correlations between the traits [48][49][50][51]. V effLi was estimated to be six, resulting in a Sidak-corrected threshold of 0.0085 required to keep the type 1 error rate at 5%. We further adjusted this threshold conservatively to account for the 62 genes tested, resulting in an experiment-wide threshold of 0.00014.

GSEA
We employed three approaches to assess the validity and significance of our findings. First, we tested and compared with the GWASs of the six non-psychiatric phenotypes in the WTCCC [42]. Second, in addition to the cortical gene sets, we included a gene set consisting of various ''housekeeping genes'', testing it across all cognitive, psychiatric-and non-psychiatric phenotypes (TaqMan endogenous controls from Applied Biosystems and a set of genes from Warrington et al. [52], Gene set 5: Housekeeping genes, n = 36, Table S6). Finally, for the significant gene sets, we ran the GSEA on 100 random gene sets. The random gene sets were generated using a pseudorandom number generator, randomly selecting genes from the Ensembl 54 definition. They were designed to mimic the significant gene sets, both with respect to the number of genes and the number of GWAS SNPs assigned to the genes (i.e. by LDsnpR) making up the gene set.

Regionally enriched cortical genes show association to cognitive abilities
Based on our initial study of regional enrichment of genes in different parts of the rat neocortex, 62 genes were selected as candidates (Table 1-3) to search for association to nine different neurocognitive traits in the NCNG GWAS data set, covering four major cognitive domains: intellectual function, memory, executive attention and attention (Table S1). We took a candidate genebased approach to the analysis, using a novel tool, LDsnpR, to assign SNPs to single genes based on chromosomal position and LD. LDsnpR was further used to score the genes, using the minimum P-value approach, adjusted for the number of SNPs in the gene ''bins'' with a modified Sidak's correction [45].
Several of the candidate genes displayed significant association to test measures of cognitive functions at the nominal, uncorrected significance level of 0.05 (Table 4-6, Table S7a-c), but none at the experiment-wide threshold of 0.00014. The overall strongest association in the analysis was observed between the TCx enriched gene RAR-related orphan receptor B (RORB) and the measure of verbal intelligence (Vocabulary, modified Sidak's P = 7.7E-04). In addition the FMCx enriched gene Huntingtin-associated protein 1 (HAP1) displayed strong association to the measure of verbal intelligence (Vocabulary, modified Sidak's P = 8.9E-04) and nominal association to the full-scale measure of intellectual Table 4. Gene-based analysis of frontomedial cortex enriched genes for association to cognitive abilities.

HGNC Symbol SNPs Intellectual function Memory
Executive attention Attention The frontomedial cortex enriched genes (n = 29) were analysed for allelic association to nine test measures from the NCNG GWAS The temporal cortex enriched genes (n = 22) were analysed for allelic association to nine test measures from the NCNG GWAS. For trait abbreviations see Table 4. Modified Sidak's minimum P-value for each candidate gene was extracted [45],  Table 6. Gene-based analysis of occipital cortex enriched genes for association to cognitive abilities.

CDT-Neutral
The occipital cortex enriched genes (n = 11) were analysed for allelic association to nine test measures from the NCNG GWAS. For trait abbreviations see Table 4. Modified Sidak's minimum P-value for each candidate gene was extracted [45], function (FSIQ, modified Sidak's P = 0.033). We also observed that three of the candidate genes showed nominal association to all the tests of attention (i.e. Complement component 1, q subcomponent-like 3 (C1QL3), Hypocretin (orexin) receptor 1 (HCRTR1) and Calcium binding protein 1 (CABP1)).
Genes with preferential expression in the temporal cortex show enrichment of association signal to the Reasoning performance in GSEA Next, we performed GSEA, to test the candidate genes for enrichment of association signal in test measures of cognitive functions. GSEA was originally developed to analyse the distribution of genes identified from microarray experiments, but has recently been implemented in the analysis of GWAS [32,53].
We divided the candidate genes into gene sets based on their observed regional differences in expression (one set including all the differentially expressed cortical genes regardless of region and three gene sets composed of the genes enriched in the FMCx, TCx or OCx). In addition, we included a gene set comprising various ''housekeeping'' genes (from Applied Biosystems list of TaqMan endogenous controls and from Warrington et al. [52]). In order to test whether the candidate gene sets would show an overall enrichment for association to the nine cognitive test scores (Table  S1), we used the ''gene bins'' and their assigned modified Sidak's P-values generated by LDsnpR as described above (see Method section for details and Figure 1).
We found that the TCx gene set showed significant enrichment of association signal to a test measure of non-verbal intelligence (Reasoning, FDR q-value = 0.06, cut-off FDR q-value set to 0.1, Table 7, Figure S1). The gene set comprised of ''housekeeping'' genes, used as a control for the specificity of our analysis, did not show significant enrichment to any of the neurocognitive tests. Furthermore, in order to validate the observed enrichment of association signal of the TCx genes (n = 22) in the test measure of non-verbal intelligence, 100 random gene sets were generated. Each of the hundred random gene sets comprised 22 arbitrary genes, each gene containing the same number of SNPs assigned to them, as the genes in the TCx gene set (see Methods for further details). Each random gene set was analysed using GSEA in the Reasoning GWAS, employing the same analysis statistics as applied for the TCx gene set. None of the random gene sets displayed significant enrichment of association signal (FDR qvalues ranging from 0.52 to 1.0, for FDR q-value details see Table  S8). This finding supports the robustness of the enrichment of association signal observed for the set of TCx genes to the test of non-verbal intelligence (Reasoning).
We also observed an enrichment of association signal for the gene set comprising genes differentially expressed in the OCx and a test measure of attention (CDT-Invalid, FDR q-value 0.04, Table 7, Figure S1). Again, neither of the random gene sets mimicking the OCx gene set showed enrichment of association signal in the CDT-Invalid GWAS (FDR q-values ranging from 0.14 to 1.0, for FDR q-value details see Table S8), suggesting a role for genes expressed in the OCx in performance of an attention task.

GSEA of genes differentially expressed in the frontomedial, temporal and occipital cortex in GWAS data of psychiatric disorders and non-psychiatric phenotypes
Since cognitive impairment constitutes a major endophenotype in patients suffering from SCZ and BP, and several cortical regions have been linked to disease susceptibility, we analysed the same gene sets by GSEA in three BP GWASs (the Norwegian TOP BP study, the British WTCCC BP and a German BP sample) and three SCZ GWASs (the Norwegian TOP SCZ study, a German SCZ sample and a Danish SCZ sample). In addition, we analysed six non-psychiatric phenotype GWASs from the WTCCC as controls (coronary heart disease, Crohn's Disease, hypertension, rheumatoid arthritis, type 1 diabetes and type 2 diabetes).
We found that the OCx gene set displayed enrichment of association signal to the Danish SCZ sample (FDR q-value 0.04, cut-off FDR q-value 0.1, Table 8, Figure S1). None of the cortical gene sets were enriched in the two other SCZ, nor in the three BP GWASs. When analysing the gene sets in the five non-psychiatric phenotype GWASs, no enrichment of association signal was observed (FDR q-value.0.1).
In this analysis, we also included a gene set consisting of ''housekeeping'' genes as a control for the specificity of our analysis. We did not observe any enrichment of association signal for this gene set in any of the psychiatric disorder or nonpsychiatric phenotype GWASs analysed (FDR q-value.0.1). As a The differentially expressed cortical genes were analysed as gene sets for enrichment of association signal in nine traits from the NCNG GWAS data [37][38][39][40], using GSEA [32]. Five gene sets were analysed; Gene set 1: combined list of all differentially expressed cortical genes, n = 62, Gene set 2: FMCx genes, n = 29, Gene set 3: TCx genes, n = 22, Gene set 4: OCx genes, n = 11, and Gene set 5: ''housekeeping'' genes, n = 36 (control gene set, Table S6). The analysis was based on extraction of modified Sidak's minimum P-values [45], as implemented in LDsnpR. FDR q-value,0.1 was set as cut-off value for significant enrichment. For trait abbreviations see Table 4. doi:10.1371/journal.pone.0031687.t007 second control, GSEA was performed in the Danish SCZ GWAS using 100 random gene sets, consisting of 11 arbitrary genes (as previously described for the TCx and OCx gene sets in the test measures of reasoning and attention, respectively). None of the random gene sets displayed significant FDR q-values (FDR qvalues ranging from 0.68 to 1.0, for FDR q-values details see Table  S8). These findings support that the enrichment of association signal observed between the OCx gene set and the Danish SCZ GWAS was due to the genes contained in the OCx gene set, and not as a result of unspecific association signals.

Functional annotation and gene expression patterns of the regionally enriched cortex genes in human
The candidate genes analysed in this study were previously predicted to have a significant over-representation for particular biological processes and molecular functions in the rat, such as signal transduction, developmental processes and receptor activity [29]. In order to examine whether the candidate genes shared similar functional annotations in human, we mapped the entire set of regionally enriched genes, and in addition the gene sets composed of differentially expressed genes in the FMCx, TCx or OCx individually, to the Panther annotation categories. By comparing the distribution of the candidate genes to the human reference gene set provided (19,911 genes), we searched for significant overrepresentations of particular biological processes and molecular functions. Overall, the candidate genes were linked to cellular, developmental and neurological system processes ( Figure 2). Furthermore, the candidate genes were found to be involved in receptor activity, primarily in cation transmembrane transporter activity and ion channel activity ( Figure 2). Notably, the TCx gene set showed the strongest over-representation for most of the biological processes, and especially the molecular functional annotation, as compared to the FMCx and OCx gene sets.
We next analysed the expression pattern of a sub-set of the human homologues to the regionally enriched rat genes in the Allen Human Cortex Study (i.e. selected genes showing significant association in the NCNG sample). Although no quantitative differential gene expression could be detected, the homologous genes were expressed in corresponding regions in the human brain (e.g. FMCx, TCx or OCx enriched genes were expressed in the frontal, temporal or occipital lobe, respectively) ( Figure S2A-C).

Gene-based analysis of regionally enriched cortical genes for association to cognition
At the global level, the gene expression in different cortical regions is surprisingly similar, although highly specific functions are attributed to distinct cortical regions. Genes displaying differential expression in cortical regions might play an important role for the specialised normal function attributed to certain areas [29]. In this study, we used a novel set of differentially expressed cortical genes, identified from microarray gene expression profiling in the adult rat brain, to search for association at the single gene level to neurocognitive traits in human. In addition, we used a gene set-based approach to search for enrichment of association signal to cognitive traits and psychiatric disorders.
By mining GWAS data from a sample of healthy adults characterised by nine psychometric tests of cognitive function (the NCNG sample), and scoring the genes using LDsnpR, we found strong association between the TCx enriched gene RORB and a test of verbal intelligence (Vocabulary). This circadian clock gene has not previously been associated to cognitive abilities, but it is worth noting that the gene was recently ranked as one of the top candidate genes for susceptibility to BP in a large meta-analysis, and in a pediatric cohort of individuals suffering from BP [54,55]. In the developing and adult rat brain, the gene is expressed in several regions associated with processing of sensory information, and behavioural changes (i.e. reduced anxiety and learned helplessness-related behaviour) have been observed in Rorb 2/2  [20,41,42]), three SCZ GWASs (the Norwegian TOP sample, the German part of a combined German-Dutch SCZ GWAS and a Danish sample [19,43,44]) and six non-psychiatric phenotypes (from WTCCC; CD: Crohn's disease, CHD: coronary heart disease, HT: hypertension, RA: rheumatoid arthritis, T1D: type 1 diabetes and T2D: type 2 diabetes, [42]). The analysis was based on extraction of modified Sidak's minimum P-values [45], as implemented in LDsnpR. FDR q-value,0.1 was set as cut-off value for significant enrichment. mice [56,57]. The HAP1 gene also displayed a strong association to the measure of verbal intelligence (Vocabulary), and in addition, we observed a nominal association of HAP1 to the estimated fullscale IQ (FSIQ). This gene has been shown to have an enriched expression in neurons, and the encoded protein is thought to be involved in intracellular trafficking and regulation of gene transcription. Dysfunction of HAP1 has been linked to the neuropathology in Huntington disease, a disease where cognitive decline and psychiatric symptoms are often prominent (reviewed in [58]). Furthermore, we observed that the two FMCx enriched genes C1QL3 and HCRTR1, and the TCx enriched gene CABP1, displayed significant association to all the tests of attention. Hcrtr1 has previously been shown to be involved in attentional processing by activating the basal forebrain cholinergic system in rats (reviewed in [59]). Interestingly, an association between HCRTR1 and major mood disorders was recently reported [60]. Neither Figure 2. Functional characterisation of the human homologues to the rat regionally enriched cortical genes. Search for over-and under-represented biological processes and molecular functions was performed by using Panther [35,36]. The significance of over-and underrepresented Panther classification categories in the complete list of candidate genes (i.e. all the cortical regions, column 2), the FMCx enriched genes (column 3), TCx enriched genes (column 4) and OCx enriched genes (column 5), is illustrated by a heat map. The statistical significance of each gene set (negative log P-value) is illustrated by colour intensity (red: over-represented, blue: under-represented, white: as expected). Number of genes in each gene set is listed. The OCx gene HTR5B was not represented in Panther. The percentage of genes within a gene set that map to the given category is indicated on the heat map, e.g. 59% of the 61 enriched genes map to the biological process ''cellular process''. The first column states the overall distribution of a term among the 19,911 genes from the default human reference gene list, e.g. 31% of the 61 regional genes were expected to map to '''cellular process'', hence this category is significantly over-represented among the regional genes. Exp: expected (based on default human reference gene list), FMCx: frontomedial cortex, TCx: temporal cortex, OCx: occipital cortex, #: number of genes in each gene set, %: percentage of genes. doi:10.1371/journal.pone.0031687.g002 C1QL3 nor CABP1 have previously been linked to cognitive abilities. Notably, a reduction of neurons expressing CABP1, accompanied by an increase in protein expression in the remaining neurons, has been observed in post-mortem brain tissue from patients suffering from SCZ [61]. While none of these genes met the experiment-wide threshold of significance, P = 0.00014, which conservatively corrects for the number of traits and genes tested, these findings should be taken in the context of the prior evidence conferred on these candidate genes through the multiple relevant positive association, expression and functional results.
Genes differentially expressed in the TCx show enrichment of association signal to a test measure of non-verbal intelligence in gene set-based analysis In order to analyse whether the candidate genes as a group would show an association to cognitive traits, we chose to analyse them as gene sets, using GSEA in combination with the NCNG GWAS dataset. We found that the TCx gene set showed a significant enrichment of association signal to a test measure of non-verbal intelligence (Reasoning). In addition to analysing the gene set using modified Sidak's P-value, we also applied random gene sets that would mimic the TCx gene set in regard to number of genes contained in the set, and also SNP number assigned to each random gene. This analysis gave no significant enrichment of association signal, and it is therefore likely that the observed association is due to biological effects of the genes contained in the TCx gene set, and not as a result of unspecific association signal. We also included a gene set comprised of ''housekeeping'' genes in the analysis. This gene set showed no enrichment of association signal to any of the cognitive tests, further supporting the validity of the finding.
The parieto-frontal integration theory network, consisting of the dorsolateral prefrontal, parietal, anterior cingulate, temporal and occipital cortices, is suggested to explain differences in cognitive performances, including a test measure of reasoning [24]. The set of TCx genes analysed in this study, could be involved in this network, although the importance of the set of genes in intellectual function remains to be explored.
In the GSEA, we also observed an enrichment of association signal for the OCx gene set in one of the measures of attention (CDT-Invalid). The random gene sets used as a control gave no significant association, indicating that the observed enrichment was not a result of spurious association. However, the OCx gene set is fairly small (n = 11), and the finding could be a result of inflated scoring. The GSEA program estimates an enrichment score, and normalizes the score by taking the number of genes in the gene set into account. For very small gene sets (n,10), the probability of generating a false positive result will therefore increase, and caution has to be exercised with respect to the validity of this finding [32].
Genes differentially expressed in the occipital cortex show enrichment of association signal to the Danish SCZ sample, in gene set-based analysis Since impairments of cognitive functions are observed in individuals suffering from SCZ and BP, we also analysed the differentially expressed cortical genes, as gene sets, in GWASs of psychiatric illnesses, using GSEA.
We found that the OCx gene set displayed significant enrichment of association signal in the Danish SCZ GWAS. None of the cortical gene sets examined showed enrichment of association in the other SCZ, nor in the three BP GWASs analysed. In order to validate the findings, we generated 100 random gene sets mimicking the OCx gene set in regard to gene number and SNPs assigned to each gene. We did not observe an enrichment of association when analysing the random gene sets in GSEA, which could indicate that the observed association signal was due to the genes contained in the OCx gene set. In addition, we tested the validity of the GSEA in psychiatric disorder GWASs, by analysing the same candidate genes, as gene sets, in GWASs of non-psychiatric phenotypes from the WTCCC [42]. None of the gene sets showed enrichment of association signal. Furthermore, we also analysed a set of ''housekeeping'' genes in the six psychiatric disorders, and non-psychiatric phenotype data sets, and found no significant enrichment of association. Taken together the results could indicate an actual role for the genes contained in the OCx gene set in SCZ. On the other hand, the observed enrichment of association signal for the OCx gene set in the Danish SCZ GWAS was not observed in the other SCZ GWAS data sets examined. It is difficult to pinpoint the cause of this discrepancy. It is possible that it represents a false-positive finding. The OCx gene set comprised a small number of genes (n = 11), increasing the risk of generating a false positive result [32]. Alternatively, the genetic heterogeneity between the Norwegian, German and Danish populations might explain the observed differences [62]. This finding should anyway be considered with caution, and further replication studies are warranted.
Regionally enriched cortical candidate genes; translation from rat to human The candidate genes analysed in this study were identified from microarray gene expression profiling of the adult rat brain as differentially expressed genes in certain cortical regions. Despite the substantial difference in size, connectivity and cortical fields, some features of cortical organisation have been conserved in major groups of mammals [63,64]. Areas within the OCx (i.e. primary and second visual areas), somatosensory areas and regions within the TCx (primary auditory area) are known to share common cortical fields in a large group of mammals [64]. The similarity in broad cortical field organisation is thought to be caused by genetic factors specifying regional identity, inherited from the common ancestor of all mammalian species [64]. Interestingly, a recent study showed that the genetically influenced cortical regionalisation in the human brain was similar to the regionalisation in rodents [65]. Furthermore, it has been demonstrated that the regional gene expression in the adult mouse anterior cortex, striatum and cerebellum showed very similar gene expression compared to the anatomically and functionally homologous human brain regions [66].
We found that the regionally enriched rat brain genes shared similar over-representations of functional annotations in human, as previously identified for the rat [29]. A sub-set of the human homologous genes were also found to be expressed in corresponding areas (i.e. human frontal, temporal or occipital lobes), as observed in the rat. Moreover, some of the candidate genes have previously been linked to psychiatric and neurological disorders (e.g. RORB, HAP1, HCRTR1 and CABP1) [54,55,58,60,61], further emphasising the potential importance of these candidate genes in the human brain. On the other hand, some cortical areas are not well conserved in all mammals, e.g. the human frontal/ prefrontal cortex, perisylvian cortex and the Broca's area (the site of speech generation). The prefrontal cortex is highly specialised in humans, being linked to higher order thinking, certain cognitive abilities and personality traits, whereas the frontomedial cortex from rat is mostly involved in motor functioning. It is therefore not surprising that we did not observe an enrichment of association signal to the FMCx gene set in GSEA.
Furthermore, the global gene expression in different cortical areas in human brain has been shown to vary more between individuals, than among regions within one individual [67]. Also, the inter-individual variation is apparently larger among humans than chimpanzees [67]. Rodents are a well established model system for studying human biology, given the ethical and practical limitations in using samples from the human brain. In rats, the variance in inter-individual gene expression is substantially less, and it therefore serves as a useful model for identifying differentially expressed genes in the adult neocortex.

Conclusion
Our findings suggest an association between regionally enriched cortical genes and intellectual function. RORB, a promising candidate for susceptibility to BP, showed the overall strongest association in the analysis to a test of verbal intelligence. Moreover, we found that genes displaying preferential gene expression in the TCx showed enrichment of association signal to a test of non-verbal intelligence. We suggest that the TCx genes may be important to intellectual function in the healthy adult population. A replication of the findings is, however, essential to establish whether the TCx differentially expressed genes play a role in the neuronal mechanisms of intelligence.  Figure S2 Cortical expression patterns of the human homologues to the regionally enriched rat genes. The Whole Brain Microarray Survey in the Allen Human Cortex Study from the Allen Institute for Brain Science [34] was explored using Brain Explorer 2, in order to analyse the gene expression pattern of a sub-set of the candidate genes (i.e. selected genes showing significant association in the NCNG). Each sample from the microarray survey had been mapped to a 3D illustration of the MR picture of the donors (two donors in total). Orientation of the donor brains are indicated above the panels and also in the upper right part of each expression analysis picture. The left and middle panels illustrate the gene expression in either the frontal (A), temporal (B) or occipital (C) lobe, only. The right panel illustrates the overall gene expression in the cortex (all cortical regions selected). Red or green colour indicates high or low relative gene expression, respectively, compared to the different samples/ structures in the brain. The human homologues to the rat genes were expressed in corresponding regions in the human brain (e.g. FMCx, TCx or OCx enriched genes were expressed in the frontal (A), temporal (B) or occipital (C) lobe, respectively). (TIF)   Table S3 Gene-based analysis of regionally enriched cortical genes for association to cognitive abilities using uncorrected minimum P-values. The cortical enriched genes were analysed for allelic association to nine tests from the NCNG GWAS [37][38][39][40]: FSIQ: estimated Full-Scale Intelligence Quotient, Vocabulary: Wechsler Abbreviated Scale of Intelligence, Vocabulary, Reasoning: Wechsler Abbreviated Scale of Intelligence, Matrix Reasoning, CVLT-L: California Verbal Learning Test, learning measure, CVLT-DR: California Verbal Learning Test, Delayed free Recall, Stroop3: the third condition from the D-KEFS Color-Word Interference Test, CDT: Cued Discrimination Task, Valid, Invalid and Neutral. The minimum Pvalue for each candidate gene was extracted, without adjusting for the number of SNPs assigned. Only uncorrected minimum Pvalues,0.05 are reported. ''-'': non-significant P-value, HGNC: HUGO Gene Nomenclature Committee, SNPs: number of SNPs assigned to each gene by LDsnpR. Table S3a: Frontomedial cortex enriched genes, n = 29, Table S3b: Temporal cortex enriched genes, n = 22, and Table S3c: Occipital cortex enriched genes, n = 11. (DOC) Table S4 GSEA of differentially expressed cortical genes in neurocognitive traits using uncorrected minimum P-values. The differentially expressed cortical genes were analysed, as gene sets, for enrichment of association signal in nine tests measures of cognitive functions [37][38][39][40] from the NCNG GWAS data, using GSEA [32]. Five gene sets were analysed: gene set 1: combined list of all differentially expressed cortical genes, n = 62, gene set 2: FMCx genes, n = 29, gene set 3: TCx genes, n = 22, gene set 4: OCx genes, n = 11, and gene set 5: ''housekeeping'' genes, n = 36 (control gene set, Table S6). The analysis was based on extraction of minimum P-values, without correcting for the number of SNPs assigned to each gene in the GWAS data sets. FDR q-value,0.01 was set as cut-off value for significant enrichment. ''*'': Nominal P-value,0.0006 (1/number of permutations (1,500) in the analysis). For trait abbreviations see Table S1 and S3. (DOC) Table S5 GSEA of differentially expressed cortical genes in psychiatric disorders and non-psychiatric phenotypes using uncorrected minimum P-values. GSEA was used to analyse the regionally enriched cortical genes, as genesets, for enrichment of association signal in three different BP GWASs (German, TOP and WTCCC [20,41,42]), three SCZ GWASs (the German part of a combined German-Dutch SCZ GWAS, TOP and a Danish SCZ sample [19,43,44]) and six nonpsychiatric phenotypes (from WTCCC; CD: Crohn's disease, HT: hypertension, RA: rheumatoid arthritis, CHD: coronary heart disease, T1D: type 1 diabetes and T2D: type 2 diabetes [42]). The analysis was based on extraction of minimum P-values, without correcting for the number of SNPs assigned to each gene in the GWAS data sets. FDR q-value,0.01 was set as cut-off value for significant enrichment. The GSEA was performed 3 times, using 1,500 permutations and weighted enrichment statistics. Each run gave a slightly different FDR q-value, and the range for significant results are listed: a: (0.0020-0.0046), b: (0.013-0.021), c: (0.0088-0.014). *: One FMCx gene was not represented in the data set. **: Two FMCx genes were not represented in the data set. (DOC) Table S6 Housekeeping genes used as a control gene set in GSEA. HGNC symbol, Ensembl ID (Release 54) and description of housekeeping genes (n = 36) used as a control gene set in GSEA of the cognitive tests, psychiatric disorders and nonpsychiatric phenotypes. The genes are from Applied Biosystem's list of TaqMan endogenous controls and from a list of housekeeping genes from Warrington et al. [52]. (DOC) Table S7 Gene based analysis of regionally enriched cortical genes for association to cognitive abilities (corrected). The cortical enriched genes were analysed for allelic association to nine traits [37][38][39][40] from the NCNG GWAS. All modified Sidak's P-values are listed. HGNC: HUGO Gene Nomenclature Committee, SNPs: number of SNPs assigned to each gene by LDsnpR. For trait abbreviations see Table S1 and S3. Table S7a: Frontomedial cortex enriched genes, n = 29, Table  S7b: Temporal cortex enriched genes, n = 22, and Table S7c: Occipital cortex enriched genes, n = 11. (DOC)

Supporting Information
Table S8 Validation of observed enrichment by random gene sets. The validity of the observed enrichment signal of the TCx gene set in the test measure of non-verbal intelligence (Reasoning), and the OCx gene set in both one of the attention tasks (CDT-Invalid) and the Danish SCZ sample, were analysed using random gene sets mimicking the gene sets in respect to gene set size and SNP markers assigned to each gene in the gene set. The ten best q-values are reported. RGS: Random Gene Set. FDR: False Discovery Rate. (DOC)