Comprehensive cross-disorder analyses of CNTNAP2 suggest it is unlikely to be a primary risk gene for psychiatric disorders

The contactin-associated protein-like 2 (CNTNAP2) gene is a member of the neurexin superfamily. CNTNAP2 was first implicated in the cortical dysplasia-focal epilepsy (CDFE) syndrome, a recessive disease characterized by intellectual disability, epilepsy, language impairments and autistic features. Associated SNPs and heterozygous deletions in CNTNAP2 were subsequently reported in autism, schizophrenia and other psychiatric or neurological disorders. We aimed to comprehensively examine evidence for the role of CNTNAP2 in susceptibility to psychiatric disorders, by the analysis of multiple classes of genetic variation in large genomic datasets. In this study we used: i) summary statistics from the Psychiatric Genomics Consortium (PGC) GWAS for seven psychiatric disorders; ii) examined all reported CNTNAP2 structural variants in patients and controls; iii) performed cross-disorder analysis of functional or previously associated SNPs; and iv) conducted burden tests for pathogenic rare variants using sequencing data (4,483 ASD and 6,135 schizophrenia cases, and 13,042 controls). The distribution of CNVs across CNTNAP2 in psychiatric cases from previous reports was no different from controls of the database of genomic variants. Gene-based association testing did not implicate common variants in autism, schizophrenia or other psychiatric phenotypes. The association of proposed functional SNPs rs7794745 and rs2710102, reported to influence brain connectivity, was not replicated; nor did predicted functional SNPs yield significant results in meta-analysis across psychiatric disorders at either SNP-level or gene-level. Disrupting CNTNAP2 rare variant burden was not higher in autism or schizophrenia compared to controls. Finally, in a CNV mircroarray study of an extended bipolar disorder family with 5 affected relatives we previously identified a 131kb deletion in CNTNAP2 intron 1, removing a FOXP2 transcription factor binding site. Quantitative-PCR validation and segregation analysis of this CNV revealed imperfect segregation with BD. This large comprehensive study indicates that CNTNAP2 may not be a robust risk gene for psychiatric phenotypes.


Introduction
The contactin-associated protein-like 2 (CNTNAP2) is located on chromosome 7q35-36.1, and consists of 24 exons spanning 2.3Mb, making it one of the largest protein coding genes in the human genome. This gene encodes the CASPR2 protein, related to the neurexin superfamily, which localises with potassium channels at the juxtaparanodal regions of the Ravier nodes in myelinated axons, playing a crucial role in the clustering of potassium channels required for conduction of axon potentials [1]. CNTNAP2 is expressed in the spinal cord, prefrontal and frontal cortex, striatum, thalamus and amygdala; this pattern of expression is preserved throughout the development and adulthood [2,3]. Its function is related to neuronal migration, dendritic arborisation and synaptic transmission [4]. The crucial role of CNTNAP2 in the human brain became clear in 2006 when Strauss et al, reported homozygous mutations in Old Order Amish families segregating with a severe Mendelian condition, described as cortical dysplasiafocal epilepsy (CDFE) syndrome (OMIM 610042) [5]. In 2009, additional patients with recessive mutations in CNTNAP2 were reported, with clinical features resembling Pitt-Hopkins syndrome [6]. To date 33 patients, mostly from consanguineous families, have been reported with homozygous or compound deletions and truncating mutations in CNTNAP2 [5-9], and are collectively described as having CASPR2 deficiency disorder [7]. The common clinical features in this phenotype include severe intellectual disability (ID), seizures with age of onset at two years and concomitant speech impairments or language regression. The phenotype is often accompanied by dysmorphic features, autistic traits, psychomotor delay and focal cortical dysplasia.
CNTNAP2 is also thought to contribute to diverse phenotypes in patients with interstitial or terminal deletions at 7q35 and 7q36. Interstitial or terminal deletions encompassing CNTNAP2 and several other genes have been described in individuals with ID, seizures, craniofacial anomalies, including microcephaly, short stature and absence of language [10]. The severe language impairments observed in patients with homozygous mutations or karyotypic abnormalities involving CNTNAP2 suggested a possible functional interaction with FOXP2, a gene for which heterozygous mutations lead to a monogenic form of language disorder [11]. Interestingly, Vernes et al., found that the FOXP2 transcription factor has a binding site in intron 1 of CNTNAP2, regulating its expression [12]. Considering that a large proportion of autistic patients show language impairments and most individuals with homozygous mutations in CNTNAP2 manifest autistic features, several studies investigated the potential involvement of CNTNAP2 in autism spectrum disorder (ASD). In particular, two pioneering studies showed that single nucleotide polymorphism (SNP) markers rs2710102 and rs7794745 were associated with risk of ASD [13,14]. These studies were the first implicating CNTNAP2 in autism, and opened a chapter of additional analyses in ASD and other psychiatric phenotypes during the next decade. In subsequent studies, rs2710102 was implicated in early language acquisition in the general population [15], and showed functional effects on brain activation in neuroimaging studies [16][17][18][19]. Furthermore, genotypes at rs7794745 were associated with reduced grey matter volume in the left superior occipital gyrus in two independent studies [20,21], and alleles of this SNP were reported to affect voice-specific brain function [22]. Genetic associations with ASD for these, and several other SNPs in CNTNAP2, have been reported in a number of studies [23][24][25][26][27][28]. Along with the first reports of SNPs associated with ASD, copy number variant (CNV) deletions have also been described in ID or ASD patients, which were proposed to be highly penetrant disease-causative mutations [13,[29][30][31][32][33][34][35][36][37][38]. To better understand the role of CNTNAP2 in ASD pathophysiology, knockout mice were generated. Studies of these mice reported several neuronal defects when both copies of CNTNAP2 are mutated: abnormal neuronal migration, reduction of GABAergic interneurons, deficiency in excitatory neurotransmission, and the delay of myelination in the neocortex [2,39,40].
While CNTNAP2 is now considered a strong candidate gene for ASD and psychiatric disease more generally (summarised in Table 1), several of these early supportive studies were performed with limited sample sizes, or were individual case reports which lacked comparison with control individuals, hence providing circumstantial evidence as a psychiatric risk gene. We therefore aimed in this current study to perform systematic genetic analyses with large datasets to examine the evidence for a role of the CNTNAP2 gene in multiple psychiatric phenotypes-performing a comprehensive analysis of common and rare variants, CNVs and de novo mutations-using both publicly available datasets and in-house data. 50], as well as several other psychiatric phenotypes [41][42][43][44][45]49]. In Table 2, we summarise all markers found significantly associated in these previous studies, and report the corresponding P-value from the Psychiatric Genomics Consortium GWAS for seven major psychiatric disorders: ADHD, anorexia nervosa, ASD, bipolar disorder, MDD, OCD and schizophrenia. Nominal associations were found with ASD for the following markers: rs802524 (P = 0.016), rs802568 (P = 0.008), rs17170073 (P = 0.008), and rs2710102 (P = 0.036; which is highly correlated with 4 SNPs: rs759178, rs1922892, rs2538991, rs2538976). Furthermore, nominal association was also observed with schizophrenia for rs1859547 (P = 0.044); with ADHD for rs1718101 (P = 0.038); with MDD for rs12670868 (P = 0.047), rs17236239 (P = 0.006), rs4431523 (P = 0.001); and with anorexia nervosa for rs700273 (P = 0.013). The nominal association with ASD at rs1770073 and rs2710102 represents the only case in which the association in the original report replicates in the PGC dataset for the same phenotype. The two SNPs rs7794745 and rs2710102, which were repeatedly reported as being associated in earlier studies with smaller sample size and proposed to be functional SNPs, were not strongly associated with any phenotype (the most significant signal being P = 0.036 for rs2710102 in ASD). None of those associations survived corrections for multiple comparisons (Table 2).

Gene-based analysis of cross-disorder associations
Next, we explored the contribution of common variants across CNTNAP2 by performing a genebased association study in MAGMA using GWAS summary statistics from PGC data of seven psychiatric disorders in European populations (Table 3). Association plots for all SNPs included in analysis of each individual phenotype are shown in supporting information (S1 Fig).
The test included a dense coverage of SNPs across CNTNAP2: from 1,214 SNPs in MDD up to 12,264 SNPs in schizophrenia. The results suggest that common variants overall do not contribute to disease susceptibility of these phenotypes ( Table 3). The most significant association observed was for MDD phase 1 analysis (P = 0.029), which is the dataset with the most modest coverage of markers.
To explore whether any gene-based signal is not being detected due to a high signal-tonoise ratio (i.e. inclusion of a large number of SNPs of no functional consequence), we selected Alcohol dependence 63 predicted functional SNPs in CNTNAP2 and performed a meta-analysis across psychiatric disorders (for regional association plot, see S2 Fig). Nominal significance of association was observed for 11 predicted functional SNPs with P-values ranging from 0.01 and 0.05, but none survive correction for multiple comparisons (Table 4). The only predicted functional SNP which was previously reported as being associated with ASD was rs34712024 [25], but this variant was not associated with autism in the PGC dataset (P = 0.67; Table 2), nor other psychiatric phenotypes examined separately or together (Tables 2-4). MAGMA gene-based association analysis using this more restricted pool of common putative functional variants revealed significant association with ADHD after correction for multiple testing (corrected P-value = 0.033) and a nominal association with schizophrenia The disease for which association at each listed SNP is given, along with the reference number for each study and the approximate location of each variant within the CNTNAP2 gene structure. On the right, the P-value from each Psychiatric Genomics Consortium (PGC) dataset is reported. Where the associated SNP was not found in the GWAS summary statistic data, results for an alternative SNP are shown in parenthesis (r 2 = 1). Putative functional SNPs rs7794745 and rs2710102 are underlined.
No association survives correction for multiple independent tests (P <3.8E-04), but P-values < 0.05 are shown in bold. Abbreviations: ASD, autism spectrum disorder; SLI, specific language impairment; DYS, dyslexia; ANX, social anxiety; LAN, language in general population; SCZ, schizophrenia; BD, bipolar disorder; ALD, Alcohol dependence; OPN, Openness general population; MDD, major depressive disorder; SSD, speech sound disorder; N/A, SNP not genotyped &, r 2 >0.97 across the following SNPs: rs851715 and rs10246256 #, r 2 >0.97 across the following SNPs: rs2710102, rs759178, rs1922892, rs2538991 and rs2538976 , summary data at this SNP was not included in the latest autism GWAS (PGC2) but was present in the previous data set which included 5,305 ASD cases and 5,305 controls. https://doi.org/10.1371/journal.pgen.1007535.t002 Role of CNTNAP2 gene in psychiatry which did not survive multiple testing correction (S1 Table). However, this signal is reduced to trend level in the cross-disorder meta-analysis for this functional SNP-set (P-value = 0.11; S1 Table).

De novo variants in CNTNAP2
De novo variants in protein-coding genes which are predicted to be functionally damaging are considered to be highly pathogenic and have been extensively explored to implicate genes in psychiatric diseases, especially in ASD and schizophrenia [77]. We explored publicly available sequence data from previous projects in psychiatric disorders to assess the rate of coding de novo variants in CNTNAP2 using two databases (NPdenovo, http://www.wzgenomics.cn/ NPdenovo/; and denovo-db, http://denovo-db.gs.washington.edu/denovo-db/). No truncating or missense variants were identified across CNTNAP2 in 15,539 families (including 2,163 controls), and synonymous variants were reported in only two probands with developmental disorder (Table 5).

Pathogenic Ultra-Rare Variants (URV) of CNTNAP2 in ASD and Schizophrenia
Finally, we explored the potential impact of pathogenic ultra-rare variants (URV) in CNTNAP2 using available sequencing datasets of 4,483 patients with ASD and 6,135 patients with schizophrenia compared with 13,042 controls. We considered only those variants predicted to be pathogenic in both SIFT and Polyphen and which are ultra-rare (MAF<0.0001 in Non-Finnish European population; S3 Table). No difference in the total number of URV was observed between controls and patients with ASD (P = 0.11), or schizophrenia (P = 0.78) ( Table 6).

Structural variants affecting CNTNAP2 amongst psychiatric phenotypes
Several deletions and duplications have been described in neuropsychiatric phenotypes thus far. In Fig 1, we present a comprehensive representation of all previously reported structural variants found in CNTNAP2 in psychiatric disorders such as ASD or ID [13, 29-38], schizophrenia or bipolar disorder [51][52][53][54]78], ADHD [55], neurologic disorders such as epilepsy, Tourette syndrome or Charcot-Marie-Tooth [56-61]; and finally language-related phenotypes such as speech delay, childhood apraxia of speech and dyslexia [62][63][64][65]. Interestingly, the  A/T  TFBS  ADHD, AN, ASD,  reported structural variants frequently map in intron 1, and extend to exon 4 in some cases. The distribution of those structural variants across different phenotypes does not vary with those found in control populations from the database of genomic variants (http://dgv.tcag.ca/dgv/ app/home) (Fig 1), suggesting that structural variants in CNTNAP2 are not rare events associated exclusively to disease but are present with rare frequency in the general population. Unfortunately, as many reported CNVs come from individual case reports for which the number of subjects screened is not reported, direct frequency comparisons of this data are not meaningful.

Examination of an intronic deletion in CNTNAP2 in an extended family with bipolar disorder
CNV microarray analysis was previously performed in two affected individuals from an extended family which included five relatives affected with bipolar I disorder [78]. A drop in signal intensity for 340 consecutive probes was compatible with a deletion of 131 kb in intron 1 of CNTNAP2 (hg19/chr7:146203548-146334635 ; Fig 2A), encompassing the described binding site for the transcription factor FOXP2 (hg19/chr7:146215016-146215040) [12]. The deletion was detected in one of the two affected individuals examined by CNV array. To infer deletion segregation amongst additional relatives, WES-derived genotypes were used to create haplotypes across chromosome 7q35 (Fig 2B). CNV segregation (by haplotype inference) was uninformative due to: 1) incomplete genotype data (unaffected descendants of deceased patient 8404 were not included in the WES study) and 2) a likely recombination at 7q35 in the family. Thus experimental validation and CNV genotyping was performed in all individuals with DNA available to assess the presence of the CNTNAP2 intronic deletion and its disease association. Using quantitative PCR, the deletion was validated in proband subject 8401, and was detected in one unaffected descendant of deceased patient 8404 (Fig 2B and 2C), implying  that: 1) affected subject 8404 would have carried the deletion, had DNA been available; and 2) the CNV is unlikely to be highly penetrant as it was observed in an unaffected adult relative. The structural variant was not detected in the remaining affected relatives and therefore did not segregate with disease status in this family (Fig 2B).

Discussion
During the last decade, the CNTNAP2 gene has received considerable attention in the psychiatric genetics field, with a number of studies examining gene dosage, and common or rare variants associations across multiple major psychiatric disorders, which together provided compelling evidence that CNTNAP2 may be a risk gene with pleiotropic effects in psychiatry. While homozygous mutations in this gene lead to a rare and severe condition described as CASPR2 deficiency disorder (CDD) [7], characterized by profound intellectual disability, epilepsy, language impairment or regression [7, 8], heterozygous mutations or common variants have been suggested to be implicated in autism, whose clinical features overlap with some observed in CDD. CNTNAP2 is categorised in the SFARI database as syndromic gene and one of the highest-ranking "strong candidate" gene for ASD (https://gene.sfari.org). Heterozygous deletions encompassing the CNTNAP2 gene were described not only in autism but a wide range of phenotypes, including psychiatric or neurologic disorders, and language-related deficiencies. These structural variants were generally described as causative or highly penetrant [13, 29, 31, 55, 57, 59]. Examination of the distribution of all structural variants described thus far in psychiatric or neurologic patients showed comparable localisation to those found in the general population, suggesting that structural variants affecting CNTNAP2 may be less relevant in disease susceptibility than previously considered. We were not able to directly compare frequencies of observed structural variants in cases versus controls due to reporting bias in case reports and a lack of information on how many cases were screened to identify those subjects with reportable CNTNAP2 CNVs, which is a limitation of this study. In the ExAC database, CNTNAP2 had fewer CNV variants than expected (11 observed vs. 16 expected, z = 0.43; http://exac. broadinstitute.org), and its haploinsufficiency score of 0.59 is in the 8 th decile of all genes [79], suggesting that CNTNAP2 has a tendency to be intolerant to structural variants. A specific case-control CNV analysis is needed to examine CNV frequency differences, but would require a very large sample due to the rarity of CNVs at this locus. A close clinical psychiatric examination of the 66 parents with heterozygous deletions across CNTNAP2 of CDD provides information on the prevalence of psychiatric conditions in individuals carrying CNTNAP2 CNVs. All heterozygous family members carrying deletions or truncating mutations were described as phenotypically healthy, suggesting a lack of correlation between these deletions and any major psychiatric condition. Furthermore, parents who were carriers for heterozygous deletions in psychiatric/neurologic patients were described as unaffected at the time of reporting [13, 29, 31, 37, 54, 62], with two exception: one father of a proband with neonatal convulsion, and another father of an epileptic patient, were reported as affected [56,59]. Moreover, discordant segregation for deletions in CNTNAP2 was also observed in an ASD sib-pair [13]. Several psychiatric patients who were reported to carry heterozygous structural variants in CNTNAP2 were also described with translocations or other chromosomal abnormalities [29, 30, 33, 34, 56, 58, 62-65], therefore it is possible that these aberrations may explain the phenotype independently from the observed CNVs in CNTNAP2.
We also describe a new CNV deletion which does not segregate with disease in an extended family with bipolar disorder. This CNV removes the FOXP2 transcription factor binding site in intron 1 of CNTNAP2, and overlaps with structural variants described in a number of other psychiatric patients. This heterozygous deletion was identified in two individuals with bipolar I disorder from an extended family with five affected members, but was observed also in one unaffected relative (who underwent diagnostic interview at age >40 and therefore was beyond the typical age of symptom onset). Hence, the deletion was not segregating with the disease and is unlikely to represent a highly penetrant risk variant in this family, although we cannot exclude a multiple hit model where the CNV deletion interacts with other etiologic risk variants at other loci to exert phenotypic effect.
CNTNAP2 -/knock-out mice have been proposed as valid animal model for ASD considering the phenotypic similarities between ASD and the CASPR2 deficiency disorder [2]. CNTNAP2 -/knock-out mice showed abnormalities in the arborisation of dendrites, maturation of dendritic spines, defects in migration of cortical projection neurons, and reduction of GABAergic interneurons [2,4]. Controversially, ASD is not a core feature in the most recent patient series reported with CASPR2 deficiency disorder [7,8]. The association previously proposed around the relationship between heterozygous deletions in CNTNAP2 and ASD does not have a support from mouse models, as heterozygous mice did not show any behavioural or neuropathological abnormalities that were observed in homozygous knockouts [2]. Notwithstanding this, it is possible that the combination of heterozygous CNTNAP2 deletions in a genomic background of increased risk (through inheritance of other common and rare risk variants at other loci) may lead to psychiatric, behavioural or neuropathological abnormalities.
Common variants in CNTNAP2 are another class of genetic variation associated with several psychiatric or language-related phenotypes. The most interesting finding from studies of this variant class converge on markers rs7794745 and rs2710102, originally reported in ASD [13,14], and replicated later in ASD or implicated in other phenotypes [12,15,23,24,[46][47][48]. Neuroimaging studies have supported the notion that these common variants play a role in psychiatric disorders. SNP rs2710102 has been implicated in brain connectivity in healthy individuals [16,18,19], and rs7794745 was implicated in audio-visual speech perception [80], voice-specific brain function [22], and was associated with reduced grey matter volume in left superior occipital gyrus [20,21]. These studies focused principally on language tasks in general population, given the reported suggestive implications of CNTNAP2 in language impairment traits of ASD or language-related phenotypes. However, the direct role of CNTNAP2 in language is still unclear; indeed the language regression observed in patients with CASPR2 deficiency disorder are concomitant with seizure onset and may represent a secondary phenotypic effect caused by seizures [7]. On the other hand, the first genetic association of rs7794745 and rs2710102 with ASD, as well as the other psychiatric diseases were based in studies with limited sample size, and recent studies failed to replicate associations between the two markers and ASD [81,82]. Individual alleles associated in the past with limited numbers of patients warrant replications in adequately powered samples to ascertain bona fide findings considering the small size effects of common variants [83], such as that attempted here. Using the largest casecontrol cohorts currently publicly available (PGC datasets), we did not find evidence for significant association of previously reported common variants, or a combined effect for common variants of CNTNAP2 in the susceptibility of psychiatric disorders, nor did we find predicted functional SNPs with a role across disorders.
Finally, we examine evidence for rare variant contributions in CNTNAP2. Rare variants in the promoter or coding region were reported to play a role in the pathophysiology of ASD [25, 33], although a recent study including a large number of cases and controls did not find association of rare variants of CNTNAP2 in ASD [84]. Here we report the largest sample investigated thus far in ASD and schizophrenia, which suggests that rare variants in CNTNAP2 do not play a major role in these two psychiatric disorders. Furthermore, examination of de novo variants in combined psychiatric sequencing projects of over 15,500 trios suggest that de novo variants in CNTNAP2 do not increase risk for psychiatric disorders.
While functional studies show a relationship between certain deletions or rare variants of CNTNAP2 with neuronal phenotypes relevant to psychiatric illness [25, 54,85], we show that the genetic link between these variants and psychiatric phenotypes is tenuous. However, this does not dispel the evidence that the CNTNAP2 gene, or specific genetic variations within this gene, may have a real impact on neuronal functions or variability of brain connectivity in the general population.
It is now possible to combine large datasets to ascertain the real impact of candidate genes described in the past in psychiatric disorders. Here we performed analyses using large publicly available datasets investigating a range of mutational mechanisms which impact variability of CNTNAP2 across several psychiatric disorders. In conclusion, our results converge to show a limited or likely neutral role of CNTNAP2 in the susceptibility of psychiatric disorders. However, the impact of this gene in language deficit per se is not directly examined in this study and warrants additional investigation.

Common variant association in CNTNAP2 using publicly available datasets
We sought to replicate previously reported CNTNAP2 SNP associations in a range of psychiatric phenotypes or traits using GWAS summary-statistic data of the Psychiatric Genomics Consortium (https://med.unc.edu/pgc/results-and-downloads).
Firstly, we report the corresponding P-values of specific previously associated markers for case-control cohorts with autism spectrum disorder (ASD), schizophrenia (SCZ), bipolar disorder (BD), attention-deficit hyperactivity-disorder (ADHD), major depressive disorder (MDD), anorexia nervosa (AN), and obsessive compulsive disorder (OCD). If a specific SNP marker was not reported in an individual GWAS dataset, we selected another marker in high linkage disequilibrium (r 2~1 , using genotype data from the CEU, TSI, GBR and IBS European populations in 1000genomes project; http://www.internationalgenome.org).
Next, a gene-based association for common variants was calculated with MAGMA [86], using variants within a 5 kb window upstream and downstream of CNTNAP2. Selected datasets were of European descent, derived from GWAS summary statistics of the Psychiatric Genomics Consortium (https://med.unc.edu/pgc/results-and-downloads): SCZ (33,640 cases and 43,456 controls), BD (20,352 cases and 31,358 controls), ASD (6,197 and 7,377 controls), ADHD (19,099 cases and 34,194 controls), MDD (9,240 cases and 9,519 controls), OCD (2,688 cases and 7,037 controls), and AN (3,495 cases and 10,982 controls) [87][88][89][90][91][92][93]. Analyses were performed combining two different models for higher statistical power and sensitivity when the genetic architecture is unknown: the combined P-value model, which is more sensitive when only a small proportion of key SNPs in a gene show association; and the mean SNP association, which is more sensitive when allelic heterogeneity is greater and a larger number of SNPs show nominal association.
Finally, we selected SNPs predicted to be functional within a 5kb window upstream/downstream of CNTNAP2 (e.g. located in transcription factor binding sites, miRNA binding sites etc; https://snpinfo.niehs.nih.gov), and assessed a potential cross-disorder effect using GWAS summary statistics data of the PGC by performing a meta-analysis in PLINK [94]. The Cochran's Q-statistic and I 2 statistic were calculated to examine heterogeneity amongst studies. The null hypothesis was that all studies were measuring the same true effect, which would be rejected if heterogeneity exists across studies. For all functional SNPs, when heterogeneity between studies was I>50% (P<0.05), the pooled OR was estimated using a random-effects model.

Analysis of rare variants in CNTNAP2 in ASD and schizophrenia, and de novo variants across psychiatric cohorts
The impact of rare variants of CNTNAP2 was assessed using sequencing-level data from the following datasets: WES from the Sweden-Schizophrenia population-based Case-Control cohort (6,135 cases and 6,245 controls; dbGAP accession: phs000473.v2.p2); ARRA Autism Sequencing Collaboration (490 BCM cases, BCM 486 controls, and 1,288 unrelated ASD probands from consent code c1; dbGAP accession: phs000298.v3.p2); Medical Genome Reference Bank (2,845 healthy Australian adults; https://sgc.garvan.org.au/initiatives/mgrb); individuals from a Caucasian Spanish population (719 controls [95,96]); in-house ASD patients (30 cases; [97]); and previous published dataset in ASD (2,704 cases and 2,747 controls [84]). The selection of potentially etiologic variants was performed based on their predicted pathogenicity (missense damaging in both SIFT and polyphen 2, canonical splice variants, stop mutation and indels) and minor allele frequency (MAF<0.0001 in non-Finnish European populations using the Genome Aggregation Database; http://gnomad.broadinstitute.org/). A chi square statistic was used to compare separately the sample of schizophrenia patients (6,135 cases) and the combined ASD datasets (4,512 cases) with the combined control datasets (13,042 individuals).

Extended family with bipolar disorder and CNV in CNTNAP2
The extended family presented here (Fig 2B) provides a molecular follow-up from a previously reported whole exome sequencing (WES) study of multiplex BD families, augmented with CNV microarray data [78]. This multigenerational pedigree, was collected through the Mood Disorders Unit and Black Dog Institute at the Prince of Wales Hospital, Sydney, and the School of Psychiatry (University of New South Wales in Sydney) [100][101][102][103][104]. Consenting family members were assessed using the Family Interview for Genetic Studies (FIGS) [105], and the Diagnostic Interview for Genetic Studies (DIGS) [106]. The study was approved by the Human Research Ethics Committee of the University of New South Wales, and written informed consent was obtained from all participating individuals. Blood samples were collected for DNA extraction by standard laboratory methods. Three of the five relatives with bipolar disorder type I (BD-I) had DNA and WES-derived genotype data available, and six unaffected relatives with DNA and WES data were available for haplotype phasing and segregation analysis (Fig 2B).
Genome-wide CNV analysis was performed via CytoScan HD Array (Affymetrix, Santa Clara, CA, USA) in 2 distal affected relatives (individuals 8410 and 8401; Fig 2B), using the Affymetrix Chromosome Analysis Suite (ChAS) software (ThermoFisher, Waltham, MA, USA). Detailed information on CNV detection and filtering criteria have been previously described [78]. We identified a 131kb deletion in intron 1 of CNTNAP2 in individual 8401. WES-derived genotypes were used for haplotype assessment to infer CNV segregation amongst relatives, as previously described [78]. Next, we experimentally validated the CNTNAP2 CNV via quantitative PCR (qPCR) in all available family members. Validation was performed in quadruplicate via a SYBR Green-based quantitative PCR (qPCR) method using two independent amplicon probes, each compared with two different reference amplicon probes in the FOXP2 and RNF20 genes (S4 Table). Experimental details are available upon request.