Novel Loci for Non-Syndromic Coarctation of the Aorta in Sporadic and Familial Cases

Backround Coarctation of the aorta (CoA) accounts for 5-8% of all congenital heart defects. CoA can be detected in up to 20% of patients with Ullrich-Turner syndrome (UTS), in which a part or all of one of the X chromosomes is absent. The etiology of non-syndromic CoA is poorly understood. In the present work, we test the hypothesis that rare copy number variation (CNV) especially on the gonosomes, contribute to the etiology of non-syndromic CoA. Methods We performed high-resolution genome-wide CNV analysis using the Affymetrix SNP 6.0 microarray platform for 70 individuals with sporadic CoA, 3 families with inherited CoA (n=13) and 605 controls. Our analysis comprised genome wide association, CNV burden and linkage. CNV was validated by multiplex ligation-dependent probe amplification. Results We identified a significant abundance of large (>100 kb) CNVs on the X chromosome in males with CoA (p=0.005). 11 out of 51 (~ 22%) male cases had these large CNVs. Association analysis in the sporadic cohort revealed 14 novel loci for CoA. The locus on 21q22.3 in the sporadic CoA cohort overlapped with a gene locus identified in all familial cases of CoA (candidate gene TRPM2). We identified one CNV locus within a locus with high multipoint LOD score from a linkage analysis of the familial cases (SEPT9); another locus overlapped with a region implicated in Kabuki syndrome. In the familial cases, we identified a total of 7 CNV loci that were exclusively present in cases but not in unaffected family members. Conclusion Of all candidate loci identified, the TRPM2 locus was the most frequently implicated autosomal locus in sporadic and familial cases. However, the abundance of large CNVs on the X chromosome of affected males suggests that gonosomal aberrations are not only responsible for syndromic CoA but also involved in the development of sporadic and non-syndromic CoA and their male dominance.


Results
We identified a significant abundance of large (>100 kb) CNVs on the X chromosome in males with CoA (p=0.005). 11 out of 51 (~22%) male cases had these large CNVs. Association analysis in the sporadic cohort revealed 14 novel loci for CoA. The locus on 21q22.3 in the sporadic CoA cohort overlapped with a gene locus identified in all familial cases of CoA (candidate gene TRPM2). We identified one CNV locus within a locus with high multipoint LOD score from a linkage analysis of the familial cases (SEPT9); another locus overlapped with a region implicated in Kabuki syndrome. In the familial cases, we identified a total of 7 CNV loci that were exclusively present in cases but not in unaffected family members.

Conclusion
Of all candidate loci identified, the TRPM2 locus was the most frequently implicated autosomal locus in sporadic and familial cases. However, the abundance of large CNVs on the X Introduction Recent studies of sporadic, non-syndromic congenital anomalies implicated rare de novo variants and copy number variation (CNV) as their etiology [1][2][3]. Several groups associated CNVs with the pathogenesis of congenital heart disease (CHD), including pulmonary atresia, tetralogy of Fallot, and left-sided outflow tract obstruction [4][5][6][7][8]. Left-sided outflow tract obstruction represents the most severe cardiac malformation syndrome, including bicuspid aortic valve (BAV), aortic valve stenosis (AS), coarctation of the aorta (CoA), and hypoplastic left heart syndrome (HLHS).
In this study we focus on CoA, which represents the third most frequent cardiac malformation with a prevalence of~8% of all CHD. The Mendelian inheritance of CoA is very rare with only 1-2% of cases; in~90% congenital CoA presents as a sporadic and non-syndromic congenital malformation [9,10]. CoA can also be part of a syndrome, mainly the Ullrich-Turner syndrome (UTS) and the Kabuki syndrome (KS) [11,12]. The prevalence of CoA in UTS is 7-18% and in KS 23% respectively [12][13][14][15]. The genetic basis of KS is poorly understood, except for cytogenetic abnormalities that have been associated with KS. In the majority of cases, changes on the gonosomes were identified, particularly on the X chromosome with ring X, monosomy X, and mosaic mutations; ring Y has also been associated with KS [16,17]. Recently, a locus on 8p22-23.1 was implicated in to the pathogenesis of KS [18][19][20][21][22].
The most common chromosomal aberration in UTS is a complete monosomy X (50-75%) resulting from meiotic nondisjunction [23][24][25]. To a lesser degree, a partial or complete deletion of the short arm of the X chromosome has been implicated [23][24][25][26][27]. Strikingly, deletions of the short arm of the Y chromosome (Yp) can also lead to the complete phenotype of UTS including CoA [23,24]. Such deletions usually involve the sex-determining region on Yp (SRY). Individuals with SRY deletions are clinically indistinguishable from female patients with UTS. This phenomenon has been attributed to haploinsufficiency of gonosomal homologue genes (GHG) that are non-recombinant gene pairs encoded on the X and Y chromosome analogous to autosomal gene pairs [23,24,[28][29][30]. As GHG escape X-inactivation in females, two copies of each gene are expressed both in males and females [29]. Previously we hypothesized that loss-of-function mutations in selected GHG are involved in the development of non-syndromic CoA by a reduced gene dosage effect. We investigated a cohort of patients with sporadic CoA by a quite focused screening approach (gonosomal candidate gene approach) and identified transducin (beta)-like 1, Y-linked (TBL1Y), a gonosomal homologue gene associated with the Notch signaling pathway, to be involved in non-syndromic CoA by loss of function mutations [31]. In our current work we investigated a second and new cohort of 70 individuals with sporadic CoA and 3 families with inherited non-syndromic CoA by a wider screening concept (genome wide analyses of copy number variation). We tested the hypothesis that CNVs, particularly on the gonosomes, contribute not only to the development of syndromic CoA (like in UTS) but also to the development of non-syndromic sporadic CoA with male dominance.

Ethics Statement
The study and particularly the collection and processing of the human tissue samples have been approved by the ethics committee of the University of Erlangen (Re.-No. 3818). Written informed consent was obtained from all participating individuals. All samples used have been de-identified for the study.

Patient cohort
Our cohort included 70 individuals with isolated CoA (19 females; 51 males), 3 families (n = 13; 1 affected female, 5 affected males), and 605 controls; all individuals were of European genetic background. CoA was diagnosed by both echocardiography and aortic angiography. In some patients, an MR imaging study was also performed. Twenty-two individuals with CoA (31.4%) showed no other cardiac anomalies. Additional structural abnormalities in CoA including bicuspid aortic valve and septal defects are very common. Our cohort represents a typical distribution of concomitant structural abnormalities. Twenty-five (35.7%) of our patients had bicuspid aortic valve, 17 (24.2%) had a ventricular septal defect, 5 (7.1%) had persistent ductus arteriosus and 1 (1.4%) presented with Wolff Parkinson White syndrome, a common disorder of the conduction system. Individuals with genetically confirmed syndromic disorders, including UTS and KS, were excluded from the study. At the time of inclusion in our study, individuals were between 5 days and 48 years of age (average~14.4 years) with a maleto-female ratio of 2.6 to 1. The control cohort included 605 individuals (382 males; 223 females) with no known heart disease. Genomic DNA was extracted from peripheral venous blood lymphocytes by QIAamp DNA Blood Kit (Qiagen). Quality assessments for DNA extraction and array preparation were performed before genotyping.

DNA sequence variation and CNV genotyping
All sporadic cases and families with CoA (n = 83) were genotyped with the Affymetrix Genome-Wide Human SNPArray 6.0. single nucleotide polymorphisms (SNP) genotypes were called with the birdseed-v2 algorithm as implemented in the Affymetrix Power Tools. Segments with CNVs were called using birdseye, as implemented in the birdsuite software [32]. By implementation, this algorithm searches for novel, therefore rare CNV loci also outside known copy number polymorphism (CNP) sites. The same procedure was applied to all individuals from the 3 families with CoA. A group of 605 individuals with psoriatic arthritis, which had previously been genotyped using the procedure mentioned above, were used as controls [33]. A liftover to hg19 coordinates was carried out with the data from the 605 control individuals, as they had been genotyped using hg18 coordinates. At the beginning of birdsuite genotyping, individual arrays with poor quality were routinely removed by a standardized QC algorithm [32]. Segments from the sporadic cases and the 605 controls were then combined and translated into 379,666 pseudo-markers (start/endpoints of segments) to be used for association analysis using PLINK. For CNV burden analysis, the difference of the median and the minimum number of CNV segments within a cohort was added to the median, defining a cutoff value (For PSA, this was done after the liftover). All samples with more CNV segments than this cutoff value were excluded from the burden analysis. Both raw and processed copy number data haven been submitted to Gene Expression Omnibus (GEO) and are available through the accession number GSE67929 (sporadic cases) and GSE67930 (familial samples).

Genome-wide SNP association
SNP genotypes, as determined by birdseed-v2 (part of the birdsuite algorithms), were filtered to exclude SNPs with more than 5% missing genotypes, a minor allele frequency of less than 1% and a marked deviation from Hardy-Weinberg equilibrium (HWE p-value < 0.001), reducing the original 909622 SNP markers to 722,556. A standard chi-square-based association analysis was then performed using PLINK.

Linkage Analysis
We performed genome-wide linkage analysis in three non-related families. All DNA samples were genotyped with the Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, Calif., USA). Genotypes were called by the Genotyping Console software v4.0 (Affymetrix). Simulation analysis for expected LOD scores (ELOD) for the given family pedigrees was performed with a parametric 2-point analysis with the software FastSLink [34]. Relationships of family members were verified by checking the genotype data with the software Graphical Representation of Relationships (GRR) [35]. Mendelian errors due to genotype errors were checked with the software Pedcheck and erroneous genotypes were removed from the data file [36]. Parametric multipoint linkage analysis and haplotype construction were done with the program Allegro [37][38][39]. Data handling was done with the software easyLinkage-Plus under the assumption of an autosomal-recessive and autosomal-dominant mode of inheritance with 100% penetrance, disease allele frequency of 0.01% and at least 0.01 cM intermarker distance [40]. Visualization of haplotype data was performed using the software haplopainter [41].

MLPA validation of CNVs
All detected CNV regions were reviewed for gene content or gene proximity. From those we validated exemplarily eight region detected by SNP arrays by multiplex ligation dependent probe amplification (MLPA, MRC Holland) in up to 58 samples. We selected these regions according to technical MLPA assay design criteria. These eight regions contain the genes GSTM1, DEFB cluster on chromosome 8, NF1P2, ERVV1, FAM115C, SEPT9, TPTE and TRPM2, respectively. For all regions we could validate the copy number status in the analyzed samples except for the region containing NF1P2. In this region eight of 32 samples showed duplication instead of the deletion from the array data, possibly due to the highly variable pericentromeric structural variation region on chromosome 15q (S1 Table).

Cardiac gene list
A cardiac gene list from CHDwiki was used to match with CNV loci from association analysis of sporadic cohort, overlapping regions from familial analysis and linkage regions. The cardiac gene list consists of 290 genes, which represents a set of most currently known gene-phenotypes linked with 139 cardiac defects. The use of this specialized ontology maximizes the relevance of the collected information to the CHD community and improves the consistency of this information. Each gene was reported to be involved in a cardiac phenotype in at least two independent publications [42].

CNV burden analysis sporadic cohort
We included 70 individuals with sporadic CoA and 490 of the 605 PSA controls, as well as 3 families with 2 affected individuals per family with CoA (Fig 1) for a separate burden analysis in familial cases. All samples met in-house quality criteria. Overall, we detected 7,160 CNVs with >10kb (3,127 deletions and 4,033 duplications) in CoA individuals and a total of 17,990 CNVs larger than 10kb (8,600 deletions and 9,390 duplications) in the control cohort. We determined a size threshold at 100kb and distinguished between autosomes and gonosomes. The optimal size threshold for this type of analysis has been demonstrated previously to be around 100kb [43]. Ninety-three CNVs > 100kb were identified in CoA (~1.33/individual) and 2,151 in the control cohort (~4.39/individual). We then tested for a possible abundance of large CNVs in cases versus controls, both in autosomes and on the X chromosome, the latter being tested separately for males (definition of a CNV to have a copy number count other than one and be outside of the pseudo-autosomal regions) and females (definition of a CNV as in autosomes). Hereby we tested for a higher proportion of large CNV segments which might be present in cases compared to controls, as shown previously in a study concerning the genetics of short stature [43]. Addressing our hypothesis that CNVs particularly on the gonosomes contribute to sporadic and non-syndromic CoA we were able to identify a significant abundance (p = 0.005, with a Bonferroni corrected significance threshold of 0.0167) of large (>100 kb) Xchromosomal CNVs in males by applying a one-sided Fisher's exact test with the alternative hypothesis being an odds ratio greater than 1 (Table 1). Of the 51 male cases, 11 had CNV segments >100kb on the X chromosome. Just a subset of this CNV segments are localized in coding regions. Genes contained within these CNV segments are SPACA5, ZNF630, SSX6 (all

CNV association analysis in sporadic samples
The pseudo-markers from the 70 sporadic cases and 605 control individuals were tested for association using a one-sided permutation-based approach as implemented in PLINK, with 100,000 permutations. Only segments with at least 5 markers and spanning 10 kb or more were evaluated. Of the associated loci, 5 exceeded the significance threshold of 5E-05 as deletions and 9 as duplications ( Table 2). The duplication locus on chromosome 17 did not quite pass the significance threshold, but was retained due to overlap with a linkage region ( Table 3). The DEFB4 cluster on chromosome 8 is a known CNP locus also present in PSA cases. Since one-sided tests were used, and due to the previous literature (see discussion), this locus was also retained. X-chromosomal association analysis was carried out for males and females separately, but showed no significantly associated loci.

SNP analysis in sporadic CoA cohort
A SNP association using 722,556 SNPs in the unrelated cases and controls was performed, and showed no evidence of association past the genome-wide significance threshold (data not shown). Novel Loci for Coarctation of the Aorta CNV burden in familial cases CNV burden analysis for large CNVs >100 kb was performed analogous to the analysis in sporadic cases. It revealed no difference in gonosomal CNV burden. However, a difference was seen for the autosomes (Table 4). Familial CoA cases showed significantly more large deletions >100 kb (p<0.01) and less large duplications >100 kb (p<0.01) compared to controls.

Linkage analysis in families with CoA
Even though we did not expect to achieve significant LOD scores due to small pedigrees, we performed a linkage analysis. 521,435 SNPs with perfect call rate were used in a multipoint linkage analysis assuming dominant or recessive inheritance for all pedigrees, producing a union of all regions with a LOD score raising above the background. This approach yielded a total of 19 linkage regions with LOD scores of up to 1.62 (Fig 2).

CNV analysis in families with CoA
CNV calling with the birdseye software produced 1783 CNV segments larger than 10 kb (444 deletions, 1339 duplications), which were checked for overlap in the affected 6 individuals of the 3 families. One duplication locus was present in all 6 affected individuals but not in the unaffected controls, one locus in 5 cases / 0 controls and 5 loci with 4 cases / 0 controls (Table 5).  The duplication locus on chromosome 6 shared by 4 cases and 0 controls partially overlaps with a linkage region.

Cardiac gene list
A list with 290 cardiac candidate genes was compared with all CNV segments of the sporadic and familial CoA cohort as well as the linkage regions of the familial cases [42]. An overlap between calculated linkage regions with this cardiac gene list identified 22 cardiac genes located within the linkage regions as novel candidate genes for CoA (Table 3).

Discussion
Due to the high incidence of CoA in UTS an KS patients and the male dominance of CoA in non-syndromic sporadic cases we previously screened for mutation in GHG by a gonosomal candidate gene approach [31]. To follow up on our idea of gonosomal abnormalities in sporadic non-syndromic CoA we now extended our mutation detection from single gene analyses (candidate gene approach by Sanger sequencing) to a genome wide CNV analysis. Our hypothesis was to evaluate whether CNV, particularly on the gonosomes, contribute to the development of sporadic and non-syndromic CoA.
CNV have been associated with the pathogenesis of complex congenital heart disease and left sided outflow tract obstruction [5][6][7][8]. To our knowledge our study is the first genome wide CNV analysis in CoA cohorts. Our comprehensive genotype analyses contained sporadic and familial cases with CoA and addressed gender-specific genetic modifier. We performed CNV association and burden analysis. CNV burden was evaluated after assigning a size threshold of >100kb and distinguishing between autosomes and gonosomes. Analysis was performed separately for sporadic CoA cases versus controls and familial CoA cases vs. healthy family members. Gonosomes were tested separately for males and females. Additionally, a list with 290 cardiac candidate genes was compared with all CNV segments of the sporadic and familial CoA cohorts as well as the linkage regions of the familial cases to assess the overlap of gene content (Fig 3).
Regarding the CNV burden in the group of sporadic CoA cases, a significant gonosomal abundance (p = 0.005) for large X-chromosomal CNVs (>100 kb) in males was observed. This abundance of large X-chromosomal CNVs in sporadic male CoA cases can be correlated with the clinical finding of a 2:1 male-to-female incidence in CoA. Males cannot compensate for alterations of X-chromosomal gene dosage. Only a subset of the X-chromosomal CNV segments were within coding regions of chromosome X. However, we still have limited knowledge on intronic regulators of gene activity, particularly on the X chromosome. We could identify 4 genes within these chromosomal segments; SPACA5 (Sperm Acrosome-Associated Protein 5), ZNF630 (zinc finger protein 630), SSX6 (synovial sarcoma, X breakpoint 6), and PCDH11X (protocadherin 11 X-linked). In a comparative genomic hybridization (CGH) array analysis a duplication of SPACA5 (and ZNF81/ZNF182) has been described in a boy with developmental delay, autistic features, and growth and speech delay [44]. SSX6 belongs to the SSX gene family, whereas SSX1, 2 and 4 have found to be involved in synovial sarcomas, SSX6 is expressed in melanoma cell lines [45]. ZNF630 resides on an area of chromosome X that hast been implicated in non-syndromic X-linked mental retardation [46]. PCDH11X belongs to the protocadherin gene family of calcium-dependent cell adhesion and recognition proteins and is located in a major X/Y block of homology [47]. The PCDH11X protein is thought to play a fundamental role in cell-cell recognition essential for the segmental development of the central nervous system. Alternative splicing in PCDH11X results in multiple transcript variants. It has been speculated as a potential candidate gene for late-onset Alzheimer disease [48,49]. However, the main interacting partners and pathways of PCDH11X as well as its distinct genotype-phenotype correlations are unknown.
Contrary to the sporadic cases of CoA, the CNV burden for gonosomes was not different in familial cases. Here we could identify a significantly different CNV burden for the autosomes, with more deletions and less duplications on autosomes in the familial cases of CoA. This different pattern of CNV burden may indicate that the pathogenesis of sporadic and familial cases could be different. However, the number of families we investigated in this study was small compared to the number of patients in the sporadic cohort.
In the association analyses of the sporadic cohort we identified 14 CNV loci (threshold of 5E-05) derived from segments in 5-58 of 70 patients, which are mostly novel and therefore not common copy number variations (CNPs). One region on chromosome 8p23.1 was deleted in 9 patients and strikingly the same region was duplicated in 11 patients, only one control showed a deletion in this area. The ratio of the CoA subgroups (CoA only, CoA with BAV or VSD) within those 20 individuals did not significantly differ from the primary cohort of 70 individuals although the majority of patients had isolated CoA (10 patients = 50% with CoA only, 7 patients = 35% with additional BAV, 3 = 15% with additional VSD, see Table 6). This locus contains the genes DEFB4B, DEFB103A, SPAG11B, DEFB104A, DEFB106A, DEFB105A, DEFB107A, SPAG11A, DEFB4A. Various studies described copy number changes at this locus in patients with KS or Kabuki-like syndromes' some of these patients have been reported to have hypoplastic left ventricle and aorta. However, whether copy number changes on 8p22-23.1 are associated with KS or not is debated in the literature [18][19][20][21][22]50]. Hypoplastic aortic arch or CoA is only an intermediate phenotype of KS. In our analyses 20 individuals of the 70 sporadic cases with CoA had large CNVs on 8p22-23.1. Thus, we suggest that the region 8p22-23.1 is a candidate region for CoA regardless of additional phenotypic features of KS. While we are aware of the significance of this locus for psoriasis and psoriatic arthritis [51], we have used one-sided tests throughout, therefore our findings do not reflect any CNVs possibly present in the PSA controls, but instead have a higher relative abundance in our patients.
The CNV analyses in the familial cases revealed 7 rare (non-CNP) CNVs, which overlapped within affected individuals from all 3 families. One CNV locus (TRPM2; transient receptor potential cation channel, subfamily M, member 2) was present in all 6 affected individuals of the 3 CoA families and absent in all unaffected individuals. Another CNV locus (PLA2G6, phospholipase A2, group VI) was present in 5 of 6 familial CoA cases and none of the unaffected individuals. Five CNV loci (SMTN, smoothelin; ARHFEF1, Rho guanine nucleotide exchange factor (GEF) 1; SCYL1, S. cerevisiae1-like1; CCND3, cyclin d3; EML6, echinoderm microtubule associated protein like 6) were present in 4 of 6 familial CoA cases and none of the unaffected individuals (Table 5). Our association analyses considered sporadic cases only due to the limited number of familial cases. However, we tested for overlap between the groups, comparing association loci of sporadic CoA cases with overlapping loci of familial CoA cases,-and with the linkage regions of the familial CoA cases.
The CNV locus of TRPM2 was not only detected in all 6 affected individuals of the familial CoA cases but also in 6 individuals of the sporadic CoA cohort, providing us a potentially important new candidate or modifier genes for the CoA phenotype. TRPM2 is a well-known oxidant-sensitive Ca2+ permeable channel implicated in mediating endothelial apoptosis and in promoting vascular injury and inflammation [52]. TRPM2 also contributes to production of vasoactive nitric oxide via the p38/JNK pathway [53]. In an established mouse model, cardiac TRPM2 channel activity protected the heart from ischemia/reperfusion injury by ameliorating mitochondrial dysfunction and reducing reactive oxygen species levels [54]. So far there are no data on alteration of TRPM2 activity in a model of endovascular stenosis or even coarctation of the aortic isthmus. Endovascular stenosis can augment sheer stress-induced endothelial cell apoptosis and inflammation [55]. The cause of CoA is certainly heterogeneous, polygenic and pleiotropic. Altered oxidant-sensitive Ca2+ permeable channel activity involved in endovascular apoptosis and inflammation and regulation of vascular tone via nitric oxide may represent a Due to the limited number of familial cases, we decided to decrease the threshold for relevant linkage loci and highlighted in a first step all loci above background. The analyses revealed 19 linkage regions with LOD scores of up to 1.62 (Fig 2). We filtered the number of candidate genes by matching them with a list of known cardiac genes involving 139 cardiac defects. Of these 19 loci, 9 contained 22 overlapping cardiac genes listed in Table 3.

Conclusion
Our study provides new insight into the genetic basis of non-syndromic CoA by identifying new candidate genes through CNV association and burden analysis. Of all candidate loci identified, the TRPM2 gene locus was the most frequently implicated autosomal locus in both, sporadic and familial cases. The abundance of large CNVs on the X-Chromosome of affected males, suggests that gonosomal aberrations are not only involved in the development of syndromic CoA, like in UTS or KS, but also likely responsible for sporadic and non-syndromic CoA and the male dominance observed in this malformation. Further studies are needed and subject to our current efforts to better understand the genetic but also the epigenetic factors leading to CoA.
Supporting Information S1 Table. MLPA validation From all detected CNVs eight with gene content were chosen for validation with MLPA. Only those DNA samples with an aberrant CNV in the particular region were analyzed with a MLPA assay. All CNVs from the microarray analysis could be validated (+), except for the CNV status of the NF1P2 region in eight of 32 samples (duplications instead of deletions); n/a = not applicable because of absent CNV in microarray data. (DOCX)