Primary complex motor stereotypies are associated with de novo damaging DNA coding mutations that identify KDM5B as a risk gene

Motor stereotypies are common in children with autism spectrum disorder (ASD), intellectual disability, or sensory deprivation, as well as in typically developing children (“primary” stereotypies, pCMS). The precise pathophysiological mechanism for motor stereotypies is unknown, although genetic etiologies have been suggested. In this study, we perform whole-exome DNA sequencing in 129 parent-child trios with pCMS and 853 control trios (118 cases and 750 controls after quality control). We report an increased rate of de novo predicted-damaging DNA coding variants in pCMS versus controls, identifying KDM5B as a high-confidence risk gene and estimating 184 genes conferring risk. Genes harboring de novo damaging variants in pCMS probands show significant overlap with those in Tourette syndrome, ASD, and those in ASD probands with high versus low stereotypy scores. An exploratory analysis of these pCMS gene expression patterns finds clustering within the cortex and striatum during early mid-fetal development. Exploratory gene ontology and network analyses highlight functional convergence in calcium ion transport, demethylation, cell signaling, cell cycle and development. Continued sequencing of pCMS trios will identify additional risk genes and provide greater insights into biological mechanisms of stereotypies across diagnostic boundaries.


Introduction
Motor stereotypies are rhythmic, repetitive, prolonged, fixed, patterned, non-goal-directed movements that are often bilateral and temporarily stop with distraction.Complex motor stereotypies (CMS) include hand flapping, finger wiggling, head nodding, and rocking; these are often accompanied by mouth opening, head posturing, jumping, pacing, and occasional vocalizations [1].Movements occur for up to minutes in duration, multiple times per day, and tend to be exacerbated by excitement, fatigue, stress, boredom, or being engrossed in an activity.CMS are common in children with autism spectrum disorder (ASD), intellectual disability, or sensory deprivation, as well as in typically developing children.A favored classification subdivides by etiology into primary (otherwise typically developing) and secondary categories.In both groups, stereotypies often result in social stigmatization, classroom disruption, and interference with academic activities.
In children with ASD, stereotypic behaviors ("secondary" stereotypies) occur in about 44% of patients and are recognized as a core phenotype of the disorder [2].The severity and frequency of motor stereotypies is correlated with severity of illness, degree of intellectual disability, and impairments in adaptive functioning and symbolic play [3][4][5][6][7][8][9].They are often associated with self-injurious behaviors [10,11].A wide range of medications have been tried for treatment of stereotypies in ASD, but efficacy is inconsistent and inadequate, with potential for long-term side effects [12].
The precise pathophysiological mechanism for motor stereotypies remains obscure [31], though investigators have hypothesized abnormalities within cortico-striatal-thalamo-cortical pathways [32][33][34][35][36][37][38] and several neurotransmitter systems [33,[39][40][41].A recent study reported reduced functional connectivity between prefrontal cortical and striatal regions in pCMS [42].A genetic etiology for stereotypies has been suggested in primary and secondary categories, although the specific gene(s) contributing to this movement disorder remain unclear.With respect to secondary stereotypies in ASD, family studies have demonstrated that these repetitive behaviors are highly heritable, with a genetic etiology that is likely independent from other core diagnostic features [43].While there are no studies of recurrence risk or twin concordance reported for pCMS, a positive family history is reported in 25-40%, while remaining cases appear to be sporadic [16,27,44].
Considering these findings, we conducted the first pilot genetic study of pCMS in 129 typically developing children and their parents.We hypothesized that pCMS may represent a more genetically homogenous group of individuals versus those with secondary stereotypies, thereby facilitating genetic discovery and insight into the biology of stereotypies more generally [48,49].We studied rare de novo, or spontaneous, germline DNA mutations in these individuals.In disorders such as autism, obsessive-compulsive disorder, and Tourette syndrome [45][46][47], this approach has proven invaluable for identifying genetic variants of large effect, high confidence risk genes, and enriched biological functions.Using whole-exome DNA sequencing, we identified an enrichment of de novo predicted-damaging coding mutations in pCMS and identified one high-confidence risk gene, Lysine Demethylase 5B (KDM5B) in our cohort.By further analysis of de novo damaging mutations in pCMS, we predict that there are approximately 184 pCMS risk genes and that sequencing more pCMS parent-child trios is a definite path toward discovering these genes.In this pilot study, we see a significant overlap between genes harboring de novo damaging mutations in pCMS and those in ASD as well as Tourette syndrome, a neurodevelopmental movement disorder characterized by motor and assigned a unique DOI (doi:10.5061/dryad.rfj6q57d5).The data submission is currently in "private for peer review" status, so this DOI will not be live until the manuscript is accepted for publication.However, a private URL to this data is provided by Dryad for use during peer review: https://datadryad.org/stash/share/HU8cwTlay7QNbwTWYhuuBiZcSeV75dtgmkzsB4B08N0 Clicking this link immediately launches a download of the data files in the repository.
vocal tics.This overlap occurred despite excluding subjects with ASD or tics.Furthermore, owing to the two de novo damaging KDM5B mutations in our pCMS cohort, there is significant genetic overlap with ASD probands with highest stereotypy scores, but not those with low scores.Exploratory systems analyses of genes harboring de novo damaging mutations in pCMS show these genes to have peak expression in the cortex and striatum during early midfetal development.Finally, exploratory gene ontology analysis highlights functional convergence in calcium ion signaling, demethylation, cell signaling, cell cycle and development.

Subjects and assessment measures
This protocol was approved by the Johns Hopkins Medicine Institutional Review Board.Children with primary complex motor stereotypies (pCMS) were recruited from either the Johns Hopkins Pediatric Neurology Movement Disorder Outpatient Clinic (HSS, Director), or via email (singerlab@jhmi.edu).All participants verbally consented and provided signed parental consent.Using standardized forms via telephone, the study coordinator completed a brief screening general history, obtained baseline data about each child's stereotypies, and completed an Autism Spectrum Screening Questionnaire (ASSQ).The presence of stereotypic movements was confirmed, either via direct observation in clinic or by video review (HSS).If the subject passed the screening assessment, additional data was collected on the child and both parents via RedCap, an electronic web-based application for data capture and online questionnaires.The latter included the Stereotypy Severity Scale (Motor and Impairment scores) and comorbidity measures (Multidimensional Anxiety Scale for Children-MASC; ADHD-Rating Scale IV; Conner's Parent Rating Scale-CPRS; Repetitive Behavior Scale-Revised-RBS-R; Children's Yale-Brown Obsessive-Compulsive Scale-CYBOCS; and Social Responsiveness Scale-SRS) (see S1 File).
For this pilot study, we prioritized the study of "simplex" pCMS (children without known family history of affected first or second-degree relatives) to increase the likelihood of detecting de novo DNA sequence variants.Eligibility required participants to have: (a) confirmed complex motor stereotypies; (b) onset before age 3 years; (c) temporary suspension of movements by an external stimulus or distraction.Exclusion criteria included: (a) a total score >13 on the ASSQ or a prior autism spectrum disorder diagnosis; (b) historical evidence supporting the absence of intellectual disability; (c) seizures or a known neurological disorder; and (d) the presence of motor/vocal tics.The presence of inattentiveness, hyperactivity, or impulsivity (i.e., ADHD symptoms) and/or obsessive-compulsive behaviors were not exclusionary.

DNA whole-exome sequencing (WES)
DNA was collected from all children meeting eligibility criteria and from their parents, using the Oragene OG-500 collection kit and standard extraction protocols (DNA Genotek, Ottowa, Ontario, Canada).Exome capture and sequencing were performed at the Yale Center for Genome Analysis (YCGA), using the NimbleGen SeqCap EZExomeV2 capture library (Roche NimbleGen, Madison, WI, USA) and the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA).WES data from 853 unaffected parent-child trios (2,559 samples total) were obtained from the Simons Simplex Collection via the NIH Data Archive (https://ndar.nih.gov/edit_collection.html?id=2042).These children and their parents have no evidence of autism spectrum or other neurodevelopmental disorders [48].The same exome capture and sequencing platforms were used for these control samples.Overview of variant discovery and data analysis.We performed whole-exome DNA sequencing of 129 pCMS and 853 control parent-child trios.After quality control, 118 pCMS and 750 control trios remained for subsequent analyses.We performed a burden analysis, comparing the rates of de novo single nucleotide (SNVs) and insertion-deletion (indel) DNA variants between cases and controls.Next, we assessed the statistical significance of gene-level recurrence of de novo damaging variants in our pCMS group, identifying one high-confidence risk gene.Using the maximum likelihood estimation (MLE) method, de novo variant simulations, and TADA, we estimated the number of genes contributing to pCMS risk and used this estimate to predict the number of risk genes that will be discovered as more pCMS trios are sequenced.Finally, exploratory gene enrichment analyses were performed, assessing degree of overlap with gene sets harboring de novo damaging variants in other disorders, gene ontology terms, networks, and expression pattern clustering within certain brain regions across development. https://doi.org/10.1371/journal.pone.0291978.g001

Sequence alignment, variant calling, and quality control
Alignment and variant calling of the sequencing reads followed the latest Genome Analysis Toolkit (GATK) [49] Best Practices guidelines, as described previously [46].Variants were annotated using RefSeq hg19 gene definitions using ANNOVAR [50].Trios were omitted from downstream analyses if (a) genetic markers were not consistent with expected family relationships; (b) an excessive number of de novo variants were observed, or (c) if they were outliers in principal components analysis (see S1 File).De novo variants were called and confirmed as previously described [46] and as detailed in S1 File.

Mutation rate and gene recurrence
Within each cohort, we calculated the rate of de novo DNA mutations per base pair, using methods previously described [46].We included only those de novo variants present with a frequency of <0.001 (0.1%) in the ExAC v0.3.1 database [51] and compared de novo mutation rates in cases versus controls using a one-tailed rate ratio test (S1 File).Because our cases and controls were sequenced at different times, we took precautions to ensure that batch effects, including differences in sequencing depth and quality, did not influence our comparisons.First, we compared cases and controls that were sequenced on the same sequencing platform and using the same capture library.Second, we considered only "callable" bases, defined as loci with � 20x sequencing depth in all family members, with base quality � 20, and map quality � 30; these thresholds match those required for GATK and de novo variant calling.Third, for each cohort, we summed the "callable" base pairs in every family and used this number as the denominator for de novo rate calculations.In this way, we normalized the de novo rates to guard against any residual differences in sequencing depth or quality, and we compared these normalized rates between cases and controls.This method of comparing different batches of sequencing data has been used in several prior studies [45,46,52,53].
As described in our previous WES studies [45,46,52], we used the Transmitted And De novo Association (TADA-Denovo) test as a statistical method for risk gene discovery based on gene-level recurrence of de novo mutations within the classes of variants that we found enriched in pCMS [54,55].This test generates random mutational data based on each gene's specified mutation rate to determine null distributions, then calculates a p-value and a false discovery rate (FDR) q-value for each gene using a Bayesian "direct posterior approach."A low q-value represents strong evidence for pCMS association.See S1 File for details.

Estimating the number of pCMS risk genes
As described previously [46,52], we used a maximum likelihood estimation (MLE) method [56] to estimate the number of genes contributing risk to pCMS, based on the observed number of de novo damaging variants in our dataset.See S1 File for details of these calculations.
Next, we used previously described methods [46,52] to predict the likely number of risk genes that will be discovered as additional pCMS parent-child trios are sequenced by WES.These predictions utilize the estimated number of pCMS risk genes along with pCMS de novo mutation rates observed in our study to perform mutation simulations, followed by TADA-Denovo testing (see S1 File).

Gene set overlap
We used DNENRICH [57] (https://statgen.bitbucket.io/dnenrich/index.html) to test whether genes harboring de novo damaging mutations in our pCMS subjects were significantly enriched among genes harboring de novo damaging mutations in several neuropsychiatric disorders, including autism (ASD), schizophrenia (SCZ), Tourette's disorder (TD), obsessivecompulsive disorder (OCD), developmental disorders (DD), intellectual disability (ID), and epileptic encephalopathy (EE).Additionally, we were interested in the question of whether our pCMS cohort share genes harboring de novo damaging mutations with ASD probands having high versus low stereotypy scores.To approach this question, we assembled lists of genes harboring de novo damaging mutations in ASD probands from the Simons Simplex Collection (SSC) for whom stereotyped behavior scores (Stereotyped Behavior Score from the RBS-R, Repetitive Behavior Scale-Revised) were available.We looked for overlap between our pCMS cohort and those SSC ASD probands with stereotypy scores in the 90 th percentile (high stereotypies) and those scoring in the 10 th percentile (low stereotypies).These gene lists are compiled in S4 Table .Further details about gene list curation and DNENRICH methods can be found in S1 File.

Exploratory gene ontology, network, and spatiotemporal analyses
To determine whether genes harboring de novo damaging variants in pCMS may perform similar biological functions, we used the list of pCMS genes harboring de novo damaging mutations to identify overlap with gene ontologies using two tools: Enrichr (https:// maayanlab.cloud/Enrichr/)[58] and ConsensusPathDB (http://cpdb.molgen.mpg.de/).We identified gene ontology and pathway terms with an enrichment p-value < 0.05.We also used Ingenuity Pathway Analysis (IPA, Ingenuity Systems, http://www.ingenuity.com/) to identify potential gene networks based on this same gene list with the lowest likelihood of interactions due to chance.
Finally, using this same list of genes harboring de novo damaging variants in pCMS, we searched for possible enrichment of gene expression within certain brain regions across multiple developmental time periods, using data from the Brainspan Atlas of the Developing Human Brain [59,60].See S1 File.

Results
We performed WES on 129 pCMS parent-child trios (387 samples total) meeting inclusion criteria.WES data from 853 unaffected control trios, already sequenced from the Simons Simplex Collection, were pooled with our pCMS trios for joint variant calling.After quality control methods, our sample size for a burden analysis was 118 pCMS and 750 unaffected trios (Table 1

Increased burden of de novo damaging variants in pCMS
Based on work in other neurodevelopmental disorders, we expected to find an enrichment of de novo likely gene disrupting (LGD) variants (stop codon, frameshift, or canonical splice-site variants) in pCMS probands versus controls.We found a statistically significant increased rate of de novo LGD variants in pCMS cases, confirming our hypothesis (rate ratio [RR] 1.95, 95% Confidence Interval [CI] 1.04-3.50,p = 0.04).Furthermore, de novo variants predicted to be damaging (LGD plus missense variants with Polyphen2-HDIV score <0.957 and �0.453) were also over-represented in pCMS probands (RR 1.37, CI 1.05-1.76,p = 0.03).We did not detect a difference in mutation rates for de novo synonymous variants, or when all de novo variants (coding +/-non-coding) were considered together (Table 1

KDM5B is a high-confidence candidate risk gene in pCMS
Having established a higher rate of de novo damaging variants in pCMS probands, we next asked whether these variants cluster within specific genes.We identified one gene with more LGD variants are those altering a stop codon, canonical splice site, and frameshift indels.i "Unknown" variants are not included in the synonymous or nonsynonymous counts.j De novo mutation rates were calculated as the number of variants divided by the number of haploid "callable" bases (see Methods).
k The estimated number of de novo mutations per individual was calculated by multiplying the mutation rate by the size of the RefSeq hg19 coding exome (33, than one predicted damaging de novo variant in unrelated probands: KDM5B (Lysine Demethylase 5B) harbored two different LGD (stopgain) de novo variants in pCMS probands 1029-03 and 1050-03.Using TADA-Denovo [54] and previously established false discovery rate (FDR) thresholds, we found that KDM5B meets statistical criteria for a high-confidence risk gene (q<0.1) in pCMS (S3 Table ).

Approximately 184 genes contribute to pCMS risk
Based on the number of observed de novo damaging mutations in pCMS, the MLE method estimated the most likely number of pCMS risk genes to be 184 (S2 Fig) .Next, we used this estimate along with de novo mutation rates observed in pCMS trios to predict the likely number of these 184 risk genes that will be discovered in larger pCMS cohorts.Based on these simulations, WES of 500 trios should find 16 probable and 7 high-confidence risk genes; 1000 trios should find 51 probable and 26 high-confidence risk genes (S3 Fig).Using DNENRICH [57], we found significant overlap between genes harboring de novo damaging variants in pCMS (52 genes after excluding two genes with de novo damaging variants in controls, Table 2, S2 Table ) and several gene sets curated from the literature (S4 Table ).In particular, our pCMS cohort genes show significant gene overlap with autism probands with high stereotypy scores (5.8x enrichment, p = 0.047), Tourette's disorder (4.5x enrichment, p = 0.019), autism spectrum disorder (2.2x enrichment, p = 0.0055-0.0069).There was no significant overlap with OCD, schizophrenia, intellectual disability, developmental disorders, or epileptic encephalopathy (S4 Table ).

Exploratory gene ontology, network, and spatiotemporal analyses
Using this same list of 52 genes harboring de novo damaging variants in pCMS (Table 2, S2 Table ), we performed exploratory analyses to identify enrichment in biological, cellular, and molecular gene ontology terms.Using two enrichment tools, we identified significant enrichment for calcium ion transport and demethylation (adjusted p-value < 0.05 in either tool).By relaxing the statistical threshold to an unadjusted p < 0.05, we identified enrichment for these same gene ontology terms in results from both tools (S5 Table ).Finally, we performed an exploratory gene network analysis of these 52 genes using IPA and identified the potential importance of these genes in cell signaling, cellular assembly and organization (S5 Table ).Finally, mapping our pCMS de novo damaging variant genes onto the Brainspan Atlas of the Developing Human Brain gene expression data, we see nominal enrichment of gene expression in early mid-fetal cortex and striatum, with a trend toward enrichment in early fetal hippocampus, late mid-fetal cerebellum, and young childhood cerebellum (S5 Table ).

Discussion
Like prior studies of ASD, Tourette's disorder, and OCD, the current study demonstrates that the identification of de novo DNA coding variants will identify risk genes and provide a reliable entry-point into understanding the biology of stereotypies.We are studying otherwise typically developing children with stereotypies (primary CMS), as this may represent a more genetically homogenous group of individuals versus those with secondary stereotypies, thereby facilitating genetic discovery and insight into the biology of stereotypies more generally [61,62].Despite our small cohort size, we identified two de novo nonsense mutations in KDM5B in unrelated probands, and we show that finding two such independent mutations in our cohort is highly unlikely to be a chance occurrence.KDM5B is a lysine-specific demethylase that removes methyl groups from tri-, di-and monomethylated lysine 4 on histone 3. KDM5B acts a transcriptional repressor and has primarily been implicated in the pathogenesis of cancer [63].More recently, this gene has also been implicated in congenital heart disease risk, embryonic development, DNA repair, adult cognitive function, and muscle strength [64][65][66][67][68][69].KDM5B has been identified as high-confidence risk gene in ASD via detection of heterozygous de novo damaging variants in WES studies [47,55] and for developmental disorders more broadly [70].Individuals reported in the literature with intellectual disability/developmental delay harboring KDM5B mutations often show an autosomal recessive inheritance pattern, including inherited homozygous or compound heterozygous mutations in this gene [71,72], while heterozygous mutations occur more frequently in probands from the Deciphering Developmental Disorders Study [73].A recent study by Chen et al. [68] identified KDM5B as one of eight genes associated with adult cognitive function through rare protein-truncating and damaging missense variants.Consistent with prior reports, they identified a gene dose effect, whereby individuals with rare heterozygous protein-truncating variants showed higher adult cognitive function measurements compared to those with homozygous damaging mutations [68].Both this study and another recent report by Huang et al. [69] found a significant association between rare variants in KDM5B and hand grip strength, a phenotype related to muscle function, and one of several additional phenotypes found to be associated with KDM5B [68].Interestingly, KDM5B has a relatively high rate of protein truncating variants, with a rate of approximately 1 in 1,900 subjects in the UK Biobank sample.This is in contrast to most other genes linked to neurodevelopmental phenotypes.Considering the substantial pleiotropic and gene dosage effects reported in several studies to date, it is interesting to see how different mutations and inheritance patterns in this gene can lead to a spectrum of phenotypic outcomes, including ASD, ID/ DD, congenital heart disease, adult cognitive function, muscle strength, and now pCMS in childhood.Our team subsequently interviewed families harboring KDM5B mutations in our pCMS child probands and confirmed that there was no evidence for ASD, ID, or congenital heart disease.Expression of KDM5B is normally restricted to the brain and the testis [74].Within the brain, high expression levels of KDM5B are seen in the cerebellum (S4 Fig) , and expression across all brain regions is highest prenatally (S5 Fig) .Consistent with this data, a recent MRI study from our group found volumetric differences in the cerebellum of children with pCMS versus controls, and these changes correlated with Stereotypy Severity Scores [75].Similarly, cerebellar volume was correlated with stereotyped activity in a deer mouse animal model with repetitive behaviors [75].The identification of this risk gene in pCMS suggests that chromatin (dys)regulation of KDM5B target genes may be one contributing mechanism underlying stereotypies across diagnostic boundaries.Further studies are warranted to determine the downstream effects of these mutations in the developing brain.These studies are underway in our laboratory.
It is interesting that we find significant overlap between genes harboring de novo damaging mutations in pCMS and those reported in a recent study of Tourette syndrome (S4 Table ; 4.5x enrichment, p = 0.019).While we have reported approximately 25% of pCMS patients have co-existing tics [14], we find this overlap with Tourette despite excluding pCMS subjects with co-existing motor or vocal tics (see Methods).Enriched expression of pCMS genes in the cortex and striatum (S5 Table) is also consistent with widely believed involvement of these regions in Tourette syndrome.While OCD was not exclusionary in our pCMS study, we saw no significant gene overlap with OCD.Similarly, we found no significant overlap with SCZ, ID, DD, or EE.We did, however, find significant overlap between pCMS and ASD risk genes (2.2x enrichment, p = 0.006-0.007),despite no evidence of ASD in our subjects.
With regard to stereotypies in ASD, we curated lists of genes harboring de novo damaging mutations in SSC probands with the highest (90 th percentile) and lowest (10 th percentile) stereotypies, measured by Stereotyped Behavior Scores (SBS) from the RBS-R.KDM5B mutations were found only in SSC probands with high stereotypy scores, yielding 5.8-fold enrichment over expectation (p = 0.047) when compared against our pCMS genes (S4 Table ).To further examine the relation between de novo KDM5B mutations stereotypies in SSC ASD probands, we compared SBS scores in four probands with KDM5B mutations versus 364 agematched patients without (S6 Fig).Scores were higher in mutation carriers, but this did not reach statistical significance (p = 0.076), likely due to the low number of mutation carriers in this cohort.
In summary, we report an increased burden of de novo damaging heterozygous DNA coding variants in primary complex motor stereotypies.We identified one high-confidence risk gene for pCMS in our pilot cohort and estimate that there are 184 genes conferring risk for this phenotype.Whole-exome sequencing in parent-child pCMS trios provides a reliable way to make progress in gene discovery.Our exploratory analyses of genes harboring de novo damaging mutations in pCMS highlight several gene ontology terms (comprising biological processes, molecular functions, and cellular components), as well as brain regions and developmental time periods.These preliminary findings provide insights into possible etiologies of stereotypies, and this knowledge is a prerequisite for developing new treatments.Further sequencing and mechanistic studies are warranted to understand this phenotype, which has relevance across diagnostic boundaries.

Fig 1
Fig 1 provides an overview of the study methods.

Fig 1 .
Fig 1.Overview of variant discovery and data analysis.We performed whole-exome DNA sequencing of 129 pCMS and 853 control parent-child trios.After quality control, 118 pCMS and 750 control trios remained for subsequent analyses.We performed a burden analysis, comparing the rates of de novo single nucleotide (SNVs) and insertion-deletion (indel) DNA variants between cases and controls.Next, we assessed the statistical significance of gene-level recurrence of de novo damaging variants in our pCMS group, identifying one high-confidence risk gene.Using the maximum likelihood estimation (MLE) method, de novo variant simulations, and TADA, we estimated the number of genes contributing to pCMS risk and used this estimate to predict the number of risk genes that will be discovered as more pCMS trios are sequenced.Finally, exploratory gene enrichment analyses were performed, assessing degree of overlap with gene sets harboring de novo damaging variants in other disorders, gene ontology terms, networks, and expression pattern clustering within certain brain regions across development.

Fig 2 .
Fig 2. Rates of de novo variants in pCMS cases versus controls.Bar chart comparing the rates of de novo variant classes between pCMS cases (red) and controls (blue).Comparisons are between per base pair (bp) mutation rates, using a one-tailed rate ratio test.Statistically significant comparisons (p<0.05) are marked with asterisks.Error bars show 95% confidence intervals.https://doi.org/10.1371/journal.pone.0291978.g002

Table 1 . Distribution of de novo variants in pCMS cases and controls. De novo variant type a Variant counts Mutation rate (x10 -8 ) per bp (95% CI) j Estimated coding variants per individual (95% CI) k
a Variants were annotated with Annovar, using RefSeq hg19 gene definitions.b "All" includes coding and non-coding variants.c "Coding" variants include synonymous, nonsynonymous, nonframeshift, and those annotated as "unknown" by Annovar.d "Nonsynonymous" variants include all missense and LGD variants.e "Mis-D" are "probably damaging" missense variants with a Polyphen2 (HDIV) score �0.957.f Mis-P are "possibly damaging" missense variants with a Polyphen2 (HDIV) score <0.957 and �0.453.g Mis-B are "benign" missense variants with a Polyphen2 (HDIV) score <0.453.Two pCMS missense variants and five control missense variants had no prediction by Polyphen2 but were included in the "All Missense (Mis)" variant type.h Rates were compared using a one-sided rate ratio test.Rate ratios, 95% CI, and p-values that are statistically significant (p<0.05) are underlined and in bold.A rate ratio greater than one indicates a higher rate in pCMS versus controls.Also see Fig 2. Variants are listed in S2 Table.