Coupling of autism genes to tissue-wide expression and dysfunction of synapse, calcium signalling and transcriptional regulation

Autism Spectrum Disorder (ASD) is a heterogeneous disorder that is often accompanied with many co-morbidities. Recent genetic studies have identified various pathways from hundreds of candidate risk genes with varying levels of association to ASD. However, it is unknown which pathways are specific to the core symptoms or which are shared by the comorbidities. We hypothesised that critical ASD candidates should appear widely across different scoring systems, and that comorbidity pathways should be constituted by genes expressed in the relevant tissues. We analysed the Simons Foundation for Autism Research Initiative (SFARI) database and four independently published scoring systems and identified 292 overlapping genes. We examined their mRNA expression using the Genotype-Tissue Expression (GTEx) database and validated protein expression levels using the human protein atlas (HPA) dataset. This led to clustering of the overlapping ASD genes into 2 groups; one with 91 genes primarily expressed in the central nervous system (CNS geneset) and another with 201 genes expressed in both CNS and peripheral tissues (CNS+PT geneset). Bioinformatic analyses showed a high enrichment of CNS development and synaptic transmission in the CNS geneset, and an enrichment of synapse, chromatin remodelling, gene regulation and endocrine signalling in the CNS+PT geneset. Calcium signalling and the glutamatergic synapse were found to be highly interconnected among pathways in the combined geneset. Our analyses demonstrate that 2/3 of ASD genes are expressed beyond the brain, which may impact peripheral function and involve in ASD comorbidities, and relevant pathways may be explored for the treatment of ASD comorbidities. mutations in chromatin modelling and transcription factors can contribute to altered developmental trajectory in brain formation, synaptogenesis, and organogenesis. Alterations in synaptic and ion channel genes may lead to perturbed action potentials and imbalance of E/I synapses that can contribute to the core ASD symptoms and CNS comorbidity such as epilepsy and motor functions. However, alterations in cellular, hormonal signalling and gene regulation can contribute to peripheral co- morbidities such as altered facial phenotypes and behaviour. Calcium signalling appears acting as a hub among the CNS and peripheral pathways.

Introduction can be accessed via Bioconductor in R, using the code provided in the supplementary information. The networks from Huri can be accessed from NDEX (http://www.ndexbio.org/#/user/69e7b21d-8981-11ea-aaef-0ac135e8bacf) Information from the scoring systems used can be found in the supplementary information sections of their respective publications. Differentially Expressed genes can be found in supporting information from respective publications and in Geo2r (GSE42133 for Pramparo, GSE28521 for Voineagu). Code is provided for data access to HPA and GeneOverlap and generating figures. Genelists were obtained from pysgenet(http://www.psygenet.org/web/ PsyGeNET/menu), Harmonizome(https:// maayanlab.cloud/Harmonizome/dataset/GWASdb +SNP-Disease+Associations) and the supplementary information from respective publications.

Datasets and shortlisting of the ASD risk factors
Five datasets were chosen as the starting point of this study [28,[31][32][33][34], and high-ranking genes were shortlisted from each dataset with defined criteria ( Table 1, Fig 1). For the Exome Aggregation Consortium (EXAC) with exome sequencing data from~60,000 cases (S1 Table), a high intolerance to mutation of pLi � 0.9 was applied, where pLi indicated the level of intolerance to mutations in a given gene, with many containing loss of function variants. In the Krishnan's geneset (S2 Table) created using human disease databases of GAD, OMIN, HUGE and SFARI in December 2013, along with a brain-specific functional interaction network, a qvalue of �0.05 was used, where q-value was the probability of the gene being an ASD risk  candidate after multiple testing for false positivity. For the Zhang's geneset (S3 Table) made using CNS microarray expression data from six brain regions derived from mice and validated using data from exome sequencing studies [11,14,17,29,36,37], mutations of the human homologues with a (DAMAGE) score of D�0 were shortlisted, where positive D-score was a measure of a mutation's likeliness to be associated with ASD. For the Duda's list (S4 Table) which was created from de novo mutation analysis, protein-protein interaction and phenotype information, the top 10% of genes in the list were selected, as this was used as a cut off in the original publication. Finally, for the SFARI collection, all genes up to January 2019 were included (S5 Table).

Overlap of shortlisted genes with SFARI
After shortlisting of high-ranking genes from each dataset, the Jvenn program was applied to identify the common ASD risk genes among the 5 sources [38]. A list of 519 genes, which were overlapped among 4 of 5 sources, was extracted ( Table 1, Fig 1). Among them, 319 genes appeared in the SFARI database were taken forward for expression and enrichment analyses.

Filtering using gene expression analysis data
Since ASD has a wide array of peripheral co-morbidities, we believe that some ASD genes are expressed in peripheral tissues. To explore this possibility, mRNA expression data were downloaded from the GTEx consortium (v7) in the form of median TPM (Transcripts Per Million) values from all tissues [39]. We excluded low expression genes based on the average of the median (TPM<3) in both CNS and PT groups. Additionally, we removed genes with a SFARI score of 6, which were not likely to be associated with ASD. The expression data was applied to the 319 selected genes and uploaded to Morpheus (https://software.broadinstitute.org/ morpheus) to generate heatmap (with settings for clustering: hierarchical, euclidean distance, linkage method complete, clustering based on columns). Genes with extremely low levels of mRNA expression, at an average TPM�3 in both the brain and peripheral tissues, were excluded from the subsequent analyses (S8 Table).

Overview of protein expression levels
As an additional level of verification, we used HPA data (v19.3) obtained via HPA analyze [40], a Bioconductor program that runs in R, to assess protein expression levels across multiple tissues among the two genesets, and used ggplot2 [41] to visualise the data.

Tissue specific interaction networks
The expression of ASD genes in other tissues indicates that they may interact with other factors in tissue-specific networks. To explore this, we used tissue-specific networks generated from Huri to see if ASD candidate genes were present in other networks, and if they had any interacting partners within these networks.

Functional enrichment analysis of the final geneset
The final list of ASD common risk factors was analyzed through STRING program for pathway analyses, except for SHANK3 that was misidentified as HOMER2 in the current human database of STRING v11. The resulting GO Terms (Biological Processes, Cellular Components, and Molecular Function) and KEGG pathways were downloaded. The same list was also loaded into Cytoscape [42] to identify sub-clusters of genes in interaction network.

Analysis of genesets for co-morbid phenotypes
To examine potential association of the ASD genesets with occurring co-morbidities, an overrepresentation analysis (ORA) was carried out using the tool WebGestalt [43] to assess other co-morbid conditions linking to the ASD genesets. The Human Phenotype Ontology database was used for the analysis [44]. The top 50 terms were used as a cut off to balance between the co-morbidities reported in ASD and to ensure that the final lists are not too broad and overly diluted.

Comparison of shortlisted ASD geneset with ASD expression studies
To examine the utility of the ASD geneset in the literature of ASD gene expression studies, we compared the ASD geneset with DEGs reported from post-mortem brain [45], blood [46][47][48] and GI tissue [46,49], as well as iPSC-derived cell models [50][51][52][53][54][55], to see if any of the ASD genes were significantly up or downregulated. The DEGs were obtained using autism versus control group with FDR (adj p-value) at 0.05, except for Voineagu [56] and Pramparo [47], which we used Geo2R [57] using autism versus controls as groups to obtain DEGS at 0.05 FDR (adj p-value).

Expression of genes across brain development and sex-bias
We used CSEA [58] tool to check for enrichment of our genes for human gene expression across developmental periods and brain tissues in the Brainspan dataset. We also used the genesets from the publications [59,60] to see if any of our shortlisted genes had sex-bias expression in prenatal stages [59], or if there was sex-specific splicing in our geneset [60] using the bioconductor package GeneOverlap [61].

Comparison with psychiatric and peripheral diseases
Our functional analyses of the ASD genesets showed enrichment for processes in areas relating to cardiac function and insulin. In addition, it is known that many ASD risk genetic factors share molecular pathways with other psychiatric conditions. To future explore this, we downloaded genesets for schizophrenia (SCZ), bipolar disorder (BP), and major depressive disorder (MDD) from Psygenet [62]. We also downloaded genelists from Harmonizome database [63,64] for 4 peripheral conditions of arrythmia (ARY), type 1 (T1D) and type 2 (T2D) diabetes based on our results, and inflammatory bowel disease (IBD) based on co-morbidity of gastrointestinal issues with ASD for comparison of ASD risk factors in the current study.

Results
To explore the hypothesis that ASD candidate genes were recurrent across different scoring systems, we selected the SFARI dataset and four other published scoring systems for the current study ( These shortlists were subsequently analysed by Jvenn web tool for overlapping ASD genes [38], and 114 genes were found to be shared by all five shortlists (Fig 2, S6 Table). Considering that some well-known ASD genes such as SHANK3 and CHD8 were excluded from the 114 geneset, we adjusted to include the genes that were overlapped in 4 of the 5 scoring systems, which resulted in a total to 519 ASD risk genes. Further examination of the 519 genes for the presence in the SFARI dataset has narrowed down the list to 319 SFARI genes, which were highly ranked in >3 of 4 other independent studies besides the SFARI dataset (Table 1, Fig 1).

Expression pattern of ASD genes
To assess the gene expression across different tissues, we next examined body-wide expression of the 319 genes using the GTEx dataset containing standardized mRNA expression in units of TPM. This further reduced the ASD genes to 292 genes to filter out low abundance genes with TMP�3 in both CNS and PT (S7 Table).
The expression levels of the 292 genes were presented in the Heatmap, with high expression coloured in red, low expression in green and no expression in grey. Analysis of the 292 genes with GTEx (S8 Table) showed that the ASD genes were clustered into 2 groups (Fig 3); 91 genes were mainly expressed in the CNS with TMP<3 in PT, whereas the remaining 201 genes were ubiquitously expressed in both the CNS and PT.
A similar expression profile for proteins was observed in HPA data (Figs 4 and 5), whereas CNS geneset showed protein expression mostly in the CNS (Fig 4), the CNS+PT geneset showed protein expression in both the CNS and other tissue types ( Fig 5). We also identified that some of the ASD factors in the CNS+PT geneset (S1-S19 Figs) were indeed interacting with other proteins in tissue-specific networks ( Table 2). Interestingly HDAC4 (S6) in the colon and STX1A (S15) in the pituitary gland were found to be tissue-specific genes in these organs.

Enriched neurodevelopment, synaptic function, and ion transport in the CNS geneset
To assess the biological context of ASD factor, the STRING (v 11) program was used to analyse two groups of the ASD factors, respectively [65]. The 91 CNS-specific geneset gave rise to 305 "Biological Processes" (S10 Table), 73 "Cellular Components" (S11 Table) and 62 "Molecular Function" (S12 Table). Go term results revealed an enrichment for neuronal development and synaptic function in the CNS geneset ( Fig 6A).

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation GRIA1, GRIN1, GRIN2A, GRIN2B, NOS1, NOS1AP) were the top pathways in the CNS geneset (Table 3). Together, these data suggest that the 91 CNS-specific ASD risk factors are involved in regulation of brain development, E/I balance and calcium signalling, which are closely related to the ASD core features, and to the CNS comorbidity such as epilepsy, intellectual disability and sleeping disorders.
However, the most prominent feature of the CNS+PT ASD geneset was transcription regulation, and this included Nucleus (128/200, FDR = 1.73E-14) in the "Cellular Components",

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation  Table 3. Key Go terms of the CNS-specific and ubiquitous ASD genesets.   In consistency with this, STRING analysis of the combined 291 ASD genes gave rise to 47 "KEGG pathways" (S18 Table), 677 "Biological Processes" (S19 Table), 149 "Cellular Components" (S20 Table) and 177 "Molecular Function" (S21 Table). These data suggest that the CNS+PT geneset of ASD candidate genes may influence not only the core symptoms via CNS development and synaptic function, but also the comorbidities through dysregulated gene expression and hormonal signalling in the peripheral organs.
To further investigate the interconnectivity among the KEGG pathways (S18 Table), we constructed an interaction matrix with the combined ASD geneset KEGG pathways (S22 Table), similar to a previous analysis [30]. The calcium signalling (27/47), MAPK Signalling (12/47) and cAMP signalling (6/47) were identified as highly interconnected signalling pathways. Pre-synaptically, the genes in calcium signalling also appeared frequently in the other pathways, particularly the genes encoding the pore-forming subunit of voltage-gated calcium channel (CACNA1A, CACNA1B, CACNA1C, CACNA1D), the I3P receptor IPTR1, PLCB and the calmodulin proteins (CAMK2A, CAMK2B) are connected to the neurotransmitter releases PLOS ONE (Table 4). Post-synaptically, Glutamatergic synapse (8/47) had the largest number of interactions with other pathways, which was followed by Dopaminergic synapse (7/47). The NDMAR (GRIN2A, GRIN2B, GRIN1) and AMPA (GRIA1) receptors were also the highly interconnected notes in neural and synaptic, calcium and cAMP signalling pathways. In summary, this overlap analyses demonstrated that the E/I balance, and calcium signalling are the most significant pathways linking to the ASD core symptoms, and transcriptional regulation to the ASD comorbidities.

Enriched epilepsy/seizures in the CNS geneset and congenital abnormalities/developmental delay in CNS+PT geneset
The results from WebGestalt (Fig 7, Supplementary links 1 and 2) showed that epilepsy and seizures were enriched in 19 of the top 50 phenotype terms and movement disorders in 9 of the 50 terms in the CNS geneset. In the CNS+PT set, there was enrichment for behaviour issues (9/50) such as self-injurious (FDR = 1.04E-07), aggressive behaviour (1.04E-11), and congenital abnormalities of the face (25/50 items).

High proportion of the ASD genes were dysregulated in ASD
We compared the ASD geneset with DEGs between ASD and controls in the literature and found that 201 of the 292 genes in our ASD geneset were dysregulated, at least once across multiple ASD gene expression studies (Table 4, S23 Table). Recurrent DEGs such as RELN, FOXP1, GAD1, NRXN1, FOXG1 and CAMK2A were present in multiple studies (S23 Table). These data suggest that not only mutations but also dysregulated expression of the ASD risk genes can be linked to the development of ASD.

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation

ASD genes enriched in cortex development and sex-bias brain expression
We compared ASD shortlist with CSEA dataset and found that 280 of the 292 ASD genes were mapped to CSEA, with an enrichment in the cortex across most timepoints (S24 Table), with most significant enrichment in the early-mid fetal stage of brain development at pSI 0.05 (pvalue = 5.427e-16, FDR = 3.256e-14) pSI 0.01 (p-value = 5.560e-07 FDR = 3.336e-05) and pSI 0.001 p-value = 8.912e-04, FDR = 0.053) (Fig 8D). In addition, there was also enrichment of genes in the striatum at the early-mid fetal stage (p-value = 1.872e-08, FDR = 2.808e-07), in the cerebellum at childhood (p-value = 7.576e-07, FDR = 7.576e-06) and adolescence (pvalue = 9.970e-04, FDR = 0.005), in the thalamus in early fetal (p-value = 0.010, 0.043) and neonatal period (p-value 4.333e-04, FDR = 0.003), and in the amygdala from mid-early (pvalue = 0.009, FDR = 0.042) to mid-late (p-value = 0.017, FDR = 0.064) fetal stages. These data suggest that the ASD geneset plays essential role in brain development, especially corticogenesis. To investigate if the ASD genes are sex-related, we compared the ASD geneset with genes which were known to have sex-biased expression in prenatal male and female (S25 Table). Significant correlations were found with genes bias for female cerebellar cortex, dorsolateral

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation prefrontal cortex, medial prefrontal cortex, orbital frontal cortex, inferolateral temporal cortex and caudal superior temporal cortex (Fig 8A), and with genes bias for male primary somatosensory cortex, mediodorsal nucleus of thalamus and striatum (Fig 8B). In addition, 34 of the 292 genes were found to have sex-biased gene-splicing in at least one brain region (Fig 8C, S25  Table). These genes are likely to contribute to sex-bias occurrence of the ASD.

ASD genes present in other conditions
We next compared the ASD genes with other neuropsychiatric conditions, including schizophrenia, bipolar and major depression, and peripheral conditions such as arrythmia, inflammatory bowel disease and type 1 and type 2 diabetes (Fig 9, S25 Table), and identified 94 genes which were identified in Pyschgenet database (Fig 9A), and involved in synaptic transmission. The remaining ASD-unique genes were enriched for chromatin organization and gene expression. We also found significant overlaps of the ASD genes with factors associated with psychiatric and peripheral conditions (Fig 9B), including a large overlap with type 2 diabetes (134/

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation 292, FDR = 3e-42). These data suggest a common disturbance in neuronal communication with CNS other neuropsychiatric disorders and gene dysregulation with peripheral conditions.

Discussion
It is becoming apparent that ASD genes could influence other organ systems. This is reflected by the many co-morbidities occurring outside the CNS such gastrointestinal issues, metabolic disorders, auto-immune disorders, tuberous sclerosis, attention-deficit hyperactivity disorder, and sensory problems associated motor problems. However, little attention has been given to related organs of major comorbidities. Here we have identified 319 overlapping ASD candidates among the four independent scoring systems and the SFARI database [28, 31-34]. We also introduced gene expression using the GTEx database [39,71], which consists of mRNA data of 53 human tissues from approximately 1000 individuals at the age of 21-70. This resulted in a shortlist of 292 common ASD candidate genes with mRNA expression at TPM �3 transcripts. This also categorized the ASD factors into 2 genesets, the CNS-specific geneset of 91 genes (with a TMP<3 in PT) and the CNS+PT geneset of 201 genes. This was validated at the protein level across these tissues in the human protein atlas (HPA) dataset and Huri, showing that that ASD genes are not only expressed in other organs outside the brain, but also appear to interact with other proteins in tissue-specific networks.
STRING analyses show that the CNS geneset is enriched for nervous development, glutamatergic/GABAergic synapses, and calcium signalling. Phenotype analysis also showed high enrichment for epilepsy and seizures. Both results support the hypothesis that disruption of E/ I balance during CNS development as a major feature of ASD [72], which are related to CNS co-morbidities such as epilepsy occurring in 30% of ASD at severe end of the spectrum.
The expression of 201 ASD candidate genes in CNS+PT suggest that ASD genes may influence not only the CNS but also peripheral systems in the body. This geneset is enriched for nervous development and synapse, as well as for chromatin organisation and gene regulation, which are consistent with previous reports from exome sequencing studies [12,14,16,17,29]. Therefore, the genes involved in chromatin organisation and gene regulation could have an influence in peripheral co-morbidities. Many of the genes in our ASD geneset also show dysregulated expression in multiple studies (Table 4) pertaining to cortical tissue and iPSC derived models, and in the few studies carried on gene expression in the gastrointestinal tract of ASD.
Strong candidate genes such as CHD8, POGZ and DYRK1A were previously reported to be associated with not only autism but also gastrointestinal issues, facial dysmorpisms, visual and feeding problems [73][74][75][76]. Indeed, facial dysmorphism is also a recurrent phenotype that emerges among various subgroups of autistics with known genetic mutations [77][78][79]. Some enrichment of ASD genes reported to overlap with heart development (S10-21) and congenital heart deformation [80,81], and a high rate of ASD diagnosis was reported among children with congenital heart defects [82]. POGZ and ANKRD11 are also present in many tissue interaction networks ( Table 2) which are involved in neural proliferation. The POGZ is a zinc finger protein interacting with the transcription factor SP1 [83], and ANKRD11 is a chromatin regulator, modulating histone acetylation and inhibiting ligand-dependent activation of transcription [84]. ANKRD11 mutations have been associated with diseases with distinctive craniofacial Overlap of ASD genes with other conditions. A) Comparison of the ASD geneset with Pyschgenet (SCZ, BP, MDD) defines a network of 94 overlapping genes (green) and 198 ASD-unique genes (blue). B) matrix of overlaps between ASD/Psychiatric (SCZ, BP, MDD) and peripheral (IBD, ARY. T1D and T2D) conditions. C) Network of ASD genes and overlaps with peripheral (IBD, ARY. T1D and T2D) conditions, with orange circle for non-overlapping genes. IBD-Inflammatory bowel disease, Ary-Arrythmia, T1D -type 1 diabetes, T2D -type 2 diabetes. SCZ-schizophrenia, BP-bipolar disorder, MDD-major depressive disorder. https://doi.org/10.1371/journal.pone.0242773.g009

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation features, short stature, skeletal anomalies, global developmental delay, seizures and intellectual disability [85].
The strong enrichment for neuronal processes and functions in tissues beyond the brain is of curious interest, but researchers are starting to explore other aspects of ASD that could have an impact on other organs such as the heart via the sympathetic and parasympathetic nervous systems [86], and the gastrointestinal tract via the enteric nervous system [87]. In fact there is growing evidence that heart rate is affected among Autistics [88][89][90][91], along with evidence that ASD genes could be involved in aspects of gastrointestinal development and function [74,[92][93][94][95][96][97]. There is a potential for more work on how the parts of the brain control heart rate and gastrointestinal, how they are altered in autism, or even if reported autonomic issues in co-morbidities such gastroesophageal reflux [98,99] hold true in the autistic population as well. The expression of proteins such as STX1A, SNAP25 and FOXP1 expressed in endocrine tissues could be of interest in ASD, given how knockouts in these genes can impact the development and function of certain parts of the endocrine system [100][101][102] and how genes involved in neurotransmission could also be involved in secretion of hormones [103].
Another unaddressed question is if the peripheral nervous system and peripheral organs are affected by mutations in addition to the CNS. Amongst the results we found enrichment for neuromuscular and cardiac function in STRING analysis (S9-20). Some animal models such as FOXP1, SHANK3, NOS1 and CHD8 [74,92,94,96,97] have been developed for functional analysis of ASD candidate genes in the gastrointestinal system, which indicates that ASD genes may play an important role in this organ [95], and yet most research have been mainly focused on the brain in both human and animal models. A greater utilisation of the animal models to explore other systems such as the cardiac and gastrointestinal systems would be welcome.
In fact, 88/292 genes in our ASD geneset already have existing genetic mouse models, as well as rescue models for 28/292 genes according to the SFARI database (July 2020) at the time of writing. They are helpful in understanding function of these genes, and may also assist drug development to remediate related pathways, not just in the brain, but also throughout the peripheral nervous system that connects to co-morbidity organs, and even in the peripheral organs themselves.
The interconnectivity analyses from the current study reveal calcium, MAPK and glutamatergic signalling as three highest interconnected pathways, all are also involved with each other based on the interaction matrix. This is in line with a previous publication that ASD factors are converged upon MAPK and calcium signalling [30]. It is worth to note that MAPK signalling is also interlinked with calcium signalling in this study. Ten of the 14 MAPK pathway members, CACNA1A, CACNA1B, CACNA1C, CACNA1D, CACNA1G, CACNA2D1, CACNA2D3, CACNB2, ERBB4 and PRKCB, are overlapped with the calcium signalling (Table 5), and 8 of them are calcium channels. Furthermore, calcium channels appear in 16 top KEGG pathways including glutamatergic, GABAergic, dopaminergic, cholinergic and serotonergic synapses of the ASD genes (Table 5). Our results add to the evidence that calcium and glutamatergic signalling are the significant components in ASD pathways.
Calcium signalling is a highly integral system in the human body and is increasingly shown to be implicated in ASD [104,105]. In neurons, the arrival of the electric current induces Ca 2+ influx via voltage-gated calcium channels, and this triggers exocytosis and neurotransmitter release [106]. The voltage-gated calcium channels are tetramers containing three auxiliary subunits (β, α2δ, γ) and one pore-forming α1 subunit. Eight calcium channels (CACNA1A, CAC-NA1B, CACNA1C, CACNA1D, CACNA1G, CACNA2D1, CACNA2D3, CACNB2) are present in our 291 ASD geneset. Experiments using fibroblasts from monogenic [107] and non-syndromic autistic subjects [108] demonstrate aberrant calcium signalling mediated by I3PR. In Table 5. Converging ASD candidate genes on E/I balance and calcium signalling pathway.

PLOS ONE
Coupling of autism genes to tissue-wide expression and dysfunction of synapse, signalling and gene regulation Remarkably, various calcium signalling members are commonly appeared in other pathways that are perturbed in ASD (Fig 5C-5D, Table 5). For example, 11/14 molecules in Circadian entrainment, 10/14 in MAPK signalling, 7/15 in Wnt signalling, 10/10 in Oxytocin signalling, 9/12 in Aldosterone synthesis and secretion, 8/17 in Retrograde endocannabinoid signalling, 6/11 in insulin secretion, were found in calcium signalling pathway, which could be of great interest given their role in the pancreas [131]. In the top KEGG pathways identified from the CNS geneset (CNS), the CNS+PT geneset and combined (Com) 292 ASD candidates, calcium signalling members (bold) appeared in all other KEGG pathways. Therefore, calcium signalling is likely to play a major role not only in ASD but also in ASD comorbidities.

Limitations
Like other meta-analyses, biases and limitations should be considered in pathway analyses. We first assumed that any significant ASD candidate genes shall appear in 4/5 independent datasets. However, some strong candidate genes, such as FMR1 and PTEN, did not appear in the final 291 gene list. While they have much evidence to support their role in ASD, the genes did not overlap due to differences in ranking criteria of different systems. For example, PTEN was in the top ranking in SFARI, Zhang and EXAC, however it was not shortlisted from Duda's and Krishnan's score systems. Likewise, FMR1 was ranked top in Duda's, Zhang's and SFARI, but did not pass the threshold in EXAC or Krishnan's dataset. The biological influence of FMR1 and PTEN in ASD is very significant. For example, FMR1 is known to target 126/291 ASD candidates in the current study, and PTEN is a target of another strong ASD candidate, CHD8, on our list.
We also filtered out 200 genes (S26 Table), which were overlapped by four independent scoring systems but not existed in the SFARI database. 66 of them are targets of the FMR1, and 36 are targets of CHD8. Some of them like BCL11B, DPF1, ETV1, NFASC, PAK7, PLXNC1, SCN3A, SLC17A7 and SLITRK1 were also dysregulated in cerebral organoids derived from autistics with large head circumference [50], while others (NEDD4L, RICTOR, SLC8A1, CELF2, MLLT3, PPFIA2, EPHA7, LRRC4C and RORB) were identified from very recent single cell sequencing of autism cortical tissue [68]. This suggests they would still be modulated as downstream targets in some cases of ASD.
In the current study we hypothesized that a strong ASD candidate gene should be highly expressed in the brain and/or relative PT with strong comorbidity. However, this assumption could be potentially challenged by the following scenario. (1) If an ASD gene were development-specific but had low abundance in adult post-mortem tissues, it would be filtered out by GTEx dataset derived from donors of 20-79 age; (2) If a gene were highly expressed in a small subset of specific nuclei of the brain, which might appear not abundant in the total brain RNA; (3) If an ASD gene were modulated by ethnic genetic background, as 85.2% of GTEx donors were Caucasian European subjects, 12.7% were African-American, 1.1% were Asian and 0.3% were American-Indian. As for protein expression, the HPA is an evolving database, and genes with no protein expression data now can be updated in future releases. The same goes for cell types in tissue, which will also be incomplete, given the myriad cell types that are present in all organs. The use of single cell sequencing technology and flow cytometry will be useful in addressing the issue [132,133].
It is interesting that many of the ASD genes are found to have sex-bias expression and/or splicing, which may be associated with sex-bias diagnosis of ASD [134][135][136]. We are limited in understanding its biological base by the lack of transcriptomic analysis of ASD genes. Many of the transcriptomic studies have focused on male subjects, or the ratio of female samples are too few to illustrate any sex-specific effects in ASD. It is suggested that future transcriptomic studies should incorporate an even number of male and female subjects in both groups of case and control. Organoid cell lines may also be created from both sexes to make up for this short coming.
It would be desirable if gene expression datasets are available to compare from control and ASD patients at different time points not just in the brain, but also across the entire body. A recent publication [137] has proposed a paediatric cell atlas, which would collate and characterise gene expression at the single cell level across multiple tissues and time points of human development from birth to adulthood. This would be a fantastic initiative if it fully goes ahead, as such a resource would bring great insight into the co-morbidities associated with ASD. Single-cell expression data from cortical tissue of ASD subjects has become available to researchers recently [68], which could be a good starting point to analyse cell-types of interest and explore ASD heterogeneity. The availability of iPSCs from different ASD cases allows to culture and analyse different cell types in both CNS and peripheral organs, which are not easily accessed by conventional methodologies. This can be useful to explore how ASD genes may influence the biological processes during brain development, neuronal function, as well as cells of comorbidity peripheral organs.

Conclusion
By utilizing multiple scoring systems, we have identified recurrent ASD candidate genes, with convergence on multiple pathways and processes involved in ASD (Fig 10). The use of GTEx and HPA data also gives a glimpse into their body-wide expression patterns, which has not been explored previously using ASD gene lists, which we have done so in this study. The bioinformatic analyses of CNS-specific and/or CNS+PT candidate genesets enable us to pinpoint CNS development, E/I balance and calcium signalling as important pathways involved in not just ASD but also brain comorbidity such as epilepsy. The analysis of CNS+PT geneset suggests chromosomal organisation/transcription regulation, calcium-interconnected MARK/ WNT and secretion as major pathways with disruptive behaviour, developmental delay as well From fertilization to full development, mutations in chromatin modelling and transcription factors can contribute to altered developmental trajectory in brain formation, synaptogenesis, and organogenesis. Alterations in synaptic and ion channel genes may lead to perturbed action potentials and imbalance of E/I synapses that can contribute to the core ASD symptoms and CNS comorbidity such as epilepsy and motor functions. However, alterations in cellular, hormonal signalling and gene regulation can contribute to peripheral comorbidities such as altered facial phenotypes and behaviour. Calcium signalling appears acting as a hub among the CNS and peripheral pathways.
https://doi.org/10.1371/journal.pone.0242773.g010 as congenital abnormalities. Calcium signalling is highly interconnected amongst pathways, which could be informative in exploring complications and co-morbidities associated with ASD where calcium signalling could be involved, especially those subsets of autistic individuals who harbour mutations in these genes that can result in channelopathies.
Supporting information S1 Fig. ASD genes are present in tissue-specific interaction networks. Yellow nodes highlight ASD genes, green are tissue-specific genes, red edges are interactions between ASD genes and other partners. The graphs are in order of appearance; Adrenal Gland, Adipose subcutaneous, Artery aorta, Artery coronary, Artery tibial, Colon sigmoid, Colon transverse, Esophagusmucosa, Esophagus-muscularis, Heart-left atrial appendage, Heart-left ventricle, Kidney, Lung Yellow nodes highlight ASD genes, green are tissue-specific genes, red edges are interactions between ASD genes and other partners. The graphs are in order of appearance; Adrenal Gland, Adipose subcutaneous, Artery aorta, Artery coronary, Artery tibial, Colon sigmoid, Colon transverse, Esophagusmucosa, Esophagus-muscularis, Heart-left atrial appendage, Heart-left ventricle, Kidney, Lung, Muscle-skeletal, Pancreas, Pituitary, Small Intestine-terminal ileum, Spleen, Stomach. (TIF) S18 Fig. ASD genes are present in tissue-specific interaction networks. Yellow nodes highlight ASD genes, green are tissue-specific genes, red edges are interactions between ASD genes and other partners. The graphs are in order of appearance; Adrenal Gland, Adipose subcutaneous, Artery aorta, Artery coronary, Artery tibial, Colon sigmoid, Colon transverse, Esophagusmucosa, Esophagus-muscularis, Heart-left atrial appendage, Heart-left ventricle, Kidney, Lung, Muscle-skeletal, Pancreas, Pituitary, Small Intestine-terminal ileum, Spleen, Stomach. (TIF) S19 Fig. ASD genes are present in tissue-specific interaction networks. Yellow nodes highlight ASD genes, green are tissue-specific genes, red edges are interactions between ASD genes and other partners. The graphs are in order of appearance; Adrenal Gland, Adipose subcutaneous, Artery aorta, Artery coronary, Artery tibial, Colon sigmoid, Colon transverse, Esophagus-mucosa, Esophagus-muscularis, Heart-left atrial appendage, Heart-left ventricle, Kidney, Lung, Muscle-skeletal, Pancreas, Pituitary, Small Intestine-terminal ileum, Spleen, Stomach. (TIF) S1  Table. The expression data of the 292 genes, this is found in the supporting excel file. (XLSX) S10 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S11 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S12 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S13 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S14 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S15 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S16 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S17 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S18 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S19 Table. STRING results for the CNS, CNS+PT and all 292 genes, these are found in the supporting excel files. (XLSX) S20  Table. Gene lists for sex-biased expression in male and female prenatal brain, sexbiased splicing, and psychiatric conditions from Psygenet, and peripheral conditions from Harmonizome database. These are found in the supporting excel file.