Enrichment of Conserved Synaptic Activity-Responsive Element in Neuronal Genes Predicts a Coordinated Response of MEF2, CREB and SRF

A unique synaptic activity-responsive element (SARE) sequence, composed of the consensus binding sites for SRF, MEF2 and CREB, is necessary for control of transcriptional upregulation of the Arc gene in response to synaptic activity. We hypothesize that this sequence is a broad mechanism that regulates gene expression in response to synaptic activation and during plasticity; and that analysis of SARE-containing genes could identify molecular mechanisms involved in brain disorders. To search for conserved SARE sequences in the mammalian genome, we used the SynoR in silico tool, and found the SARE cluster predominantly in the regulatory regions of genes expressed specifically in the nervous system; most were related to neural development and homeostatic maintenance. Two of these SARE sequences were tested in luciferase assays and proved to promote transcription in response to neuronal activation. Supporting the predictive capacity of our candidate list, up-regulation of several SARE containing genes in response to neuronal activity was validated using external data and also experimentally using primary cortical neurons and quantitative real time RT-PCR. The list of SARE-containing genes includes several linked to mental retardation and cognitive disorders, and is significantly enriched in genes that encode mRNA targeted by FMRP (fragile X mental retardation protein). Our study thus supports the idea that SARE sequences are relevant transcriptional regulatory elements that participate in plasticity. In addition, it offers a comprehensive view of how activity-responsive transcription factors coordinate their actions and increase the selectivity of their targets. Our data suggest that analysis of SARE-containing genes will reveal yet-undescribed pathways of synaptic plasticity and additional candidate genes disrupted in mental disease.


Introduction
Neuronal plasticity and memory formation require changes in gene expression that are triggered by synaptic activity. The nature and organization of this response is the subject of intense research, and a number of transcription factors (TF) have been identified in recent years as necessary for long-term memory consolidation and storage. The Ca 2+ /cAMP response element-binding protein (CREB) was initially identified as the main interlocutor in the dialogue between the synapse and the nucleus [1]. Later studies revealed the complexity of this process and implicated other transcription factors, including the serum response factor SRF [2], MEF2 [3] and Npas4 [4]. The availability of efficient methods for gene expression analysis has also contributed with a large collection of mRNAs, possible targets of these TF, whose expression is modulated by activity and experience [5,6].
The large number of potential targets for these factors does not facilitate a model that clarifies how TF establish a coordinated response and regulate transcription for efficient remodeling of neuronal connections. The description of a 100 bp cis-regulatory enhancer element containing a cluster of CREB, MEF2 and SRF binding sites suggests a mechanism that might help to explain the selectivity and coordination of the activity-dependent transcriptional response. This sequence, termed SARE, was identified in the gene that encodes the activity-regulated cytoskeleton-associated protein (Arc) [7]. The SARE sequence is conserved in mammalian Arc regulatory regions; it is sufficient to drive a rapid transcriptional response following synaptic activation and to reproduce, both in vitro and in vivo, the endogenous Arc activation pattern [7]. Despite the novelty and potential repercussion of this finding, the study restricted the description of this sequence to the Arc gene and did not determine whether SARE appear in the regulatory regions of other genes, or the specificity of this sequence to the nervous system. We studied the broader implication of SARE sequences in the context of the response to neuronal activity, and validated SARE analysis as able to identify elements of synaptic plasticity. Using the in silico tool SynoR [8], we analyzed the SARE sequences conserved in the mammalian genome. Comparison of mouse and human genome sequences showed enrichment in conserved SARE clusters in the regulatory regions of genes that are expressed specifically in neural tissues, that are involved in neural development and homeostatic maintenance, and that encode mRNA targeted by FMRP. These data support the concept that SARE sequences are true transcriptional regulatory elements, responsible for the coordinated response of TF that convey information from the postsynaptic compartment to the nucleus. These findings might contribute to understanding the genetic causes of mental diseases linked to neuronal plasticity.

Results and Discussion
We used SynoR to study the possible relationship between the SARE regulatory region and genes related to the nervous system [8], specifically those involved in synaptic activity and mental processes. We sought sequence regions containing clusters of the consensus binding sites for CREB, MEF2 and SRF (Fig. 1A) in the human genome, and compared them to the mouse genome to identify conserved sequences. Based on these criteria, we identified 887 genetic regions containing SARE sequences (Table S1 and data deposited in the SynoR tool, ID: s1219104005847). The SARE regions are assigned to the gene(s) of which they form part or to which they are proximal, and are classified as intergenic, intronic, utr (untranslated), cds (coding sequence), or promoter, depending on their position within the gene (Table S1 and Fig. 1B). Control searches for clusters containing combinations of other unrelated TFBS yield significantly less number of regions and were not enriched in neural biological functions ( Fig. 1C and Experimental Procedures). The original SARE sequence of the Arc promoter is not identified in our search because it contains only half of the CREB binding site, and its MEF2 binding sequence shows 2 nucleotide mismatches compared to the consensus [7]. Binding site predictions for individual TF using matrix analysis can be conducted with the Match tool of TRANSFAC. We have validated the presence of the SARE cluster in more than ten of our candidates manually using this tool. These include ATF3, CUX1, CUX2, FOXP1, FOXP2, HOMER1, LMPDH2, NRG1, NPAS4. NR4A1, PLXNA4 and SEMA6A. Similarly, there might be additional SARE sequences not identified by this search because of the analysis procedure: it first identifies TFBS clusters in the human genome and subsequently searches for homology to this specific sequence in mice.
The analysis showed that SARE clusters were most abundant in intergenic and intronic regions ( Fig. 1), potential areas for gene expression control. The list of SARE-containing genes showed genes with central roles in the nervous system such as NMDA (Grin2a, essential for excitatory synapses), Robo2 (with major functions in axon guidance) and Cutl/Cux2 (determinant for cerebral cortex layer II-IV) (examples in Table 1). Classification of genes containing SARE sequences at the GO categories using Toppfun application (http://toppgene.cchmc.org/''ToppFun'') indicated that the processes potentially affected by SARE regulation are clearly related to the nervous system (Table S2). This analysis yielded several enriched GO categories, out of 112 significantly enriched GO biological processes, 21 (18,75%) of them related to neural functions ( Table 2). All of these categories are specifically related to nervous system development and maintenance, and many showed significant greater enrichment than other categories ( Table 2 and Table S2). In accordance with our hypothesis, this prevalence of neural functions supports potentially important, selective action of SARE-mediated mechanisms in the nervous system. Next, SARE containing genes were grouped into two main categories representing potential distal and proximal regulatory sequences: intergenic; and intragenic, cds, promoter and utr regions (Table S2) and GO analysis was performed separately with these two groups. Both groups showed similar enrichment in neural functions; therefore it did not favor proximal or distal regulatory regions as more relevant to plasticity.
The analysis of genes containing the SARE cluster appeared an appropriate approach for identification of mechanisms of homeostasis, plasticity and activity-dependent remodeling in the nervous system. This study disclosed a large number of genes known to participate in plasticity and synaptogenesis (examples in Table 1); Homer1 is an example in this category. Homer genes encode scaffolding proteins that bind Ca 2+ signaling proteins and target them to their correct subcellular localization [9,10]; they are essential for dynamic regulation of the synapse, synaptic plasticity, and spatial learning [11,12]. Coincident expression of experiencetriggered Homer and Arc proteins is found in hippocampal and cortical neurons [13], which supports simultaneous activation, as predicted by our analysis. We also identified axonal guidance molecules ( Table 1 and Table S1), including PlxnA4 and its ligand Sema6A [14] as molecules potentially regulated by SARE. The semaphorin and plexin receptor families, together with neuropilins, are crucial during nervous system development and homeostasis, and mark the pathway for axon growth [14]. These proteins also control synaptogenesis, axon pruning, the density and maturation of dendritic spines and are implicated in a number of developmental, psychiatric and neurodegenerative disorders [15]. As for the axon guidance cues, we found a number of genes that encode cytoskeletal remodeling molecules at the synapse ( Table 1). For example, ankyrins link integral membrane proteins to the underlying spectrin-actin cytoskeleton; they have key roles in activities such as cell motility, activation, proliferation, contact, and maintenance of specialized membrane domains. They might be involved in bipolar disorder and other mental alterations [16].
Less anticipated were SARE-containing genes not previously implicated in plasticity or structural maintenance of the synapse; in this category, we found neuronal subtype-specific TF such as Cux1 and Cux2 [17], Zic2 [18] and Sox6 [19,20] (Table 1 and Table S1). Cux TF expression is restricted to neurons of layers II-III and IV of the cerebral cortex. During development, Cux regulate dendritic branching, spine morphogenesis and synapse maturation [17]. Cux expression is maintained through adulthood, but nothing is yet known of their function in mature neurons. Whereas Cux functions could be associated with plasticity at the postsynaptic site [17], Zic2 might act on the presynaptic terminal, as it is associated with axon development in retinal ganglion cells; Sox6, in turn, is described as essential for neuronal differentiation [19,20]. These observations suggest that activity-dependent mechanisms act on pathways specific to neuronal subtypes.
To test the relevance of our findings and the predictive capability of our gene set to identify genes up-regulated upon neuronal activation we searched for experimental confirmation. Several studies of gene expression changes induced by neuronal activation have been reported and made useful available contributions. Many of them analyze the effects of the gabaergic inhibitor bicuculline to trigger neuronal excitatory response. We therefore compare the list of SARE containing genes with those of genes which expression was modify in studies analyzing the in vivo effects of infusion of bicuculline into the accessory olfactory bulb [21]; in vitro bicuculline treatment of cortical cells [22]; and hippocampal neuronal cultures [23]. This allowed us to extend our validation to several neuronal types. In all three cases, the comparison revealed a highly significant enrichment between the SARE containing genes and those up-regulated upon neuronal activation, but none or of lower statistical significance, when compared to the list of genes that are down-regulated ( Table 3). Several of the SARE genes, such as Homer, Atf3, Klf6 and Bdnf are common to two or the three studies, and may represent a general pan-neuronal response, while unique ones might represent tissuespecific responses. These significant overlapping validate our results with external independent data. We next took the reverse approach and tested the predictive capability of our study by testing the expression of genes picked from our list upon neuronal activation. Cells from E18 mouse cortex were dissociated, neurons were cultivated and neuronal activity triggered using bicuculline [7]. RNA was obtained and transcript expression of twelve SARE containing genes, including Arc, analyzed by using quantitative real time RT-PCR (Q-PCR) ( Fig. 2A). Up-regulation of Arc gene demonstrated efficient neuronal activation and, also expected, the levels of the S-isoform of Homer1 were increased [24,25]. Six more genes showed up-regulation when neurons were activated. Atf3, Impdh2, and Npas4 up-regulation in cortical cells was in agreement with our own analysis of the raw data obtained from gene expression arrays reported by other investigators [22], and further confirmed our comparison with external sources (Table 3). Interestingly, up-regulation of Cux1, Cux2, and PlxnA4, genes not suspected to be regulated by activity, again confirmed the predictive capacity of our study. Four genes, Lmo4, Robo1, Robo2 and Klf6 did not show significant changes. This can be ascribed to the almost certain possibility of a number of false positive in our list, to the fact that other splicing variants might be affected, or to the possibility that subsets of genes may respond differently depending on the stimulus that triggers neuronal response.
Next, the sequence corresponding to two of these SARE sequences were cloned upstream of a minimal promoter into vectors containing luciferase reporters to test their ability to activate transcription in response to neuronal activity. Cells from E18 mouse cortex were transfected with reporter constructs, neuronal activity triggered using bicuculline [7], and luciferase activity compared to control tetrodotoxin (TTX) treated neurons. These experiments demonstrated that these novel identified SARE sequences replicate the promoter activity of the SARE sequence corresponding to the Arc gene and significantly increase transcription upon depolarization (Fig. 2B).
Our analyses thus point to overlooked pathways that might participate in activity-dependent regulatory mechanisms and, by extension, suggests the identification of genes potentially linked to mental diseases caused by plasticity defects [26,27,28]. This is the case of genes reported as candidates for autism in which we found  Table S1). Validation of our prediction nonetheless required evaluation of true enrichment of genes involved in cognitive dysfunction. Fragile X syndrome (FXS) is a well-characterized form of autism, caused by loss of function of the Fragile X mental retardation protein (FMRP), which regulates local translation and plasticity at preand postsynaptic sites [28,35]. Based on a recent extensive list of genes targeted by FMRP, from which the authors extract a stringent set of 842 reliable targets [36], we hypothesized that the list of SARE-regulated genes will be enriched in FMRP targets. Comparison of SARE-containing genes (including those containing SARE clusters at intergenic locations) with the stringent list of FMRP targets resulted in 70 genes common to both (8.5% overlap), an enrichment of biological relevance (p = 4.3909-13; see Methods) ( Table 4). The relationship between the SAREcontaining genes and FMRP targets thus strongly supports SARE involvement in activity-dependent regulation. In addition, it suggests that mutations in SARE or SARE-containing genes and pathways can contribute to mental retardation, autism spectrum disorders and other psychiatric diseases.
Correct function of nervous system networks and subnetworks is possible thanks to the extraordinary spatial and temporal coordination of gene expression that is guided by the TF subset expressed by each neuronal population. Our findings suggest that cooperation between CREB, SRF, and MEF2 transcription factors at the SARE region is one of the precisely regulated mechanisms that govern the transcriptional program of activated neurons. This transcriptional cooperation might also apply to other TF to initiate an appropriate, specific transcriptional response in other biological processes. This study also highlights the value of the development and use of computational tools and databases for the comprehensive analysis of biological events. We identified a subset of genes whose transcription is potentially regulated by the SARE cluster after synaptic activation. Most of these genes are directly related to nervous system development and maintenance; several of them are reported at the synapse, some are mutated in human mental disorders, and many form part of FMRP-regulated mechanisms. The identification and functional analysis of SARE-containing genes provided here is thus a useful for implicating new candidate genes in plasticity, memory, and mental retardation, and suggests new approaches to the study of mental disorders in which synaptic activity might have a central role.

Genome Sequence Analysis
In silico analyses were performed using SynoR (Identifying synonymous regulatory elements in vertebrate genomes), a tool described by Ovcharenko and Nobrega [8]. SynoR is available at the National Center for Biotechnology Information (NCBI) DCODE.org Comparative Genomics Developments (http:// synor.dcode.org/), and performs de novo identification of synonymous regulatory elements (SRE) using known patterns of transcription factor binding sites (TFBS) in active regulatory elements (RE) as seeds for genome scans. The search was performed on the human genome assembly (hg18; July 2007 NCBI Build 36.1) and compared to the mouse genome assembly (mm9; July 2007NCBI Build 37). ECBR Browser performs whole genome Blastz-based alignments using the TFBS data of the transcription factors under study from the TRANSFAC Professional database. The TFBS studied were those of CREB (CREB_01, CREBP1_01, CREBP1CJUN_01, CREB_02,  CREB_Q2,  CREB_Q4,  CREBP1_Q2,  CREB_Q3,  CREB_Q2_01,  CREB_Q4_01,  CREBATF_Q6), MEF2 (MEF2_01, MEF2_02, MEF2_03, MEF2_04, MEF2_Q6_01) and SRF (SRF_01, SRF_Q6, SRF_C, SRF_Q4, SRF_Q5_01, SRF_Q5_02). The maximum distance between two adjacent TFBS was set at 125 base pairs. Random relative position of TFBS was allowed. To test for the significance of the results, we performed analysis of several TFBS combinations. A search for clusters combining TFBS for ZIC2, BRCA and PAX1 yields only 1 cluster indentified by SynoR, combination of ACAAT, Bach1, Lbp1 gives 50 clusters. Combination of TF related to the nervous system, EGR1, MEF2, NF-kB results on 244 clusters identified. Substitution of any of the TFBS from our particular search of MEF2, CREB and SRF significantly decreased the number of identified clusters. For example, substitution of CREB for ZIC2, i.e. search for MEF2, SRF and ZIC2 gives 142, compared to 842.

Gene Annotation and Analysis of Gene Ontolog
The list of SARE genes obtained from SynoR was updated using Toppfun application from Toppgene suit Cincinnati Children's Hospital Medical Center (http://toppgene.cchmc. org/''ToppFun'') and manually curated. A small number of genes were not annotated in NCBI but appeared annotated in other data

Luciferase Reporter Assays
Mouse genomic sequence containing SARE regulatory regions (see below) corresponding to those identified for the Cux1 and Cux2 genes were cloned into the pGL4.23 luciferase vector Table 3. The number of SARE containing genes is significantly enriched on genes up-regulated upon bicuculline triggering of neuronal activity.

Atf3
Atf3 Comparative analysis of the list of SARE containing genes with three independent studies reporting mRNA changes in gene expression in the accessory olfactory bulb [21]; cortical cells [22]; and hippocampal neuronal cultures [23]. The table shows the lists of the SARE genes that were found to be up-regulated (upper part), or downmodulated (lower part) in each study. P-values for random coincidence are shown. Not significance coincidence was observed when compared to genes that are downregulated. doi:10.1371/journal.pone.0053848.t003 The SARE sequence for Cux1 is CTGGCCGAATCTGGCTGCCTGGCGTCCTGTGAGT-TAATTATAGCTCTGTTAACAGAGCAGGGAACAGGGAA-CACTTGCAGTGACGGAGAT. The sequence for Cux2 TTAAATAAAGCTGTCACGACTCTTCCATCAGGAGG-GATGGGCTCCAAACATGAGAGTTTCCAGAGCCGT-GACTATAACAGGAGTGGAAATTTATCCCTTCTAAT-TATGTAATTGCATAATTTTAGGTAGCATTGGAAAT-TATGTAA. E18,5 neuronal cells cultured for 8 days were cotransfected with the corresponding firefly luciferase reporter constructs and internal control Renilla luciferase plasmid, at a ratio of 4:1 using lipofectamine 2000 (Invitrogen). Neuronal response was trigger using bicuculline/4-amino-pirydine as described above. Control cells Luciferase and Renilla activity was measure using the Dual-Luciferase Reporter Assay System (Promega) and following the manufacture protocol. Relative expression of each reporter construct was determined by normalizing the ratio of reporter activity to the activity on TTX treated neurons. reverse ACGCGAT-CAGCCTGTTTTCT) were tested. The results were normalized as indicated by the parallel amplification of b-actin (forward GGCTGTATTCCCCTCCATCG; reverse CCAGTTGGTAA-CAATGCCATGT). The list of genes identified as containing SARE sequences was compared to a list of 842 reliable FMRP targets. This resulted in an overlap of 70 genes common to both lists. This represent a significant enrichment of p = 4.39096.93e-13, far from the expected random distribution of coincidences between the genome and the mouse nervous system transcriptome (see Experimental Procedures). doi:10.1371/journal.pone.0053848.t004

Statistical Analysis
Probability of overlap between the FMRP target gene, and the SARE-containing gene lists was based on a binomial function, considering the size of the human genome as 28000 genes; the mouse nervous system transcriptome as 12000 transcripts and the total number of all genes associated to one or more SARE cluster is 827. We calculated the density function that describes the probability of having a number of genes within the transcriptome, with the binomial X = B(n 1 = 827, p = 12/27) We then obtained the density function Y, which describes the probability of having a number of coincidences between both lists, with a new binomial Y~P n1 i~0 B n 2~8 42, p~i 12000 |X i ð Þ (n = total number of FMRP targets). This probability distribution yielded a p value of 4.3909e-13 for Y(70) (for 70 coincidences). Equal analysis was performed to calculate the significance of overlapping between datasets from experimental data of bicuculline modulated genes and the SARE containing genes list.

Supporting Information
Table S1 SARE sequences conserved in human and mouse assigned to genes according to proximity. Using SynoR, we searched for regions containing clusters of the consensus binding sequences for SRF, MEF2 and CREB in the human genome and compared it to the mouse genome to identify conserved sequences. Based on these criteria, we identified 887 genetic regions with conserved SARE sequences that are assigned to the proximal genes: 530 clusters were found on intragenic regions and 357 in intergenic, (data deposited in the SynoR tool, ID nu s1219104005847). Additional tab (OtherTFclusters) show results from the analysis of other TFBS combinations and graphics showing the relative low number of clusters identified and their lower relation to neuronal functions. (XLSX)