Differential regulation of germ line apoptosis and germ cell differentiation by CPEB family members in C. elegans

Cytoplasmic polyadenylation element binding (CPEB) proteins are evolutionary conserved RNA-binding proteins that control mRNA polyadenylation and translation. Orthologs in humans and other vertebrates are mainly involved in oogenesis. This is also the case for the C. elegans CPEB family member CPB-3, whereas two further CPEB proteins (CPB-1 and FOG-1) are involved in spermatogenesis. Here we describe the characterisation of a new missense allele of cpb-3 and show that loss of cpb-3 function leads to an increase in physiological germ cell death. To better understand the interaction and effect of C. elegans CPEB proteins on processes such as physiological apoptosis, germ cell differentiation, and regulation of gene expression, we characterised changes in the transcriptome and proteome of C. elegans CPEB mutants. Our results show that, despite their sequence similarities CPEB family members tend to have distinct overall effects on gene expression (both at the transcript and protein levels). This observation is consistent with the distinct phenotypes observed in the various CPEB family mutants.


Introduction
The central dogma of molecular biology states that genetic information generally flows from DNA through RNA to proteins [1]. In eukaryotic organisms this information transfer is highly regulated at each step of the process, including at the level of post-transcriptional regulation of mRNAs (such as mRNA splicing, transport, localization, stability, and translational activation), which often involves RNA-binding proteins (RBPs) and microRNAs (miRNAs) [2,3].
Cytoplasmic polyadenylation element binding (CPEB) proteins are RBPs that bind to cytoplasmic polyadenylation elements (CPEs), in the 3'-UTR region of mRNAs [4,5]. CPEB proteins act within a large ribonucleoprotein (RNP) complex and are mainly involved in the regulation of polyadenylation [4]. They are also indirectly involved in both translational a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 activation and repression as well as in cell differentiation, division, and senescence [6]. CPEB proteins are characterized by two RNA recognition motifs (RRMs) and a zinc finger (ZnF)like cysteine-histidine repeat region (C/H domain; composed of C 4 and C 2 H 2 motifs) [7]. Both RRMs and the C/H domain are essential for specific RNA binding [8]. CPEB proteins are evolutionarily conserved and present in all metazoans, from invertebrates to humans. Vertebrates often contains four paralogs, whereas invertebrates have been found to contain only one or two [9]. Caenorhabditis elegans is an exception as it contains four CPEB proteins (CPB-1, CPB-2, CBP-3, and FOG-1) [7,10]. All nematode species analysed had four CPEB family members, suggesting an expansion either very early in the nematode lineage or slightly prior to their separation from other lineages [10].
In contrast to CPB-3, CPB-1 and FOG-1 are both required for spermatogenesis, with FOG-1 participating in sperm cell fate determination and CPB-1 contributing to spermatocyte differentiation [7,22]. Little is known about the function of CPB-2, except that its transcript is highly expressed during spermatogenesis [7,23], suggesting that cpb-2, like cpb-1 and fog-1, might also be involved in this process.
Here we describe the isolation and characterisation of a new missense allele of cpb-3. We show that this mutation leads to defects in oogenesis and increased levels of p53-independent apoptosis in the adult hermaphrodite germ line. To better understand the interaction and the effect of C. elegans CPEB proteins on processes such as physiological apoptosis, germ cell differentiation, and regulation of gene expression, we characterised changes in the transcriptome and proteome of fog-1, cpb-2, and cpb-3 mutants. Consistent with the distinct phenotypes caused by loss of the various family mutants, we find that CPEB family members tend to have distinct overall effects on gene expression (both at the transcript and protein levels).

Results and discussion
Characterisation of the allele op234 The op234 allele was originally isolated in a forward genetic screen designed to identify novel genes that control germ line apoptosis (gla). op234 was originally assigned to a novel gene name "gla-1" as it did not map to any previously known cell death gene [24].
Through classical two-and three-factor as well as SNP mapping, we localised the allele op234 to a roughly 46 kb region containing 12 genes on chromosome I (see Materials and methods). One of these 12 genes, B0414.5/cpb-3 gave rise to an increased number of germ cell corpses when inactivated by RNA interference (RNAi). We sequenced the cpb-3 from wildtype and op234 mutant worms and detected a G-to-A transition at position 1559 of the spliced gene (position 1724 of the unspliced gene). This transition leads to a missense mutation in the CPB-3 protein, resulting in a non-conservative substitution of an invariant cysteine to tyrosine (C520Y) within the conserved C/H domain (Fig 1A and 1B), which likely leads to a loss of function of this domain. As the molecular nature of the gene affected by op234 is now known, we will henceforth refer to this allele as cpb-3(op234) instead of gla-1(op234).
To determine the epistatic relationship between cpb-3 and other apoptotic genes, we generated double mutants between cpb-3(op234) and the strong loss-of-function mutation ced-3 (n717), which lacks the major C. elegans caspase and is characterised by a complete loss of apoptosis [26]. Virtually no germ cell death was observed in cpb-3(op234); ced-3(n717) animals, demonstrating that op234-induced cell death is apoptotic in nature and that cpb-3 functions up stream of the core apoptotic genes (Fig 1C). Since germ line apoptosis can be induced by both physiological signals and through various stresses, including DNA damage [27], we also performed an epistatic analysis with clk-2 and egl-1 mutations. clk-2(mn159ts) animals are characterised by a complete loss of DNA damage-induced apoptosis, but have normal levels of somatic and physiological germ line apoptosis [28], whereas egl-1(n1084 n3082) worms are defective in both DNA damage-induced and somatic apoptosis but show normal physiological germ line apoptosis [29,30]. We found that both clk-2 and egl-1 failed to suppress the high germ line apoptosis of cpb-3 ( Fig 1C). Increased cell corpse number could in principle result either from an increase in the number of cells that die or in a decrease in the clearance efficiency of dead cells, which would lead to their accumulation. Based on our prior results [24], we can clearly exclude the latter possibility. Taken together, these observations thus support our previous, RNAi-based conclusion that loss of cpb-3 function results in increased physiological apoptosis in germ cells [21].
To further understand the role of cpb-3 in oogenesis and germ cell survival, we compared gonads of cpb-3(op234) hermaphrodites with wild type (Fig 2). We observed that in the gonads of older animals, cpb-3(op234) mutants contained fewer mature oocytes (full-sized oocytes which span across the whole gonad diameter) than wild type. This phenotype was further enhanced by inactivation of apoptosis in cpb-3; ced-3 double mutants, leading to a complete loss of mature oocytes in the gonad of older worms (Fig 2H and 2I). Consistent with the above, cpb-3(op234) mutants also showed a reduced brood size, which was further decreased in the absence of apoptosis ( Fig 2J). These observations show that cpb-3(op234); ced-3(n717) animals still have oogenic defects even in the absence of apoptosis, suggesting that cpb-3 plays a broader role in oogenesis rather than simply regulating germ cell survival. The synergetic defect in oogenesis observed in the cpb-3(op234); ced-3(n717) double mutant likely arises from the fact that old ced-3 mutants also fail to properly generate mature oocytes [30].
We also analysed germ cell apoptosis in the other CPEB mutants. Unlike cpb-3(op234) others mutants: cpb-1(tm2821), cpb-2(ok1772), and fog-1(q253ts) showed little to no change in cell corpse numbers (Fig 1D). Double mutants between cpb-3(op234) and cpb-2(ok1772) still had high apoptosis levels, albeit slightly less than the cpb-3 single mutant ( Fig 1D). This is consistent with the previous observation of Hasegawa et al., who failed to find any effect of cpb-2(RNAi) on the germ line differentiation defects that can also be found in cpb-3 [19]. This suggests that these two CPEB proteins might have distinct effect on C. elegans germ line development.

Proteomics and transcriptomics
To better understand the role of CPB-3 in the regulation of oogenesis and spermatogenesis, we characterised the effect of loss of cpb-3 function at the transcriptome and proteome levels. In parallel, we also characterised the changes induced by loss of the CPEB family members: fog-1 (involved in spermatogenesis) and cpb-2 (function unknown). CPB-1 was not considered due to the absence of a fertile mutant strain.
All four C. elegans CPEB family members are expressed specifically in the germ line, but with distinct temporal expression pattern [31] (Fig 3A-3D). Based on these expression patterns and the known phenotype of fog-1 and cpb-3 mutants, we selected larval stage 3 (L3) for fog-1(q253ts) and cpb-2(ok1772) animals, and L4 for cpb-3(op234) mutants. Those stages were chosen one stage prior to the full-blown phenotype in order to give us a chance to identify early changes in gene expression that might contribute to the mutant phenotype.
For proteome profiling we used a variation of the stable isotope labelling by amino acids in cell culture (SILAC) method called "spike-in" SILAC [32], where heavy SILAC labelling is used only to produce a reference proteome (in our case mixed stages of wild type; S2 Fig). We quantified between 1840 and 4725 protein groups (proteins sharing same identified peptides [33]). Between 73 and 192 protein groups were found to be differentially expressed (|FC| > 1.5 and P-value < 0.05) between the various CPEB mutants and wild type at the corresponding developmental stage in three biological replicates (Fig 4A-4C). We did not detect any of the CPEB family members at the protein level. This is not surprising as C. elegans CPEB proteins  are of low abundance (between 0.02-0.26 ppm based on PaxDb version 4 [34]) and thus unlikely to be detected via shotgun mass spectrometry (MS) after SILAC labelling (SILACbased MS).
In parallel, we also performed transcriptome profiling of wild-type and CPEB mutant animals by RNA sequencing (RNA-seq) to complement the proteomics analysis (S2 Fig). Between 2840 and 4870 transcripts were found to be differentially expressed (|FC| > 1.5 and BHadjusted P-value < 0.01) between the various CPEB mutants and wild type at the corresponding developmental stage in three biological replicates (Fig 4D-4F). The correlation between changes in proteome and transcriptome abundance in various CPEB mutants was generally weak (ranging between 0.07-0.22; S3 Fig). This is not surprising as CBEPs in other model systems are known to affect translation much more than mRNA stability. However, further factors such as indirect interactions or simply technical or biological noise can also contribute to low correlation.
We observed no increase in the mRNA abundance of the CPEB family members in the various CPEB mutants relative to wild type, with the exception of cpb-2 mutants, where we observed a 2-3 fold increase in the transcript levels of the other three CPEB family members ( Fig 3E). This suggests that there are no general compensatory mechanisms between the various CPEB family members.

CPEB proteins in C. elegans tend to have distinct effects on gene expression
As CPEB protein homologs have a high degree of sequence identity in their RNA-binding domains (RRMs and C/H domain), one could expect the C. elegans CPEB family members to bind to similar consensus sequences and thus show a certain degree of overlap in their target mRNAs. However, C. elegans CPEB proteins are also known to have non-redundant functions in germ cell differentiation pathways (with CPB-1 and FOG-1 acting in spermatogenesis; and CPB-3 being required for oogenesis) and germ line apoptosis. Because of these considerations, we analysed to what extent the changes in gene expression at the transcriptome and proteome levels would overlap in the various CPEB mutants. Overall, the three CPEB mutants had few differentially expressed genes in common (much less overlap than predicted by chance; Fig 5), both at the transcriptomics and proteomics levels. For example only 494 transcripts were differently expressed in all three mutants whereas 794 transcripts were expected by chance ( Fig  5C).
Given the role of the CPEB proteins in germ cell differentiation, we next focused on genes that have previously been shown to be specifically enriched during oogenesis or spermatogenesis (oogenic and spermatogenic genes, respectively) [23]. We did not observe any major changes in the protein and transcript expression levels of oogenic and spermatogenic genes (S4 and S5 Figs and S1 Table). This observation is not surprising, as we analysed the mutants one developmental stage prior to the fully expressed phenotype. Nevertheless, consistent with the reported role of cpb-3 in oogenesis, we observed that oogenic transcripts were as a group underexpressed in cpb-3 mutants (S1 Table).
To further explore the differentially expressed genes (both at the transcript and protein levels) in the various CPEB mutants, we performed enrichment or depletion analysis of gene ontology (GO; [35]) terms (S6 and S7 Figs and S2 Table). A small number of GO terms based on differentially expressed transcripts were found to be enriched or depleted both in fog-1 and cpb-2 mutants (more than expected by chance; Fig 6). On the other hand GO terms based on differentially expressed protein groups showed no significant overlap (S7 Fig). and protein groups (B) in CPEB mutants relative to wild type. Columns of heat maps were clustered using functions "dist" and "hclust" from the R package "stats" (version 3.2.4) using "euclidean" distance and "complete" method. Blue and red shades represent statistically significant log 2 scaled fold changes, grey colour is used for changes below the absolute fold change cut-off of 1.5 (~0.58 on log 2 scale) or above the statistical significant cutoff. Only transcripts and protein groups quantified in all three mutants with differential expression in at least one mutant were considered. (C-D) Overlap between differentially expressed transcripts (C) and protein groups (D) in CPEB mutants. Each set shows (see key) observed value and median along with the 2.5 to 97.5 percentile range in parentheses, calculated from the random permutation for 10000 iteration. Green and magenta shades in each set represent the log 2 scaled fold changes of observed value relative to median of permutation distribution. Asterisks denote permutation test P-values (P PT ): *P  0.05; **P  0.01; ***P  0.001. For each set the total number of differentially expressed transcripts or protein groups out of total transcripts or protein groups used in the analysis is shown in parentheses beside set names.
Taken together our observations show that loss of CPEB family members caused distinct changes in expression patterns, consistent with the hypothesis that they play different, specific roles in the C. elegans germ line.

Conclusion
Here, we described a new missense allele of cpb-3. This mutation (op234) leads to various oogenesis defects, similar to those described previously by Hasegawa et al. for the putative null alleles bt17 and tm1746 [19]. Moreover, we showed that loss of cpb-3 function give rise to higher physiological germ cell apoptosis, likely in response to oogenic defects in cpb-3 mutants.
We also performed transcriptomics and proteomics analyses of the fog-1, cpb-2, and cpb-3 mutants to elucidate the role of CPEB proteins in C. elegans. We found significant changes in transcript and protein abundances in all three CPEB mutants relative to wild type. Further analysis of differentially expressed genes showed that although the CPEB proteins have a high degree of sequence level similarity, they have distinct overall effects on gene expression (both at the transcript and protein levels).
Taken together, our study has helped to further define the role of the CPEB family members in the regulation of germ cell differentiation in C. elegans. Further studies, including the identification of binding sequences and target mRNAs via cross-linking and immunoprecipitation (CLIP) will be required to further understand the distinct roles of this gene family in C. elegans.

Isolation and cloning of op234
The op234 allele was independently isolated from a forward genetic screen for mutants with increased germ cell apoptosis, as previously described [38]. The op234 mutant was crossed back to the wild type three times before further analysis.
The op234 allele was mapped close to the middle of chromosome I by two-factor mapping using the marker dpy-5. The position of op234 was further refined by the ability of the mutation to complement the following chromosomal deficiencies: sDf4, qDf16, and mnDf111, and by three-factor mapping using the following marker combinations: unc-11 dpy-5, unc-57 dpy-5, dpy-5 unc-87, and dpy-5 unc-101. These analyses placed op234 closer to dpy-5 than unc-87. SNP mapping further refined the position of op234 to a roughly 46 kb region covered by cosmids B0414 and C32F10, which includes 12 genes. RNAi was performed on all 12 genes in this interval as previously described [38]. We sequenced the PCR amplified product of the B0414.5 locus from wild-type and op234 mutant worms to determine the molecular changes induced by op234.

Apoptotic cell corpse and brood size counting
Between 15 and 20 synchronized hermaphrodites (30 h post L4/adult molt) from different genotypes (Fig 1C and 1D) were used to count germ cell corpses using differential interference contrast (DIC) microscopy as previously described [39]. Brood size analysis was performed as previously described [40].
To perform spike-in SILAC, mixed stages of wild-type worms were labelled with heavy lysine and arginine (heavy-SILAC sample) and used as reference sample. The heavy-SILAC sample was harvested after two generations (see S8 Fig for the labelling efficiency), and the protein extract was divided into single use aliquots of 150 μg and stored at -80˚C till further use. Additionally the wild-type and CPEB mutant worms were labelled with light lysine and arginine (light-SILAC sample), and harvested after two generations at different developmental stages (fog-1 and cpb-2 at L3; cpb-3 at L4; and N2 at L3, L4, and YA) in biological triplicates.

Proteomics
Protein extraction, digestion, and peptide pre-fractionation. Protein extraction was done as previously described [44]. 150 μg of total worm protein from three biological replicates of each light-SILAC sample was mixed with one aliquot (150 μg of total worm protein) of heavy-SILAC sample and digested with trypsin as previously described [44].
Peptide samples were pre-fractionated by hydrophilic interaction liquid chromatography (HILIC) on an Agilent 1200 series HPLC system using a YMC-pack polyamine II column (250 mm × 3 mm ID, particle size 5 μm, pore size 12 nm) at a flow rate of 0.5 ml/min into 26 fractions (pooled to 11 final fractions). The previously described [44] buffer composition and elution gradient profile was used with the total run time reduced to 60 min.
Eluted peptides were directly ionized by electrospray ionization (ESI) and transferred into the Q Exactive orifice using the Digital PicoView (DPV-550; Newobjective) nanospray source. Proteomics data analysis. We used the MaxQuant software (version: 1.5.0.30) [33,45] for identification and quantification of protein groups from SILAC-based MS data. In each Max-Quant run, all raw data files belonging to one sample were analysed as a single "Parameter group" (Group 0), further categorized into biological replicates referred as "Experiment", where each "Experiment" contains 11 "Fractions" (first biological replicate for cpb-2 sample was lost during data acquisition, hence cpb-2 sample was analysed with only two biological replicates.) The C. elegans protein database wormpep242 (downloaded on April 2014 with 27078 entries) combined with 261 common MS contaminants (yielding a total of 27339 entries) was used for peptide spectrum matching using the Andromeda search engine [46] from Max-Quant. This forward database was concatenated with a decoy database generated by Max-Quant (Decoy mode = Revert, Special AAs = KR, and Include contaminates = checked) prior to the search, to facilitate the calculation of false discovery rate (FDR) [47]. Protein groups assigned to contaminants were also counted as forward hits for FDR calculations. Default search parameters for the Oribtrap instrument type were used with the following search settings: Fixed modifications = none; Variable modification = acetylation of the protein N-terminus, deamidation of asparagine and glutamine, and oxidation of methionine; Digestion mode = Specific; Enzyme = Trypsin with one maximum missed cleavage. SILAC labelling pairs (Heavy labels = Arg10 and Lys8; maximum of three labelled amino acids per peptide) were extracted from the isotope patterns with "Re-quantify" and "Match between run (with default settings)" enabled.
Protein quantification was performed using both unique and razor peptides (Peptides for quantification = unique + razor) with the above mentioned variable modifications, and including only proteins with at least two SILAC pairs (Min. ratio count = 2). Protein level FDR of 5% was used to export quantification results.
For each sample a further downstream analysis (based on proteinGroups.txt file) was done using R software environment [48]. First all contaminants and decoy (reversed) protein groups were removed and then for each sample only protein groups with normalised H/L ratios in at least two biological replicates were considered for calculating the fold change of protein groups between mutant and wild-type samples at the same developmental stage.
Briefly, for each protein group, deconvoluted ratios ("ratio of ratio" i.e. (H/L) wild type /(H/ L) mutant ) for all biological replicates between mutant and wild-type samples at the same developmental stage was calculated using values in the "Ratio H/L normalized" column. Only protein groups with deconvoluted ratios in at least two biological replicates were considered further. These deconvoluted ratios were appropriately scaled to make the median equal to 1 and then transformed to log 2 scale. P-values were calculated by performing one-sample student t-test (H 0 : μ = 0 and H 1 : μ 6 à 0) on log 2 scaled deconvoluted ratios and the average of these ratios represent average log 2 scaled fold change of protein groups (see S9-S11 Figs for reproducibility and distribution of protein groups within replicates and samples, and S3 Table  for list of protein groups quantified in each sample).

Transcriptomics
RNA extraction and sequencing. From three biological replicates of each light-SILAC sample (samples from same batch were used for proteomics) RNA extraction was done using TRIzol reagent (15596-026; Life Technologies) [49] in accordance with the manufacturer's protocol and further purified by using DNA-free kit (AM1906; Ambion) to obtain high quality RNA for sequencing.
RNA-seq was performed at GATC Biotech (Germany; http://www.gatc-biotech.com) using "InView™ Transcriptome Explore" package. In brief, a random primed cDNA library generated from each sample was sequenced on Illumina HiSeq 2500 (run type = single end, read length = 50). For some samples the cDNA library was sequenced more than once to achieve a total of at least 30 million reads.
Transcriptomics data analysis. RNA-seq data was analysed by a count-based approach using R and Bioconductor [50] as previously described [51]. In summary, raw sequencing data files (FASTQ format) were assessed for their quality by using R package "ShortRead" (version 1.22.0) [52] and FastQC (version 0.11.2, Babraham Bioinformatics, http://www. bioinformatics.babraham.ac.uk). After quality control checks (all FASTQ files passed), RNAseq reads were aligned to the C. elegans reference genome WBcel235 (release-76; downloaded on September 2014 from Ensembl) using aligner TopHat (version 2.0.9) [53]. For all samples mapped reads were counted by the htseq-count script from HTSeq (version 0.6.1p1) [54] to generate a count table. Counts from different sequencing runs of the same cDNA library were summed prior to the differential analysis. Finally, differential analysis of counts between mutant and wild-type samples at the same developmental stage was done using R package "edgeR" (version 3.6.8) [55] (see S12 and S13 Figs for reproducibility and distribution of transcripts within replicates and samples, and S3 Table for list of transcripts quantified in each sample).

GO analysis
R package "org.Ce.eg.db" (version 3.2.3) [56] was used to retrieve GO-to-genes annotations, and genes annotated with evidence codes ND, IEA, and NR were removed prior to the analysis. GO analysis was performed individually for biological process (BP), cellular component (CC), and molecular function (MF) ontologies by using R package "topGO" (version 2.22.0) [57] with nodeSize = 10. The set of all quantified protein groups or transcripts was used as the gene universe and differentially expressed protein groups (|FC| > 1.5 and P-value < 0.05) or transcripts (|FC| > 1.5, BH-adjusted P-value < 0.01, and log 2 (CPM) > 8) were considered as interesting genes. Custom test statistic function for the two-tailed Fisher's exact test in combination with "elim" algorithm [58] was used to perform gene counts based enrichment or depletion tests (see S2 Table, for list of significantly enriched or depleted GO terms in proteomics and transcriptomics datasets).
Supporting information S1 Fig. C. elegans contains four distantly related CPEB homologs. Unrooted phylogenetic tree showing evolutionary relationship between CPEB orthologs and paralogs across different organisms. Multiple protein sequence alignment was performed using CLUSTAL W [59] via R package "msa" (version 1.2.1) [60] with default arguments. The tree was calculated (after filtering alignment positions for at least 10% non-gap) using "dist.alignment" function from R package "seqinr" (version 3.1-3) [61] and drawn using neighbor-joining method with "nj" and "plot.phylo" functions from R package "ape" (version 3.4) [62]. Numbers on the internal node represent the bootstrapping score for 1000 iterations calculated using "boot.phylo" function from R package "ape". C. elegans proteins are in blue. The four clades containing the vertebrate CPEB proteins are highlighted. Length of proteins is shown in parentheses beside protein names (aa: amino acids). CPEB proteins from the following organisms are shown: C. elegans depleted GO terms based on differentially expressed transcripts in CPEB mutants relative to wild type. GO analysis was performed on highly abundant, differentially expressed transcripts (|FC| > 1.5, BH-adjusted Pvalue < 0.01, and log 2 (CPM) > 8) using R package "topGO" (version 2.22.0) [57]. GO terms from BP, CC, and MF ontologies with two-tailed Fisher's exact test P-value < 0.05 in three CPEB mutants are shown here as word clouds using R package "GOsummaries" (version 2.4.7) [63]. In the word clouds the size of the words is proportional to -log 10 of P-value within one word cloud. Terms are coloured as follows: red for enriched and blue for depleted. (TIF) S7 Fig. GO analysis of differentially expressed protein groups in CPEB mutants relative to wild type. (A) Word clouds of the significantly enriched or depleted GO terms. GO analysis was performed on differentially expressed protein groups (|FC| > 1.5 and P-value < 0.05) using R package "topGO" (version 2.22.0) [57]. GO terms from BP, CC, and MF ontologies with two-tailed Fisher's exact test P-value < 0.05 in three CPEB mutants are shown here as word clouds using R package "GOsummaries" (version 2.4.7) [63]. In the word clouds the size of the words is proportional to -log 10 of P-value within one word cloud. Terms are coloured as follows: red for enriched and blue for depleted. (B-D) Overlap between BP (B), CC (C), and MF (D) ontology terms in CPEB mutants. Each set shows (see key in Fig 5) observed value and median along with the 2.5 to 97.5 percentile range in parentheses, calculated from the random permutation for 10000 iteration. None of the overlaps was significant. For each set the total number of significant GO terms out of total GO terms used in the analysis is shown in parentheses beside set names.