Characterizing Transcriptional Networks in Male Rainbow Darter (Etheostoma caeruleum) that Regulate Testis Development over a Complete Reproductive Cycle

Intersex is a condition that has been associated with exposure to sewage effluents in male rainbow darter (Etheostoma caeruleum). To better understand changes in the transcriptome that are associated with intersex, we characterized annual changes in the testis transcriptome in wild, unexposed fish. Rainbow darter males were collected from the Grand River (Ontario, Canada) in May (spawning), August (post-spawning), October (recrudescence), January (developing) and March (pre-spawning). Histology was used to determine the proportion of spermatogenic cell types that were present during each period of testicular maturation. Regression analysis determined that the proportion of spermatozoa versus spermatocytes in all stages of development (R2 ≥ 0.58) were inversely related; however this was not the case when males were in the post-spawning period. Gene networks that were specific to the transition from developing to pre-spawning stages included nitric oxide biosynthesis, response to wounding, sperm cell function, and stem cell maintenance. The pre-spawning to spawning transition included gene networks related to amino acid import, glycogenesis, Sertoli cell proliferation, sperm capacitation, and sperm motility. The spawning to post-spawning transition included unique gene networks associated with chromosome condensation, ribosome biogenesis and assembly, and mitotic spindle assembly. Lastly, the transition from post-spawning to recrudescence included gene networks associated with egg activation, epithelial to mesenchymal transition, membrane fluidity, and sperm cell adhesion. Noteworthy was that there were a significant number of gene networks related to immune system function that were differentially expressed throughout reproduction, suggesting that immune network signalling has a prominent role in the male testis. Transcripts in the testis of post-spawning individuals showed patterns of expression that were most different for the majority of transcripts investigated when compared to the other stages. Interestingly, many transcripts associated with female sex differentiation (i.e. esr1, sox9, cdca8 and survivin) were significantly higher in the testis during the post-spawning season compared to other testis stages. At post-spawning, there were higher levels of estrogen and androgen receptors (esr1, esr2, ar) in the testis, while there was a decrease in the levels of sperm associated antigen 1 (spag1) and spermatogenesis associated 4 (spata4) mRNA. Cyp17a was more abundant in the testis of fish in the pre-spawning, spawning, and post-spawning seasons compared to those individuals that were recrudescent while aromatase (cyp19a) did not vary in expression over the year. This study identifies cell process related to testis development in a seasonally spawning species and improves our understanding regarding the molecular signaling events that underlie testicular growth. This is significant because, while there are a number of studies characterizing molecular pathways in the ovary, there are comparatively less describing transcriptomic patterns in the testis in wild fish.


Introduction
The freshwaters of North America are populated by a rich variety of native fishes. Research has been restricted to a few common species, and relatively little attention is given to other groups, including the darters (Family Percidae). Often these lesser-studied species can function as valuable indicators of ecological integrity, and can provide important information about the health status of freshwater systems. Teleost fishes, including darters, have diverse reproductive strategies. As such, fish monitoring programs have grouped species with different reproductive strategies into three major categories: single spawners, multiple spawners with relatively few spawning events, and multiple spawners that show multiple spawning events [1]. Rainbow darter (RBD, Etheostoma caeruleum) is a multiple spawner, and the few spawning events tend to occur in late spring and early summer. RBD is currently used as a sentinel species to monitor sewage effluent impacts in river systems in southern Ontario, Canada [2] and is gaining popularity for environmental monitoring practices.
The main objective of this study was to determine the pathways related to RBD testis development over an annual cycle in RBD. Fish were collected from the Grand River, Ontario, Canada, in an area considered low impact in terms of anthropogenic pollution. A custom second generation RBD microarray was used to determine the major pathways involved in the male reproductive cycle in relation to the relative proportion of sperm types in the testis. In addition, a suite of genes were examined in males over a compete breeding season using real-time PCR to determine optimal sampling times for molecular endpoints in monitoring programs based on variability.

Fish sampling
Adult male RBD were collected from the Grand River, ON, Canada from a site outside the urban areas and away from anthropogenic inputs (43˚30' 17" N, 80˚28' 28" W) using a backpack electrofisher (Smith-Root Model 12-D). All fish captured by electrofishing were held in aerial buckets with water from the river. After species identification, fish that were juveniles or other species were released. For the RBD lethal sampling, fish were anesthetized in 0.1 g/L tricaine methanesulfonate (MS-222), and killed by cervical dislocation. All wild fish were collected in conjunction with the University of Waterloo under the approved animal care AUPP 10-17 by University of Waterloo Office of Research Ethics. RBD were measured for weight (± 0.001 g), length (± 1 mm), gonad weight (± 0.001 g) and liver weight (± 0.001 g). Morphological endpoints were used to calculate condition factor [k = 100 Ã (body weight/length3)], gonadosomatic index [GSI = 100 Ã (gonad weight/body weight)], and liversomatic index [LSI = 100 Ã (liver weight/body weight)]. Gonads were divided in half with one lobe collected and processed for histology (placed in Davidson's solution) and one lobe collected for molecular studies (flash frozen in liquid nitrogen). Five field collections were conducted from May 2011 to March 2012 with the goal being to collect males with distinct stages of testis development (i.e. May: spawning; August: post spawning; October: recrudescence; January: developing and March: pre -spawning). The Grand River Conservation Authority provided water quality data and this is provided in S1 Table. Histological Analysis Testis were excised and examined to determine developmental stage. A lobe of the testis was fixed in Davidson's solution for 7 d and placed in 70% ethanol. The tissues were imbedded with paraffin, and sectioned with a microtome (Leica RM 2155) at 5μm thickness. The tissue sections were placed on the microscope slides, and stained with hematoxylin and eosin. Approximately 15 males per site were randomly selected to examine differences between the reproductive stages. All sections from each male fish were examined in a random fashion. An image at 40x magnification was collected and a 13x17 point grid was overlaid before saving the image (221 points total). Each cell type (spermatogonia, spermatocytes, spermatids, spermatozoa, and others) at each grid intersection was scored, and five random images per fish were analyzed (total of up to 1,105 cells/fish; Northern Eclipse (v8.0) software package (Empix Imaging, 2006)).

Extraction of mRNA and preparation of cDNA
The QIAGEN1 RNeasy Mini Kit (Qiagen, Mississauga, ON, Canada) was used to isolate total RNA from the testis according to the protocol from post-spawning gonadal tissues because they were 1mg. For other fish, total RNA was isolated using TRIzol reagent (RNA Isolation Protocol, Life Technologies, Burlington, ON, Canada). Between 5 to 50 mg of tissue was homogenized with 1 ml of TRIzol reagent using the VWR1 VDI 12 Adaptable Homogenizer as per our methods [14]. All RNA was purified through a QIAGEN1 RNeasy Mini column prior to real-time PCR and microarray analysis, thus every sample was column purified. RNA concentration and A260/A280 ratio (between 1.8 and 2.0) was measured using the NanoDrop™ 2000 (Thermo Scientific, Canada). RNA integrity was evaluated using the 2100 Bioanalyzer (Agilent) with the RNA 6000 Nano kit. The integrity of total RNA, as measure by the RNA Integrity Number (RIN) was between 9.2 and 10 for all samples. RNA was stored in DEPC water at −80˚C prior to microarray and gene expression analysis. DNase treatment of 250 ng RNA was performed with TURBO DNA-free (Ambion, Burlington, ON, Canada) according to the manufacturer's protocol for real-time PCR. Reverse transcription for cDNA synthesis was performed with SuperScript II Reverse Transcriptase (Life Technologies, Canada) using established protocols [15]. The cDNA samples were kept at -20˚C until used for real-time PCR.
To confirm that a single PCR product was amplified, a PCR reaction was performed as previously outlined [15] and amplicons were subjected to electrophoresis on an agarose gel (5%) to visualize the unique band with SYBR safe DNA gel stain and a Low DNA Mass Ladder (Invitrogen). The two step real-time PCR using a CFX96 BioRad instrument began with enzyme activation at 95˚C for 30 s, followed by 39 cycles of 5 s at 95˚C and 30 s at the specific annealing temperature. The reaction was then measured at increasing temperatures in increments of 0.5˚C from 65˚C to 95˚C, every 5 s to generate a dissociation curve. There were 7-9 biological replicates for each reproductive stage, and each sample was analyzed in duplicate. Each gene plate also contained a 6-point standard curve in duplicate. In addition, two no template control (NTCs) and two no reverse transcriptase controls (NRT) were included as negative controls. The target stability value for the 4 control genes was M-value = 1.12 (CV = 0.47). Normalized gene expression was extracted using CFX Manager™ software and gene expression differences determined using the relative ΔΔCq method. All amplicons were verified as correct target genes by Sanger Sequencing at the McGill University and Génome Québec Innovation Centre (Montréal, QB, Canada). MIQE guidelines were carefully considered for real-time PCR analysis [17].

Microarray development
We designed a new RBD microarray platform by selecting annotated genes from a first generation microarray [14], and this became a 8x15K second-generation microarray. Briefly, a total of 6,202 annotated probes were included in duplicate on the 8x15k Agilent platform. Additional positions on the array contained unknown probes that were added randomly to the second-generation array based upon the criterion that the probe showed relatively high expression in the testis (intensity > 7.5). The arrays were printed by Agilent Technologies (Santa Clara, CA, USA). The RBD microarray platform has been deposited into Gene Expression Omnibus (GPL18038, GSE57865).
Gene expression analysis using the RBD oligonucleotide 8x15K microarray platforms Gene expression analysis was performed on fish at five stages: recrudescence (n = 5), developing (n = 6), pre-spawning (n = 7), spawning (n = 7), and post-spawning (n = 6) (n = 31 microarrays). Microarray hybridizations were performed according to the Agilent One-Color Microarray-Based Gene Expression Analysis protocol using Cyanine 3 (Cy3) and 100 ng total RNA per sample was used for the production of cDNA and labeled/amplified cRNA as per the Agilent Low RNA Input Fluorescent Amplification Kit. Labeling methods followed those previously described [18]. Each RBD testis sample showed a specific activity >6.0 pmol Cy3/mL and amounts were adjusted to a final mass of 0.825 μg for 8x15K microarray hybridizations. Fragmentation of the cRNA, hybridization, and slide washes after the 17 h hybridization followed instructions outlined by Agilent (Gene Expression Hybridization Kit). An ozone barrier slide was used to cover microarrays before scanning. Microarrays were scanned at 5 μm with the high density Agilent DNA Microarray Scanner (Agilent Technologies). Agilent Feature Extraction Software (v9.5) scanned arrays with a full dynamic range. The quality of microarray data was evaluated by manual inspection and the quality control report generated by Feature Extraction Software (Agilent Technologies). Each microarray was deemed to be of high quality based on visual inspection.

Bioinformatical analysis
Gene expression data were normalized using Quantile normalization [19]. Data were filtered prior to identification of differentially expressed genes (DEGs). All gene probes below an intensity of 3.5 were assigned a value of 3.5 as this was estimated to be the limit of detection of the microarray analysis (based upon the lower limit of the standard curve for Agilent spike in controls and negative controls across all microarrays). Approximately 5.5% of all spots across all microarrays fell below this intensity. This was a conservative limit for subsequent analyses to identify DEGs. Normalized data were subjected to hierarchical clustering in JMP Genomics to visualize global patterns of gene responses. Hierarchical clustering and k-means cluster analysis involved complete linkage and used normalized expression data. Functional enrichment of gene ontology terms and Principle Component Analysis (PCA) were performed in JMP Genomics (V6.0). Pathway Studio 9.1 (Ariadne, Rockville, MD, USA) and ResNet 9.0 (Mammals) (Nikitin et al., 2003) were used for gene set enrichment analysis (GSEA) and sub-network enrichment analysis (SNEA). These two approaches were employed to gain additional insight into patterns of gene expression in testis development. GSEA is a method widely used in microarray analysis to determine if molecular pathways are enriched for specific annotated cell pathways [20]. SNEA analysis differs from GSEA in that sub-networks are built based upon the relationship of the regulated gene with other genes or proteins in the same biological pathway. Entities are ''connected" to each other based upon relationships that are user-defined (i.e. binding partners, expression targets, protein modification targets, cell process). An algorithm compares the sub-network distribution of the regulated genes to the background distribution of other genes in the characterized interaction network. Thus, data mining of the ResNet 9.0 database yields interactions among molecular and cellular processes.
The GO term analysis was conducted using GOrilla (Gene Ontology enRIchment anaLysis and visuaLizAtion tool). We searched for enriched GO terms in the cluster list of genes compared to the background list of the total genes on the microarray. The algorithm recognized 5,838 genes out of the 11,762 gene terms that were entered. There were 4,316 duplicate genes that were removed (keeping the highest ranking instance of each gene), leaving a total of 1,522 genes.

Statistical Analysis
Morphometric data were tested for equal variance (Levene's test) and normality (Kolmogorov-Smirnov test). Log transformed data met the assumptions to test differences in fish length, body weights, condition factor (k), gonadosomatic index (GSI), and liversomatic index (LSI) using analysis of variance (ANOVA). A Tukey's post hoc test identified individual stage differences. For the histological analysis, the 5 replicate micrographs per individual were pooled together, the data were arcsine transformed, and an ANOVA was used to test for differences among the individual responses across reproductive stages.
A Kruskal-Wallis test was performed on the Cq values of the raw expression data to eliminate any unsuitable reference genes that varied with reproductive stage. Following this, normalized gene expression data were extracted using CFX Manager™ software using a relative ΔΔCq method. The data were tested for equal variances and for normality. Differences among group means were tested by ANOVA (p < 0.05) on the log transformed expression data, followed by Tukey's post hoc tests to identify if there were any differences between stages.
The k-means algorithm was used for the computation of different gene expression trends in the set of 9, 344 unique genes on the array in male gonads across the annual cycle. K-means clustering is an iterative procedure aimed to reduce the variance to a minimum within each cluster. An automated radius K-means was used with a correlation radius for clustering of 0.8. This yielded a total of 183 different patterns of gene expression over the different stages of development, and these were further pooled into 18 patterns of gene expression.

Rainbow darter morphological characterization
All males that were collected were adults, with a minimum length of 4.00 cm and a maximum length of 7.20 cm. Fish from May (6.11 ± 0.09 cm) were larger than males collected in August (5.56 ± 0.10 cm) and October (5.38 ± 0.12 cm) (p = 0.02 and p < 0.001, respectively). In general, the males also collected in May were heavier (3.12 ± 0.17 g) than those collected in other seasons (p < 0.001), with individuals collected in the pre-spawning being an exception (2.45 ± 0.17 g; p = 0.10). This same pattern was also observed with condition factor (k). Males collected in May had the highest k (k = 1.33 ± 0.02) compared to all other seasons. The k of spawning males was different from males collected in the post-spawning (k = 1.18 ± 0.02; p < 0.001), recrudescence (k = 1.13 ± 0.02; p < 0.001) and developing (k = 1.17 ± 0.02; p < 0.001) period. No differences were detected between pre-spawning and spawning males (p = 0.20; Fig 1A).
As expected, the lowest mean GSI occurred when individuals were collected in the postspawning stages (0.08 ± 0.01; p < 0.001) (Fig 1B). Shortly after post-spawning, the testis began to increase in size, reaching a peak GSI = 1.17 ± 0.05 in the pre-spawning and spawning period. The testis do not growing dramatically over the winter, and GSI was not significantly different in individuals collected between recrudescence and the developing stage (1.16 ± 0.09; p = 0.98). GSI increased in individuals at the pre-spawning season to 1.56 ± 0.06, and showed no difference when compared to males collected at the spawning season (1.59 ± 0.05; p = 0.99).
The LSI had a different pattern than GSI over the reproductive cycle ( Fig 1C). Developing and pre-spawning males had the largest LSI (2.26 ± 0.09 and 2.75 ±0.09, respectively) compared to other males collected throughout the year. Males collected in the recrudescence and spawning season followed this had smaller LSI than developing and pre-spawning; finally, the males with the smallest LSI (1.02 ± 0.05) belonged to those at the post-spawning season compared to any other stage (p < 0.001).

Histological analysis
Representative micrographs of the different stages of RBD testis development are shown in Fig  2A as well as a sperm maturation network ( Fig 2B, described further below), and the spermatogenic cell type proportion in Fig 3. In the recrudescence stage, the major spermatogenic cell type present in the testis was spermatocytes (~40%). The cell type proportion between recrudescence and developing stage was not different except for a decrease in the proportion of spermatids (p < 0.001). However, the spermatogenic cell proportions for the next stage, prespawning, was very different compared to the others and the most dominant cell type proportion was spermatocytes (~60%), followed by spermatozoa (~20%). There was no statistical difference in gamete proportions in males between the pre-spawning season and the spawning season. The morphology and gamete proportions were also different in individuals collected in the post-spawning season. The major spermatogenic cell type at this stage of sexual development was spermatogonia (~72%), while the presence of spermatids or spermatogonia were almost completely absent. There was also a significant increase of non-gamete characteristics, primarily due to connective tissue and epithelial cells (~16%, p <0.001).
The strongest relationship detected was between the % spermatocytes and % spermatozoa, which were correlated throughout the reproductive process (Fig 4). Cell types in post-spawning males did not correlate to other gamete stages due to the absence of spermatozoa. The increased proportion of spermatocytes was correlated with the decrease of spermatozoa at recrudescence (r 2 = 0.88), developing (r 2 = 0.93), pre-spawning (r 2 = 0.80) and spawning (r 2 = 0.59) seasons. No relationship was found between GSI and the proportion of any of the spermatogenic cell types (data not shown).

Hierarchical cluster analysis and Principal component analysis (PCA)
Hierarchical clustering of the differentially regulated genes showed that expression patterns were strongly associated to the specific stages of gonad development (Fig 5). Expression profiles from males in pre-spawning and spawning appeared to be most closely related compared to other transcriptomics profiles. Post-spawning fish also showed a transcriptomics profile that was most different from the other stages, and this stage also showed the most differences among stages in terms of cell proportion composition. The analysis of the co-expressed genes (columns) showed 10 different expression clusters (from A to J, Fig 5).
The first component of PCA analysis explained 93.8% of the variability in the expression data. PCA analysis showed the same pattern as that of the cluster analysis; males that were spawning, pre-spawning, and recrudescent showed a gene expression pattern that was most similar (Fig 6), while the developing males had a different distribution pattern, and the postspawning males were most different from all other male stages.

K-means cluster analysis
To identify the major temporal patterns of genes expression over the five stages of development, we performed a K-mean cluster analysis. An automated radius K-mean with a correlation radius for clustering of 0.8 resulted in a total of 183 patterns of gene expression over the different stages of development. Due to the large number of clusters, a manual inspection was applied to collapse the overall number of clusters. The data were reanalyzed by automated Kmeans with a total of 18 clusters that had similar patterns of gene expression or "waves" (S2 Table, Fig 7). The cluster with the highest number of genes was cluster 11 (5,297 transcripts), a cluster that contained a high proportion of genes that showed "no change" in relative expression over testicular development. The clusters that contained genes with the highest expression changes over testis development were 3 and 14. However, the large majority of genes that belonged to those clusters were not annotated, and we were unable to determine the biological processes that underlie those clusters.

GO term and SNEA/GSEA analysis within expression clusters
There were two approaches used to identify unique gene networks within each testis stage transition. We first took a targeted approach to determine what major functions were enriched within the genes lists that belonged to each one of the 18 clusters identified by k-means clustering (Fig 7). The rationale was that these expression waves included genes involved in specific   Each row numbered represents different individual fish within a treatment; each column represents a specific gene. Co-expressed genes were classified in 10 major clusters (from A to J). Blue indicates low mRNA levels (relative) and red indicates high mRNA levels (relative). processes. However, we were not able to perform this analysis for all clusters due to the fact that some clusters had too few genes annotated. Table 1 list the GO terms enriched at each cluster analyzed. Cluster 4, for example, was characterized by transcripts that were increasing  in abundance between the developing and pre-spawning season, followed by a decrease in abundance of the transcripts at the spawning and post-spawning season (Fig 7). This cluster Transcriptional Networks in Male Etheostoma caeruleum over Annual Testis Development had enriched GO terms that were related to lipid pathways. Cluster 10 and 11, which contained genes that showed no change in abundance over testis development, included enriched pathways associated with ribosomal proteins and cell cycle (Tables 1 and 2). Cluster 12 contained transcripts with high expression at the recrudescence, pre-spawning and spawning stages, followed by a decrease at post-spawning. This cluster contained transcripts that had functions related to reproduction.
Sub-network enrichment analyses (SNEA) were conducted for 9 different k-mean clusters ( Table 2) to determine if there were specific biological entities enriched in each group of genes included within a cluster. For example, cluster 2 (Fig 7), which showed the lowest expression for transcripts at pre-spawning and spawning, followed by a peak at post-spawning, contained genes that were related to the cell process of apoptosis, cell differentiation, and cell growth ( Table 2). Using the same analysis, clusters 10 and 11 contained genes that showed no change in abundance over the stages of development, and these transcripts tended to be involved in cell cycle and cell functions within the nucleus (Table 2). Genes in clusters 7 and 15 showed similar expression patterns of low abundance at post-spawning (Fig 7) and both clusters had an enrichment of cell processes related to spermatogenesis and meiosis (Table 2). Finally, genes within clusters 13, 17 and 18 shared a common pattern (Fig 7), with low expression at developing stages and a peak at post-spawning (Fig 7). SNEAs of these clusters showed and enrichment of genes related to immune system functions, in addition to lipid, fatty acid, and steroid metabolism. Cluster 18 also contained a subset of genes involved in sperm mobility.
Spermatid developmental pathways were investigated using pathway analysis to determine how genes related to this process changed from stage to stage (Fig 2B). The analysis indicated that there were no significant differences in the overall expression of genes that belonged to the spermatic development pathway from developing to spawning stages. However, there was a significant decrease in the expression of genes in this pathway from the spawning season to the post-spawning season (-1.7 fold change; p = 0.003). In addition, there was a significant increase in this pathway from the post-spawning season to the recrudescence stage (4.2 fold Not all clusters contained sufficient numbers of genes for analysis. 'P-value' is the enrichment p-value. This p-value is not corrected for multiple testing of 1438 GO terms. Enrichment (N, B, n, b) is defined as follows: N-is the total number of genes; B-is the total number of genes associated with a specific GO term; n-is the number of genes in the top of the user's input list or in the target set when appropriate; b-is the number of genes in the intersection.
doi:10.1371/journal.pone.0164722.t001 change; p 0.001). GSEA revealed several cell processes that changed over the reproductive cycle. Fig 8 shows that processes related with phagocytosis increased after the spawning season  towards post-spawning season. In addition, the data suggest that cell processes involved in the immune system increased in fish at the post-spawning stage of development, which is consistent with the SNEA results ( Table 2) of cluster 17 (Fig 7, peak abundance at post-spawning season). A second approach was employed to identify unique gene networks specific to the stage transitions. Venn diagram showed the number of genes that are significantly expressed (p < 0.05) at each stage of testes development (Fig 9) and they are later listed at Table 3. Enriched cell process that were affected when transitioning from Developing to Pre-spawning stages included nitric oxide biosynthesis, response to wounding, sperm cell function, and stem cell maintenance. The Pre-spawning to Spawning transition was characterized by gene networks related to amino acid import glycogenesis, Sertoli cell proliferation, sperm capacitation, and sperm motility. The Spawning to Post Spawning transition contained unique gene networks associated with chromosome condensation, endothelial cell adhesion, nuclear division, nucleotide biosynthesis, ribosome biogenesis and assembly, and mitotic spindle assembly. Lastly, Post Spawning to Recrudescence transition was characterized by gene networks associated with egg activation, epithelial to mesenchymal transition, germinal center formation, membrane fluidity, and sperm cell adhesion. All pathways are given in S3 Table. Targeted qPCR analysis Targeted qPCR analysis was conducted on genes of interest that were determined to be important for male reproduction in a previous study [14]. In total, 18 genes listed in Table 4 were analyzed using real-time PCR. None of our genes of interest were found in the "wave" shape of clusters 13, 17 and 18 (Fig 7). These included steroid hormone receptors, enzymes involved in steroid biosynthesis, genes related with spermatogenesis and female sex differentiation, and genes that are indicative of stress and biomarkers of exposure to endocrine disrupting compounds.
The expression of esr1 did not change in individuals collected from the recrudescence stage to the spawning stage; however, there was a significant peak in esr1 expression at the post-  Table 3.  spawning season (p = 0.004). The same pattern of expression was observed for both esrb and ar.
Since spermatogenesis is highly controlled by steroid hormones, there was a focus in this study on the expression of genes involved in steroidogenesis. Cyp17a was more abundant during the pre-spawning, spawning, and post-spawning seasons, and decreased significantly in individuals collected during recrudescence. Lastly, cyp19a, the enzyme responsible for the biosynthesis of 17β-estradiol from testosterone, did not show any significant changes over the different stages of testis development (p = 0.086).
We compared qPCR analysis with the K-means cluster analysis to determine if the qPCR data showed the same changes in abundance as those data collected by the microarray analysis, and this was indeed the case. For example, Cluster 7 (Fig 7) contained the expression pattern of spata4, and cluster 12 (Fig 7) contained the expression pattern of spag (sperm associated antigen 1). Spag and spata4 (spermatogenesis associated 4) showed similar expression patterns over testis development, and each transcript was abundant in testes at recrudescence, prespawning and spawning season, and was decreased in individuals collected during the developing (winter time) and post-spawning seasons (Table 4). Spata4 was included in cluster 7 (Fig 7) following the k-mean cluster analysis, and there was a significant decrease in mRNA abundance during developing and post-spawning stages (p < 0.001). Spag belonged to a different cluster 12 (Fig 7), but there was also a significant decrease in abundance during the postspawning season. Dmrt1 on the other hand, showed high expression during the post-spawning season, with a decrease in expression in the recrudescence season (p = 0.001). Genes related to oogenesis were also investigated in the testes over a breeding season because there is evidence that these genes are increased during intersex [14]. Female transcription factors sox9 and foxl2 were examined in the testis. Sox9 and foxl2 were below the detection limits of the microarray data, so we could not validate the microarray data with these genes. Interestingly, qPCR data  error for esr1, esrb, ar, cyp17α, cyp19α, spag, spata4, dmrt1, sox9, foxl2, hps90, hsp1, hsp70,  nanos3, cdca8, surviving, fabp, and zp analyzed using real-time PCR in male gonads over the different stages of testis development. Differences among treatment groups (p < 0.01) for each gene are denoted with different letters; outliers were included in the analysis.

Gene
Recrudescence ( showed that sox9 significantly increased at the post-spawning stage compared to the other stages, and no significant differences in the expression of foxl2 over the different stages of development was observed (Table 4). Genes related to general stress responses also showed different patterns of expression over the year in the testis, while other genes were localized to clusters 10 and 11 (Fig 7). Transcripts such as hsp90 showed an increase in mRNA abundance in recrudescent males towards the post-spawning stage (p = 0.001), while hsbp1 was most abundant in the testis during the prespawning and spawning seasons (Table 4). A significant decrease in hsbp1 was detected in males collected in the winter period and the post-spawning season (p < 0.001). No difference in hsp70 abundance over the different stages was detected (p = 0.076).
Cluster 2 contained genes that showed the lowest transcript abundance at pre-spawning and spawning seasons, with a peak at the post-spawning season (Fig 7). This cluster included nanos3, cdca8, and survivin, each of which were evaluated at the individual level with qPCR ( Table 4). The trend in the abundance change was the same for both the microarray and realtime PCR data. Interestingly, cdca8 and survivin interact with each other at the protein level [21] and these two transcripts showed comparable expression patterns over development in the testis.
Fig 7 also depicts some clusters that contain genes with a major peak in expression during the developing season. Transcript levels of fabp showed the highest amount in testis from individuals collected during the developing season, and showed a significant decrease in individuals collected at spawning, post-spawning and in recrudescent individuals (cluster 5; p = 0.001). Lastly, isoforms of zona pellucida (zp) belonged to cluster 5 and cluster 11 (Fig 7). The qPCR data demonstrated that zp increased in the testis of individuals at the developing stage (p = 0.002, Table 4).

Discussion
Rainbow darter spermatogenesis occurs over an annual cycle [22]. In this study, we examined 5 defined stages of the male reproductive cycle at the molecular level in order to better understand the mechanisms underlying testis development. The somatic indexes evaluated (k, GSI and LSI) showed expected patterns as previously reported by Tetreault in 2013 [23], confirming that the wild fish collected for the molecular analysis had the expected somatic changes over the year of monitoring. Noteworthy was that we observed the same pattern of change between the LSI and the gene cluster 4, suggesting that genes in that cluster are related to LSI. GO terms represented in cluster 4 include an enrichment of genes related to lipid pathways and further investigation is warranted to describe this relationship.
A negative relationship was detected between the percent of spermatozoa and the percent of spermatocytes. This was expected since the spermatogenesis cycle involves the developmental process during which a small number of diploid spermatogonia stem cells produce a large number of highly differentiated spermatozoa carrying a haploid, recombined genome [3]. There are two major developmental steps between spermatocytes and spermatozoa; the meiotic phase with the primary and secondary spermatocytes and the spermiogenic phase with the haploid spermatids emerging from meiosis and differentiating-without further proliferationinto motile, flagellated genome vectors, the spermatozoa. Thus, the percent of spermatozoa is dependent upon the percent of spermatocytes. Furthermore, it was hypothesized that there would be a relationship between the spermatogenic cell proportion and GSI (since both endpoints increase with development); however this was not observed in RBD. We were unable to locate any literature on multi-spawning, small-bodied fish that demonstrated the relationship between spermatogenic cells and GSI. In addition, no relationship was found between the proportion of spermatogonia and the proportion of any other cell type, since the mitotic phase can develop multiple generations of cells. A lack of a relationship between spermatogonia and the proportion of any other cell has also been reported in other fishes with an annual reproductive cycle, such as the dojo loach (Misgurnus anguillicaudatus) [24], and greenside darter (Etheostoma blennioides) [25].
Cluster analysis using gene expression grouped the individuals by stages. The analysis revealed that males in recrudescence, spawning and pre-spawning shared expression patterns that were more similar to each other when compared to males at the developing and postspawning stages (Fig 5). This result was confirmed by the PCA analysis (Fig 6). Expression data over a reproductive season are not widely available for other male teleosts fishes. In female largemouth bass [11] it was demonstrated that, based on global gene expression analysis, fish in primary growth stages formed a unique expression clade while individuals in secondary growth stages also formed a distinct clade. The expression profile for atresia, the active reabsorption of oocytes in the ovary that are not ovulated, was most different when compared to all other ovarian stages in the semi-synchronous spawner. These data are consistent with male RBD and, as expected, GSEA revealed that there was an increased prevalence of genes related with phagocytosis at the post-spawning level (Fig 8), when the non-released cells are absorbed, suggesting that this process is highly associated with the stage of development of the cells.
Increased abundance of genes in males collected during recrudescence, pre-spawning, and spawning were located in cluster 7 (as well as clusters 1, 9 and 12). SNEA revealed that genes related to meiosis, spermatogenesis, and spermatid development were preferentially expressed at recrudescence, pre-spawning and spawning stages. Many gene networks related to sperm maturation, for example Sertoli cell proliferation, sperm capacitation, sperm motility, and spermatid development decreased 20-30% in expression when progressing from the developing stage towards post-spawning, suggesting that early on in the testis, there is increased activity to prepare the sperm. Similar pathways were found to be differentially expressed at the same corresponding stage in the fathead minnow (Pimephales promelas) testis proteome [26] and the rainbow trout (Oncorhynchus mykiss) testis transcriptome [27]. The other stage of interest is the one that is dominated by early stages of spermatogenic cell development (i.e. post-spawning) (clusters 17 and 18; Fig 7). Noteworthy was that there were a number of gene networks related to the immune system that were differentially expressed with advanced stages of reproduction, suggesting that these networks play an important role in male reproduction. For example when the testis transitions from developing to pre-spawning stages, gene networks for T-cell and T-helper lymphocyte responses were decreased~20%. However, networks for monocyte recruitment and platelet activation were increased in expression, moving from prespawning to spawning stages. This trend continued into post-spawning, and there was a global up-regulation of gene networks related to immune function in the testis, moving from spawning to post spawning stages. These included B-cell response, immune response, and neurtophil adhesion gene networks among many others (S3 Table). In addition, SNEA identified several cell processes that are activated in the transition from spawning to postspawning stage of development. Those processes are then suppressed as the testis enters recrudescence and begins a new maturation cycle. For example, phagocytosis increased 1.73 fold change (p = 0.001) from spawning to post-spawning; spermatozoa phagocytosis is a process that plays a significant role in testis development, mainly during the regression stage [28].
Interestingly, the development of eggs and sperm, while very different processes, share some common patterns in gene expression during gametogenesis. Cluster 15 contained genes related to gonad development. For example, cdc20 (a gene present in cluster 15; Table 4) has been shown to have higher expression in gonads than other tissues (kidney, brain and whole body) in zebrafish [29]. Expression data support the hypothesis that this transcript plays an important role in the proliferation of spermatogenic tissue in RBD. In addition, other pathways that peak in females as well as in males in the early stage of gonad development are genes related to the immune system. Martyniuk et al., [11] showed the same peak in the expression of the immune system in early follicular growth of largemouth bass (Micropterus salmoides). As stated previously, genes included in cluster 10 and 11 showed the same pattern of expression over the different state of developments. Those clusters are enriched for GO terms related to nuclear cell function and also genes related with cell division and cell cycle processes (i.e. cell proliferation, cell growth, cell cycle, cell death SNEAs), in RBD, thus the different bioinformatics techniques converged in similar themes for pathways (Tables 1 and 2). In addition, similar cell division process findings were reported in ovaries from striped bass (Morone saxatilis) [30], and largemouth bass [11]. Thus, there are likely conserved molecular signaling cascades initiated during gamete maturation in male and female fishes.
We measured a number of genes by qPCR for several sex specific genes during testis development. Interestingly, many genes associated with female sex differentiation were significantly increased in the males at post-spawning season (i.e. esr1, sox9, cdca8 and survivin; Table 4). It was previously shown that steroid hormone receptors esr1 and ar can be more abundant in ovaries than testis in RBD [15], however this is not the case for esrb, in which there was no difference in mRNA levels between ovary and testis [15]. Bahamonde et al., [14] found that genes such as esr1, ar, sox9, cdca8 and others like foxl2 were significantly increased in intersex individuals. This might suggest a sensitive window for the effects of endocrine disrupting compounds (EDCs) on male fish, and may be related to the subsequent development of intersex.
It has been possible to generate the intersex condition in medaka (Oryzias latipes) in the laboratory [31,32], but it appears as though it is not as straightforward for other fish species [33]. In the field, roach (Rutilus rutilus) germ cell disruption can be substantial [34]. The authors state that the lack of disruption in the laboratory might be due to the fact that the municipal wastewater effluents (MWWE) was not estrogenic or that the germ cell disruption is a consequence of prolonged, chronic exposure [33]. Conversely, the early stages of testis differentiation may be most sensitive to endocrine disruptors and endogenous hormones, and the testis can be converted to an "ovary" most readily at this point in time. This is based on our data that show there are higher expression levels of androgen and estrogen receptors in this stage in rainbow darter testis compared to more advanced testicular stages. Based on the molecular responses observed here, future studies into the induction of intersex condition should investigate the post-spawning season and perhaps, the immune system, as a novel mechanism of intersex.

Summary
The post-spawning period was the dominant phase that separated individuals based upon gene expression patterns (i.e. was the most different). The post-spawning stage contained gene networks related to apoptosis, cell death, and cell cycle, as well as genes for RNA binding and ribosomal units (clusters 2, 10, and 11), spermatoproteosome complex (cluster 16), and genes involved in stress responses, for example hsp90 and hsp1 (cluster 10). Subnetwork enrichment analysis revealed that during the post-spawning season, genes related to oxidative stress and immune responses (cluster 13, 17 and 18) as well as ATPase binding and apoptosis were differentially expressed compared to other stages. There were also a wide variety of genes that showed lower relative expression at post-spawning compared to the other stages. These included genes related to meiosis, cytokinesis and spermatogenesis (cluster 7 and 15), synaptic transmission (cluster 7), protein localization to organelles (cluster 15), and cellular processes involved in reproduction (cluster 12). Developing males showed increased mRNA abundance of genes associated with lipid mobilization (cluster 4), and fabp which is involved during the lipid mobilization process. Fabp was most expressed at this period based upon the real-time PCR data. This study increases our understanding of testis development in teleost fish and identifies new genes and processes that may be useful in understanding exposure to hormones and EDCs.
Supporting Information S1 Table. Water quality data provided by The Grand River Conservation Authority. Water Temperature (deg C), pH, Specific Conductance (μS/cm), Dissolved oxygen (mg/L), and turbidity (NTU) was measured hourly (when it was possible) from may 11 th , 2011 to June 1 st , 2012. (XLSX) S2 Table. List of genes included in each one of the 18 different patterns of gene expression determined by the k-mean analysis. (XLSX) S3 Table. The complete list of pathways identified as differentially expressed relative to the stage before during RBD testis development. (XLSX)