Comparative expression profiling of testis-enriched genes regulated during the development of spermatogonial cells

The testis has been identified as the organ in which a large number of tissue-enriched genes are present. However, a large portion of transcripts related to each stage or cell type in the testis still remains unknown. In this study, databases combined with confirmatory measurements were used to investigate testis-enriched genes, localization in the testis, developmental regulation, gene expression profiles of testicular disease, and signaling pathways. Our comparative analysis of GEO DataSets showed that 24 genes are predominantly expressed in testis. Cellular locations of 15 testis-enriched proteins in human testis have been identified and most of them were located in spermatocytes and round spermatids. Real-time PCR revealed that expressions of these 15 genes are significantly increased during testis development. Also, an analysis of GEO DataSets indicated that expressions of these 15 genes were significantly decreased in teratozoospermic patients and polyubiquitin knockout mice, suggesting their involvement in normal testis development. Pathway analysis revealed that most of those 15 genes are implicated in various sperm-related cell processes and disease conditions. This approach provides effective strategies for discovering novel testis-enriched genes and their expression patterns, paving the way for future characterization of their functions regarding infertility and providing new biomarkers for specific stages of spematogenesis.


Introduction
The testis has been identified by RNA sequencing as the organ in which the largest number of tissue-enriched genes is expressed among various organs. It has been estimated that expressions of more than 1000 genes are enriched in the testis [1]; whereas, on average, there are approximately 200 signature genes in each tissue [2]. Tissue-enriched or tissue-specific genes are essential for the growth and development of specific cells and organs [3]. Thus, characteristic processes that occurred in germinal cells in the testis, including meiosis, genetic recombination, spermatogenesis, and spermiogenesis may largely be attributed to a number of differential gene expressions. Spermatogenesis is a complex process that is orchestrated by expression of multiple genes at various stages containing particular cell types, such as spermatogonial stem cells, spermatogonia, spermatocytes, and spermatids [4]. In addition to germinal cells, the somatic Sertoli cells play a role in testis formation and provide an essential environment for spermatogenesis [5], and Leydig cells produce androgen, which plays a key role in the regulation of spermatogenesis and undergo changes in gene expression [6,7]. However, a large portion of transcripts and proteins related to each stage or cell type as well as their functions still remains unknown.
Investigation of gene expression and function during spermatogenesis has been hampered by a lack of immortalized cell lines for each stage [8]. Alternatively, testis transcriptome microarray analysis based on Gene Expression Omnibus (GEO) repository (www.ncbi.nlm.nih.gov/ geo) followed by protein profiling using immunohistochemical data from the Human Protein Atlas portal (www.proteinatlas.org) is a useful tool for discovering highly expressed genes in each stage of spermatogenesis in the testis. Furthermore, gene expression profiles under various developmental, disease, and knockout conditions produced in GEO microarray datasets offer a platform for functional genomic studies of spermatogenesis stage-specific gene expression.
Using these sources combined with confirmatory gene expression measurements and pathway analysis, in this study, protein localization and signaling pathways of 15 testis-enriched genes were analyzed. The objectives of this study were to identify novel testis-enriched genes using gene expression profiles and analyze protein localization, developmental regulation and biological implications of testis-enriched genes in humans and mice. The current approach provides an effective strategy for discovering novel testis-enriched genes and their unique stage-specific expression, paving the way for future studies of normal development of the testis and associated diseases.

Microarray data mining
The microarray-based, high-throughput gene expression data were obtained from the GDS DataSet (GDS) of the GEO repository in the National Center for Biotechnology Information (NCBI) archives (www.ncbi.nlm.nih.gov/geo). For analyzing tissue distribution pattern of gene expression in 12 male mouse tissues and 10 man tissues, GDS3142 for mice and GDS596 for humans were downloaded and sorted (Tables 1 and 2) as described in our previous reports [9,10]. Also, gene expression patterns in mouse sperm cells (GDS2390), developing mouse testis (GDS605, GDS606 and GDS607), semen samples collected from 14 teratozoospermic individuals aged 21-57 (GDS2697), and polyubiquitin knockout mice (GDS3906) were examined.

Animal use and sample preparation
All animal care and procedures were approved by the Institutional Animal Care and Use Committee (IACUC) at The Ohio State University. Mice were raised under ad libitum feeding conditions in a mice housing facility at The Ohio State University. Mice were euthanized by carbon dioxide inhalation followed by cervical dislocation. For isolation of total RNAs, testis,  muscle, liver, brain, lung, kidney, adipose tissue, thymus, spleen, and small intestine were collected from 3-month-old FVB mice (n = 3) using Trizol reagent (Invitrogen, Carlsbad, CA, USA) [11]. Total RNAs from the adult human kidney, liver, lung, heart, muscle, testis, thymus, and brain were purchased from Agilent Technologies (Santa Clara, CA, USA) and adult human RNA from adipose tissue was purchased from Clontech Laboratories (Mountain View, CA, USA). For RNA isolation from mouse testis at 10 days postpartum (dpp), 21 dpp, and 91 dpp (three months postpartum), C57BL/6 mice (n = 4) were euthanized and both testes were harvested.

Reverse transcription PCR (RT-PCR)
To measure the quantity of RNA, a Nanodrop spectrophotometer (Thermo Scientific, Wilmington, DE) was used. The RNA samples were stored at -80˚C until use. Approximately 1 μg of RNA was reverse-transcribed in a 20 μL total reaction to cDNA using Moloney murine leukemia virus (M-MLV) reverse transcriptase (Invitrogen). The thermal cycle of the reverse transcription was 65˚C for 5 min, 37˚C for 52 min, and 70˚C for 15 min. Exactly 1 μL of cDNA samples was used as a template for PCR in a 25 μL total reaction with AmpliTaq Gold DNA polymerase (Applied Biosystems, Carlsbad, CA). The conditions for this reaction were 95˚C for 1 min 30 s, 33 cycles of 94˚C for 30 s, 55˚C for 1 min, 72˚C for 1 min, with an additional extension step at 72˚C for 10 min. PCR products were separated by using 1% agarose gel electrophoresis. Forward and reverse primers for both humans and mice listed in supporting information were designed on different exons for multi-exon genes to avoid genomic DNA contamination.

Analysis of protein expression profiles from the Human Protein Atlas
Data visualizing immunohistochemically the expression patterns of selected proteins in human testes were obtained from the Human Protein Atlas portal (www.proteinatlas.org). A total of 15 testis-enriched proteins were analyzed for their localization in the human testis: ten of them have been published in terms of their localization in mouse testis, but not in human testis, and the rest of them have not been published regarding their localization in both human and mouse testis.

Real-time PCR
Quantitative real-time PCR (qPCR) was performed on an ABI 7300 Real-Time PCR instrument (Applied BioSystems, Foster City, CA) by using AmpliTaq Gold polymerase (Applied BioSystems) with SYBR green detection dye. Cyclophilin (CYC) was used as a housekeeping gene. Reactions were performed in duplicate 25μL volumes and conditions for the qPCR were 95˚C for 10 minutes followed by 40 cycles of 94˚C for 15 seconds, 60˚C for 40 seconds, 72˚C for 30 seconds, and 82˚C for 33 seconds. Relative quantification of gene expression was determined by using the 2 -ΔΔ C T method [12].

Signaling pathway analysis
Signaling pathways of spermatocyte-or spermatid-enriched proteins were analyzed using Pathway Studio (v 11.2.5.9, Elsevier, Amsterdam, Netherland). A list of 15 testis-enriched proteins was entered into Pathway Studio. The resulting pathways were verified through the PubMed/Medline hyperlink embedded in each node.

Statistical analysis
For comparison of gene expression in testis versus other tissues, one-way ANOVA followed by a Fisher's protected least significant difference test was performed using SAS version 9.2 (SAS Institute Inc., Cary, NC). A Student's t test was conducted to compare the difference between two means. Comparison of multiple means was conducted by one-way ANOVA followed by a Tukey's post hoc test. The significance level was set at p < 0.05.

Microarray analyses identified common testis-enriched genes for the mouse and human
Comparative analysis of GEO DataSets (GDS3142 for mice and GDS596 for humans), a public microarray repository, revealed that expressions of 24 genes in both the mouse and human testis are more than 10-fold higher than an average expression value of other tissues (Tables 1  and 2). For example, murine Tnp1 and human TNP1 expressions are 276-and 186-fold greater in the mouse and human testis, respectively, than an average value of other tissues. In addition, these 24 genes are expressed at very low levels in the ovary, showing that they are male-specific genes. Our literature search revealed that some, but not all, genes were reported for testis enrichment and protein cellular location in testis. For instance, testis-specific expression of murine Ldhc gene and cellular protein location of LDHC were reported in the mouse testis [13][14][15], but not in human testis. In this study, testis enrichment of selected genes was confirmed by RT-PCR (Fig 1) and their localization profiles in humans were explored through the Human Protein Atlas (Fig 2).

RT-PCR confirmed testis enrichment of selected genes
To validate the microarray data, RT-PCR was performed for murine Smcp, Odf1, Crisp2, Phf7, Gapdhs, Ddx4 and Zmynd10 and human PRM2, TNP1, SPATA6, NEK2, LDHC, YBX2 and EFHC1, which have not been reported previously for expression in the testis. To prevent PCR saturation effects during amplification, the number of PCR cycles was reduced until the saturation no longer occurs. These genes showed testis-enriched expression patterns among various tissues (Fig 1), which is consistent with the GEO DataSets (Tables 1 and 2).

Analysis of immunohistochemical data showed protein expression in specific stages of human testis
With the very latest version of the Human Protein Atlas, cellular location of several testisenriched genes in humans was analyzed. When these locations were reported in mouse, they were grouped in Fig 2A and 2B; otherwise they were grouped in Fig 2C and 2D. As shown in Fig 2, PHF7, SPINK2, LDHC, TCP11, EFHC1, and TCFL5 proteins were located in earlier stage cells (Fig 2A) than cells expressing ZPBP, ACTL7A, ACTL7B, and SPATA6 ( Fig 2B). PHF7, SPINK2, LDHC, TCP11, EFHC1, and TCFL5 were localized in pachytene spermatocytes (PC) and round spermatids (RS). In detail, SPINK2 was expressed strongly in the cytoplasm of pachytene spermatocytes, LDHC showed expression in the tail of spermatozoa (SZ), and EFHC1 was highly expressed in Sertoli cells (ST) (Fig 2A). ZPBP was uniquely detected in the developing acrosomal granules (AG) of round spermatids. ACTL7A was expressed in round spermatids and exclusively in the acrosome granules, with a lesser degree in spermatozoa tails. ACTL7B showed a stronger expression than SPATA6 in round spermatids (Fig 2B). Other testis-enriched proteins shown in Fig 2C and 2D, such as YBX2, ZMYND10, STAG3, ODF1, and GAPDHS, have not been published regarding their localization in both human and mouse testis. YBX2 and ZMYND10 were strongly localized in the cytoplasm of pachytene spermatocytes (PC) and, to a lesser degree, the nucleus of pachytene spermatocytes in the case of ZMYND10. STAG3 was expressed in the nucleus of pachytene spermatocytes. These proteins were also present in round spermatids (RS) except for low expression of ZMYND10 in round spermatids. In addition, YBX2 was detected in the tail of spermatozoa (SZ), and ZMYND10 showed expression in Leydig cells (LD) (Fig 2C). ODF1 and GAPDHS were expressed in round spermatids (RS). Also, GAPDHS showed expression in the tail of spermatozoa (SZ). ODF1 was expressed in elongating spermatids (ES) with developing tails (Fig 2D).
In summary, most testis-enriched proteins selected in this study are expressed after the spermatogonia stage. Their gene expression profiles curated in GDS2390 also showed a significant increase of expression during the stages of pachytene spermatocytes and round spermatids (Table 3). On the other hand, two non-testis-enriched proteins, COL1A2 and ZBTB16, were mainly expressed in the early-stage cells such as type A and type B spermatogonia ( Fig  2E) and similarly, their gene expression profiles showed significantly higher mRNA expression in spermatogonia stages (Table 3). Therefore, whether expression of testis-enriched genes is regulated during testis development was further analyzed.
Expression of testis-enriched genes showed an increasing pattern during normal testis development qPCR revealed that expression of selected testis-enriched genes is significantly increased during testis development. To investigate stage-specific expression patterns, 10 days postpartum (dpp) with mostly spermatogonia, 21 dpp when spermatocytes are the most abundant cell type, and 91 dpp representing an adult stage with spermatids were selected. Compared to 10 days postpartum (dpp), expression of these selected genes was significantly increased at 21 dpp after weaning and/or at 91 dpp (three months postpartum) when sexual maturation occurs (Fig 3A-3D). These data are consistent with the microarray database in GDS605, GDS606, and GDS607, which also shows an increasing pattern of these genes during the period of 0 through 35 dpp (Table 4). It suggests that expression of these testis-enriched genes is up-regulated during testis development and plays a role in later stages of spermatogenesis. In contrast, expression of both Col1a2 and Zbtb16 was significantly decreased at 21 dpp (Fig 3E), and this pattern was also shown in GDS605, GDS606, and GDS607 ( Table 4), suggesting that these non-testis-enriched genes may be involved in early stages of spermatogenesis rather than the later stages. In addition, our further data analysis showed that, compared to fertile normal males, expression of these testis-enriched genes was significantly decreased in teratozoospermic patients with abnormal sperm morphology according to GDS2697 (S1 Table). It appears that those testis-enriched genes are involved in normal testis development without morphological defects and may serve as a biomarker for teratozoospermic condition. Moreover, according to GDS3906, polyubiquitin knockout resulted in a decreased expression pattern of testis-enriched genes at 28 dpp compared to wild-type (S1 Table). Testis-enriched genes are associated with various biological pathways in sperm Signaling pathway analysis was conducted to identify corresponding pathways related to sperm-related biological functions and disease conditions. Schematic illustration was drawn to identify cellular and metabolic processes regulated by testis-enriched genes and showed that at least 12 out of 15 spermatocyte-or spermatozoa-enriched proteins were putatively associated with various sperm-related cell processes, clinical parameters, and disease conditions (Fig 4).

Discussion
In this study, testis-enriched genes in human and mouse were arranged based on microarray based-GEO database, and 15 genes that have not been published regarding their localization in human ( Table 2) were selected to analyze their protein expression in testis using immunohistochemical data from the Human Protein Atlas portal.

Several proteins expressed in pachytene spermatocytes and round spermatids of human testis were analyzed
Proteins localized in pachytene spermatocytes (PC) and round spermatids (RS) were shown in Fig 2A and 2C. PHF7 is a male-specific transcription factor for germ cell development and sexual identity [16,17]. SPINK2 is a Kazal-type serine protease inhibitor or an acrosin-trypsin inhibitor that is synthesized in the testis [18][19][20]. LDHC is an enzyme related to aerobic glycolysis in spermatozoa for energy production, and it regulates the sperm motility and capacitation [21]. Based on its localization, we postulated that LDHC is associated, not only with ATP generation in mature spermatozoa, but also with development of germ cells. TCP11 is a receptor Expression profiling of testis-enriched genes of a fertilization promoting peptide that regulates sperm capacitation in the mouse [22]. EFHC1 has been found in mouse sperm flagella and is present in motile cilia, but not in immotile cilia [23]. In this study, EFHC1 was expressed in cytoplasmic regions of testicular cells. It suggests that EFHC1 may be associated with germ cell development and sperm motility. TCFL5 has been found in the testis during spermiogenesis, and it is associated with spermatogenesis and the formation of sperm flagellum in the mouse [24]. YBX2, also known as contrin, is a germ cell specific protein and required for the formation of functional spermatozoa and has been implicated as a potential cause of azoospermia [25,26]. ZMYND10 has been found in motile cilia of Drosophila, and it is associated with male fertility [27]. STAG3 is the meiosisspecific cohesion subunit and is associated with meiotic division of gametes [28].
Some proteins expressed in the acrosome or cytoplasmic region of spermatids of human testis were presented The acrosome reaction is required for zona pellucida penetration and fertilization with oocytes [29], and four proteins, ZPBP [30], ACTL7A (T-ACTIN2), ACTL7B (T-ACTIN1), and SPATA6, were localized in the acrosome or cytoplasmic region of spermatids (Fig 2B), implicating their roles in fertility. Other proteins that may also be involved in spermatogenesis during the spermatid phase are shown in Fig 2D. ODF1 is one of the heat shock proteins that play an important role as molecular chaperones in spermatozoa, and it is located in the sperm tails and supports the flagella motility [31,32]. GAPDHS is a testis-specific glycolytic enzyme and generally known to be present in the principal piece of spermatozoa, and it is associated with ATP production and flagella motility and capacitation [33,34].

Expression of testis-enriched genes increased during normal testis development
Those selected testis-enriched proteins were mostly expressed in cells in the late spermatogenesis stages. The stage-specific mRNA expression of these genes showed similar patterns as shown in the GDS2390 dataset (Table 3). Their expression was further analyzed during the testis development. During postnatal testicular growth, the proportion of germ cell types in seminiferous tubules changes. Before 10 dpp, testes of the mouse (Mus musculus) contain mostly spermatogonia. Between 21 and 24 dpp (weaning ages), spermatocytes become the most abundant cell type, round spermatids develops as the most advanced germ cells, and in adults, spermatids are a predominant cell type stage [35,36]. Based on the changes in types of spermatogenic cells, qPCR was performed at the stage of spermatogonia (10 dpp), spermatocytes and round spermatids (21 dpp), and spermatids (91 dpp). Our qPCR data showed that expression of these testis-enriched genes was significantly increased around weaning ages when spermatocytes and round spermatids are present in testis (Fig 3). Some of them were further increased at 91 dpp when the mouse is sexually matured and spermatids are the predominant form of spermatogenic cells. These expression patterns were consistent with gene expression profiles in GDS605, GDS606, and GDS607 (Table 4). These results suggest that testis-enriched genes may be involved in advanced stages of spermatogenesis when spermatocytes and spermatids are dominant types of spermatogenic cells.

Testis-enriched genes tend to be repressed in diseases associated with male infertility
Expression of all of these testis-enriched genes was decreased in teratozoospermic patients compared to normal individuals (S1 Table). Spermatogenic cells are susceptible to impairment which causes spermatogenic cells to become arrested at a certain developmental stage. For example, spermatogenic arrest at spermatogonia leads to total germ cell depletion and Sertoli cell only (SCO) syndrome with a lack of germ cells, arrest at spermatocytes gives rise to azoospermia (no spermatozoa) and oligozoospermia (a reduced number of spermatozoa), and arrest at spermatids results in teratozoospermia (an abnormal shape of spermatozoa) [37]. Thus, decreased expression of genes enriched in spermatids in this study could be used as biomarkers for spermatid arrest, teratozoospermia, and subsequent infertility. In addition, polyubiquitin knockout mice showed a decreased expression of these testis-enriched genes compared to wild-types (S1 Table). The ubiquitin-proteasomal pathway (UPP) has been regarded to be a critical process for the successful maturation of spermatids into spermatozoa by tagging and degrading proteins related to morphological defects [38]. In addition, post-testicular presence of ubiquitin plays a role in disposal of defective mature spermatozoa [39,40]. It has been reported that total knockout of the polyubiquitin gene in mice resulted in a developmental arrest of spermatogenesis followed by infertility [41]. Therefore, the decreased expression of testis-enriched genes in polyubiquitin knockout models can be used as an indicator of failure in sperm maturation. A recent study has shown that knockout mice lacking several testis-enriched genes were fertile [42]; however, the relationship between these genes and normal testis development remains to be explored. Genes presented in the current study that are related to testis development may provide appropriate targets for future knockout studies.

Various signaling pathways in sperm are linked to testis-enriched genes
Pathway analysis, in this study, provided comprehensive insight into the underlying biological functions and diseases involved in spermatocyte-or spermatozoa-enriched expression. As such, most of spermatocyte-or spermatozoa-enriched proteins being analyzed were implicated in a variety of sperm functions, including motility and capacitation, and multiple disease conditions such as infertility. On the other hand, three proteins (ZMYND10, ACTL7B, and TCFL5) out of those 15 spermatocyte-or spermatozoa-enriched proteins were not implicated in the biological conditions possibly due to incomplete functional annotations (Fig 4).
In conclusion, testis-enriched genes were found based on GEO profiles, and among them, protein localization of 15 genes was identified using the Human Protein Atlas. Mostly, these testis-enriched proteins were expressed in spermatocytes and/or round spermatids, and their expression significantly increased during testis development. In testicular disease conditions, expressions of these genes were significantly decreased suggesting their relation to normal spermatogenesis and testis development. Moreover, in our pathway analysis, most of these proteins exhibited multiple biological implications related to sperm function. Future studies should ascertain the potential involvement of these testis-enriched genes in male infertility.
Supporting information S1 Table. Microarray analysis of testicular transcriptome. Samples were derived from normal and teratozoospermic individuals aged 21-57 (GDS2697) and from wild-type and polyubiquitin knockout mice at 28 dpp (GDS3906).