Gonadal transcriptome analysis of hybrid triploid loaches (Misgurnus anguillicaudatus) and their diploid and tetraploid parents

Hybrid triploid loaches (Misgurnus anguillicaudatus) were generated from natural tetraploid and diploid loaches. We studied the gonads of parents and offspring from direct and reciprocal crosses through histological and transcriptome analyses. The triploid offspring had inferior ability to form sperm and egg cells compared with diploid forms. After sequencing the transcriptomes, 655,109,158 clean reads were obtained, and 62,821 unigenes and 178,962 transcripts were assembled. Of these unigenes, 23,189 were annotated in the GO database, 18,525 in the KEGG database and 24,661 in the KOG database. 36 fertility-related genes were found to be differentially expressed between the direct cross (2n × 4n) progenies and their parents, while 53 fertility-related genes between the reciprocal cross (4n × 2n) progenies and their parents. Following protein-protein interaction network analyses, 54 differentially expressed genes, including PLCB4, cyp17a1 and Pla2g4d, were mined, yielding candidate genes involved in the poor fertility of hybrid triploid loaches. This is the first report of transcriptomes of gonads of hybrid triploid loaches and their parents, offering a substantial contribution to sequence resources for this species and providing a deep insight into the molecular mechanism controlling the fertility of hybrid triploid fish.


Introduction
Loach (Misgurnus anguillicaudatus; Cypriniformes; Cobitidae) is one of the endemic fishes of Asia and a common small freshwater fish in China. The loach is distributed in all parts of China. For delicious taste and traditional Chinese medicine value, it has become one of the most important cultured fish species in China. The loach exhibits a range of polyploidy: in addition to diploid, there are natural triploid [1], tetraploid [2][3][4][5] and hexaploid [3] forms in China. This ploidy has received close attention by scholars internationally. Thus, Arai et al. [4,[6][7][8] reported the distribution of natural triploid loach in Japan, but found no natural a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 containing lysate until the tissue mass disappeared, and DAPI (4',6-diamidino-2-phenylindole) was added to each tube. Tests of ploidy were performed using flow cytometry (Partec PAS-III, Partec, Munster, Germany).

Artificially induced spawning and insemination
The parents were chosen from well-developed diploid and natural tetraploid loaches injected with human chorionic gonadotropin (HCG) (injection dose for females 20-25 IUÁg −1 ; males 10-12.5 IUÁg −1 ). After 12 h, the abdomens of female fishes were pressed gently to discharge the eggs into a 9 cm culture dish. Semen from male fishes was similarly extruded from genital pores on both sides of the body. It was collected in centrifuge tubes and diluted 100 times with Kurokura solution (750 mg NaCl, 20 mg CaCl 2 , 20 mg NaHCO 3 , and 20 mg KCl dissolved in 100 ml distilled water). Using dry fertilization, the hybrid combinations namely a direct cross (2n × 4n) and a reciprocal cross (4n × 2n) were obtained. During the incubation periods, the temperature was maintained at 25 ± 1˚C. The fresh water was aerated, and any dead fry were removed immediately from the nursery aquarium.

Sample collection
After the experimental loaches had reached sexual maturation, the method of euthanasia (i.e. decapitation) was used for loaches anesthetized with 100 mg/L tricaine methanesulfonate (MS-222). And then the loaches were placed on ice for tissues collections. The gonads of four parents and eight offspring were removed. Parts were fixed in Bouin's solution and stored in 80% alcohol for 24 h before being prepared for histology of gonadal tissue sections. The other parts were placed in liquid nitrogen for temporary storage, and then stored in tubes in a -80˚C freezer before RNA extraction.

Preparation and observation of gonadal tissue sections
Fixed samples were dehydrated in an alcohol gradient, cleared in xylene, embedded in paraffin wax, then serially sectioned using a microtome (Leica Instruments, Nussloch, Germany) at 6-8 μm thickness. Dewaxed sections were stained with hematoxylin and eosin (HE) and sealed on slides under neutral gum. Images were taken using an Olympus AH2 camera (Olympus, Tokyo, Japan).

Library construction and sequencing
RNA extraction used enrichment with magnetic beads coated with oligo (dT) primer. The extracted mRNAs were broken randomly into short fragments using fragmentation buffer, and the first strand cDNA was synthesized as random hexamers using a fragment of mRNA as a template. Subsequently, buffer, dNTPs, RNase H and DNA polymerase I were added to synthesize the second strand cDNA. Then we used AMPure XP beads to double-purify products, T4 DNA polymerase and Klenow DNA polymerase to fix the sticky end of the DNA to the flat end. 3 0 -end added base A and the joint, then used AMPure XP beads (Agencourt Bioscience, Beverly, MA, USA) for fragment selection. Finally, qPCR amplification was used to obtain the final sequencing library. Sequencing was done by LC-Bio Technologies Co., Ltd. (Hangzhou, P. R. China) (http://www.lc-bio.com) using an Illumina HiSeq2000/2500 machine (Illumina, San Diego, CA, USA) for paired-end sequencing.

Identification of differentially expressed genes (DEGs)
Gene expression level was normalized against the Reads Per Kilobase of transcripts per Million mapped reads (RPKM) value, which considers the effect of sequencing depth and gene length for the read count at the same time, and is currently the most commonly used method for estimating gene expression levels. DEGs and their corresponding P values were determined using the methods described by Audic [24] in 1997, and the significance threshold of the P value in multiple tests was set based on the false discovery rate (FDR). The fold changes (log 2 RERPKM/PERPKM) were also estimated according to the normalized gene expression level. For this study, "FDR< 0.001 and |log 2 fold change|> 2" were used as the thresholds to define DEGs. The protein-protein interaction network analyses of fertility-related DEGs were performed and the key genes causing the poor fertility of hybrid triploid loach were obtained.

Validation of RNA-seq results by qPCR
To test the reliability of the RNA-Seq results, DEGs involved in the development of gonads tissues were selected for validation using qPCR. The first strand synthesis of cDNA was carried out using Revert Aid Premium Reverse Transcriptase kits (# EP0733; Thermo Scientific, Waltham, MA, USA). The reactions were performed in an ABI Step One Plus machine (Thermo Scientific). The cDNA samples diluted 10 times served as test templates. The reaction mix included SYBR Green qPCR Master Mix 10 μl, 0.4 μl fragment F (10 μM), 0.4 μl fragment R (10 μM), 7.2 μl H 2 O and 2 μl template cDNA. The following thermal cycling parameters were used: initiation at 95˚C for 3 min followed by 45 cycles of 95˚C for 7s for denaturation, 57˚C for 10s for annealing, and 72˚C for 15 s for extension. All experiments were conducted with three biological replicates for each sample. The relative expression levels were normalized to the endogenous housekeeping gene βactin and expression ratios were calculated using the 2 −ΔΔCt [25] method.
The DNA content ratio in individual blood cells of diploid, triploid and tetraploid forms was 1:1.5:2.

Histology and comparison of mature gonads
Spermatogonia (SG), spermatocytes (SC), and large numbers of mature spermatid (ST) were observed in diploid and hybrid triploid testes, but the mature sperm content was lower than in normal diploid forms (Fig 2A, 2B and 2C). Thus the male hybrid triploid can produce spermatozoa, but the output is less than in the control diploid male testis. Diploid and hybrid triploid female ovaries also contained second, third and fourth phase oocytes, yolk granules (yg) and follicular membranes (fm) (Fig 2D, 2E and 2F).

Sequence assembly and unigene annotation
Preliminary analysis showed that the 12 samples (Table 1) used in the trial experiment had optical densities (OD) at 260/280 wavelength of >1.8, indicating no protein contamination. The RNA integrity numbers (RINs) were 8.1-9.3. The results showed that the RNAs were free of protein contamination and their integrities were high and met the requirements of sequencing. The detailed test result report is listed in S1 Table. Illumina-based RNA-Seq analysis was conducted with gonadal tissue samples from hybrid triploid loaches and their parents. A total of 661.34 million raw data sequences with an average length of 100 bp were generated. After trimming out low-quality reads and short read sequences, a total of 655.11 million sequences of clean data (99.05%) were obtained. The average GC content was 46.46% (S2 Table), and these reads were used for the following analysis of 178,962 transcripts and 62,821 unigenes. The N50 values of transcripts and unigenes were 2,149 and 2,121, respectively. A summary of the assembly data is given in Table 2   In all, 62,821 unigenes were annotated by Blastx and Blastn against six public databases (Swiss-Prot, NR, Pfam, KEGG, KOG, and GO), with a threshold of 10 −5 . According to the numbers and proportions of unigenes in the different databases, there were differences in the numbers of genes annotated successfully, because these databases use different filter conditions, such as protein sequence, protein topology, and the metabolic pathways involved. The NR database annotated 33,433 unigenes successfully, accounting for 53.22% of the total number of genes. The number of genes was least well covered in the KEGG database where 18,525 unigenes were annotated, accounting for 29.05% of the total (Fig 3).

Unigene classification
In all, 23,189 annotated genes were obtained by Gene Ontogeny (GO) annotation, accounting for 36.91% of the total number of unigenes. GO analysis identified three main functional pathways: biological pathways, cellular components and molecular functions. These were subdivided into 52 functional categories (Fig 4).
In all, 24,661 unigenes were annotated by the EuKaryotic Ortholog Groups (KOG) database and divided into 24 functional categories. These were mainly signal transduction mechanisms (20%) and general functions (18.76%; Fig 5).
In all, 18,525 unigenes were annotated by the Kyoto Encyclopedia of Genes and Genomes (KEGG) library into 262 pathways, covering metabolism, genetic information processing, environmental information processing, cellular process, human disease and six aspects of drug development (Fig 6).

Identification of differentially expressed genes (DEGs)
Significant DEGs were analyzed in each group. Most of the DEGs were found between diploid male parent and male offspring. The expression levels of FMN2, Foxm1, Sox30 and other genes were up-regulated in male offspring, whereas the expression levels of CYP17A1, CYP11A1, GNAQ and other genes were down-regulated in male offspring. The fewest DEGs were present between diploid female parent and female offspring, where the expression levels of MLL2,  Ccnb1, IGF2, and other genes were up-regulated in female offspring, and the expression levels of HMR-1, Cyp27b1, CYP2J2 and other genes were down-regulated in female offspring (Fig 7).
By screening for fertility-related DEGs, a total of 17 genes were firstly obtained from PF (2n × 4n)-VS-OF (2n × 4n), of which 8 genes were up-regulated in the progeny and 9 genes were down-regulated (S3 Table). The production of differential gene expression heat maps can give better understanding of the differences between gene distribution patterns. The expression levels of embryonic development-related (e.g., Foxq1) and DNA repair-related genes (e.g., SMC1A) were up-regulated in the progeny significantly. However, the expression levels of embryonic development-associated (e.g., hmr-1) and membrane-associated genes (e.g., Adcy2) were down-regulated in the progeny significantly (Fig 8).
A total of 20 fertility-related DEGs were obtained from PM (2n × 4n)-VS-OM (2n × 4n), of which 10 genes were up-regulated in the progeny and 10 genes were down-regulated (S4 Table). The expression level of meiosis-related (e.g., FMN2) and transcription factor-related genes (e.g., SOX9) were up-regulated in the progeny significantly. However, the expression levels of steroid biosynthesis-related genes (e.g., cyp17a and cyp11a1) were down-regulated in the progeny significantly (Fig 9).
A total of 24 fertility-related DEGs were obtained from PF (4n × 2n)-VS-OF (4n × 2n), of which 13 genes were up-regulated in the progeny and 11 genes were down-regulated (S5 Table). The expression level of embryonic development-related (e.g., LAMA3) and transcriptional regulation-related genes (e.g., hand2) were up-regulated in the progeny significantly. However, the expression levels of mitosis-related genes (e.g., Sycp1 and stag2) were down-regulated in the progeny significantly (Fig 10). A total of 30 fertility-related DEGs were obtained from PM (4n × 2n)-VS-OM (4n × 2n), of which 9 genes were up-regulated in the progeny and 21 genes were down-regulated (S6 Table). The expression level of reverse transcription (gnas) and protein coding genes (e.g., Pla2g4d) were up-regulated in the progeny significantly. However, the expression levels of embryonic development related (e.g., Sox30) and sperm structure-related genes (e.g., TSGA10) were down-regulated in the progeny significantly (Fig 11).

Protein-protein interaction (PPI) network analyses of the fertility-related DEGs
The protein-protein interaction network analyses of the fertility-related DEGs were performed. Of the fertility-related DEGs in the comparison of direct cross (2n × 4n) offspring and their parents, 27 genes were found to be linked to each other (S3 Fig). Of the fertility-related DEGs in the comparison of reciprocal cross (4n × 2n) offspring and their parents, 32 genes were linked to each other (S4 Fig). As a whole, 54 genes involved in poor fertility of hybrid triploid loach (direct or reciprocal crosses) were finally mined and 5 genes were coexisted in the two networks, namely PLCB4, PLA2G4E, Pla2g4d, Nme5 and cyp17a1. Gonadal transcriptome analysis of hybrid triploid loaches and their diploid and tetraploid parents

Validation of DEGs by quantitative real-time polymerase chain reaction (qPCR) amplification
The expression levels of genes obtained using qPCR were compared with those obtained by RNA-seq to verify the accuracy of the latter approach. Fig 12 shows a comparison of qPCR and RNA-seq results for dmrt1-a and cyp19a2. It confirmed that the results of qPCR were consistent with those of the RNA-seq analysis, showing the reliability and accuracy of our transcriptome sequence expression analysis.

Discussion
Triploid fish have the characteristics of fast growth, good meat quality, strong disease resistance, but poor fertility, so its breeding is of great significance [26]. These breeding Gonadal transcriptome analysis of hybrid triploid loaches and their diploid and tetraploid parents characteristics have been receiving increasing attention in fishery resource management. It is generally accepted that triploid fish-being infertile-can transform the energy used in gonadal development into muscle growth, thus giving them a potential growth advantage. In addition, the sterility of triploid fish is of great significance for controlling the over-growth of fish and protecting natural germplasm resources. Moreover, triploid fish can also be used as ideal vectors for generating transgenic lines to address ecological safety problems and ethical concerns [27]. Most artificially induced triploid fish proved to be sterile [28][29][30][31][32][33][34][35][36][37][38][39]. However, there have been differences between reports. Thus, Yang [40] found that male triploid rainbow trout can produce spermatozoa, but a large number of malformed embryos appeared after hybridization, and all the progeny died during maturation. Yin et al. [41] found that the gonadal functions of triploid catfish differ between genders, as there was no difference between the normal and diploid testis, but the female ovary showed arrest of egg development at the oogonium stage. Arai [42] studied direct and reciprocal hybridization by using tetraploid loaches and diploid loaches of unknown origin. The male hybrids were sterile but female hybrids were fertile, and produced large triploid eggs (3n), and small haploid eggs (1n). Both types of egg could be fertilized and produce viable offspring. In the present study, hybrid triploid offspring were obtained by hybridization between tetraploid and diploid loaches, which are unique to China. Histology of the testes showed that spermatogonia, spermatocytes and large numbers of mature spermatozoa could be observed in diploid and triploid testes, but that the mature sperm content was lower than in normal diploid fish. The diploid and triploid female ovaries also contained second, third, and fourth phase oocytes, yolk granules and follicular membranes. However, the molecular mechanisms controlling fertility in such triploid fish have not been reported so far.
Transcriptomes are collections of all the RNAs of a particular tissue or cell that are transcribed at given developmental stages or functional state, including mRNA-encoding proteins and various noncoding RNAs, such as rRNAs, tRNAs, snoRNAs, snRNAs, and microRNAs. However, transcriptomal studies are typically restricted to mRNAs, and are important for studying the phenotypes and functions of cells. Transcriptomics is the study of gene transcription and transcriptional regulation in cells at the RNA level. Currently, high-throughput sequencing techniques have been used widely in transcriptomal studies of plants and animals, and many studies have also been carried out on the transcriptomes of fish, such as Oncorhynchus mykiss [43], Lampetra japonica [44], Anguilla anguilla [45], Scophthalmus maximus [46], and Huso dauricus [47]. Here we used high-throughput sequencing to study the transcriptomes of gonads from the offspring and parents of direct and reciprocal hybrids produced by crossing between diploid and tetraploid loaches. A total of 62,821 unigenes was obtained. Gene function annotations, classifications and metabolic pathways were analyzed, and sequence information of gonadal development-related genes in hybrid triploid loaches were obtained. Through screened differentially expressed genes related to fertility between parents and offspring and these genes were used to map protein-protein interaction networks, we found that there were 5 important genes, namely PLCB4, PLA2G4E, Pla2g4d, Nme5 and cyp17a1, were up/down regulated in the offspring of whatever direct or reciprocal crosses. These genes participated in the estrogen signaling pathway and progesterone-mediated oocyte maturation. In 2015, Jiang [48] studied the proteomics and carried out transcriptomal analysis of homozygous male sterile individuals of the double haploid Paralichthys olivaceus, and screened the levels of CYP11A1, CYP11B2, CYP17A1 and other genes. These genes play important roles in regulating the synthesis of sterol hormones and show low expression levels in infertile flounders. This suggests that infertility might be linked to low levels of hormones, similar to the differences between male and female offspring of diploid parents in this study. Jia et al. [49] reported that Nme5 was expressed in pachytene primary spermatocytes, round spermatids and elongated spermatozoa, and participated in the whole process from meiosis to sperm deformation, and played an important role in spermatogenesis. PLCB4 is one of the subtypes of phospholipase C (PLC), which controls cellular processes including neural signaling, cell growth and synaptic plasticity [50]. The Pla2g4d gene can directly regulate ovarian follicle development, or indirectly influence leptin secretion involved in the regulation of the sheep estrous cycle when cycles are initiated at the end of the anestrous season [51]. In addition, we also found that some genes were specific to the offspring and parents. For example, the LACTBL1, SGK1, XLRS1, acbC, and SGK1 genes were expressed only in the offspring, in contrast, Chst7, Lemd3, CMAS, SNRPB2, and KLHL24 genes were only expressed in the parents, so these genes might influence the differences in fertility.
In conclusion, the main reason for the poor fertility of hybrid triploid loach is probably because of changes in the expression levels of genes that regulate sex hormone levels and hormone response. The expression of genes related to gonad development of triploid hybrid parental was systematically analyzed by transcriptome, to analyze the reason of its occurrence in the view of molecular biology, and provides a theoretical basis for future research.
Supporting information S1  Table. Table of