Transcriptomic variation of eyestalk reveals the genes and biological processes associated with molting in Portunus trituberculatus

Background Molting is an essential biological process throughout the life history of crustaceans, which is regulated by many neuropeptide hormones expressed in the eyestalk. To better understand the molting mechanism in Portunus trituberculatus, we used digital gene expression (DGE) to analyze single eyestalk samples during the molting cycle by high-throughput sequencing. Results We obtained 14,387,942, 12,631,508 and 13,060,062 clean sequence reads from inter-molt (InM), pre-molt (PrM) and post-molt (PoM) cDNA libraries, respectively. A total of 1,394 molt-related differentially expressed genes (DEGs) were identified. GO and KEGG enrichment analysis identified some important processes and pathways with key roles in molting regulation, such as chitin metabolism, peptidase inhibitor activity, and the ribosome. We first observed a pattern associated with the neuromodulator-related pathways during the molting cycle, which were up-regulated in PrM and down-regulated in PoM. Four categories of important molting-related transcripts were clustered and most of them had similar expression patterns, which suggests that there is a connection between these genes throughout the molt cycle. Conclusion Our work is the first molt-related investigation of P. trituberculatus focusing on the eyestalk at the whole transcriptome level. Together, our results, including DEGs, identification of molting-related biological processes and pathways, and observed expression patterns of important genes, provide a novel insight into the function of the eyestalk in molting regulation.

Introduction controlling its growth and development of P. trituberculatus, which would benefit the aquaculture industry in China.
To investigate and gain a better understanding of the molting mechanism in P. trituberculatus, we created a reference transcriptome and digital gene expression (DGE) profile for single eyestalk samples in three major molting stages (InM, PrM, and PoM) using Illumina sequencing. To our knowledge, this is the first transcriptome-wide gene expression profiling of eyestalk in P. trituberculatus. This study will be a foundational resource for further studies in molting regulation mechanisms.

Animals
Ninety healthy swimming crabs (P. trituberculatus) between 80 and 100 days of age were obtained from Haifeng Company (Changyi, Weifang, China). The crabs used in the present study were artificial breeding animals. This species is the most common edible crustacean in China and is not an endangered or protected species. All the samples were acclimated in the laboratory (33 ppt, 18˚C) for 1 week before beginning the experiment. Three molt stages were identified based on morphological features [24] and the expression level of MIH mRNA [26], which included intermolt (InM, stage C, all parts of the body were hard and the new carapace had not yet started secretion), premolt (PrM, substage D3/D4, the old and new carapaces were separated completely) and postmolt (PoM, stage A, parts of the body were flaccid). Nine male crabs from each molt stage (InM, PrM, and PoM) were selected. Then the crabs were placed in an ice bath until anesthetization (5-10 min), and eyestalk were dissected, cleaned of pigments and exoskeleton, and then stored in RNAlater (−20˚C) until RNA extraction.

RNA extraction, library preparation, and sequencing
Total RNA was extracted individually using Trizol (Invitrogen, Carlsbad, CA, USA) and then equivalent RNAs were pooled into each group. For library preparation, 3 μg RNA per group was used as input material. Sequencing libraries were generated using a NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, USA) following the manufacturer's recommendations and index codes were added to attribute sequences to each sample. Library quality was assessed on the Agilent Bioanalyzer 2100 system. The clustering of the index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit v3-cBot-HS (Illumina) according to the manufacturer's instructions. Then, the library preparations were sequenced on an Illumina Hiseq 2000 platform and 100 bp single-end reads were generated.

Assembly and annotation
The raw reads from our two previous works were used to assemble the reference transcriptome [19,27]. The clean reads were assembled by Trinity as described previously [28], followed by TIGR Gene Indices clustering tools (TGICL) [29]. The longest assembled sequences were referred to as contigs. The reads were then mapped back to contigs with paired-end reads to detect contigs from the same transcript and the distances between these contigs. Finally, sequences were obtained that lacked Ns and could not be extended on either end [30]. Such sequences were defined as unigenes. The unigenes were annotated via public databases (Nr; Nt; Swiss-Prot; COG; and Kyoto Encyclopedia of Genes and Genomes, KEGG) by BlastX via an E-value cut-off of 1.0 x 10 −5 .

DEG identification
The read counts were obtained by RSEM after being mapped back to the reference transcriptome, which was normalized via RPKM [30]. Prior to differential gene expression analysis, for each sequenced library, the read counts were adjusted by the edgeR program package through one scaling normalized factor [31]. Differential expression analysis of two conditions was performed using the DEGSeq R package (1.12.0) [32]. The P values were adjusted using the Benjamini & Hochberg method. Corrected p-value of 0.005 and log2 (fold change) of 1 were set as the threshold for significantly differential expression. Selected differentially expressed genes (DEGs) were clustered by the STEM Clustering Method [33].

GO and KEGG pathway enrichment analysis
Gene Ontology (GO) enrichment analysis of DEGs was implemented by the GOseq R package, in which gene length bias was corrected. GO terms with a corrected P value of less than 0.05 were considered significantly enriched for DEGs. KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism, and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). We used KOBAS to test the statistical enrichment of differential expression genes in KEGG pathways [34].

Real-time PCR
To test the reliability of RNA-seq, 18 unigenes were selected to confirm the accuracy of different expression patterns. Their specific primers were designed with Primer Premier 5 software (Premier Biosoft International) (S1 Table). First strand cDNA was synthesized with prepared RNA of Illumina sequencing using a PrimeScript RT reagent kit (Takara, Dalian, China). Quantitative real-time PCR was performed using a 7500 Fast Real-Time PCR System and QuantiFast SYBR Green PCR Kit (Qiagen, USA). The PCR was carried out in a total volume of 10 μl and performed with the following thermal profile: 95˚C for 5 min, 40 cycles of 95˚C for 10 s and 60˚C-65˚C for 30 s. Fluorescence levels were measured after the 60˚C step. At the end of the PCR cycles, a melt curve was generated to analyze product specificity.

Reference transcriptome assembly and annotation
In order to achieve a comprehensive P. trituberculatus reference transcriptome, previous RefSeq data obtained from a variety of tissues (including the eyestalk, gill, heart, hepatopancreas, and muscle) by Illumina deep sequencing was used to create the Trinity-assembled transcriptome. The overall Illumina sequencing data were deposited in the Short Read Archive database of the National Center for Biotechnology Information (NCBI) (SRR1013694, SRR1013695, SRR1013696, SRR1168416, and SRR1168417), which contained 155,713,331 read pairs and 31.02G clean bases (Table 1). Finally, we identified 186,081 transcripts with an average length of 1,028 bp (Fig 1), which contained 130,560 unigene sequences (submitted in TSA database of NCBI: SUB2309547).

DEG identification and analysis during different molting cycles
To identify DEGs for eyestalks in different molting cycles, we sequenced three eyestalk libraries for each molting stage (InM, PrM, and PoM). Finally, the three libraries generated 14,387,942, 12,631,508 and 13,060,062 clean reads, which were mapped to the reference transcriptome and represented 74.20%, 74.10%, and 77.49%, respectively (S2 Table). The overall Illumina sequencing data were deposited in the Short Read Archive database of the NCBI (SRR5166969, SRR5166967 and SRR5166965).
A total of 1,394 DEGs were identified by DESeq [38]. The number of genes that were up-or down-regulated at the different molt stages is shown in Fig 2. There were 445 DEGs during the InM and PrM stages, among which, 284 were down-regulated and 161 were up-regulated in PrM stages, and 1,181 DEGs (452 down-regulated and 666 up-regulated) were identified between the PrM and PoM stages. A total of 728 DEGs (476 down-regulated and 252 up-regulated) were identified between InM and PoM. Eighteen unigenes were subjected to qPCR assays in order to verify the results of differential gene expression analysis ( Table 2). Within the 54 comparisons, 40 (74.1%) matched well with the DEG data. Although qPCR data of 14 unigenes did not match the DEG data, they showed similar trends with the Illumina sequencing in up-or down-regulation. The results validated that the majority of the DEGs identified from the DEG analysis in different molting phases were authentic.

GO and KEGG enrichment analysis
DEGs were further analyzed according to GO functional enrichment analysis (Fig 3). The 188 DEGs of PrM vs. InM were enriched in 10 processes, among which, most of the DEGs (77.66%) were up-regulated in PrM. In the 10 processes, chitin metabolic process (GO:0006030), ribosome (GO:0005840) and peptidase inhibitor activity (GO:0030414) have DEGs were also analyzed according to KEGG enrichment analysis in a complete molting cycle. Six groups of DEGs were enriched in 34 KEGG pathways (Fig 4), of which DEGs up-regulated in PrM vs. InM were enriched to the largest number of KEGG pathways (18). During the PoM and PrM stages, the up-and down-regulated DEGs were enriched in six and 15 KEGG pathways, respectively. The pathway of 'amino sugar and nucleotide sugar metabolism' (ko00520) was enriched at the highest level (p-value = 2.43×10 −9 ). Eighteen KEGG pathways were enriched during PoM and InM stages, among which, 13 KEGG pathways were up-regulated and five KEGG pathways were down-regulated.

Expression pattern analysis of molting-related gene
Forty-seven molting-related genes were screened and clustered by the STEM Clustering Method [33], which included 6 neuropeptide transcripts, 4 molting-related receptor transcripts, 31 cuticle transcripts and 6 chitinase transcripts. (Fig 5 and Table 3). They were

Discussion
Molting is a vitally important metamorphosis process for P. trituberculatus and affects its physical growth, tissue regeneration, and gonad development. Apart from studies concentrating on MIH, molt hormone, and methyl farnesoate, there has been little molecular research on the molting process [4]. Our study is the first to use high-throughput filtration to investigate molting-related genes in the eyestalk of P. trituberculatus at the whole transcriptome level. Compared with other technologies, RNA-seq technology has several advantages, such as high throughput, higher sensitivity and the ability to detect new or unknown genes. This approach will assist in the discovery and annotation of novel genes that play key roles in the molting process.
The reference transcriptome used in this study was based on five groups of transcriptome data from our two previous works [19,27], in which equal quantities of RNA were extracted and mixed from five tissues, including the eyestalk, and sequenced by Illumina. The transcript library obtained covered all of the P. trituberculatus ESTs from NCBI [27]. The overall Illumina  Transcriptomic variation of eyestalk reveals the genes and biological processes associated with molting read pairs and clean bases for all samples used in this study are 155,713,331 and 31.02G. We finally identified 186,081 transcripts after assembly analysis, the number of transcripts was significantly higher compared to the previous studies (120,137 and 141,339, respectively). Therefore, the transcriptome assembled in this study could be used as a reference transcriptome with a high coverage rate for subsequent analysis. Identification of DEGs among the different molting phases is important to find the underlying candidate genes of molting regulation in P. trituberculatus. We detected a total of 1,394 DEGs when comparing the three important molting cycles (qvalue<0.005 & |log2 (fold change)|>1). Some genes with important roles in the regulation of molting were found in the DEGs, which included an ecdysteroid receptor (comp97869_c0) [39], chitinase (comp99034_c0 and comp108916_c0) [9,40], hemocyanin (comp101914_c0) and nuclear hormone receptor (comp86338_c0 and comp86338_c0) [41]. Many of the DEGs have not yet been reported to have functions in the molting cycle, in spite of their having important roles in other physiological processes. For example, heat shock protein 70 (comp84939_c0), sodium/ hydrogen exchanger (comp101931_c0) and methyltransferase (comp83893_c0) play an important role in immunity, osmoregulation, and epigenetic modification, respectively. This suggests that these genes might have multiple functions and may also play important roles in the regulation of molting. It also suggests that the process of molting is closely related to these physiological processes. It should be noted that most of the DEGs (746/1394) could not be aligned to any known protein by BlastX (E values less than 10 −5 ). This indicates that the regulation of molting was far more complicated than we originally thought, and that there is still a long way to go in clarifying the molting mechanism in crustaceans.
GO and KEGG enrichment analysis were able to improve our understanding of the underlying biological processes related to various biological traits. A total of 15 GO processes were enriched, of which five processes (chitin metabolic process, peptidase inhibitor activity, chitin binding, carbohydrate derivative binding and structural constituent of cuticle) were found in every comparison of the three groups, which suggests these processes play an important role in molting. Thirty-four pathways were enriched via KEGG enrichment analysis. From PoM to PrM, the enriched pathways were mainly up-regulated (85.7% and 72.2% in InM vs. PoM and PrM vs. InM, respectively), which included 'ribosome', 'amino sugar and nucleotide sugar metabolism', 'retrograde endocannabinoid signaling', 'circadian entrainment' and 'phosphatidylinositol signaling system'. The pathway of 'ribosome', with the highest enrichment level and the highest number of DEGs (p-value = 2.84461E-10, 24 DEGs), was enriched in PrM vs. InM. The ribosome is a complex molecular machine, found within all living cells, that serves as the site of biological protein synthesis (translation). Our results indicate that a large number of protein synthesis processes were triggered to prepare for molting during PrM. From PrM to PoM, 71.4% KEGG pathways were down-regulated, which suggested that the body was in a calm state after molting. However, the pathway of 'amino sugar and nucleotide sugar metabolism' with the highest enriched level (p-value = 4.14001E-08) was up-regulated in PoM vs.
PrM. An amino sugar is a sugar molecule in which a hydroxyl group has been replaced with an amine group. More than 60 amino sugars are known, with one of the most abundant being Nacetyl-d-glucosamine, which is the main component of chitin. This indicated that there could be more active cuticle metabolism in the body after the early phase of PoM.
Previous studies on Eriocheir sinensis found a DEGs pattern associated with energy metabolism during a molting cycle. To be specific, DEGs enriched in PoM were linked to energy consumption, whereas genes enriched in InM were related to carbohydrates, lipids metabolic and biosynthetic processes [42]. However, we did not find a similar pattern with energy metabolism in our study. We believe this difference is mainly caused by the different tissues used in the two studies. Unlike the hepatopancreas, which has a main role in energy metabolism and storage, the eyestalk is a major site for the production of neurohormones and controls a variety of physiological functions such as molting and reproduction [20]. Therefore, it was not surprising that there were little energy metabolism processes or pathways enriched in DEGs in this study. But it is worth noting that a number of neuromodulator-related pathways were enriched including the dopaminergic, serotonergic, cholinergic and glutamatergic pathways, which were not found in previous research [12,13,42]. Hormones from the XO-SG complex appear to be controlled by the neurotransmitters/neuromodulators dopamine, serotonin, and enkephalin [43][44][45][46][47]. The release of some SG hormones is controlled presynaptically by dopamine, metenkephalin, and leuenkephalin, which occurs via synaptic contacts in the medulla terminalis near neuron cell bodies [48]. These neuromodulator-related pathways were up-regulated in PrM and down-regulated in PoM, which indicates molting-related neurohormone release is triggered by neuromodulators in the early phase of PrM, then aroused a series of subsequent molting processes. However, the relationship between neurotransmitters and molting-related hormones is far form clear, and further research is necessary to clarify the underlying mechanisms regulating hormone release.
The process of molting is regulated by an elaborate interplay of neuropeptide hormones, which include positive and negative regulation [13]. The signals that arise by neuropeptide hormones trigger a series of physiological activities related to molting, including losing the extracellular cuticle, escaping from the confines of the cuticle relatively rapidly, taking up water, expanding the new flexible exoskeleton and then quickly hardening it for defense and locomotion. In particular, meticulous control of regulatory proteins and genes is required to form and harden a new exoskeleton as well as to dissolve, reabsorb and shed the old one [2,49]. Forty-seven DEGs were screened and clustered, which included six neuropeptide transcripts, four nuclear receptor (NR) transcripts, 31 cuticle transcripts and six chitinase transcripts. Most genes had similar expression patterns (Profile 1 and 2) via clustering analysis. This suggests that there is a connection between these genes throughout the molt cycle.
Crustacean neuropeptides and NRs play important roles in regulating many physiological activities, including molting, growth, feeding and reproduction [50][51][52]. In arthropods, the molting process is regulated by neuropeptide hormones acting via NR proteins [41]. Our results show that the highest number of these two kinds of genes were clustered in profile 2, including five neuropeptides (pigment-dispersing hormone [PDH], neuroparsins [NPs] 1 and 2, neuropeptide and neuroendocrine convertase) and 2 NRs (EcR and E75). Interestingly, the five neuropeptide transcripts were significantly up-regulated in PrM, and their functions in molting regulation have rarely been reported. For example, the PDH, which is released primarily from the eyestalk, is important in regulating pigment dispersion in crustaceans [53]. NPs were initially identified in insects with diverse roles including anti-juvenile, antidiuretic and hyperlipidemic effects [54,55]. Neuropeptide F (NPF) is a potent stimulator regulating feeding behavior, circadian rhythm and reproduction [56][57][58][59]. The five neuropeptides were up-regulated in PrM vs. InM and subsequently down-regulated in PoM vs. PrM indicating that they play important roles in regulating molting in PrM.
EcR belongs to the NR superfamily and acts as an important transcription factor regulating downstream genes involved in the molting process [60]. Consistenting with previous studies in crustacean molting [61,62], the expression of EcR in this research was significantly up-regulated in PrM and then down-regulated in PoM. E75, which also belongs to the NR superfamily, has been identified in numerous insects such as D. melanogaster, Aedes aegypti and Galleria mellonella [63][64][65]. Although the complex role of E75 is not fully understood, it was proved to be an responsive gene in ecdysteroids signaling pathway in crustacean [66,67]. In this study, the expression pattern of E75 was similar to that of EcR, and both of them were clustered in profile 2. Our results suggest that EcR and E75 play essential roles in molting of P. trituberculatus, and point to a possible synergy between them, which deserves further verification.
Chitin, a 1, 4-β-linked polymer of N-acetyl-β-glucosamine, is found in crustacean shells [68]. During the pre-molt period, the epidermis secretes chitinases that degrade the inner layers of the old exoskeleton while synthesizing a new exoskeleton [69]. In this study, one of the six chitinase transcripts was clustered in Profile 1, which was up-regulated in PrM vs. InM and down-regulated in PoM vs. PrM, suggesting a specific role in degrading the old exoskeleton in the PrM stage. However, the chitinases have diverse functions in a wide range of organisms, including morphogenesis [70], host defense against chitin-bearing pathogens, cell division and sporulation [71], and regulators of development [72]. Interestingly, five of the six chitinase transcripts were clustered in Profile 1, which were significantly up-regulated in PoM, indicating they may participate in other important physiological processes after the molting stage, such as growth and development [73].
The crustacean cuticle provides initial reinforcement by cross-linking cuticular proteins attached to the cuticle chitin-fiber matrix and the protein matrices generally have very complex compositions. Thirty-one cuticle proteins were differentially expressed across the molt cycle and mainly clustered in Profile 1. The majority of them have a common variation tendency: they were not significantly changed in PrM vs. InM, but significantly up-regulated in the PoM, which shows that the majority of cuticles in P. trituberculatus were highly active at the PoM stage. The endocuticle and membranous layer form during the PoM, accompanied by calcification and sclerotization [74]. Our data suggest that the process to generate a new cuticle mainly occurs in the PoM, which is similar to other reports of crustacean cuticle protein expression during the molting cycle [69,75].

Conclusions
Our work is the first molt-related investigation of P. trituberculatus that is focused on the eyestalk at the whole transcriptome level. A total of 1,394 molt-related DEGs were identified. Many important processes and pathways that play a key role in molting regulation were identified including chitin metabolism, peptidase inhibitor activity, ribosome and circadian entrainment. In particular, we were the first to observe a pattern associated with neuromodulatorrelated pathways and four kinds of important molting-related gene families during a molting cycle. Together, our results, including screened DEGs, identified molting-related biology processes and pathways, and the observed expression pattern of important genes, provide novel insights into the functions of the eyestalk in molting regulation.