Comparative Transcriptome Analysis Reveals Different Silk Yields of Two Silkworm Strains

Cocoon and silk yields are the most important characteristics of sericulture. However, few studies have examined the genes that modulate these features. Further studies of these genes will be useful for improving the products of sericulture. JingSong (JS) and Lan10 (L10) are two strains having significantly different cocoon and silk yields. In the current study, RNA-Seq and quantitative polymerase chain reaction (qPCR) were performed on both strains in order to determine divergence of the silk gland, which controls silk biosynthesis in silkworms. Compared with L10, JS had 1375 differentially expressed genes (DEGs; 738 up-regulated genes and 673 down-regulated genes). Nine enriched gene ontology (GO) terms were identified by GO enrichment analysis based on these DEGs. KEGG enrichment analysis results showed that the DEGs were enriched in three pathways, which were mainly associated with the processing and biosynthesis of proteins. The representative genes in the enrichment pathways and ten significant DEGs were further verified by qPCR, the results of which were consistent with the RNA-Seq data. Our study has revealed differences in silk glands between the two silkworm strains and provides a perspective for understanding the molecular mechanisms determining silk yield.


Introduction
Bombyx mori was domesticated from the wild silkworm (B. mandarina) and has been used in various economic applications for thousands of years. Indeed, silk produced by domestic silkworms is widely used as a textile material in daily life [1,2]. However, little is known about the molecular mechanisms contributing to different silk yields in domestic silkworm strains [3].
Hundreds of silkworm strains are preserved at the Sericultural Research Institute, Chinese Academy of Agricultural Sciences, and hereditary and ecological divergence of various genes has been shown to be responsible for the quality of silk produced by silkworm in many strains. Jingsong (JS) and Lan10 (L10) are typical strains with high and low silk yields, respectively. The JS strain is widely used in various applications as a commercial strain and has silk properties that are advantageous for silk yields. In contrast, L10 produces smaller cocoons and has poor silk-producing properties [4]. The silk yield of JS is about five-fold higher than that of L10. The silk yield is about 0.5 g per individual in Js, while that of L10 is about 0.1 g. Because cocoon and silk yields are determined by genetic quantitative traits, molecular marker techniques have been used widely in silkworms in order to identify many quantitative trait loci (QTLs) [5][6][7]; indeed, our laboratory has used such techniques to identify 14 QTLs [4]. Although technopositioning can be useful for finding the genetic loci of large fragments having relatively high location-dependent (LOD) values, RNA-Seq based transcriptome analysis may be more suitable for detecting minor genes [8,9].
The silk gland (SG), which is the primary organ responsible for silk yield, can be divided into three compartments: the anterior silk gland (ASG), middle silk gland (MSG), and posterior silk gland (PSG). The silk fibers are composed of two types of major silk proteins, fibroins and sericins, which are produced by the SGs. Fibroins and sericins are synthesized by the PSG and MSG, respectively [10]. Fibroins are composed of the fibroin heavy chain (fib-H), fibroin light chain (fib-L), and p25 proteins. The three proteins form a hexameric complex in a 6:6:1 ratio [11]. The mRNA levels of these genes and the corresponding fibroin protein accumulate in SGs during different developmental stages [12]. The lumen of the MSG contains more than seven major sericins and various uncharacterized minor proteins, which are encoded mainly by Ser1, Ser2, Ser3, MSGS 3 , MSGS 4 , and MSG 5 [13][14][15][16][17][18]. However, no studies have examined the molecular mechanisms in SGs that contribute to different silk yields in domestic silkworms.
In this study, we used the SGs from third day fifth-instar larvae of the JS and L10 strains to examine the silk yield differences between the two strains by RNA-Seq. Our results will provide insights into the molecular mechanisms of silk production and lead to improved cocoon silk yield [19,20].

Sample preparation
L10 and JS strains, which have different rates of silk production, were chosen from the silkworm resource library. The larvae were reared on live mulberry under a constant 14-h light/ 10-h dark photoperiod at 25 ± 1°C and 75% ± 3% relative humidity. Intact SGs were dissected and frozen in liquid nitrogen immediately for RNA-seq.

RNA extraction and library preparation
SG total RNAs were extracted using TRIzol reagent (Invitrogen, USA) following the manufacturer's protocol. An Agilent 2100 Bioanalyzer was used to determine RNA quality. Contaminating genomic DNA was removed from a 3 μg total RNA aliquot by treatment with 10 μg DNase I (Takara, Japan) for 30 min at 37°C. RNA was purified from the digest using Dynabeads 1 Oligo (dT) 25 (Life Tech, USA). Finally, 100 ng purified mRNA per sample were used to construct the respective cDNA libraries using an NEBNext 1 UltraTM RNA Library Prep Kit for Illumina (NEB, USA).
were removed from the raw data. All of the downstream analyses were based on high-quality, clean sequences. The clean reads were directly mapped to the silkworm reference genome [23] using the Bowtie short read aligner [24].

Analysis of differentially expressed genes (DEGs)
The expression levels of genes were normalized using the reads per kb per million reads (RPKM) method [25]. The gene expression levels were compared between the two strains using DEGSeq [26]. P values were adjusted for multiple comparisons using the Benjamini and Hochberg method [27]. DEGs were filtered with the following thresholds: false-discovery rate (FDR) 0.001 and |log2 Fold Change (Log 2 FC)| 1.

Annotation and enrichment analysis for gene ontology (GO) and KEGG analyses
All genes in the current transcriptome of the silkworm were annotated based on BLAST homology searches. The expressed genes were searched against the Swiss-Prot and TrEMBL databases by double-direction BLAST with an e-value of 1e -5 or less. Orthologous genes were then annotated to GO terms and KEGG pathways using GOPipe [28] and the KEGG database [29]. To explore the function of DEGs between the two silkworm strains, GO and KEGG enrichment analyses were performed using hypergeometric tests. An FDR of 0.05 or less was set as the threshold for screening the significantly enriched GO and KEGG terms.

Validation by quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR)
To further validate the RNA-Seq results, SGs from three biological samples for each strain were dissected from 3-day-old fifth instar larvae and divided into three parts (ASG, MSG, and PSG). In addition to the SGs, the fat body, midgut and hemolymph were also dissected. RNAiso Plus (TaKaRa Dalian, China) was used to isolate total RNA according to the manufacturer's instructions. Gel electrophoresis and ultraviolet spectrophotometry were used to determine the integrity and purity of the RNA. One microgram of total RNA from each sample was used to synthesize cDNA using a PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time, TaKaRa)-followed by storage at -20°C. Real-time quantitative PCR was carried out in a reaction volume of 20 μL, containing 2 μL of template, 10 μL of 2× SYBR Premix EX Taq (TaKaRa), 0.4 μL of 50× ROX Reference Dye (TaKaRa), and 0.4 μL of specific primers (10 μM). The PCR amplification efficiency (E) and R^2 of each primer pair was calculated from the slope of a standard curve, which was conducted according to MIQE (Minimum information for publication of quantitative real-time PCR experiments) guidelines [30]. The qRT-PCR primer sequences, which were designed based on the consensus sequence of each alignment, and their efficiencies are provided in S1 Table. qRT-PCR was performed with an ABI7300 real-time PCR system, using the following conditions: 95°C for 5 min, followed by 40 cycles of 95°C for 5 s, 60°C for 31 s and dissociation.
The mRNA quantity of each gene was calculated with the 2 -44Ct method [31] and normalized to the abundance of the house-keeping gene BmGAPDH (Accession No. XM_012690444) [32]. The relative mRNA levels of each gene are presented as fold changes relative to the expression level of BmGAPDH.
The expression levels of each gene in the two respective strains were compared based on a Student's T-test. Differences in gene expression between the two strains were considered significantly different at P 0.05.

Overview of transcriptome sequencing data
Pooled total RNAs were isolated from 3-day-old fifth instar larval SGs corresponding to either the JS or L10 strain of silkworm. cDNA libraries were then constructed and sequenced using an Illumina HiSeq2500 platform. 22,704,312 and 21,167,022 raw reads were generated from the cDNA libraries. The quality of the RNA-Seq data is listed in the S2-1 Table. In total, 22,635,272 and 21,099,316 clean reads were filtered from the raw data. The clean data has been submitted to the NCBI SRA database with the accession numbers: SRR3190017 and SRR3190036. These clean reads were mapped to reference genome sequences obtained from the Silkworm Genome Database (Silk DB; https://www.silkdb.org/silkdb/) using the Bowtie short read aligner [23,24]. A total of 14,772,549 (65.26%) clean reads were mapped to 10,849 genes in JS, and 13,787,558 (65.35%) clean reads were mapped to 11,218 genes in L10 (S2-2 Table). There were 10,097 commonly expressed genes between the two strains with~47% and 56% of the genes covered by more than 90% of the clean reads in the JS and L10 strains, respectively (S1 Fig).

DEGs in SGs
The gene expression levels were compared between the two strains using DEGSeq [26]. Based on the thresholds for screening DEGs, 1411 genes were identified as DEGs in JS compared to L10; of these, 738 were up-regulated, and 673 were down-regulated in JS. All of the DEGs are listed in S3 Table. KEGG and GO annotations and enrichment analyses Based on homologous proteins identified in BLAST searches, 2551 genes were mapped to 32 pathways comprising the five main KEGG database categories (S2 Fig). KEGG pathway enrichment analysis was performed based on hypergeometric tests. The DEGs between the two silkworm stains were significantly enriched in three pathways (Table 1), pyrimidine metabolism, ribosome biogenesis in eukaryotes and the spliceosome. These genes are mainly responsible for protein production. Compared with L10, more than 97% of DEGs in JS were significantly upregulated in ribosome biogenesis in eukaryotes and the spliceosome (Fig 1). Additionally, 72% of DEGs in JS were also significantly up-regulated compared with those in L10 in the pyrimidine metabolism pathway. These results suggested that the processing and biosynthesis of proteins were activated to a greater degree in JS than in L10. All DEGs in the enrichment pathways are listed in S4 Table. It is reasonable to hypothesize that silk fiber biosynthesis is dependent on the production of proteins in the SG. The results of GO enrichment analysis reflected the differences in the SG between the two strains. The GO terms for the DEGs were annotated using GOPipe [28], and the GO classification bar graph was drawn according to the GO IDs. From this analysis, 4110 genes were annotated with 51 GO terms across three main categories (S3 Fig). The DEGs between the two silkworm stains were significantly enriched in nine GO items ( Table 2). The dominant GO term was membrane-enclosed lumen, belonging to the cellular component, which is consistent with results from the pathway enrichment analysis. One of the DEG enriched pathways was ribosome biogenesis. Ribosomes are mainly attached to the membraneenclosed lumen, for example the endoplasmic reticulum. It indicated that the protein process  had significant differences between the L10 and JS SGs. The DEGs were also enriched in some biological processes, including metabolic processes, cellular component biogenesis, cell wall organization or biogenesis, and cellular process. We observed significant differences in DEGs involved in molecular functions for nutrient reservoir activity and catalytic activity between the two strains. The results implied that most of the DEGs may play a positive role in protein biosynthesis.

Validation of DEGs by qRT-PCR
To validate the RNA-Seq data, qPCR was performed for representative genes in the enrichment pathways and 10 additional significantly different genes selected from the DEGs. The gene sequences were obtained from the silkworm genome sequence [33]. The qPCR expression results were similar to the results obtained from the Illumina sequencing data. In enrichment pathways, the expressions of three representative genes in JS were significantly higher than that in L10 (P 0.05, Fig 2, Table 3). The expression of BGIBMGA012722 in JS was significantly higher than that in the L10 PSG. For the other 10 DEGs, genes in the second column of Fig 2 were up-regulated in the L10 SG compared to JS (P 0.05, Fig 2E~2I, Table 3). In contrast, the genes in the third column of Fig 2 were more highly expressed in JS SG (P 0.05, Fig 2J~2N, Table 3). Based on the qRT-PCR results, we found that the following genes were differentially expressed in the SGs, fat, midgut, and hemolymph (S4 Fig). Compared with JS, genes with higher expression in the SG of L10, most of which were more highly expressed in the fat body (BGIBMGA004401 and BGIBMGA004394) and some of which were expressed at significantly higher levels in the midgut (BGIBMGA002627 and BGIBMGA003132). Compared with L10, genes with higher expression in the SG of JS silkworms, were also expressed highly in the fat body, however, expression of the genes was lower than that observed in the hemolymph of L10 silkworms. As described above, the expression patterns of DEGs in SGs was different from that of fat body and hemolymph in L10 and JS. Moreover, we predicted that the biosynthesis and secretion of silk proteins may depend on the cooperation of multiple tissues. The expression profiles of these genes across six tissues are shown in S4 Fig.

Discussion
Enrichment pathways in SGs were mainly related to protein synthesis Ribosome biogenesis is necessary for cellular adaptation, growth, and proliferation and provides a major source of energy and biosynthesis for cells [34]. DEGs enriched in the ribosome biogenesis pathway suggested a relationship between protein expression and efficiency in silk protein synthesis. Ribosome biogenesis underlies the cell's capacity to grow because cell growth, or increases in cell mass, requires large numbers of ribosomes, the molecular factories that carry out protein synthesis [35]. The high expression of ribosome proteins reflects the rapid biosynthesis in SGs, and the SG size of Js is much bigger than that of L10. These phenomena might correspond with the expression of DEGs in JS and L10. Moreover, 39 genes were enriched in the pathway of ribosome biogenesis, most of which were up-regulated in JS. For example, the expression level of BGIBMGA007162 and BGIBMGA000474 in JS were exhibited up to four times fold change in MSG and PSG than the expression level of those genes in L10 (Fig 2C~2D), and these up-regulated genes that may result in more formation of silk fiber in JS than L10. On the other hand, ribosomal proteins are not only essential in the assembly of translational machinery but are also crucial for the transport of primary proteins [36]. Additionally, the spliceosome is also associated with protein synthesis through catalysis of pre-mRNA splicing [37,38]. Zhou et al predicted that Bm eIF5A, which is involved in cell proliferation, had two spliceosome sites and may be associated with silk expression in Bombyx mori [39]. Indeed, our results showed that DEGs were enriched in the spliceosome pathway in the SG, suggesting that genes related to silk yield may facilitate the expression of silk protein.

DEGs up-regulated in L10
We characterized 10 DEGs, five of which were up-regulated in L10, and five of which were upregulated in JS (Fig 2 and S4 Fig). We manually annotated the functions of these 10 DEGs. In L10, genes encoding a 30 kDa lipoprotein and phosphoserine aminotransferase were up-regulated. The gene encoding the 30 kDa lipoprotein is expressed mainly in the fat body and integument during the larval and pupal stages [40,41] and is involved in various physiological processes, including energy storage, embryonic development, and immune response in silkworms [42][43][44][45]. The 30 kDa protein is involved in the degeneration of the B. mori SG by a caspase-dependent pathway [46]. Thus, the highly efficient biosynthesis and secretion of silk proteins may be suppressed by the 30 kDa lipoprotein. We also identified the BGIBMGA012485 gene, which functions as a phosphoserine aminotransferase and is involved in serine biosynthesis [47,48], as being up-regulated in the PSG but not the ASG or MSG of L10; the opposite expression pattern was observed in JS. We speculated that this gene may be regulated by another component that could specifically block production in the MSG and ASG in L10. As we known, the spinneret is another important tissue for spinning. Filippi's glands (FG), which are a pair of tiny glands located at the distal ends of the two ASGs, join to form a common duct. Spinneret and FGs contribute to silk formation, spinneret-expressed genes are closely related to silk formation and provide a suitable environment for silk fiber formation. In a previous study, FGs were suggested to possibly function in transporting small solutes to the ASG duct [49]. DEGs in the spinneret and FG between day 3 of fifth instar larvae and wandering stage larvae were mainly enriched in the pathways associated with ion-transport, chitin binding, and energy metabolism. But these findings are different from our study in which it is mainly related to pyrimidine metabolism, ribosome biogenesis in eukaryotes, and the Log2 FC is referred to log2 fold change and P is referred to P value. P 0.05 is marked with "*" and P 0.01 is marked with "**".
spliceosome. That might be because the SG is responsible for biosynthesis of the silk ingredients and the spinneret and FG provide the environment for mixing the ingredients. It suggests that spinneret, FG and SG play different functional roles during the silk fiber formation. Despite high expression in the PSG of L10, the silk yield was low. Thus, of the up-regulated DEGs in the L10, some may play important roles in defense against pathogens and in low efficiency of protein synthesis. Additionally, some of the genes may be involved in signaling processes during cell communication.

DEGs up-regulated in JS
We identified a number of genes that were up-regulated in JS, including a gene (BGIBMGA0 03010) encoding a protein with similar function to Werner helicase-interacting protein 1 (WRNIP1). BGIBMGA003010 encodes a protein that exhibits ATPase activity and is involved in DNA synthesis and WRNIP1 has been reported to function in the metabolism of ubiquitinated proteins [50]. Several different DNA repair pathways have been shown to be regulated by ubiquitination. For example, the proper execution of nucleotide excision repair depends upon the polyubiquitination and degradation of a subset of its component enzymes [51,52], whereas translesion DNA synthesis is regulated by mono-and polyubiquitination of PCNA3 [53]. The BGIBMGA009392 gene encodes a protein with protein kinase functionality, which is fundamental to most signaling and regulatory processes in eukaryotic cells [54]. The DESs up-regulated in JS are mainly enriched in the pyrimidine metabolism, spliceosome and ribosome biogenesis pathways. The up-regulatory genes in the pyrimidine metabolism and spliceosome pathways will facilitate the transcription process of mRNAs which are required during the silk fiber synthesis and the highly expressed genes in ribosome biogenesis pathway will benefit the translation process of silk related proteins. The more proteins are translated, the more silk fiber will be produced. That might be the cause which led to the difference in silk production between two silkworm strains.
Supporting Information