Shotgun Proteomic Analysis on the Diapause and Non-Diapause Eggs of Domesticated Silkworm Bombyx mori

To clarify the molecular mechanisms of silkworm diapause, it is necessary to investigate the molecular basis at protein level. Here, the spectra of peptides digested from silkworm diapause and non-diapause eggs were obtained from liquid chromatography tandem mass spectrometry (LC-MS/MS) and were analyzed by bioinformatics methods. A total of 501 and 562 proteins were identified from the diapause and non-diapause eggs respectively, of which 309 proteins were shared commonly. Among these common-expressed proteins, three main storage proteins (vitellogenin precursor, egg-specific protein and low molecular lipoprotein 30 K precursor), nine heat shock proteins (HSP19.9, 20.1, 20.4, 20.8, 21.4, 23.7, 70, 90-kDa heat shock protein and heat shock cognate protein), 37 metabolic enzymes, 22 ribosomal proteins were identified. There were 192 and 253 unique proteins identified in the diapause and non-diapause eggs respectively, of which 24 and 48 had functional annotations, these unique proteins indicated that the metabolism, translation of the mRNA and synthesis of proteins were potentially more highly represented in the non-dipause eggs than that in the diapause eggs. The relative mRNA levels of four identified proteins in the two kinds of eggs were also compared using quantitative reverse transcription PCR (qRT-PCR) and showed some inconsistencies with protein expression. GO signatures of 486 out of the 502 and 545 out of the 562 proteins identified in the diapause and non-diapause eggs respectively were available. In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed the Metabolism, Translation and Transcription pathway were potentially more active in the non-dipause eggs at this stage.


Introduction
Diapause is a special physiological state of arrested development by many insects to avoid unfavorable environments such as low temperature, drought or food shortage and to synchronize their life cycles to these changes [1,2]. Diapause is widespread among insects and can occur at any stage of the life cycle, i.e., adult, pupa, larva or egg, and each species enters diapause at fixed stages. Diapause is endogenously controlled, and this dormancy typically begins well before conditions become too harsh to support normal growth and development.
In the silkworm Bombyx mori, the development of diapausedestined embryos is arrested during the G2 cell cycle stage immediately after formation of the cephalic lobe and telson and sequential segmentation of the mesoderm [3]. In Bombyx mori, embryonic diapause is determined by a diapause hormone that is secreted by the suboesophageal ganglion (SG) of the mother moth during the pupal period to act on her developing ovaries and is responsible for induction of embryonic diapause of the silkworm, Bombyx mori [4,5]. The termination of diapause requires a low temperature of 5uC for 2-3 months [6], various stimuli can also artificially prevent or terminate diapause, such as HCl, physical stress, corona discharge, or higher oxygen pressure [7]. Once diapause terminates, the embryos are competent to resume development at 25uC and cells enter the M phase; Cell division in the embryos then resumes [3].
However, the molecular mechanisms involved in generating, maintaining, and breaking diapause have not yet been fully elucidated [1]. Currently, only a few studies have elaborated the molecular mechanisms that regulated diapause [8,9,10,11]. Most of these studies have mentioned the molecular regulation of diapause in the larvae, pupae and adults stage. However, the molecular regulation of embryonic diapause is still unclear, perhaps because of the difficulty of extracting the limited amounts of RNA from a developmental stage that has relatively little tissue and large amounts of protein and lipid.
The gene expression profiles linked with diapause have been researched in many insects, such as northern malt fly species [12,13], flesh fly [14], Chinese oak silkworm [15], Allonemobius socius [16], Drosophila melanogaster [17], Colorado potato beetle [18]. Nevertheless, gene expression profile alone is not sufficient to explain gene functions, and mRNA levels incompletely correlate with the protein levels [19,20] because of the alternative splicing and dynamics of gene translation. Therefore, with the gradual completion of genome sequencing of insect model organisms, proteomics has become the focal point in entomological research including the diapause mechanism. For a long time, twodimensional gel electrophoresis (2-DE) combined with mass spectrometry (MS) has been frequently used in insect proteomic research [21,22]. Another relevant approach in proteomics for large-scale characterization of proteome profiles is shotgun proteomics, which is based on in-gel or gel free digestion of protein mixtures followed by liquid chromatography tandem mass spectrometry (LC-MS/MS) separation, MS detection and database searching, and provides an extremely sensitive and highthroughput approach to determine the proteome components in a complex biological sample. This approach has been widely used in many organisms, such as Bombyx mori [23,24,25], Homo sapiens [26,27] and Orientia tsutsugamushi [28,29].
Following the rapid development of proteomics and bioinformatics approaches, the credibility of results is greatly promoted. In this study, we utilized the shotgun LC-MS/MS approach combined with bioinformatics analysis to illuminate the differences among protein identification profiles of the diapause and nondiapause eggs of the silkworm and to find valuable clues regarding the molecular regulation of diapause at the early stage of embryogenesis in the silkworm.

Silkworm Rearing and Sample Collection
Bivoltine silkworm strain 932 was protected at 20uC with half on a short-day photoperiod (8-h-light:16-h-dark for the generation of non-diapause eggs) and half on a long-day photoperiod (16-hlight:8-h-dark for the generation of diapause eggs) separately during the hatching period of silkworm eggs. Larvae were reared on fresh mulberry leaves under an environment of 24 to 25uC and 85% relative humidity. Pupae were kept at 25uC. After female moths emerged, they copulated with males (usually within 5 h) and then laid the eggs at 25uC. Diapause and non-diapause eggs which laid by 30 female moths seperately within 1 h were collected to obtain synchronously developing egg batches. All the diapause and non-diapause eggs were collected 24 h after oviposition and divided into 0.1 g each sample (corresponding to about 200 eggs, three copies). All samples were stored at 280uC until use.

Sample Preparation and SDS-PAGE Separation
Samples of diapause eggs (D) and non-diapause eggs (ND) were mechanically homogenized on ice for 10 min in 10 mL lyses buffer (comprising 8 M urea, 2 M thiourea, 2% CHAPS, 20 mM Tris-HCl, 30 mM DTT) per 1 mg tissue. The samples were then sonicated in an ice-bath for five circles and each circle contained a 30 s sonication with a 30 s interval. The samples were then centrifuged at 12,0006g at 4uC for 15 min. The supernatants were then collected and were centrifuged again and the resultant supernatants were stored at 220uC for further study. The concentrations of protein samples were determined by the Bradford methods [30].The samples were boiled for 2 min and centrifuged at 12,0006g for 10 min before they were subjected to SDS-PAGE, using a 5% stacking gel and a 12.5% resolving gel. For each sample, a total of 300 mg protein was separated using SDS-PAGE on three lanes with 100 mg each lane. The gels were stained with Coomassie Brilliant Blue (CBB) R250 (Sigma, USA) after electrophoresis.

In-gel Digestion
Each gel lane was manually cut into 8 bands according to the deepness of Coomassie staining (Fig. 1). The gel bands were sliced into 161 mm pieces and subjected to in-gel tryptic digestion which was essentially carried out as described by Shevchenko et al. [31]. Briefly, the gel pieces were rinsed thrice using Milli-Q water (Millipore) and destained twice with 25 mM NH 4 HCO 3 in 50% acetonitrile (ACN, Amersham) at 37uC until the color depigmented completely. The dried gels were incubated with 50 mM Tris [2carboxyethyl]phosphine (TCEP, Sigma) in 25 mM NH 4 HCO 3 at 56uC for 1 h to reductively cleave the disulfide bonds of proteins and then the resulting sulfhydryl functional groups were alkylated by 100 mM iodoacetamide (IAA, Amersham) in 25 mM NH 4 HCO 3 at room temperature in the dark for 0.5 h. Gel pieces were washed twice with 25 mM ammonium bicarbonate in 50% acetonitrile solution, dehydrated twice with 100% acetonitrile, and dried in a vacuum centrifuge. Subsequently, the proteins were digested with 20 ng/mL modified trypsin (Sigma) for 20 h at 37uC.The resulting peptides were extracted twice from the gel pieces with 5% trifluoroacetic acid (TFA, Fluka, Milwaukee, WI, Figure 1. SDS-PAGE patterns of the diapause (D) and non-diapause (ND) eggs protein samples respectively. The samples were separated by 12.5% resolving gel in triplicate. The three batches for both ND and D were pooled into two sets of eight prior to in-gel digest. Then eight slices were combined to four groups for each sample (comprising slices 1 and 2, slices 3 and 4, slices 5 and 6, slices 7 and 8 for both ND and D) prior to the LC separation and MS detection. doi:10.1371/journal.pone.0060386.g001 USA) in 50% ACN solution. The pooled extracts were evaporated in a vacuum centrifuge (Labconco, Kansas, MO), and resuspended with 0.1% methanoic acid (Sigma). After trypsin digested, 8 slices were combined to 4 groups for each sample (slice 1 and 2, slice 3 and 4, slice 5 and 6, slice 7 and 8) prior to the LC separation and MS detection.

Shotgun LC-MS/MS Analysis
All digested peptide mixtures were separated by reverse phase (RP) HPLC followed by tandem MS analysis. RP-HPLC was performed on a surveyor LC system (Thermo Finnigan, San Jose, CA). Samples were loaded into a trap column (Zorbax 300SB-C18 peptide traps, 300 mm 6 65 mm, Agilent Technologies, Wilmington, DE) first at a 3 mL/min flow rate for peptides enrichment and desalting. After flow-splitting down to about 1.5 mL/min, peptides were transferred to the analytical column (RP-C18, 150 mm 6 150 mm, Column Technology, Inc., Fremont, CA) for separation with a 195 min linear gradient from 95% buffer A (0.1% methanoic acid in water) to 50% buffer B (84% ACN, 0.1% methanoic acid in water) at a flow rate of 250 nL/min. The analytical column was regenerated for 20 min with buffer A before loading the next sample. A Finnigan LTQ linear ion trap mass spectrometer equipped with a nanospray source was used for the MS/MS [32,33] experiment in the positive ion mode. The temperature of the ion transfer capillary was set at 170uC. The spray voltage was 3.0 kV and normalized collision energy was set at 35.0% for MS/MS. The MS analysis was performed with one full MS scan (m/z 400-1800 with a resolution R = 60,000 at m/z 400) followed by 10 MS/MS scans on the 10 most intense ions from the MS spectrum with the dynamic exclusion settings: repeat count 2, repeat duration 30 s, exclusion duration 90 s. Data were acquired in data dependent mode using Xcalibur software (version 2.0.7, Thermo Fisher Scientific).

Database Search
Database search was carried out against the in-house database [34], with the protein sequences downloaded from NCBInr protein database (http://www.ncbi.nlm.nih.gov/) including the domesticated silkworm (B.mori,2224 entries), the wild silkworm (B. mandarina,13 entries) and the fruit fly (D. melanogaster, 27777 entries), along with predicted B. mori protein sequence data (14,623 entries) [35]. _The four raw datasets of ND and D samples were searched against the in-house database on a local server using Turbo SEQUEST software (Bioworks version 3.2, Thermo Finnigan). The mass tolerances of precursor ion and fragmentation ion were set to 1.5 Da and 1.0 Da, respectively. The trypsincleavage was at both ends of the protein and two missing cleavage sites were allowed. Only b and y fragment ions were taken into account. Static (carbamidomethyl) modification on cysteine, and variable modifications (oxidation) on methionine were set for all searches. The results for each dataset were stored as.out format files. All the.out files were filtered by Buildsummary software. The protein identification criteria that we used were based on Delta CN ($0.1) and Xcorr (one charge $1.9, two charges $2.2, three charges $3.75) [36].

Quantitative RT-PCR
Total RNA was extracted from diapause and non-diapause eggs samples using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions and then reversetranscribed into cDNA with random primer and M-MLV reverse transcriptase (Promega, Madison, WI, USA). The qRT-PCR was performed in 25 mL reactions with 100 ng reverse transcription product, 200 nM each of the forward and reverse primer, and the SYBRH Premix Ex Taq TM (Takara, Tokyo, Japan). The cDNA was amplified in a Rotor-Gene 3000 real-time PCR system (Corbett Research, Sydney, Australia) according to the following program: initial denaturation at 95uC for 10 min, and 40 cycles for amplification at 95uC for 30 s, 56uC for 30 s and 72uC for 30 s followed by an additional steps for melting curve with the increase of temperature from 72 to 95uC at 0.5uC/6 s and 30uC for 30 s. The expression levels were calculated using Rotor-Gene software (version 6.0.19) on the basis of the difference of Ct value by normalizing with the reference gene (BmactinA3, accession No.X04507). The diapause-unique gene BGIBMGA001595 (Aliphatic nitrilase, BmnitA)and the non-diapause-unique genes BGIBMGA002594 (adenylate kinase 2, BmADK2), BGIBMGA011412 (Isocitrate dehydrogenase, BmIDH), BGIBMGA002521 (gamma-glutamyltransferase, BmGGT) were chosen for the qRT-PCR. The qRT-PCR was performed using the following primers: BmnitA forward, TCGGGAAACATCG-CAAGAAC and reverse, GCCGTATCTGGTCGCAAATAC, BmADK2 forward, ATCACGCTCAAACGGTTCCTT and reverse, AGACGTCATCAGCGGCTTTC, BmIDH forward, CGTTGCGACCAGACATAAGGA and reverse, TTCACG-GATTCGTGTTCCAA, BmGGT forward, GACAGCCT-CAAACCCAATCAG and reverse, GCGGCCATAAAGC-CATCTC.

Bioinformatics Analysis
Protein sequences were searched against InterPro member databases using the InterProScan software to identify protein signatures [37]. The compiled RAW outputs were subjected to gene ontology (GO) category analysis using the Web Gene Ontology Annotation Plot (WEGO) [38]. The two groups of datasets were simultaneously subjected to online analysis (http://wego.genomics.org.cn/cgi-bin/wego/index.pl) to conveniently compare them in one graph. The P-values were calculated using Pearson Chi-Square test where available. The proteins identified were classified into cellular component, molecular function and biological process. The pathways related to the identified proteins were predicted by searching against KEGG reference pathway database (http://www.genome.jp/ kegg/tool/search_pathway.html) with the available protein sequences.

Proteome Profiles of the D Compared with ND Samples
The shotgun proteomics strategy, based on proteolytic digestion of complex protein mixtures, peptides LC separation and tandem MS sequencing, has been widely utilized. However, database searching remains the bottleneck for many shotgun proteomics experiments. So in our research, the 4 raw data of each sample sets from LC-MS/MS were subjected to in-house database search using SEQUEST. A total of 501 and 562 proteins were identified from the D and ND respectively (Fig. 2), with 8091 peptides including 1560 unique peptides and 8125 peptides including 1826 unique peptides respectively. The number of common-expressed proteins among the two samples was 309, which was 61.68% and 54.98% of the proteins in the D and ND respectively. Moreover, there were 135 commonexpressed proteins with functional annotations (Table S1), the others were predicted proteins from the silkworm genome database. There were 192 and 253 unique proteins identified in the D and ND respectively, of which 24 and 48 had functional annotations (Table 1 and 2).

Theoretical Two-dimensional Distribution of the Identified Proteins
The theoretical isoelectric point (pI) and molecular weight (Mw) of the identified proteins were calculated using the Compute pI/ Mw tool (http://cn.expasy.org/tools/pi_tool.html) according to the predicted amino acid sequences. It showed that 87.30% of the   total proteins were distributed in the range of pI 4-7 and 8-10 (Fig. 3a). About 60.87% of the proteins were distributing in the range of 15-60 kDa (Fig. 3b). In the two kinds of eggs, less than 7.62% of proteins showed pI 7-8. Furthermore, 25 and 29 proteins with higher pI (more than 10), usually difficult to be separated by 2-DE, were also identified from D and ND respectively. It also revealed that the protein distributions were nearly identical in the D and ND samples.

Profiling of Common-expressed Proteins in the D and ND Samples
On the list of common-expressed proteins of D and ND with annotations, many functional proteins were detected (Table S1). Among these common-expressed proteins we identified, vitellogenin precursor (gi:112983746), egg-specific protein (gi:187281695) and low molecular lipoprotein 30 K precursor (gi:156119320, gi:162461355 and gi:112984502) which comprise the three main storage proteins in the silkworm eggs [39,40]. The relative concentration of a protein identified by MS in a biological sample is directly related to the number of identified peptides, neglecting possible effects such as the enzymatic digestion constraint, the detection mass range of the mass spectrometer and differential post-translational modifications. Therefore the number of identified unique peptides assembled into a protein may reflect the protein's relative abundance [25,41]. So according to the amount of identified unique peptides, vitellogenin precursor, egg-specific protein, low molecular lipoprotein 30 K precursor, and some predicted proteins such as BGIBMGA013342-PA, BGIBMGA004585-PA and BGIBMGA004399-PA are highly represented in the samples. In these predicted proteins, the BGIBMGA013342-PA and BGIBMGA004585-PA were involved in lipid transport (GO: 0006869) with the molecular Function of lipid transporter activity (GO: 0005319), while the BGIBMGA004399-PA was identified to be a low molecular weight lipoprotein which located at the extracellular region (GO: 0005576).These highlyrepresented proteins were important at the early stage of the embryogenesis [40]. Moreover, many heat shock proteins were identified, such as heat shock protein HSP19.9, HSP20.1, HSP20.4, HSP20.8, HSP23.7, HSP70, HSP90 and heat shock cognate protein.
These proteins, HSP19.9, HSP20.1, HSP20.4, HSP20.8, HSP23.7 belong to a family of small heat shock protein (sHSP) which mainly function as molecular chaperones to protect proteins from being denatured during extreme conditions, especially under high temperature stress [42,43,44]. The sHSP family is functionally more diverse than other HSPs [29]. Moreover sHSPs can also develop a protection function under other stress conditions, such as cold, drought, oxidation, hypertonic stress, UV, and heavy metals [45,46], and even under high population densities of organisms [47].The HSPs including sHSPs also play an important role in normal development [44]. However, insect orthologous sHSPs may not be associated with response to environmental stresses and may be involved in basic metabolic processes. Moreover, silkworm sHSPs may play an important role in the development of the germocyte and have functions in immune defense mechanisms [42]. HSP70, heat shock cognate protein,HSP90 and sHSPs are associated with diapause in a number of species [48,49,50].But in this study, some HSPs were highly expressed both in diapause and non-diapause eggs, including HSP90, HSP20.8, HSP20.4 and HSP70. A possible explanation is that these HSPs may play an important function in the initial embryos development regardless of diapause or non-diapause. Furthermore, among the common-expressed proteins with functional annotations, 37 metabolic enzymes were identified (11.97% of the common-expressed proteins with functional annotations). The metabolic enzymes include ADP/ATP translocase, fibroinase, glucose-6-phosphate isomerase, glutamate dehydrogenase, isocitrate dehydrogenase, salivary secreted ribonuclease, transketolase etc. The identification of these essential metabolic enzymes indicated that during embryogenesis stage metabolism was active in the eggs. Many ribosomal proteins (L7, L7A, L9, L10A, L11, L13, L13A, L15, L17, L18, L19, L23A, L38, S3, S5, S7, S8, S9, S15A, S27, S28, and S30) were found in D and ND samples. This abundance of ribosomal proteins on the first day after oviposition might be closely related to protein synthesis during embryogenesis.
In addition, 62 proteins that may also play important roles in the egg development were identified commonly, such as 14-3-3 epsilon protein, antennal binding protein, calreticulin, cellular retinoic acid binding protein, chemosensory protein 11, exuperantia, perilipin, profiling, transferring and etc. 14-3-3 epsilon protein (Bm14-3-3e) is a member of the 14-3-3 protein family. The 14-3-3 proteins along with partner proteins(for example CDK11, PFTK1, GSK3beta, Chk1) are involved in the regulation of several crucial cellular processes including metabolism, signal transduction, cell development, differentiation, apoptosis, transcription, stress responses and malignant transformation [51,52,53,54,55,56]. In this study, we identified Bm14-3-3e in the D and ND samples. This result indicated that Bm14-3-3e may play a role in regulating embryonic development of silkworm. Another important protein was translationally controlled tumor protein (TCTP), a highly conserved protein upregulated in various tumours. Hsu et al [57] reported that reducing Drosophila TCTP (dTCTP) levels will reduces its cell size, cell number and organ size. Further more, calreticulin located in the endoplasmic reticulum, has been implicated in many diverse functions, including: regulation of intracellular Ca 2+ homeostasis, chaperone activity, steroid-mediated gene regulation, and cell adhesion [58]. The highest level of mRNA expression of calreticulin was exhibited in the fat body of Bombyx mori [59]. Our result showed that calreticulin was identified during the initial embryogenesis both in D and ND samples, which means that calreticulin is essential during embryonic development.

Profiling of D-Unique Proteins
The characteristic proteins with functional annotations unique to the D samples are shown in Table 1. Among the D-unique proteins, BCP inhibitor (BCP1, Bombyx cysteine proteinase inhibitor) is inhibitory towards the processing of the enzymatically inactive proform of BCP (pro-BCP) to the activated mature BCP but has no effects on trypsin and pepsin, and is highly expressed in the metamorphosis stage of B.mori [60]. In our study, BCPI which was exclusively identified in the D samples may be involved in regulating the development of diapause-destined embryos to arrest during the G2 cell cycle stage.
Calcineurin A and FK506-binding protein were other proteins which were uniquely identified in the D sample. Calcineurin A with four EF-hand type calcium-binding structures is localized in the cytoplasm of the pheromone-producing cells and participates in the intracellular signal transduction of PBAN (Pheromone biosynthesis activating neuropeptide) in B. mori [61]. The FK506binding protein (FKBP) belongs to a ubiquitous family of proteins which participates in a variety of pathways, including protein folding, down-regulation of T-cell activation and inhibition of cellcycle progression [62]. Both Calcineurin A and FK506-binding protein identified in the diapause eggs are related with the Ca 2+ release, suggesting that in the diapause eggs the regulation of the Ca 2+ might be more active than in the non-diapause eggs. Other proteins identified in the diapause eggs were cytochrome P450 CYP6AE9, cytochrome P450 CYP6AE7, titin1, titin2, topoisomerase II etc (Table 1).
In the diapause eggs, the cephalic lobe and telson and sequential segmentation of the mesoderm are formed before the embryos enters into diapause [3]. In another words, the development of the diapause eggs are on one hand moving toward embryonic development, while on the other hand preparing to enter diapause after oviposition. Thus, these D-unique proteins may be adapted to this physiological need.

Profiling of ND-Unique Proteins
Compared with the unique proteins of the D samples, the characteristic proteins unique to the ND samples are shown in Table 2. There were more metabolic enzymes identified in the ND samples than in the D samples, including 6-phosphogluconate dehydrogenase, adenylate cyclase, alcohol dehydrogenase, ATP synthase(subunit B), carboxylesterase, glutathione peroxidase, malate dehydrogenase, ovarian serine protease, serine hydroxylmethyltransferase etc ( Table 2). In the non-diapause eggs, the development of the embryos was consistent and without interruption, so the metabolism of the non-diapause eggs was potentially more active than the diapause eggs.
In addition, we also identified eukaryotic translation initiation factor 3 subunit 2 beta, translation initiation factor 2 gamma subunit, eukaryotic translation termination factor 1 and more ribosomal proteins (L4, L10, S4, S6) in the non-diapause eggs. These components are important in regulating translation and protein synthesis, indicating that translation of the mRNA and synthesis of proteins was potentially more active in the nondipause eggs than that in the diapause eggs. Proteasome betasubunit, proteasome zeta subunit and proteasome 26S non-ATPase subunit 7 were also identified only in the non-dipause eggs, which play important roles in degrading cytosolic and nuclear proteins previously labeled with ubiquitin molecules [63]. Other proteins identified in the non-diapause eggs were ecdysteroid-inducible angiotensin-converting enzyme-related gene product whose transcription was directly induced by 20-hydroxyecdysone (20E), ubiquitin-conjugating enzyme E2M, stathmin and a ras oncoprotein, all of which are associated with embryonic development [64].   Quantitative RT-PCR Analysis The dipause-unique gene BmnitA and the non-diapause-unique genes BmADK2, BmIDH and BmGGT were selected to detect their relative mRNA levels by qRT-PCR analysis. Relative gene expression levels were showed in Fig. 4. In the diapause eggs, BmnitA, BmADK2 and BmGGT mRNA relative expression levels were significantly increased compared with those in non-diapause eggs according to the development stage, while BmIDH mRNA relative expression levels were remained constant compared with those in non-diapause eggs according to the development stage, which indicated that BmIDH is a possible non-diapause-unique gene. Interestingly at the initial embryogenesis, the mRNA relative expressions of these four genes were all significantly higher in the non-diapause eggs than in the diapause eggs. In the non-diapause eggs, BmIDH mRNA relative expression was significantly higher more than 21-fold than that in diapause eggs, BmnitA mRNA relative expression was significantly higher more than 21-fold than that in diapause eggs, BmADK2 mRNA relative expression was significantly higher more than 1795-fold than that in diapause eggs, BmGGT mRNA relative expression was significantly higher more than 57-fold than that in diapause eggs. However, the mRNA levels were not always consistent with their protein expressions. In the mRNA level, the expression of these four genes can be detected whether in diapause or non-dipause eggs.

Gene Ontology Analysis of the Functional Categories
Gene ontology (GO) is now widely used to describe the function of genes and gene products in a standardized format (http://www. geneontology.org) [65]. To understand the functions of the proteins we identified, the protein sequences were queried against the InterPro databases and the resultant proteins were functionally categorized based on universal GO annotation terms using the online GO tool WEGO [38,66]. GO signatures of 486 out of the 501 and 545 out of the 562 proteins identified in the diapause and non-diapause eggs respectively were available. They were classified into Cellular Component, Molecular Function and Biological Process according to the GO hierarchy using WEGO (Fig. 5).
In the Cellular Component category, proteins mapping to cell, cell part, macromolecular complex and organelle related GO terms were the most abundant, mapping to membrane-enclosed lumen were the fewest. In the subcategory of cell part, 114 and 131 proteins of D and ND respectively were ascribed to intracellular. In the subcategory of organelle, 30 and 35 proteins were separately assigned to membrane-bounded organelle, and 39 and 48 proteins were separately assigned to non-membranebounded organelle.
According to the Molecular Function category, most proteins were addressed to binding (194/231 in the D and ND samples respectively) and catalytic activity (177/215 in the D and ND samples respectively), especially the nucleotide, nucleoside, chromatin, ion and nucleic acid binding and hydrolase, oxidoreductase, transferase activities. The groups with much fewer terms (only one protein annotated) include small protein activating enzyme activity, chromatin binding, odorant binding, metal cluster binding, enzyme activator activity, phosphatase regulator activity, transcription repressor activity, transcription initiation factor activity proteins. Moreover, one protein with deaminase activity, two proteins with cyclase activity and one protein with thioredoxin-disulfide reductase activity were annotated functional proteins only in the non-dipause eggs.
Considering the Biological Process category, most of proteins were involved in the metabolic and cellular process. Much fewer proteins were involved in the reproduction, reproductive, biological adhesion and developmental process. In the metabolic process subcategories, most proteins were related to primary metabolic, cellular metabolic and macromolecule metabolic process. In the cellular process subcategories, most proteins were associated with the cellular metabolic process followed by the cell communication process. Besides, only one protein was functionally annotated to each of secondary metabolic process, cellular component disassembly, translational initiation, membrane docking and secretion by cell in the diapause eggs.
GO analysis on the identified proteins presented an overall view on the functional categories of the diapause and non-diapause egg proteomes. In addition, proteins with GO annotation in the diapause eggs were not significantly different (p,0.05) to those in the non-diapause eggs, indicating that the expressed proteins during the early stages of embryonic development have many functional similarities between diapause and non-diapause eggs.

KEGG Pathway Analysis
KEGG is a large resource contains information for various model organisms about Molecular Interactions, Reaction Networks, Cellular Processes and Human Diseases [67]. In the present study, a total of 689 of 754 proteins were subjected to query against the KEGG reference pathway database and 192 nonredundant pathways were indicated ( Table S2) that most of them were related to Metabolism and Organism Systems. Other pathways such as the Genetic Information Processing, Environmental Information Processing, Cellular Processes and Human Diseases were also detected. 42 and 29 KEGG pathways were involve only in the diapause eggs unique proteins and nondiapause eggs unique proteins, respectively (Table 3, 4).
The metabolism pathways in non-diapause eggs were more highly represented than which in diapause eggs. The amino acid metabolism, carbohydrate metabolism and biosynthesis of other secondary metabolites was more highly represented in the nondiapause eggs, meanwhile the lipid metabolism pathway such as steroid hormone biosynthesis, linoleic acid metabolism and biosynthesis of unsaturated fatty acids was more highly represented in the diapause eggs. The translation and transcription pathway such as RNA transport, mRNA surveillance pathway and spliceosome were only detected in the non-diapause eggs. The sensory system (olfactory transduction, taste transduction and phototransduction), excretory system and digestive system were also detected in the non-diapause eggs, which indicated that the embryonic development was in progress.
ErbB signaling pathway, phosphatidylinositol signaling system, mTOR signaling pathway and VEGF signaling pathway which belonged to signal transduction pathway were more active in the diapause eggs. ErbB signaling regulates diverse cellular functions, such as proliferation, migration, differentiation and survival/ death, and participates in various developmental processes during both invertebrate and vertebrate early embryogenesis [68,69]. Interestingly, development and immune system pathways were also more represented in the diapause eggs, maybe the trigger of immune system pathway was the self-protection of diapause eggs during the long diapause stage.

Conclusions
This study provides a catalogue and an initial analysis of the proteomes of the diapause and non-diapause eggs during embryogenesis of the silkworm by shotgun proteomic analysis. Unique proteins identified in the two kinds of eggs and common proteins shared by each other were identified and analyzed. The relative mRNA levels of four identified proteins in the two kinds of eggs were also compared using qRT-PCR and showed some inconsistencies with protein expression. GO analysis of these proteins also provided us with a global view of their functions. In addition, KEGG pathway analysis showed the Metabolism, Translation and Transcription pathway were potentially more highly represented in the non-dipause eggs at this stage. These results will also help further research on finding the diapauseassociated proteins during egg development of silkworm. However, the shotgun proteomic analysis has some shortcomings, for example, it is too complex for protein assembly and it mainly depends on bioinformatic methods. With the development of genomics and bioinformatics, the shotgun LC-MS/MS will be a promising strategy in proteomics research. Table S1 The common-expressed proteins of D and ND with functional annotation.