A Proteomic Analysis of Seed Development in Brassica campestri L

To gain insights into the protein dynamics during seed development, a proteomic study on the developing Brassica campestri L. seeds with embryos in different embryogenesis stages was carried out. The seed proteins at 10, 16, 20, 25 and 35 DAP (days after pollination), respectively, were separated using two-dimensional gel electrophoresis and identities of 209 spots with altered abundance were determined by matrix-assisted laser desorption ionization time-of-flight/time-of-flight mass spectrometry (MALDI-TOF/TOF MS). These proteins were classified into 16 groups according to their functions. The most abundant proteins were related to primary metabolism, indicating the heavy demand of materials for rapid embryo growth. Besides, the high amount of proteins involved in protein processing and destination indicated importance of protein renewal during seed development. The remaining were those participated in oxidation/detoxification, energy, defense, transcription, protein synthesis, transporter, cell structure, signal transduction, secondary metabolism, transposition, DNA repair, storage and so on. Protein abundance profiles of each functional class were generated and hierarchical cluster analysis established 8 groups of dynamic patterns. Our results revealed novel characters of protein dynamics in seed development in Brassica campestri L. and provided valuable information about the complex process of seed development in plants.


Introduction
Plant seed is an important organ that is evolutionarily advantageous for plant survival and contributes so much to animal and human life [1]. Seed development goes through three overlapping phases, i.e. morphogenesis, seed filling and seed desiccation, which involve coordinated growth of three seed components, seed coat, endosperm and embryo [2]. Seed development involves highly dynamic processes of cell division, differentiation, growth, pattern formation and macromolecule production [3,4], elucidating the underlying mechanisms will provide insight into the complex system coordinating plant development and metabolism. In recent years, genetic and molecular analyses have identified critical players in the process of seed development [1]. DNA microarray and RNA-seq technique are also advantageous by large-scale genome-wide study at the mRNA level [5][6][7]. However, mRNA level doesn't always reflect protein abundance [8], and genomic tools can't provide precise information on protein levels [9], limiting our understanding on those metabolic and molecular networks. Proteomics provides more powerful tool to understand the complex protein dynamics and the underlying regulatory mechanisms during seed development [10][11][12]. By examining temporal patterns and simultaneous changes in protein accumulation, extensive proteomic studies have been carried out in legumes [13,14], Arabidopsis [15,16], rapeseed [17,18], rice [19], wheat [20,21] and many other species [10] to profile protein dynamics during seed development. The most popular proteins are those participating in central metabolism, followed by those related to cellular structure, and many previously unknown proteins are indicated important roles in embryo development [12]. In addition, proteome studies also reveal some important characters of seed proteins. For example, a proteome study on Medicago truncatula reveals a remarkable compartmentalization of enzymes involved in methionine biosynthesis between the seed tissues, therefore regulating the availability of sulfur-containing amino acids for embryo protein synthesis during seed filling [22]; in tomato seed, the most abundant proteins in both the embryo and endosperm were found to be seed storage proteins, such as legumins, vicilins and albumin [23]. These proteomic applications have greatly expanded our knowledge on seed development.
Plant embryo development, also known as embryogenesis, defines an important development process in higher plant life cycle [24]. Embryo development starts from a double fertilization event in which two sperm nuclei fuse with the egg cell and central cell nuclei respectively, then the zygote undergoes a series of cell divisions and differentiation events to initiate embryo development, going through a globular embryo stage, then a heart-stage, a torpedo-stage and a bended-cotyledon-stage embryo consecutively to produce the mature embryo [25][26][27]. Therefore, embryogenesis covers part of the processes of morphogenesis and seed filling during seed development. There are gaps in our understanding on the complete seed development process, as current proteomic studies mainly focus on the protein dynamics during the seed filling or seed dessication. A systematic view of the seed development process encompassing complete embryo development stages is necessary for integrity of our knowledge of full seed development. This is especially meaningful for most dicot plants, because in the mature seed of different species, the relative content of endosperm and embryo is variable. The embryo in dicots is normally the major part of the mature seed, such as in species of Arabidopsis thaliana and Brassica napus, and the endosperm is almost completely absent in the mature seed, whereas in monocts such as wheat, maize, and rice, endosperm tissues possess the majority of the whole seed mass.
To this end, oilseed (Brassica campestri L.) takes its advantage for its relatively larger embryo compared to the model plant Arabidopsis and ease to be accurately differentiated from embryo developmental stages [28]. B. campestri belongs to the mustard family (Brassicaceae), and like most dicotyledonous plants, its embryo development goes through morphologically defined globular, heart, torpedo, and bended cotyledon stages to produce the mature embryo [25,26,29]. Here, using B. campestri seeds with embryos in five sequential development stages of embryogenesis, we carried out a proteomic study on protein dynamics aiming at understanding seed development of oilseed.

High-resolution Proteomes of the Developing B. campestri Seeds
To isolate proteins of seeds at different stages, developing B. campestri seeds were harvested at precisely 10, 16, 20, 25 and 35 DAP when their embryos were in the globular embryo stage, heart stage, torpedo stage, bended-cotyledon stage and C-shaped mature embryo, respectively ( Figure 1). Then the whole proteins were resolved and detected using high-resolution two-dimensional electrophoresis (2-DE) followed by colloidal Coomassic brilliant blue staining. Initial analyses were performed with immobilized pH gradient (IPG) strips that ranged from pH 3 to 10. It was observed that the region from pH 4 to 7 was a highly dense area on the proteome map, so analyses with pH 4 to 7 IPG strips were further performed to attain high resolution proteome maps. The 2-DE maps showed a highly dynamic proteome during B. campestris seed development ( Figure 1). Using the ImageMaster 2D Platinum software 6.0 (GE Healthcare), more than 800 CBB R250-stained protein spots were reproducibly detected from at least three independent 2-D gels, suggesting they were involved in the seed development.

Identification of Dynamically Accumulated Seed Proteins of B. campestri
To select protein differentially accumulated over five developmental stages, their proteome profiles were compared using ImageMaster software and 260 spots with at least a two-fold change in statistical analysis (P#0.05) in combination with manual validation and quantification. Then they were excised from the 2-DE gels and identified by MALDI-TOF/TOF-MS MASCOT and MASCOT database searching. Identities of a total of 209 proteins with altered accumulation were established (Table 1 and Figure 2). G/O analysis was carried out on the base of protein function and these proteins were classified into 16 groups, as depicted in Table 1, including primary metabolism, protein processing and destination, oxidation and detoxification, energy, transcription, protein synthesis, cell structure, signal transduction, defense, secondary metabolism, DNA repair and storage, suggest-ing these proteins should be involved in a wide range of cellular activities during seed development. Those proteins related to primary metabolism could be further classified into the TCA cycle, carbohydrate metabolism, fatty acid metabolism, nitrogen metabolism, amino acid metabolism, and others (Table 1). Importantly,  these spots represented 147 non-redundant proteins, for example,  five spots were identified to be enolase (127, 139, 248, 452, 132) and a total of 8 spots were found to be triosephosphate isomerase (591, 592, 557, 583, 588, 560, 573, 589), indicating some of the selected spots are isoforms or modified (Table 1). By calculating the relative proportions, it was found the most abundant proteins participated in the primary metabolism (32.1%), highlighting the dynamic requirement for the growing seed. The second group is related to protein processing/destination (23.4%) followed by those in energy production (8.1%), oxidation/and detoxification (6.7%), disease and defense (5.7%). The other processes these proteins got involved were protein synthesis (4.3%), signal transduction (3.8%), secondary metabolism (3.8%), cell structure (1.9%), transcription (1.9%), DNA repair (1.9%), transposon (1.4%), storage (1.4%), transporter (1.4%), unclear classification (1.0%) and unknown (1.0%) ( Figure 3).

Protein Abundance Profiles of Each Functional Class
To characterize global abundance kinetics of proteins involved in different processes, composite expression profiles were generated by summing protein abundance, expressed as relative volume [17,30,31], of each functional class over the five development stages. As shown in Figure 4, relative abundances of metabolic proteins fluctuated along the experimental period, reflecting different metabolic activity during the embryo maturation. Abundance of those responsible for protein synthesis, destination and second metabolism decreased during early seed growth, but increased and reached the top at the 25 DAP before a second reduction. Disease-and defense-related proteins were highly abundant at the late stage of seed development, and those involved in energy production, oxidation and detoxification, signal transduction, transposition, storage and transportation shared very similar patterns which increased and reached the summit at 25 DAP, whereas proteins related to cell structure, transcription, DNA repair and continued to accumulate and had the highest abundance at 20 DAP ( Figure 4). Generally, it's very interesting to find most of the protein groups possessed relatively higher abundance at a stage from 20 DAP to 25 DAP, reflecting extensive cellular activities during the processes of seed development.

Hierarchical Clustering Analyses of Seed Proteins
To further improve the understanding of the identified proteins, abundance profiles were analyzed by hierarchical clustering. Finally, we generated a total of eight cluster groups (c1, c2, c3, c4, c5, c6, c7 and c8) that displayed similar dynamics (Table 2 and Figure 5), suggesting complicated regulatory patterns of these identified proteins during the seed development. The largest group contained 64 proteins (c4), expression of which increased from the early stage of seed development and reached the top at the 25 DAP but decreased at the late stage (35 DAP). The second group included 46 proteins (c7), and most of them were not detected until 16 DAP and were highly accumulated even at 35 DAP, different from those in the group of c1. The smallest cluster, c6, had only four proteins which displayed U-type expression profiles ( Figure 5). Clusters c1 and c3 consisted of 28 and 30 proteins, which had the highest abundance at 16 DAP and 20 DAP, respectively. Seventeen proteins were grouped into the c8, and their abundance remained reducing along the seed growth. Notably, most of the        proteins involved in primary metabolism, energy production, protein destination and oxidation were included into the c4 group ( Figure 5 and Table 2), suggesting these cellular activities are essential for the early-stage seed development.

Proteins Associated with Metabolism and Protein Renewal are Prevalent in the Developing Seed
Currently, a large number of proteomic studies have been carried out in different species to understand seed development [11,32]. Most of these studies, both in embryo-dominant seeds and endosperm dominant seeds, identify the largest group of proteins   involved in metabolism, which is consistent with the rapid and complicated metabolic changes during seed development [12]. Our analysis revealed similar character that a proportion of 32.1% total identified proteins participated in primary metabolism ( Figure 3). For example, for the enzymes involved in the glycolytic pathway, 7 of them were identified as 15 protein spots. Besides, five enzymes of TCA pathway and three enzymes in lipid biosynthesis were identified (Table 1). An obvious pattern shared by these enzymes is their accumulation remained increasing over five stages of seed development ( Figure 5), indicating these  metabolic pathways were increasingly required for the seed development. Many proteins with altered expression in our analysis were related to other metabolic events, like amino acid metabolism. In addition, proteins associating with energy and metabolism, defense, oxidation/detoxification were prevalent in the developing seed (Table 1). Interestingly, one transcript study on Arabidopsis embryo development indicates that transition from globular to torpedo stage is associated with up-regulation of genes involved in energetics and metabolism [33], which is consistent to our proteomic study. Abundance of these proteins probably suggests that their activities defined the basal requirement during seed development. Our data revealed 23.4% total proteins (25.9% of nonredundant proteins) were involved in protein processing and destination ( Figure 3). They were those molecular chaperons that helped protein folding of newly synthesized proteins (spot 42, 53, 73), those isomerases that functioned in changing protein conformation (spot 353, 328, 406 and 675), the ubiquitin proteasome group including 20S, 26S proteasome subunits (spot 91, 464, 592, and 569), and those proteases (spot 328, 576 and 406), suggesting important protein turnover and rearrangements during seed development. Ubiquitination-mediated degradation pathway plays an important role in various aspects of plant growth and development [34]. Polyubiquitinylation of substrates is achieved through the action of three enzymes: E1, ubiquitin-activating enzyme, E2, ubiquitin-conjugating enzyme, and E3, ubiquitin ligase that determines the specificity of the substrate. The modified protein is then processed by the 26S proteasome, which consists of a core 20S protease capped at each of its ends by a regulatory 19S complex [35]. In our analysis, four isoforms of E1 and eight proteasome components were observed ( Table 1). Folding of nascent polypeptides into functional proteins is controlled by a number of molecular chaperones and protein-folding catalysts. Our analysis revealed 6 different isoforms of protein disulphide isomerases, an endoplasmic reticulum-located protein that cata-lyzes the formation, isomerization, and reduction/oxidation of disulfide bonds [36]. Seven chaperonins or chaperones were also observed, including the plant homolog of the immunoglobulin heavy-chain binding protein (BiP), which is an endoplasmic reticulum-localized member of the heat shock 70 family. BiP has been proposed to play a role in protein body assembly within the endoplasmic reticulum [37,38].
These proteins displayed different accumulation patterns in the process of seed development. For example, spot 73 was identified as a protein disulfide isomerase that continued to accumulate and reached the highest at 20 DAP. Consistent with this in the transcript level, our gene expression analysis also revealed disulfide isomerase can be detected at the late stage of embryogenesis [39]. Plant cysteine proteases are important for organ senescence, plant defense and nutrient mobilization during seed germination [40], and previous studies reveal cysteine proteinases are up-regulated in various senescing plants, such as Arabidopsis, B. napus, and Nicotiana tabacum [41]. In this study, we identified spot 542 as senescenceassociated cysteine protease, and spot 576 as another cysteine proteinase that increased its abundance all over the five stages ( Figure 5), suggesting that cysteine proteinase also played an important role in maturation and senescence of seed growth. Altered accumulation of these proteins indicated active protein production and elimination occurred in the process of seed development, which might serve as a monitoring mechanism over those intricate processes of metabolism and energy production. It's also highly likely that the accumulation of these proteins may be used during rapid cell division and cell structure construction. Despite of these, preponderance of these proteins seemed to be particular of our study, because few of previous reports has indicated so many proteins with similar function [13,[42][43][44], which make us underestimate the importance of protein selfrenewal. Therefore, protein renewal could be an essential regulatory mechanism for seed development.  6  3  3  13  3  1  14  6  49 3 .E n e r g y Carbon Assimilation During Seed Development The developing oilseeds take up sugars and amino acids from the surrounding endosome liquid and synthesize large quantities of triacylglycerol storage proteins. Previous work characterizes carbon assimilation during seed filling in Brassica napus and castor, both of which are oil plants [17,45]. It's interesting to examine this important metabolism pathway in the seed development. It has been demonstrated that glycolysis supplies most carbon to fatty acid synthesis (FAS) in rapeseed developing embryos in culture [46], suggesting glycolysis is essential for carbon assimilation in the developing seeds, but relatively little is known about its regulation and control, and due to the parallel pathways operated in both the cytosol and plastids, it become more complex in plants [47,48].

Possible Seed Development-specific Proteins Indicated by This Study
In our analysis, four spots were identified to be proteins related to cell structure, taking a proportion of only 1.9% of total proteins (Figure 3). This is obviously lower than previous studies on seed development in Arabidopsis [16] or B. napus [18], which often identifies similar proteins of about 20%. Protein level of spot 318, RGP4, continued increasing from the beginning of seed development until at 16 DAP, and could be hardly detected at the late stage ( Figure 5), suggesting it might be a novel protein associating with seed development. This is consistent with recent report that expression of RGP4 was restricted in seed and important for development [49]. Such proteins as Actin1 (spot 241, 252) has been found to be highly dynamic in nearly every development stages of seed development, highlighting their importance. Despite these studies use different materials of different development stages, the relative low proportion of proteins contributing to cell structure in this report may suggest another novel character specific to B. campestris seed development.
Our analysis identified only two storage proteins: napin (spot 558) and cruciferin (spot 461 and 540 ), which are the two major storage proteins in rape seed (B. napus), and constitute 20% and 60% of the total protein in mature seeds [50]. As has been reported that their biological synthesis begins early from the expansion phase of embryo development [51]. Consistent with their roles as ''molecular marker'' of late embryogenesis, both proteins were found to continually accumulate over the five embryo development stages and reach the highest level in the last stage ( Figure 5). In our parallel gene expression analysis, napin gene was found to be up-regulated obviously during embryogenesis and expressed highly only in late embryo stage, but it was not detected in the globular embryo stage [39]. Interestingly, few of previous embryo-related proteomic studies have reported storage protein of napin, in contrast to that cruciferin is frequently detected. Expression of a novel protein, AKT2/3 (spot 476), increased in the early stage of seed development and began to decrease after 20 DAP ( Figure 5). In Arabidopsis, AKT2/3 encodes photosynthateand light-dependent inward rectifying potassium channel with unique gating properties that are regulated by phosphorylation [52,53]. Therefore, identification of AKT2/3 suggested its novel role in seed development. Another interesting finding comes from an unclassified protein TSJT1 (spot 795), which has been indicated as stem-specific and found in gene chip data [54]. During seed development, it was very highly expressed early at 10 DAP, but after 20 DAP, its protein significantly decreased (Table 2 and Figure 5), therefore, our analysis indicates it may be important for early seed development, which remains to be determined by further experiment.
An investigation on seed development should significantly enrich our knowledge on the molecular and physiological events in whole seed growth process. In this study, we explored the protein dynamics over five stages during B. campestri seed development using a proteomic approach. A total of 209 proteins were identified by mass spectrometry to be differentially seed development and they could be classified into 16 functional groups. It was found that proteins participating in metabolism, energy production, oxidation/detoxification as well as stress/ defense were highly dynamic in abundance. However, expressed during functional assignment of these altered proteins uncovers unexpected abundance of proteins related to protein processing and destination, highlighting the importance of protein renewal in seed development, and proportion of those associated to cell structure was rather low compared to previous proteomic analysis of seed development. Our study provides important information to better understanding the seed development in oil plant.

Plant Materials and Sample Collection
Brassica campestri L. (cv. Jianghuangzhong) plants were grown in soil-based compost under natural conditions (Wuhan, China). Before flowering, nylon nettings were used to prevent pollen contamination. For sampling seeds in different developmental stages, flowers were tagged immediately after opening of buds, and development of seeds was monitored by checking the embryos under a dissecting microscope. Harvesting the developing seeds was performed at precisely 10, 16, 20, 25 and 35 days after pollination (DAP) when their embryos were at the globular embryo stage, heart stage, torpedo stage, bended-cotyledon stage and C-shaped mature embryo, respectively. Five grams of seeds in each stage were sampled. Then they were frozen in liquid nitrogen and stored at 280uC for use.

Protein Extraction
One gram of seed samples were grounded with mortar and pestle into fine powder in liquid nitrogen, then they were immediately homogenized with ice-cold extraction buffer (8 M Urea, 2 M Thiourea, 4% w/v CHAPS, 40 mM Tris-HCl, pH 8.0) containing protease inhibitors (1 mM PMSF, 10 mM DTT). The supernatant was collected by centrifugation at 20000 g for 30 min at 4uC. Then the pellet was resuspended in ice-cold lysis buffer and centrifuged as described above. After the oil above the supernatant was removed, proteins in the supernatant were precipitated with five volumes of ice-cold trichloroacetic acid-acetone (12.5% trichloroacetic acid in 100% acetone) at 220uC for 2 h and then collected by centrifugation at 20000 g for 30 min. The pellet proteins were resuspended in 80% ice-cold acetone containing 20 mM DTT and centrifuged as above for two times before they were dried by vacuum. The obtained proteins were dissolved in lysis buffer (8 M urea, 4% CHAPS, 10 mM DTT, and 2% pharmalyte 4-7) at room temperature, then vortexed vigorously and centrifuged. The final supernatants were transferred to fresh tubes. The protein concentration was quantified according to the Bradford method [55] using UV-2000 UV-visible spectrophotometry (UNICO) with bovine serum albumin (BSA) as the protein concentration standard. The final protein samples were stored at 270uC for two-dimensional gel electrophoresis (2-DE).

Imaging and Statistical Analysis
2-DE gels stained by CBB were scanned at a resolution of 300 dpi and 16-bit pixel depth and then analyzed by ImageMaster 2-D Platinum 6.0 software (GE Healthcare) according to protocols provided by the manufacturer. After automatic spot detection, manual spot editing was carried out. Spots matching in at least two out of three gels for each protein extraction were considered as reproducible spots and included in the synthetic 2-DE gel images. Spot matching was further confirmed by visual inspection. To determine the differences in protein abundance across distinct 2-DE gels, the normalized/relative protein spot volume (area multiplied by stain intensity) calculated by the ImageMaster 2-D Platinum 6.0 software (GE Healthcare) was used as the parameter, and protein spots with changes more than two folds (P,0.05) by statistic analysis were considered as differentially accumulated. Then, significantly dynamic spots (P,0.05) were re-examined by eye detection to include only the most obviously varying spots over different stages for further study.

Protein Identification by MALDI-TOF/TOF MS
Dynamically accumulated protein spots among five developmental stages were manually excised from 2-D identification gels and digested with trypsin (Promega). Each dried peptide mixture was dissolved into a volume of 50% ACN/0.1% TFA according to its relative abundance in the gel. Then the salts and detergents were removed using Millipore C18 ZipTips (Millipore). Bound peptides were eluted from ZipTip with approximately 3 ml 60% methanol/3% formic acid. 0.5 ml sample solution or calibration standard was then mixed with equal volume of CHCA (a-cyano-4-hydroxycinnamic acid) matrix (10 mg/ml CHCA in 50% ACN/ 0.1% TFA) and spotted onto a freshly cleaned target plate. After air drying, the crystallized spots were analyzed by MALDI-TOF/ TOF (4800 Plus Analyzer, Applied Biosystems). Parent mass peaks were scanned in 1000 laser shots with a mass range of 800,4000 Da after calibration. The minimum signal to noise ratio was 10. Five parent mass peaks with most intensity were picked out for tandem TOF/TOF analysis, each with 1500 laser shots. The searching parameters were set as follows: carbamidomethylation (C) and oxidation (M) as variable modifications, up to one missed cleavage, precursor ion tolerance at 200 ppm, and fragment ion tolerance at 0.3 Da and peptide charge of 1+. Protein hits with protein scores C. I.% (confident identification percentage, based on combined mass and mass/mass spectra) over 95 were reserved. Most identified proteins also have total ion score C. I.% (based on mass/mass spectra) over 95. Spectra combined mass and mass/mass were searched against an NCBInr protein database, taxonomy Viridiplantae (Green Plants) by GPS Explorer TM Workstation (Applied Biosystems).

Hierarchical Cluster Analysis
Gene Cluster 3.0/TreeView software was used to do the clustering based on the mean relative volume of each protein spot. Clustering is based on visual inspection of relative similarities or differences between different cluster ranges and the number of clusters was chosen when the dynamics of functional categories between clusters possesses the most significant difference.