Identification of Proteins Enriched in Rice Egg or Sperm Cells by Single-Cell Proteomics

In angiosperms, female gamete differentiation, fertilization, and subsequent zygotic development occur in embryo sacs deeply embedded in the ovaries. Despite their importance in plant reproduction and development, how the egg cell is specialized, fuses with the sperm cell, and converts into an active zygote for early embryogenesis remains unclear. This lack of knowledge is partly attributable to the difficulty of direct analyses of gametes in angiosperms. In the present study, proteins from egg and sperm cells obtained from rice flowers were separated by one-dimensional polyacrylamide gel electrophoresis and globally identified by highly sensitive liquid chromatography coupled with tandem mass spectroscopy. Proteome analyses were also conducted for seedlings, callus, and pollen grains to compare their protein expression profiles to those of gametes. The proteomics data have been deposited to the ProteomeXchange with identifier PXD000265. A total of 2,138 and 2,179 expressed proteins were detected in egg and sperm cells, respectively, and 102 and 77 proteins were identified as preferentially expressed in egg and sperm cells, respectively. Moreover, several rice or Arabidopsis lines with mutations in genes encoding the putative gamete-enriched proteins showed clear phenotypic defects in seed set or seed development. These results suggested that the proteomic data presented in this study are foundational information toward understanding the mechanisms of reproduction and early development in angiosperms.


Introduction
In angiosperms, the female gametophyte, referred to as the embryo sac, develops from a functional megaspore via three or more rounds of mitosis without cytokinesis. Subsequently, plasmamembranes/cell-walls are formed between the nuclei, resulting in a cellularized female gametophyte, generally known as an embryo sac. The embryo sac most commonly consists of one egg cell, one central cell, two synergid cells, and three antipodal cells [1,2] and plays critical roles in pollen tube guidance, double fertilization, and seed development [3,4]. In the anthers, male microspores undergo an initial asymmetric mitotic division, and the smaller generative cell, which establishes the male germ line, migrates into larger vegetative cell. The generative cell within the vegetative cell divides into two sperm cells before or after germination of the pollen tube.
Upon double fertilization, one sperm cell from the pollen tube fuses with the egg cell, and the resultant zygote develops into an embryo transmitting genetic material from the parents to the next generation. The central cell fuses with the second sperm cell to form a triploid primary endosperm cell, which develops into the endosperm nourishing the developing embryo and later seedling [5][6][7]. Within the embryo sac, the haploid egg cell is specially differentiated for fertilization and subsequent embryogenesis. However, it remains unclear how the egg cell specializes, fuses with the sperm cell, and converts into an active zygote for early embryogenesis, despite its importance in plant reproduction and development. This lack of knowledge is partly attributable to the fact that gametogenesis, fertilization, and embryogenesis all occur in the embryo sac, which is deeply embedded within the ovary, making direct observation and characterization of the female and male gametes in the embryo sac difficult.
Recently, single-cell proteomic approaches have been widely employed to dissect the functions of specific cells, because cellularlevel information is diluted when organs or tissues, which comprise various differentiated cells, are used as starting materials [8]. For example, more than 1,000 unique proteins have been identified in pollen grains [9], guard cells [10][11][12], trichomes [13,14], and root hairs [15,16]. However, such global proteome analyses have not been conducted for plant egg and sperm cells, presumably because of the difficulty in obtaining sufficient, highly-pure homogenous cells, especially for egg cells. Procedures for isolating viable gametes have been reported for a wide range of plant species, including maize, wheat, tobacco, rape, rice, barley, Plumbago zeylanica, and Alstroemeria [17][18][19]. We previously conducted proteomic analyses using 75-290 maize and rice egg cells, although only six and four major protein components could be identified, respectively [20,21]. However, in those studies, trace amount of proteins in just 10-20 egg cells could be detected by minimizing the size of gels. Moreover, state-of-the-art proteomics technologies enable high throughput and high-resolution analyses using such limited numbers of cells.
In the present study, large numbers of rice gametes were isolated from flowers, and proteins expressed in egg and sperm cells were globally detected by highly sensitive liquid chromatography coupled with tandem mass spectroscopy (LC-MS/MS) technology, and proteins which are preferentially expressing in gametes were identified by comparison of protein expression profiles between gametes and somatic cells/pollen grains. In addition, it is supposed that the gamete-enriched proteins function in reproductive and/or developmental processes such as gamete differentiation, gamete fusion, early zygotic development, and that defects in function of these proteins affect such biological processes. Therefore, seed-set fertility of the rice plant possessing transposon-insertional mutation for the several genes encoding the putative gamete-enriched proteins was checked. Moreover, the Arabidopsis plants possessing T-DNA-insertional mutations for putative orthologous genes to rice gene encoding the putative gamete-enriched protein were obtained, and profiles of seed development of these mutants were observed. Several of these rice and Arabidopsis mutants showed clear phenotypes of fertility defects, suggesting that the present proteomic results for rice gametes are useful basic information for understanding the reproductive and/or developmental processes in angiosperms.

Plant Materials
Oryza sativa L. cv. Nipponbare was grown in environmental chambers (K30-7248, Koito Industries, Yokohama, Japan) at 26uC in a 13/11 h light/dark cycle, and gametes were isolated from flowers. Tos17 insertional rice plants were obtained from Rice Genome Resource Center, National Institute of Agrobiological Sciences (NIAS, Tsukuba, Japan), and grown in an experimental field at our university from mid-May to October to check fertility. Arabidopsis thaliana ecotype Columbia and T-DNA insertion lines obtained from the Arabidopsis Biological Resource Center (ABRC, Columbus, OH, USA) were grown in an airconditioned room at 23uC in a 16/8 h light/dark cycle.

Isolation of Gametes for Proteomic Analyses
Rice egg cells were isolated according to Uchiumi et al. [22], except that the mannitol solution was adjusted to 370 mOsmol/kg H 2 O instead of 0.3 M mannitol (Fig. S1). After washing the cells three times by transferring the cells into fresh droplets of mannitol solution on coverslips, fifty to seventy isolated egg cells were transferred into a 1 mL droplet of SDS-sample buffer (2% SDS, 25 mM Tris-Cl pH 6.8, 30% glycerol, 5% 2-mercaptoethanol) and treated at 98uC for 3 min, then stored at -80uC until use.
Sperm cells were isolated according to Zhang et al. [23] and Gou et al. [24] with modifications. Approximate 120 anthers harvested from rice flowers were collected in plastic dishes (w 3.5 cm) filled with 3 mL of 370 mOsmol kg 21 H 2 O mannitol. After washing the anthers by gentle shaking, the anthers were transferred to four plastic dishes (w 3.5 cm) filled with 3 mL of 15% sucrose, and the tissues were broken with forceps to free the pollen grains. After gentle shaking for 30 min, the sucrose solution, in which pollen grains released their sperm cells, was filtered twice, through 20-mm then 10-mm nylon bolting cloth. To the filtrate, an equal volume of 15% sucrose containing 60% Percoll (GE Healthcare UK Ltd., UK) was added, and then 4 mL of the mixture was transferred into a 13PA centrifugation tube (Hitachi, Japan). Over the mixture, 20% and 5% Percoll in 15% sucrose were layered to form a discontinuous Percoll gradient in the tube.
After centrifugation at 3,000 6g for 30 min at 4uC, the interface between the 5% and 20% Percoll layers was collected, and an equal volume of 15% sucrose was added to the collected fraction. Then, the fraction, containing sperm cells, was centrifuged at 5,000 6g for 4 min at 4uC. The bottom 15 mL of the tube, containing concentrated sperm cells, was used as the isolated sperm cells after counting the number of cells in the fraction. An equal volume of SDS-sample buffer was added to the fraction, and the mixture was treated 98uC for 3 min then stored at -80uC until use.

Preparation of Lysates from Rice Callus, Seedlings, and Pollen Grains
Pollen grains released from 10-20 anthers were homogenized with 0.5 mL of SDS-sample buffer using a mortar and pestle. Rice seeds were cultured on N6D medium containing 2,4-D for 7 days at 30uC under the continuous light according to Toki et al. [25], and the callus derived from scutellum were harvested. For obtaining seedlings, rice seeds were sown in water and grown at 26uC in darkness for 4 days, and then the germinated seeds were further grown at 26uC in a 13/11 h light/dark cycle for 7 days. The seedlings whose shoot length are 2.5-3 cm were harvested. Callus (0.3 g) or five seedlings were homogenized with 0.4 mL of SDS-sample buffer using a mortar and pestle. Each homogenate was treated at 98uC for 5 min, then centrifuged for 10,0006g for  5 min at room temperature. The supernatants were used as lysates from pollen grains, callus, and seedlings, respectively, and stored at -80uC until use. Lysate protein concentrations were measured using the Pierce 660 nm protein assay kit (Thermo Scientific, MA, USA) using bovine plasma gamma globulin (Bio Rad, CA, USA) as standard.

SDS-polyacrylamide Gel Electrophoresis
According to Laemmlie [26], 12.5% SDS-polyacrylamide (SDS-PAGE) gels were prepared in a small mold (5066061 mm; Atto, Tokyo, Japan), and cell lysates from egg cells, sperm cells, seedlings, callus, and pollen grains in SDS-sample buffer were separated. Proteins in the gel were detected by conventional silver staining [27]. When gels were used for subsequent LC-MS/MS Table 2. Proteins preferentially expressed in egg or sperm cells with .5 identified spectra. analyses, proteins in the gel were visualized with modified silver staining according to Taoka et al. [28].

Identification of Proteins by Tandem Mass Spectrometry
SDS-PAGE gels were cut into 15 pieces. Proteins in each piece were in-gel-digested with trypsin [29] and identified by liquid chromatography coupled with tandem mass spectroscopy (LC-MS/MS) using a direct nanoflow LC-MS system equipped with an Orbi Trap XL (Thermo Scientific) mass spectrometer as described elsewhere [30]. Dataset of protein sequences obtained from the Rice Annotation Project Database (Tsukuba, Japan; http://rapdb. dna.affrc.go.jp/download/irgsp1.html) was searched using Mascot software (ver. 2.2.1, Matrix Science, MA, USA) with the following parameters. The fixed modification was propionasmide (Cys) and variable modification parameters were pyro-Glu, acetylation (protein N-terminus), and oxidation (Met). The maximum missed cleavage was set at 3 with a peptide mass tolerance of +/-15 ppm. Peptide charges from +2 to +4 states and MS/MS tolerances of +/-0.8 Da were allowed. The criteria for peptide identification were based on the vendor's definitions (expectation value , 0.05, Matrix Science), and we assigned the protein ''identified'' if at least two peptides were identified from the protein. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral. proteomexchange.org) via the PRIDE partner repository [31] with the dataset identifier PXD000265 and DOI 10.6019/ PXD000265.

cDNA Synthesis and Quantitative PCR
Rice egg cells were isolated as above, and sperm cells were manually isolated according to Uchiumi et al. [22]. Isolated egg or sperm cells were washed three times by transferring the cells into fresh droplets of mannitol solution on coverslips. The washed cells were submerged in 5 ml of the extraction buffer supplied in a PicoPure RNA Isolation Kit (Arcturus, CA, USA) and stored at -80uC until use. Total RNAs were prepared from 15 egg cells or 150 sperm cells using the PicoPure RNA Isolation Kit according to the manufacturer's instructions. Roots and shoots were harvested from 7-days old rice seedlings, and total RNAs were isolated from these tissues using RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). cDNAs were synthesized from total RNAs of egg cells, sperm cells, roots and shoots using the High Capacity RNA-to-cDNA TM Kit (Life Technologies, CA, USA) according to the manufacturer's instructions. For quantitative PCR analysis, 0.5 ml of first-strand cDNA was used with LightCycler 480 SYBR Green I Master (Roche Applied Science, Penzberg, Germany) according to the manufacturer's protocol. PCR cycle was conducted as follows: 94uC for 10 s, 55uC for 10 s, and 72uC for 10 s, and relative quantification was calculated with ubiquitin (Os02g0161900) as a reference by the delta delta Ct method. Primer sequences used for PCR analyses are listed in Table S1.

Fertility of Tos17 Insertional Rice Mutants and T-DNA Insertional Arabidopsis Mutants
Confirmation of the Tos17 insertion at the gene locus was conducted by genomic PCR according to the instructions of the Tos17 database (http://tos.nias.affrc.go.jp/index.html.en) using a left border primer (59-ATTGTTAGGTTGCAAGTTAGT-TAAGA-39) together with primers specific for the Tos17 insertion lines (Table S2). The mutants with Tos17 mutations were grown in experimental field as mentioned above, and fertility was checked by counting the numbers of developed and undeveloped seeds after harvesting the fully grown plants.  Table S1 for primer sequences. doi:10.1371/journal.pone.0069578.g002  (Table S3). Seed-set and/or -development were checked by dissecting developing siliques and observing seeds within them.

SDS-PAGE of Lysates from Isolated Gametes, Somatic Tissues, and Pollen Grains
Homogenous egg cells were manually isolated from rice flowers (Figs. 1A and S1). Observation of the sperm cell fraction revealed that sperm cells were almost pure, but cell or tissue debris possibly derived from pollen grains slightly co-existed in the fraction (Fig. 1B).
We first separated lysates from a range (375-6,000) of sperm cells with lysate of 25 egg cells by SDS-PAGE (Fig. S2A). Based on band intensities, we judged that 25 egg cells yielded approximately the same amount of protein as 1,500 sperm cells. Similar calibrations were performed using lysates of callus, seedlings, and pollen grains. Protein yield from 25 egg cells was roughly equivalent to 0.038-0.077 mg callus protein, 0.031-0.062 mg seedling protein (Fig. S2B), and 0.014-0.028 mg pollen-grain protein (data not shown). Based on these comparisons, lysates from 50 egg cells, 3,000 sperm cells, callus (0.1 mg protein), seedlings (0.12 mg protein), and pollen grains (0.06 mg protein) were separated on the same SDS-PAGE gel (Fig. 1C). Protein band intensities for egg and sperm cells were equivalent and approximately 2-3-fold weaker than those of callus, seedlings, and pollen grains ( Fig. 1C and S2B). Although protein amounts from callus, seedlings, and pollen grains are higher than those of gametes, in this study, proteome data from somatic tissues and pollen grains were mainly used as subtractive factors to identify gameteenriched proteins; proteins detected in these tissues were not considered as candidates for gamete-enriched proteins. Therefore, protein amounts of these tissues that were 2-3-fold higher than those of gametes were considered suitable for our purposes. Maintaining this ratio, lysates of 500 egg cells, approximate 3610 4 sperm cells, 1 mg of callus proteins, 1.2 mg of seedling proteins, and 0.6 mg of pollen proteins were separated by SDS-PAGE for subsequent LC-MS/MS analyses.

Identification of Proteins Preferentially Expressed in Rice Gametes
By analyzing proteins from egg and sperm cell lysates, 2,138 and 2,179 proteins were detected, respectively (Tables S4 and S5). Among these proteins, 1,276 and 1,076 proteins were assigned by at least two peptides (Tables S4 and S5). In callus, seedlings, and pollen grains, 2,877, 2,473, and 2,246 proteins were detected, respectively (Tables S6-S8). All proteins from egg cells with .25 identified spectra are listed in Table 1, along with the numbers of identified spectra in the other cell types. Among these proteins, polyubiqutin; molecular chaperones, including heat shock proteins (HSPs) and protein disulfide isomerase (PDI); enzymes of the glycolytic pathway, including phosphoglycerate kinase, glyceraldehyde-3-phosphate dehydrogenase, fructose-bisphosphate aldolase and enolase; ascorbate peroxidase; ATP synthase subunit; and annexin protein were abundantly found. Interestingly, HSP, phosphoglycerate kinase, glyceraldehyde-3-phosphate dehydrogenase, ascorbate peroxidase, and annexin protein were reported as major protein components of rice or maize egg cells in our previous studies [20,21]. In addition, these housekeeping proteins were also detected in other cell types with at least five identified spectra, suggesting that they can be treated as internal controls between cell types and that finding proteins preferentially expressed in gametes is possible by the use of the number of identified spectra. Next, the number of identified spectra for a protein was compared between cell types to identify the protein preferentially expressing in gametes.  Table 3. Tos17 insertional rice mutants used to check seed-set fertility. To screen for proteins enriched in egg cells, proteins with .2 identified spectra in egg cells and none in other cell types were selected. In addition, proteins with .3 identified spectra in egg cells and one in another cell type were also chosen as candidates for egg cell-enriched proteins. In total, 109 putative egg-enriched proteins were identified (Table S9). Similarly, 79 proteins were identified as proteins enriched in sperm cells (Table S10). Although pollen-derived cell or tissue debris slightly co-existed in sperm cell fraction (Fig. 1B), subtraction of proteins detected in pollen grains from the list of proteins detected in sperm cell fraction could eliminate the possibility of identifying the pollenderived proteins as sperm proteins. Table 2 presents putative gamete-enriched proteins with .5 identified spectra. Interestingly, none of these proteins, except for HSP 70 (HSP70), has been reported to play a role in reproductive/developmental processes to our knowledge. Investigating these proteins further may uncover novel molecular mechanisms during gametic development and fusion and early embryogenesis. An abundance of molecular chaperones, including HSPs and PDI, has been suggested to be a common characteristic of mammalian and plant eggs [32][33][34]. In addition, HSPs are suggested to buffer the expression of genetic variation when divergent ecotypes are crossed and profoundly affect developmental plasticity in response to environmental cues [35]. Chaperones in egg cells may function following fertilization by a sperm cell, because conversion of an egg cell into a zygote represents a major genetic and environmental change. Thus, the specific expression of HSP70 in egg cells may be related to fertilization.
For monitoring the amount of transcripts for putative gameteenriched proteins, expression level of six genes listed in Table 2 and a control gene encoding glyceraldehyde 3-phosphate dehydrogenase in somatic tissues and gametes were measured using quantitative RT-PCR. All six genes showed gamete preferential expression (Fig. 2), suggesting that expression levels of gameteenriched proteins encoded by these genes are regulated at the transcriptional level.

Fertility of Rice and Arabidopsis Mutants for Putative Genes Encoding Proteins Preferentially Expressed in Egg or Sperm Cells
Seed-set fertility of plants with mutations in genes encoding gamete-enriched proteins was checked to detect whether functional defects in these proteins could affect reproductive and/or developmental processes such as gamete differentiation, gamete fusion, or early zygotic development. Rice mutants defective in each of the egg cell (109 genes) and sperm cell (79 gene) specific/ predominant gene were examined via insertional deactivation of each gene with the Tos17 retrotransposon (Tables S9 and S10) using the Tos17 mutant panel (http://tos.nias.affrc.go.jp/index. html.en). Seven and four Tos17 mutant lines were obtained for egg-and sperm-cell proteins, respectively (Table 3), and their seedset fertilities were checked. Among these 11 lines, four showed reduced fertility ( Fig. 3A and B). Interestingly, three of four mutant lines for sperm proteins showed clear defects in fertility, although only one of seven mutant lines for egg proteins exhibited reduced fertility (Table 3).
Using rice mutants, we could check the insertional effects of Tos17 on only 11 of 188 genes that were searched using Tos17 mutant panels, probably because Tos17 tends to target to sites within a palindromic consensus sequence, ANGTT-/-AACNT, as cluster [36]. Therefore, we next employed T-DNA insertional mutants of Arabidopsis, because abundant mutant stocks are available. First, the Arabidopsis genes putatively orthologous to a rice genes encoding gamete-enriched proteins were searched using Surveyed Conserved Motif Alignment Diagram and the Associating Dendrogram (SALAD) database version 1.0 (http://salad. dna.affrc.go.jp/salad/) [37]. Arabidopsis orthologs were searched for 61 and 39 genes encoding egg-cell and sperm-cell enriched proteins with .3 identified spectra, respectively (Fig. S3, Tables S8 and S9), resulting in detection of 21 and 13 putative orthologous genes for which 12 and seven T-DNA insertional mutant lines, respectively, were available from ABRC. Seed-set/ development in siliques of these mutants was observed. Among these 19 mutant lines, five showed abnormal seed-set phenotypes ( Table 4). In developing siliques of three lines (SALK_018293, SALK_142670 and SALK_095847), ovules whose development completely failed were observed (Fig. 4), suggesting that defects in gametophyte formation, gamete function, or fertilization occurred in these mutants. In other lines (SALK_027157 and SAIL_6_C02), seeds arrested at immature stages were observed (Fig. 4), indicating that these mutants may have defects in embryo or endosperm development. We further conducted reciprocal crossing experiments using two mutant lines, SALK_018293 and SAIL_6_C02, since they showed different seed-set phenotypes. In the heterozygous SALK_018293 line, seed fertility was clearly decreased when the mutant pistils were pollinated with wild-type pollen grains or self-pollinated (Table S11). The result suggests that functional defect occurred in female side of the mutant, and may be consistent with that the gene (At3g01910), in which T-DNA is inserted, is orthologous to a rice gene encoding a putative egg-enriched protein (Table 4). However, fertility of the crossed or self-pollinated siliques using heterozygous SALK_018293 line was  Table 4. T-DNA insertional Arabidopsis mutants used to observe seed-set and seed-development. reduced to one-quarters and the reduction rate appeared to fall too much, since fertility is typically reduced to half when defects in female gametophyte of heterozygous mutant occur. The possibility that the phenotype of SALK_018293 is due to indirect or pleiotropic effects cannot be excluded. When heterozygous SAIL_6_C02 line was used for reciprocal crossing with wild-type, no seed abortion was observed. However, in case of self-pollination of the mutant, seed fertility was reduced to approximately threequarters, suggesting that homozygous mutation results in defects of post-fertilization events, including embryogenesis and/or endosperm development.
In SAIL_6_C02, function of At1g63160 encoding replication factor C2 (RFC2) is supposed to be defective by T-DNA insertion. Replication factor C is composed of five subunits of RFC1-5, and is known to function in DNA replication, repair and checkpoint control of cell cycles [38][39][40]. Interestingly, Xia et al. (2007) revealed that AtRFC1 plays important role in embryo development [41]. Putative defect of RFC2, a different subunit of RFC, also affected the post-fertilization event, being consistent with the putative function of RFC complex during embryogenesis. For other three Arabidopsis mutants and four TOS17 mutants showing reduced fertility, however, the possible function of the proteins putatively defective in these mutants during reproductive or development is little known. These suggest that further investigations for the molecular functions of these proteins will uncover the novel aspect of plant reproduction and/or development.

Conclusion
In this study, more than 1,000 proteins expressed in egg cells and sperm cells were globally identified. In addition, we also identified proteins that were preferentially expressed in egg or sperm cells. These data ameliorate the lack of proteomic information for gametes in angiosperms and provide fundamental information for dissecting the specific functions of gametes. Moreover, several rice or Arabidopsis lines with mutations in genes encoding putative gamete-enriched proteins clearly showed reproductive or developmental defects. Addressing the functions of these proteins during reproduction and/or zygotic development will improve our knowledge of these processes. Analyses using several mutant plants are currently underway in our laboratories.  Figure S3 A typical SALAD analysis result for Os01g0771100 encoding an egg cell-enriched protein. A dendrogram of sequences clustered according to the presence and similarity of conserved motifs and a diagram that displays positional information of the motifs in each sequence are presented by SALAD database 1.0 [37]. Arrow and arrowhead indicate Os01g0771100 protein and a putative Arabidopsis orthologue. Numbers on dendrogram indicate bootstrapped values, and the same color box presents the same motif. (TIF)