Enrichment and Analysis of Intact Phosphoproteins in Arabidopsis Seedlings

Protein phosphorylation regulates diverse cellular functions and plays a key role in the early development of plants. To complement and expand upon previous investigations of protein phosphorylation in Arabidopsis seedlings we used an alternative approach that combines protein extraction under non-denaturing conditions with immobilized metal-ion affinity chromatography (IMAC) enrichment of intact phosphoproteins in Rubisco-depleted extracts, followed by identification using two-dimensional gel electrophoresis (2-DE) and liquid chromatography-tandem mass spectrometry (LC-MS/MS). In-gel trypsin digestion and analysis of selected gel spots identified 144 phosphorylated peptides and residues, of which only18 phosphopeptides and 8 phosphosites were found in the PhosPhAt 4.0 and P3DB Arabidopsis thaliana phosphorylation site databases. More than half of the 82 identified phosphoproteins were involved in carbohydrate metabolism, photosynthesis/respiration or oxidative stress response mechanisms. Enrichment of intact phosphoproteins prior to 2-DE and LC-MS/MS appears to enhance detection of phosphorylated threonine and tyrosine residues compared with methods that utilize peptide-level enrichment, suggesting that the two approaches are somewhat complementary in terms of phosphorylation site coverage. Comparing results for young seedlings with those obtained previously for mature Arabidopsis leaves identified five proteins that are differentially phosphorylated in these tissues, demonstrating the potential of this technique for investigating the dynamics of protein phosphorylation during plant development.


Introduction
Seedling establishment is a critical stage in plant development, involving the transition from heterotrophic to autotrophic growth. [1] In Arabidopsis, seed germination is driven largely by the metabolism of storage products other than lipids, whereas seedling establishment involves the mobilization of seed oil reserves. [2] Triacylglycerol (TAG) is the predominant source of carbon in the seeds of Arabidopsis and related species, including Brassica napus (canola), [3] and mobilization of TAG supplies the energy and molecular building blocks required for seedling establishment. [1,4] Utilization of TAG and other seed reserves is thought to be controlled and regulated by multiple pathways, [5] and although considerable progress has been made in understanding dormancy and seed germination [6][7][8] the cellular mechanisms involved in seedling establishment are less well understood.
Following germination the glycerol released from TAG through lipase action is converted to glyceraldehyde-3-phosphate (G-3-P) and then by isomerization to dihydroxyacetone phosphate (DHAP), which can either undergo glycolysis to pyruvate or conversion to hexose via gluconeogenesis. [9] The free fatty acids are catabolized by ß-oxidation in the glyoxysome. A more complete understanding of how this metabolic program is regulated in Arabidopsis would increase our knowledge of post-embryonic development in plants and assist in the improvement of canola and other oilseed crops.
One way to achieve this is to study protein phosphorylation during early stage of seedling establishment because reversible phosphorylation of proteins regulates a wide variety of cellular processes during plant growth and development. [10] However, the analysis of protein phosphorylation can be challenging due to the low relative abundance of phosphoproteins and the possibility of phosphorylation at multiple sites within a given protein. [11,12] Affinity enrichment of phosphorylated proteins and/or the component phosphopeptides obtained by proteolysis can significantly enhance the identification of such proteins and the mapping of phosphorylation sites. However, phosphoproteome analysis of certain plant tissues is complicated by the presence of D-ribulose bisphosphate carboxylase/oxygenase (Rubisco), [11,13] an abundant phosphoprotein that inhibits the detection and analysis of other, less abundant plant proteins. Rubisco depletion columns (e.g. Seppro IgY spin columns; GenWay Biotech, San Diego CA, USA) are commercially available and have been used successfully to deplete Rubisco in total protein extracts. [14,15] Advances in phosphopeptide enrichment strategies have also enabled large-scale phosphoproteomic studies in Arabidopsis, providing new insights regarding the potential involvement of protein phosphorylation in various stages of plant development. [11,[16][17][18][19][20][21][22][23] Despite these efforts, our knowledge of protein phosphorylation events during the transition from heterotrophic to photoautotrophic growth in young seedlings remains incomplete.
Much of the information currently available in protein phosphorylation databases has been generated using peptide-level enrichment strategies, and although affinity purification of intact phosphoproteins has been demonstrated [11,24] the use of protein-level enrichment for phosphoproteome analysis in plants remains largely unexplored. To complement and expand upon previous investigations involving phosphopeptide enrichment we carried out a survey of protein phosphorylation in post-embryonic Arabidopsis seedlings (hereafter referred to as young seedlings) using Rubisco depletion and enrichment of intact phosphoproteins by immobilized metal-ion affinity chromatography (IMAC) combined with two-dimensional gel electrophoresis (2-DE) and liquid chromatography-tandem mass spectrometry (LC-MS/MS). The results of this study were then compared with those obtained previously using Rubisco depletion and protein-level enrichment of phosphoproteins from Arabidopsis mature leaves [11] to evaluate this approach for monitoring the dynamics of protein phosphorylation during plant development.

Plant growth and protein extraction
Arabidopsis thaliana (L) Heynh (Col-0) seeds were treated with 50% bleach in MilliQ water (v/ v) containing 5.25% sodium hypochlorite for 2 min and then with 50% (v/v) ethanol for 2 min before washing 4 times with sterilized Milli-Q water and cultivating in Petri dishes containing 0.5x Murashige and Skoog [25] mineral salts with BactoAgar. Seeds were allowed to imbibe at 4°C for 4 days in the dark before transferring them to a growth chamber maintained at 22°C and a 16-h light/8-h dark cycle. Week-old whole seedlings (including roots) with 2 seed leaves were collected for protein extraction. One gram of seedlings was ground to a powder in liquid nitrogen with 0.5% (w/w) PVPP and homogenized in 2 ml of ice-cold extraction buffer (pH 7.4) containing 10 mM Tris-HCl, 150 mM NaCl, the serine protease inhibitor PMSF (1 mM, freshly prepared in DMSO) and a protease inhibitor cocktail developed for plant cell and tissue extracts (Sigma P-9596, 0.2% v/v), together with phosphatase inhibitors 20 mM sodium fluoride, 1 mM sodium molybdate, 1 mM sodium orthovanadate, and 1 mM sodium β-glycerophosphate. The slurry was stirred for 30 min on ice, filtered through two layers of cheese cloth and centrifuged at 10,000 × g for 15 min at 4°C. After discarding the pellet the amount of protein in the supernatant was determined using the Bradford assay (Bio-Rad) with BSA as the standard, and the final concentration of the sample adjusted to 1 mg/ml using the extraction buffer.

Rubisco depletion
Each protein sample was filtered through a 0.45 μm spin filter (Millipore) and 500 μl of the extract, containing about 500 μg of protein, was loaded onto a Seppro IgY column. Rubisco was removed according to the manufacturer's instructions. The protein flow-through and the bound fraction were collected separately, and each precipitated with 5 volumes of ice-cold methanol and 100 mM ammonium acetate at -20°C overnight. After centrifugation at 10,000 × g for 20 min at 4°C, the resulting pellets were thoroughly washed twice with ice-cold 100% methanol and then with 80% ice-cold methanol. Each pellet was briefly dried using a SpeedVac, re-dissolved in the column incubation buffer (6 M urea, 0.25% CHAPS, 50 mM sodium acetate, pH 4.0) to approximately 1 mg/ml, and used for phosphoprotein enrichment by immobilized metal-ion affinity chromatography (IMAC).

Phosphoprotein enrichment
Enrichment of intact phosphoproteins from Rubisco-depleted samples was carried out as previously described. [11] Briefly, a 500 μl slurry of PHOS-Select iron affinity gel beads (Sigma) was washed 3 times with 0.1% TFA in 30% acetonitrile and equilibrated 3 times with 500 μl of incubation buffer (6 M urea, 0.25% CHAPS, 50 mM sodium acetate, pH 4.0) with centrifugation at 1,000 × g for 1 min between each step before being loaded onto the column. Two ml of Rubisco-depleted protein sample in incubation buffer were loaded onto each spin column (about 2 mg total protein per 500 μl of bead slurry) and incubated for 1 h at room temperature with gentle shaking. Phosphoproteins bound to the IMAC columns were eluted three times with 200 μl of elution buffer (6 M urea, 50 mM Tris-acetate pH 7.5, 0.1 M EDTA, 0.1 M EGTA, 0.25% CHAPS), each time incubating at room temperature for 10 min with gentle shaking, and then centrifuged at 1,000 × g for 1 min. The 3 eluates were pooled and precipitated with methanol as previously described before re-suspension in lysis buffer (7 M urea, 2 M thiourea, 2% CHAPS) to obtain a total protein concentration of 1 μg/μl prior to 2-D gel electrophoresis .

Gel electrophoresis and in-gel digestion
One-dimensional gel electrophoresis was used to resolve proteins from the bound and flowthrough fractions obtained during Rubisco depletion on the Seppro column. Ten μl of each fraction containing approximately 10 μg of protein was mixed with 10 μl of gel sample buffer (0.2 M Tris-HCl, pH 6.8, 2% SDS, 10% glycerol,0.02% bromophenol blue) and separated on a 1.0 mm, 12.5% Criterion Tris/HCl gel in a Criterion Cell (Bio-Rad) (13.3 cm × 8.7 cm) at a constant voltage of 150 V. The separated proteins were visualized using Bio-Safe Coomassie Blue stain (Bio-Rad).
For phosphoproteome analysis, 200 μl (200 μg) of IMAC-enriched phosphoprotein sample was mixed with 200 μl of rehydration buffer (7 M urea, 2 M thiorea, 2% CHAPS, 10 mM DTT, 0.5% IPG buffer, pH 3-10), resolved by 2-DE and visualized by silver staining 10 . Gel images were recorded using an ImageScanner (GE Healthcare) and Phoretix 2D software (v2004) was used to measure the total number of protein spots visualized in each 2-DE gel image. Proteins of interest were excised manually from each gel and digested with trypsin using a MassPREP protein digestion station, according to the protocol (digestion 5.0) recommended by the manufacturer (Micromass, Manchester, UK). Preparation of tryptic peptide samples for LC-MS/MS analysis was carried out as previously described. [11] Mass spectrometry and protein identification Six μl of each 2-D gel protein digest was analyzed using a nanoAQUITY UPLC system (Waters, Milford, MA, USA) interfaced to a quadrupole time-of-flight (Q-TOF) Ultima Global hybrid tandem mass spectrometer (Waters, Mississauga, ON, Canada). Separations were performed using a Waters BEH130 C 18 nanoAQUITY UPLC analytical column (75 μm, 1.75 mm × 100 mm) at an initial flow rate of 400 nl/min. Mobile phase solvent A was 0.2% formic acid in water and solvent B was 0.2% formic acid in 100% acetonitrile. Separations were performed using the following 55-min solvent program: 99:1 (%A:%B) for 1 min, changing to 90:10 at 16 min, 55:45 at 45 min, and 20:80 at 46 min, at which point the flow rate was changed to 800 nl/ min and the gradient held until 52 min before reverting to 99:1 at 53 min. A 5 min seal wash with 10% acetonitrile in water was carried out after the completion of each run.
The Q-TOF MS was operated in the positive ion mode and TOF MS spectra were acquired over the m/z range 400-1900 at the rate of one scan/s. Of the multiply charged (2 + , 3 + , or 4 + ) peptide ion peaks rising over a threshold, the three most abundant were automatically selected for CID, and product-ion spectra were acquired over the m/z range 50-1900 in TOF MS/MS mode. The CID collision energy was selected automatically according to the m/z ratio and charge state of the precursor ion. A real-time exclusion window was used to prevent precursor ions with the same m/z from being selected for CID and TOF MS/MS within 2 min of their initial acquisition. Data were also acquired using pre-programmed exclusion lists for keratin and trypsin.
Data were processed using MassLynx 4.1 (Waters, Milford, MA) and searched against NCBInr protein sequence database for Arabidopsis thaliana (thale cress) using an in-house Mascot server (Version 2.2, Matrix Sciences, UK) and the following parameters: carbamidomethylation of cysteine as the fixed modification; oxidation of methionine and phosphorylation of serine, threonine and tyrosine as variable modifications; mass tolerances of 0.2 Da for MS and 0.5 Da for MS/ MS data; and one missed cleavage for tryptic peptides. Peptide MS/MS spectra used for protein identification had to be of sufficient quality, with a signal-to-noise ratio of 3 or greater for annotated fragment ions, including neutral loss peaks associated with de-phosphorylation during CID. Only peptides matched with significant ion scores (P <0.05) and low expectation values (evalue <0.01) were selected. For unambiguous identification, each peptide MS/MS spectrum had to contain at least three sequential y-or b-type ions. Protein identification was regarded as positive if the Mascot score exceeded the 95% confidence threshold, the matched protein contained at least four top-ranking unique peptides, and protein sequence coverage by the matching peptides was >15%. If the same set of peptides matched multiple members of a protein family, or a protein appeared under different names and accession numbers in the database, the entry with the highest score and/or most descriptive name was reported. When protein isoforms were observed, the data were inspected manually. If several isoforms shared the same set of identified peptides the protein with the most matching peptides was accepted as the correct result. The presence of protein isoforms was confirmed and reported based on the identification of at least two unique peptides.
Since the error tolerance of the MS method used (200 mDa) was greater than the mass difference between phosphorylation and sulfation (9.5 mDa), a second error-tolerant search reporting masses to 0.1 mDa was performed to allow sulfation and phosphorylation to be distinguished. Raw MS/MS spectra matched to phosphorylated peptides in the Mascot search were manually inspected and validated using MassLynx 4.1. The spectra were processed to give singly charged, monoisotopic, centroided peaks and compared with the in silico fragmentation masses for the matched peptide to confirm neutral loss of phosphoric acid for serine and threonine phosphorylation, or the mass increment of 80 Da associated with phosphorylation of tyrosine.

Phosphoproteome analysis of young seedlings
A schematic representation of our analytical approach is shown in Fig 1. The molecular weight distributions of proteins in the bound and flow-through samples following Rubisco depletion were investigated by 1-DE (Fig 2). Results show that the Seppro IgY Rubisco-depletion columns are efficient at removing Rubisco from the protein extracts of young seedlings. The Rubisco protein concentrated in the bound fraction is predominantly the small subunit (SSU), whereas both small and large subunits of Rubisco were evident in a previous study of mature Arabidopsis leaves. [11] That study also found that Rubisco depletion significantly increased the number of identified phosphoproteins, even without IMAC enrichment, and that only Rubisco and other relatively abundant phosphoproteins were recovered from non-depleted extracts using IMAC, whereas IMAC enrichment more than doubled the number of phosphoproteins identified in depleted extracts. It has recently been demonstrated that the Rubisco SSU up-regulates expression of the Rubisco large subunit (LSU) at the transcriptional level. This coordinated expression of subunits may explain the relatively small amount of Rubisco LSU observed during early growth in young seedlings. [26] IMAC-purified phosphoproteins from the Rubisco-depleted flow-through fraction were subsequently resolved by 2-DE (Fig 3). The reproducibility of both 1-and 2-DE experiments was confirmed by analyzing and comparing three biological replicates (not shown). An average of 175 protein spots were detected in replicate 2-DE gels following IMAC enrichment of Rubisco-depleted extracts. These were excised, trypsinized and analyzed by LC-MS/MS, which identified 156 of the spots based on our acceptance criteria for protein identification (see above). Of these, 105 spots (i.e. 60% of the 175 detected following IMAC) were found to contain a total of 82 different phosphoproteins based on the detection of 144 tryptic phosphopeptides, not counting methionine-oxidized and non-oxidized forms of the same peptide (Table 1,  The 144 detected phosphopeptides contained a total of 144 unique sites of protein phosphorylation, of which 48% (69) were serine, 48% (69) were threonine, and 4% (6) were tyrosine residues ( Table 2; Fig 4A). To assess any differences in phosphorylation occupancy among the S, T and Y residues, we compared our results with those from previous studies that utilized different enrichment methods and plant tissues ( Table 2). The distribution observed in this study for Arabidopsis seedlings is similar to that obtained using Rubisco depletion and IMAC enrichment of intact phosphoproteins from mature Arabidopsis leaves, [11] which contained 52% phosphoserine (pS), 40% phosphothreonine (pT), and 8% phosphotyrosine (pY) residues (Table 2). However, these results differ from those obtained using IMAC to enrich phosphopeptides generated by trypsin digestion of plant phosphoproteins. For example, previous results reported 88% pS, 11% pT and 1% pY in 22-day-old Arabidopsis seedlings [22]; 85% pS, 13% pT and 2% pY in 9-day-old Arabidopsis seedlings [23]; 85% pS, 11% pT and 4% pY in cultured Arabidopsis cells [21]; 86% pS, 13% pT and 1% pY in Medicago truncatula roots [26]; and 81% pS, 17% pT, and 2% pY in dormant poplar (Populus simonii × P. nigra) buds [27] when IMAC enrichment was performed at the phosphopeptide level. Rao and Moller [28] reported the occurrence of 77% pS, 17.5% pT and 5.5% pY in eukaryotic phosphoproteins based on a combined Uniprot, Phospho.ELM and Phosida database analysis, which also differs Analysis of Intact Seedling Phosphoproteins from the present study. By way of comparison, the average pS:pT:pY ratio observed for cellular phosphoproteins in mammals is approximately 1800:200:1, [29] corresponding to 89.95% pS, 10.00% pT and 0.05% pY. These results suggest that peptide-and protein-level enrichment strategies complement each other to some extent and that the latter provides access to a greater proportion of phosphorylated threonine and tyrosine residues, at least in plant phosphoproteins.
These findings are of particular significance given the emerging importance of tyrosine phosphorylation in plant processes such as germination, growth, development, and abiotic stress responses. [30] In particular, our discovery of 6 new tyrosine phosphorylation sites (Table 1) in proteins involved in the mobilization of seed reserves (NAD + MDH), cell defence (MLP), cellular signaling (cyclase family protein), oxidative stress response (GST9), protein degradation (20S proteasome alpha subunit B) and protein folding (chaperonin 20) represents a significant contribution to the list of potential substrates for known and predicted protein tyrosine kinases in plants. [30] It also helps to address the apparent discrepancy between the predicted frequency of pY residues in the Arabidopsis proteome [13] and that observed using peptide-level affinity enrichment strategies, during which the phosphorylated residues in each protein are distributed between tryptic peptides containing only one or two such residues, of which those carrying the more abundant pS modification are likely to predominate in terms of recovery and analysis.
Of the 144 phosphopeptides and 144 phosphosites reported in the present study, only 10 peptides and 1 phosphorylation site matched those identified during a recent survey of the phosphoproteome in hydroponically-grown Arabidopsis seedlings, which utilized Ti 4+ -IMAC enrichment of tryptic phosphopeptides from whole protein digests (Table 1). [23] Of those 144 phosphopeptides, 10 phosphopeptides and 5 phosphorylation sites were found in both the Two-dimensional gel electrophoresis of Rubisco-depleted phosphoproteins enriched by immobilized metal-ion affinity chromatography using PHOS-Select iron affinity gel beads. Phosphoproteins identified by liquid chromatography-tandem mass spectrometry are indicated using arrows and numbers (see Table 1). Phosphorylated proteins, peptides and residues (phosphosites) identified by mass spectrometry in protein extracts from young Arabidopsis seedlings after rubisco depletion,   18 and 19). Detection of the novel phosphopeptide pTLLFGEKPVTVFGIR in both isoforms indicates that both are phosphorylated at T70. However, a second phosphopeptide SDLDIVpSNASCTTNCLAPLAK, which had previously been detected in Arabidopsis seedlings [23] (though with a different site of phosphorylation), was also identified in one of the isoforms (spot 18) indicating phosphorylation at S152 (Table 1). The concomitant reduction in pI relative to the other isoform (spot 19) is consistent with horizontal separation of these two proteins on the 2-DE gel (Fig 3), demonstrating the utility of our gel-based approach for resolving differentially phosphorylated forms of a given protein. Similarly, vertical separation of two Rubisco polypeptides (Fig 3,  spots 65 and 66) reflects the difference in molecular weight between the matched proteins, each of which contained the same number (3) of identified phosphorylation sites (Table 1).
IMAC purification, 2-DE separation, and digestion of intact phosphoproteins to produce a mixture of phosphorylated and non-phosphorylated peptides may have contributed to the relatively small number of multiply-phosphorylated peptides identified during this study, compared with studies in which only phosphorylated peptides were enriched and analyzed by mass spectrometry. [16][17][18]21,26,27,31] However, the average number of phosphopeptides identified per plant protein (1.8 in young Arabidopsis seedlings and 1.9 in mature leaves [11]) compares well with studies that utilize peptide-level enrichment [27]. Furthermore, the phosphoproteins we identified in young seedlings using protein-level enrichment include basic proteins (e.g. APX1, APX4, nucleotide diphosphate kinase) and proteins previously identified as plasma membrane proteins (e.g. CA2, PGK, DHAR1) in Arabidopsis seedlings, [12,16,32] suggesting minimal bias towards proteins of a particular polarity, pI or molecular weight. [11] By enabling protein identification using both phosphorylated and non-phosphorylated peptides our approach also provides high confidence in the identification of phosphoproteins and hence, their selection as candidates for further investigation of the role of protein phosphorylation during plant development (which lies beyond the scope of the present study).

Functional classification of phosphoproteins
The identified phosphoproteins were sorted into functional groups using the KEGG Pathway database (http://www.genome.jp/kegg/pathway.html). The two largest groups were those involved in carbohydrate/energy metabolism (22%) and oxidative stress/redox regulation (20%), which together with photosynthesis and respiration (11%) accounted for more than half of the identified phosphoproteins (Fig 4). Many of these, including glyceraldehyde-3-phosphate dehydrogenase (GAPC-2), triosephosphate isomerase (TPI), phosphoglycerate kinase (PGK1), fructose bisphosphate aldolase (FBA), and malate dehydrogenase (MDH), play important role in processes such as glycolysis, gluconeogenesis and the Calvin cycle during seed germination and the early stages of seedling establishment. Identification of phosphorylated 20S proteasome subunits, proteases, chaperonins, thioredoxins, glutathione transferases (GSTs), dehydroascorbate reductase (DHAR1) and manganese superoxide dismutase (MSD1) is also consistent with the role of proteolytic events in mobilizing TAG and other seed reserves. Comparison of our experimental results with the PhosPhAt 4.0 Arabidopsis thaliana phosphorylation site database [33,34], P3DB database [35] and with supplementary information from a recently published survey of the Aradopsis seedling phosphoproteome [23] showed that 43 of the 82 phosphoproteins identified in our study have not been reported before (Table 1), and that we were able to identify new phosphorylation sites in previously characterized phosphoproteins such as FBA, GAPC-2, TPI and PMDH1 (S1 Fig), GSTs (S2A Fig and S2B Fig), PRK and IDH. New and known phosphorylation sites were also identified in 12S seed storage proteins (Table 1), further demonstrating the utility of our approach for identifying novel sites of protein phosphorylation in plant tissues.

Phosphorylation of enzymes involved in post-embryonic development
Many of the enzymes known to be important during the early stages of plant growth were found to be phosphorylated in Arabidopsis young seedlings. The glycolytic enzyme triosephosphate isomerase (TPI), for example, plays a central role in chloroplast development [36] and other biochemical pathways by equilibrating the cytosolic pool of DHAP and G-3-P. The latter is required for 1,5-bisphosphate production in the Calvin cycle, whereas DHAP suppresses the production of chlorophyll and 1,5-bisphosphate. Phosphorylation of human TPI has been shown to reduce its activity in converting G-3-P to DHAP, and although it has been suggested that TPI can be phosphorylated at S21 there is evidence that other sites may be subject to phosphorylation. [37] Our discovery of phosphorylated S106 and S178 residues in Arabidopsis TPI (Table 1, spot 64) provides new information with which to investigate the role of protein phosphorylation in controlling the activity of this enzyme and thus regulating chloroplast development in young seedlings.
NAD + MDH, a key enzyme in carbohydrate metabolism, is responsible for regenerating NAD + and is involved in the mobilization of seed oil reserves [4] and the photosynthetic assimilation of carbon in developing leaves. [38] We identified several sites of phosphorylation in mitochondrial NAD + MDH ( Table 1, spot 9), as well as single site of phosphorylation in cytosolic MDH (Table 1, spot 10). We also observed phosphorylation of 3-isopropyl malate dehydrogenase (spot 16), which is primarily involved in leucine biosynthesis. [39] Carbonic anhydrase (CA), a major chloroplast protein, is involved in photosynthesis [40] and the mobilization of seed reserves during the early stages of post-embryonic growth. CA1 is also known to form part of a Rubisco-containing Calvin cycle enzyme complex. [40] Identification of phosphorylation sites in CA (spots 8 and 12), ribose-5-phosphate isomerase (spot 63), Rubisco SSU (spot 66) and PRK (spot 22) may help to elucidate the role of protein phosphorylation in controlling the assimilation and utilization of carbon reserves during the early stages of seedling establishment. [41] Other identified phosphoproteins include members of the jacalin-lectin (Fig 5A), cupin, and cyclase families (spots 32 to 34), all of which are involved in cell signaling. A cupin domain protein (AtPirin1) has also been found to interact with G protein α-subunit GPA1 in Arabidopsis to regulate seed germination and seedling development. [42] Phosphorylation of 20S proteasome subunit PtrPBA1, and increased expression of 20S proteasome α-subunit B and regulatory subunit RPN10, have been observed in poplar dormant terminal buds. [27] We observed phosphorylation of 20S proteasome α-subunit B (spot 75) at S54, T179, T194 and Y101 and of the 20S proteasome subunit PAA2 (spot 73) at S64 and T166 in Arabidopsis young seedlings (Table 1). ATP dependent Clp protease proteolytic subunit (CLPP) is a highly conserved, multimeric serine protease [43] that degrades large globular proteins in the presence of an AAA ATPase complex. [44] CLPP (spot 72) was found to be phosphorylated at S107 and S241, and a Clp amino terminal domain-containing protein (spot 71) at S224.
Although there is growing evidence of crosstalk between redox signaling and hormonal response pathways during seed germination, [45] the molecular components involved in this process during post-embryonic development remain elusive. We identified phosphorylated forms of several proteins known to be key regulators of stress response, including APX1 (spot 56), APX4 (spot 45), GST6 (spot 58), ATGSTF9, -10 and -6 (spots 47 to 49 and S2A Fig and  S2B Fig), dehydroascorbate reductase 1 (spot 46), thioredoxins M2 and M4 (spots 53 and 54), peroxiredoxins (spots 51 and 59), and manganese superoxide dismutase (spot 60). Phosphorylation of APX1, APX4, peroxiredoxin type-2, GST6, and MSD1 was also observed in mature leaves [11] but at sites other than those observed in young seedlings (Table 3). Thioredoxins and other H 2 O 2 -scavenging enzymes help to protect plants from damage caused by the production of reactive oxygen species (ROS) during seed germination and seedling development. [46] Germin-like protein, which generates H 2 O 2 from the oxidative breakdown of oxalate, [47] was also found to be phosphorylated in our study (Fig 5B).
Heat shock proteins (HSPs) are involved in bud dormancy [48] and phosphorylation of HSPs and chaperonin has been reported in Arabidopsis [21,49] and poplar. [27] Our results confirm phosphorylation of these proteins in Arabidopsis seedlings and identify sites of phosphorylation in HSP60 (T80) and chaperonin 20 (adjacent residues Y59, T60 and S61) that, to the best of our knowledge, have not been reported before (S2C Fig and S2D Fig). Comparing protein phosphorylation at different stages of development In a previous study we used IMAC to recover and identify 132 phosphoproteins with 252 component phosphopeptides in mature Arabidopsis leaf extracts (Fig 4A), following polyethylene glycol (PEG) fractionation to deplete Rubisco. [11] Having now used IMAC to recover and identify intact phosphoproteins in Rubisco-depleted extracts from young seedlings we decided  Table 3. Changes in protein phosphorylation between post-embryonic seedlings and mature leaves. Phosphorylated proteins, peptides and residues (S = serine, T = threonine, Y = tyrosine) identified in post-embryonic seedlings and mature leaves of Arabidopsis thaliana. Common phosphopeptides with conserved phosphosites are highlighted in bold and common phosphopeptides with different phosphosites are highlighted in bold and italics.    Table 3). For example, phosphorylation of the Rubisco small chain 1A at T58, T14 and T78 was observed in both seedlings and leaves, confirming phosphorylation of the protein at those sites. However, some of the phosphopeptides spanning the same amino acid sequence in both tissues showed a difference in protein phosphorylation state between young seedlings and mature leaves. For example, the CA2 peptide VLAESESSAFEDQCGR was identified in both tissues but was phosphorylated at S191 in seedlings and at S193 in leaves. Tryptic peptides showing differential phosphorylation of four other proteins (NAD + MDH, PBP1, avirulence responsive protein, and ribose 5-phosphate isomerase) were also observed ( Table 3), suggesting that these proteins may play a significant role in Arabidopsis development.
Comparing the phosphorylation status of 12S seed storage protein (cruciferin) in young seedlings (Table 1, spot 80) and dormant Arabidopsis seeds [50] shows that certain phosphorylation sites (T395 and T420) are common to both tissues, thereby validating the current method with reference to results obtained during a previous in-depth study of cruciferin phosphorylation. However, an apparent shift in phosphorylation site from S367 in dormant seeds to S366 in post-embryonic seedlings again demonstrates the ability to detect subtle changes in phosphorylation status that may have implications for seed storage protein mobilization and other processes during plant development, [50] although further investigations are required to confirm the significance of these findings.

Conclusions
Seedling establishment involves the efficient utilization of endogenous protein reserves and external resources, requiring that developmental and metabolic programs adapt to the prevailing environmental conditions. [51] Using a combination of Rubisco depletion and IMAC enrichment of intact phosphoproteins we identified and characterized the phosphorylated forms of 82 proteins expressed in Arabidopsis young seedlings. These included enzymes involved in chloroplast development, mobilization of TAG, and other processes known to be important during the early stages of plant development. Comparison of our results for young seedlings with those obtained previously for Arabidopsis seeds [50] and mature leaves [11] shows that some of these proteins undergo differential phosphorylation during plant growth, and that protein level enrichment appears to enhance detection of pT and pY residues. Our study complements previous investigations by identifying an additional 43 proteins and 136 residues that undergo phosphorylation in Arabidopsis young seedlings. By purifying and enriching phosphorylated proteins under non-denaturing conditions our approach also lends itself to the study of phosphorylation in endogenous protein complexes and during proteinprotein interactions.
Supporting Information