Identification and Analysis of the Acetylated Status of Poplar Proteins Reveals Analogous N-Terminal Protein Processing Mechanisms with Other Eukaryotes

Background The N-terminal protein processing mechanism (NPM) including N-terminal Met excision (NME) and N-terminal acetylation (Nα-acetylation) represents a common protein co-translational process of some eukaryotes. However, this NPM occurred in woody plants yet remains unknown. Methodology/Principal Findings To reveal the NPM in poplar, we investigated the Nα-acetylation status of poplar proteins during dormancy by combining tandem mass spectrometry with TiO2 enrichment of acetylated peptides. We identified 58 N-terminally acetylated (Nα-acetylated) proteins. Most proteins (47, >81%) are subjected to Nα-acetylation following the N-terminal removal of Met, indicating that Nα-acetylation and NME represent a common NPM of poplar proteins. Furthermore, we confirm that poplar shares the analogous NME and Nα-acetylation (NPM) to other eukaryotes according to analysis of N-terminal features of these acetylated proteins combined with genome-wide identification of the involving methionine aminopeptidases (MAPs) and N-terminal acetyltransferase (Nat) enzymes in poplar. The Nα-acetylated reactions and the involving enzymes of these poplar proteins are also identified based on those of yeast and human, as well as the subcellular location information of these poplar proteins. Conclusions/Significance This study represents the first extensive investigation of Nα-acetylation events in woody plants, the results of which will provide useful resources for future unraveling the regulatory mechanisms of Nα-acetylation of proteins in poplar.


Introduction
The N-terminal protein processing mechanism (NPM) represents a common protein modification that occurs in eukaryotes, and primarily involves the co-translational processes of N-terminal Met excision (NME) and N-terminal acetylation (N a -acetylation) [1][2][3][4]. In all eukaryotes, the nuclear-encoded protein synthesis machinery requires newly synthesized peptides to begin with methionine (Met), whereas plastid-encoded nascent proteins begin with a Met with an N-formyl group (Fo) [5]. Therefore, NME of the nuclear-encoded proteins requires only methionine aminopeptidase (MAP; EC 3.4.11.18) activity, which proteolytically removes the N-terminal Met [4,6]. NME of the plastid-encoded proteins requires MAP activity and peptide deformylase (PDF) activity [7]. The latter enzymatic activity is required for the removal of the Fo groups, thereby unmasking the amino group of the first Met and allowing the subsequent action of MAP [1,5,8].
Following the synthesis of the peptides in eukaryotes, cytosolic MAPs may remove the first Met residue if the residue at position two has a small enough side-chain, resulting in N-terminal Ala, Val, Ser, Thr, Cys, Gly, or Pro amino acids [9]. Approximately two-thirds of mature proteins undergo NME induced by MAP [1]. Unlike eubacteria, which possess only one type of MAP (MAP1), eukaryotes possess a second type of MAP, MAP2, with similar substrate specificity as found for MAP1 [4]. Experimental data have shown that, in higher eukaryotes, MAP1s are found in mitochondria, plastids, and the cytoplasm, whereas MAP2s are found specifically in the cytoplasm, suggesting that NME occurs in all compartments where de novo protein synthesis occurs [1,5,10]. N a -acetylation is an enzyme-catalyzed reaction in which the protein a-amino group accepts an acetyl group from acetyl-CoA [9]. Currently, six types of Nats conserved from yeast to humans are responsible for these N a -acetylation events: each of the three major Nats, NatA, NatB and NatC contain a catalytic subunit, and one or two auxiliary subunits, whereas NatD, NatE and NatF are composed of only a catalytic subunit [11][12]. Each type of Nats appears to acetylate a distinct subset of substrates defined by the first N-terminal amino acid [13]. NatA is often responsible for the N a -acetylation of small N-terminal amino acid residues, including Ser, Ala, Thr, Val, Gly and Cys, following NME induced by MAP [2,[14][15]. Interestingly, NatF also has the potential to acetylate these types of N-termini where the Met has not been cleaved [12]. NatB potentially recognizes and acetylates Met-Asp-, Met-Glu-, and Met-Asn-N-termini [12]. Hydrophobic Met-Leu-, Met-Ileand Met-Phe-are acetylated by NatC. Moreover, these hydrophobic termini are also recognized by NatF and NatE in vitro, suggesting that redundancy in activity also exists between particular Nats [12]. In yeast, NatD was found to acetylate the Ser-N-termini of histones 2A and 4 in vitro and in vivo, whereas no such activity has yet been observed in higher eukaryotes [16]. Furthermore, the entire genes encoding catalytic or auxiliary subunits of NatA-NatF have been identified and described in yeast and humans [12][13]. However, there is still no systematic and comprehensive characterization of Nats in Arabidopsis and poplar.
Previous evidence suggests that NPM possess similar mechanisms across several eukaryotes [1][2]. However, the NPM mechanism present in poplar remains poorly defined. Here, we identified 58 N a -acetylated proteins using tandem mass spectrometry combined with TiO 2 enrichment of acetyl peptides in dormant terminal buds of poplar. The site-specific acetylation data provide a wealth of resources for decoding NPM mechanisms present in poplar. As far as we know, this study represents the first extensive investigation of N a -acetylation events in woody plants.

Characterization of the Identified Acetylated Proteins in Poplar
The N a -acetylation of proteins was investigated to explore NPMs of woody plant proteins. Proteins from poplar were isolated and digested with trypsin in solution and the tryptic peptides were subjected to nanoUPLC-ESI-MS/MS for the identification of acetylation following TiO 2 enrichment. The spectra representing all of these acetylated peptides and the original data collected are listed in the File S1. As outlined in the Table 1 and File S2, we have identified 58 N-terminally acetylated (N a -acetylated) proteins. These 58 proteins were divided into two groups: (i) the NME-independent N a -acetylation group, where the N-terminal Met residue (iMet) is retained and subsequently acetylated; and (ii) the NME-dependent N a -acetylation group, where the N-terminal iMet residue is removed and acetylation occurs at the exposed residue located at position two. In this study, most of N a -acetylated proteins (47, .81%) belong to group (ii), whereas the remaining proteins (11, ,19%) belong to group (i), suggesting that N aacetylation and NME could represent a common NPM of poplar proteins (Table 1 and File S2). Interestingly, we found that the Nterminus of sixteen identified N a -acetylated proteins (27.6%, 16/ 58) are also phosphorylated, which are respectively fifteen N aacetylated proteins (31.9%, 15/47) of group (ii), and one N aacetylated protein (9.1%, 1/11) of group (i) (File S2). Notably, phosphorylation of ten N a -acetylated proteins are present within the N-terminal regions while the N-terminal Ser residues of six proteins (37.5%, 6/16), including four translation initiation factor eIF-5A (717121, 832646, 835953 and 724093) and two metallopeptidase M24 proteins (819223 and 577003), were also found both N a -acetylated and phosphorylated (File S2). This similar event has also been observed in spinach chloroplasts, where Nterminal Ser residues of three proteins possess both phosphoryl and acetyl groups [17].
The two enzymes of PDFs and MAPs are involved in NME, whereas Nats are responsible for N a -acetylation of proteins in yeast and humans [1,[3][4][5][18][19]. Based on these observations, NME and N a -acetylation of poplar may also be dependent on these enzyme orthologs, which are described in the following section.
Identification of the Enzymes Involved in NME Processing of these Poplar Acetylated Proteins PDF and MAP activities are successively needed for NME of plastid-encoded proteins [1,5,8]. As for nuclear-coded proteins in eukaryotes, NME only needs MAP enzyme activities, which proteolytically remove the N-terminal Met [4,6]. To determine which enzyme orthologs were involved in NME of these acetylated proteins from poplar, the subcellular location of these corresponding genes of identified acetylated proteins were determined by searching the Populus trichocarpa genome database (http://genome. jgi.doe.gov/poplar/). As a result, we found that all identified acetylated proteins were products encoded by nuclear genes. Accordingly, It could be proposed that MAPs, but not PDFs, represent the only enzymes responsible for NME of the 47 identified acetylated proteins from group (ii). However, further efforts are required to be determined which MAPs function as NME of these proteins in poplar.
Although there has been systematic and comprehensive characterization of MAPs in Arabidopsis [4][5], until now such information has not been documented in poplar. To clearly obtain all members of the MAP families in Populus, the P. trichocarpa protein sequence data [20] was exploited as a query file for searching across the Conserved Domain Database (CDD) [21]. We found five non-redundant putative MAP1s that significantly matched the MetAP1 domain (cd01086), whereas two MAP2s were found to significantly match the MetAP2 domain (cd01088) (File S3 and Table 2). A separate phylogenetic tree was generated from all complete MAP protein sequences of Arabidopsis and poplar ( Figure 1). Phylogenetic analysis demonstrated that two distinct clusters are present, including MAP1 and MAP2 clusters, which are respectively encoded by evolutionarily divergent genes ( Figure 1). These identified poplar MAPs were denominated in accordance with their MAP orthologues with the closest evolutionary relatedness in Arabidopsis. Consequently, one member of poplar MAP1s (730835) has the closest evolutionary relation with Arabidopsis MAP1A (Ath MAP1A, At2g45240), and was therefore termed poplar MAP1A (Ptr MAP1A) ( Figure 1 and Table 2). Notably, another member of poplar MAP1s (588331) was considered a novel member of MAP1s because of the divergence between this protein and other MAP1s (MAP1A-D). This MAP1 was termed Ptr MAP1E ( Figure 1 and Table 2). Furthermore, we found that the MAP1 domain of Ptr MAP1E has high sequence similarity with other MAP1s of Arabidopsis and poplar ( Figure S1a), whereas the absence of the N-terminal extension was only present in Ptr MAP1E, which could represent its divergence from MAP1A-D of Arabidopsis and poplar ( Figure S1a). Surprisingly, Ptr MAP2A and Ptr MAP2B, as well as Ath MAP2A and Ath MAP2B share near-identical amino acid sequences, suggesting a conservation of function ( Figure S1b).
In Arabidopsis, three organelle-targeted MAPs (MAP1B, MAP1C and MAP1D), and three cytosolic MAPs (MAP1A, MAP2A and MAP2B) have been characterized as members of the NME machinery [1,[4][5]. However, the role of these MAPs in NME of poplar remains unclear. Using TargetP [22], it was predicted that, Cytoplasm Ac-Ser-Asp-Glu-NatA in poplar, the two Ptr MAP2s (Ptr MAP2A and Ptr MAP2B) are specifically targeted to the cytoplasm, whereas PtrMAP1s is targeted to both the organelles (PtrMAP1B-E) and the cytoplasm (PtrMAP1A). Due to the absence of any plastid-encoded proteins (Table 1), these proteins from the NME-dependent N a -acetylation group (ii), should be subjected to NME by the three cytosolic MAPs (PtrMAP1A, PtrMAP2A and PtrMAP2B) in poplar ( Figure 2 and Table 2).

The Nats Involved in N a -acetylation of the Identified Poplar Proteins
Confirmation of acetylation sites are recognized footprints of Nat activities. Eukaryotic proteins subject to N a -acetylation have a variety of N-terminal sequences and these extracted consensus motifs reflect the activity of particular Nats. In order to identify the Nat orthologs responsible for N a -acetylating these poplar proteins, the N-terminal 14 amino acids from the N a -acetylated proteins of NME-independent N a -acetylation group (i) and NME-dependent N a -acetylation group (ii) were respectively aligned in a sequence logo plot using WebLogo [23] (Figure 2A and 2B). From the alignment of 11 acetylated proteins belonging to the group (i), we extracted four motifs, Met-Glu-(8/11), Met-Asp-(1/11), Met-Leu-(1/11), Met-Gly-(1/11) (Figure 2A). The first two striking enrichments of acetylation site motifs (Met-Glu-and Met-Asp-) match the previously identified NatB substrate motifs identified in yeast and humans ( Figure 2A). The third enriched motif (Met-Leu-) was consistent with one of the substrate motifs of NatC, NatE or NatF identified in yeast and humans [12] (Figure 2A). The last substrate motif, Met-Gly-, was assigned to one of the NatF substrate motifs of yeast [2,12,15,24] (Figure 2A).
Removal of the N-terminal Met of nuclear-encoded proteins by cytosolic MAP frequently leads to N a -acetylation of the resulting N-terminal Ala, Val, Ser, Thr, or Cys residues [9]. We found that similar events were also present for the 47 acetyl-proteins of group (ii), based on the alignment of their N-terminus ( Figure 2B). As illustrated in Figure 2B, there was an amino acid preference at positions two and three ([Ala/Ser/Gly/Thr/Val]-[Ser/Gly/ Asp/Glu] respectively) with the first position representing the removed Met. Accordingly, the preference likely represents a combination of consensus motifs for the MAPs and Nats. Furthermore, these acetylated residues are represented by five amino acid residues: Ala (25/47), Ser (13/47), Gly (7/47), Thr (1/47) and Vla (1/47), which is consistent with the substrate profiles of NatA in yeast [9,12,24] ( Figure 2B). This result suggests that N a -acetylation of these 47 proteins from group (ii) most likely involves the corresponding NatA orthologs in poplar. In summary, the major acetylases involved in the acetylation of these proteins of poplar are NatA (acetylates 47 (81%) proteins) and NatB (acetylates nine (.15%) proteins) orthologs (Figure 2A and 2B).

Identification of Nats in Poplar and Arabidopsis
Although we suggested that NatA, NatB, NatC, NatE and NatF orthologs may be involved in acetylation of the identified proteins according to recognized substrate motifs by known Nats found in yeast and humans [12], it still remains unexplored whether the poplar genome contains genes encoding similar Nat orthologs to those found in yeast and humans. In order to precisely obtain all members of each type of Nat orthologs in Populus, domain files representing subunits of individual types [25] were exploited as queries to identify the Nat orthologs in the P. trichocarpa genome [20]. As a result, we identified 16 non-redundant putative Nat orthologous proteins that were composed of all catalytic and auxiliary subunits of the six types of Nats (NatA-F) ( Table 3). Except for the NatD catalytic subunit (Ptr Naa40p), N-terminal amino acid sequences of these identified Nat catalytic subunits orthologs showed that these proteins have the consensus acetyl coenzyme A (AcCoA) binding motif, RxxGxG/A, which is a sequence feature of the N-acyltransferase superfamily ( Figure S2). To further characterize the observation in other model plants, we extended the search to the Arabidopsis protein sequence database (http://www.arabidopsis.org/). Similarly, the Arabidopsis genome also contains the genes encoding the six types of Nats (NatA-F) ( Table 3). Each Nat catalytic subunit in poplar and Arabidopsis shares high sequence similarity to their counterparts in yeast and humans ( Figure S3a-f), suggesting that they are highly conserved from lower to higher eukaryotes. Notably, the AcCoA binding motif RxxGxG/A is also present in catalytic subunit of each NatA, NatB, NatC, NatE and NatF, whereas this motif is absent in catalytic subunit of NatD (Naa40p) from Arabidopsis, poplar, yeast and human ( Figure S3a-f).
Although we identified the NatD orthologs of Arabidopsis (Ath Naa40p) and poplar (Ptr Naa40p), which is homologous to yeast Nat4p (Table 3), NatD activities of poplar were not observed in this study since N a -acetylation status of the NatD substrates, histones H2A and H4 [26], has been not determined in the MS experiment. Therefore, further experiments are required to confirm such activity for Naa40p orthologs in poplar. In summary, it has been suggested that poplar and Arabidopsis should share the common NatA-F system present in yeast and humans.

Discussion
The exact NPM including NME and N a -acetylation has been well characterized for yeast. This focus on yeast is primarily because of the accessibility of mutants involved in the pathway. In contrast, negligible progress has been made using woody plants, such as poplar. Furthermore, MS for peptide-based proteomics combined with selective enrichment technologies have been widely used for simultaneous identification of acetylated proteins of yeast and humans [27][28]. Based on the large amount of data identifying exact acetylation sites, the substrate spectra of enzymes involved in the pathway process have been well characterized, resulting in the promotion of research targeting the entire NPM [19]. In combination, these techniques and methods enable largescale identification and analysis of acetylated proteins.
It is noteworthy that TiO 2 column was considered as one of most effective methods for selective enrichment of phosphopeptides based on the strong specific interaction between TiO 2 and phosphate groups on the molecule of phosphopeptides. For this reason, we had used mass spectrometry combined with TiO 2 phosphopeptide-enrichment strategies to investigate the phosphoproteome of dormant terminal buds in poplar (Populus simonii 6 P. nigra) [29]. As a result, 161 phosphopeptides with 161 unique phosphorylated sites from 151 proteins were identified [29].
Surprisingly, we identified 51 N-terminally acetylated peptides from 58 proteins with high confidence, among which fourteen N aacetylated peptides (27.5%, 14/51) were also occurred on phosphorylation in this study (Table 1 and File S2). To explore and clarify why these N-terminal acetylated peptides and phosphopeptides were together enriched using this approach of TiO 2 affinity, we made in silico analysis of the theoretical pI and acidic amino acid composition (including D and E) for these N aacetylated peptides without the occurrence of phosphorylation events using ProtParam tool (http://web.expasy.org/protparam/). We do so mainly because highly acidic peptides or those containing multiple acidic residues tend to absorb with TiO 2 despite the presence of a number of improved TiO 2 phosphopeptide-enrichment procedures [30][31][32]. It is found that almost all of the N a -acetylated peptides could be considered as highly acidic peptides or pepides containing multiple acidic residues because of their low theoretical pI or high acidic amino acid composition (File S5). Accordingly, we thought that the enrichment of these N aacetylated peptides using TiO 2 microcolumn should mainly be due to the strong specific interaction between their additional phosphate groups or multiple acidic residues and TiO 2 . However, it should be very interesting that further experiment are now required to investigate whether interaction between the N-acetyl moiety of protein and TiO 2 was occurred and functioned on this process of enrichment.
Compared with phosphopeptide-enrichment strategies, to date there are still no more suitable methods applied in the N-terminal   acetylpeptide-enrichment strategies. Such case directly lead to protein N a -acetylation remaining poorly explored [33]. To address this, the recent emergence of N-terminomics technologies that allow isolation of protein N-terminal peptides, have greatly boosted the field of N a -acetylation [33]. However, these technologies might go through many cumbersome and complicated steps, such as N-terminus of the tryptic cleavage products require extensive chemical modification (trinitrobenzene or biotinylation) and/or consecutively repeated separation of the sample [33][34][35]. Therefore, until recently, few data resource about protein N aacetylation could be provided, especially for woody plants.
Although 51 unique acetyl-peptides from 58 proteins were accidentally identified in our study of poplar phosphoproteome since phosphate groups or multiple acidic residues within these acetyl-peptides contributed to their affinity with TiO 2 microcolumn, the site-specific acetylation data could also provide a wealth of valuable resources to assist us decoding NPM mechanisms present in poplar.

Removal of the N-terminal Met in Poplar Follows the NME Rule
In this study, we used a proteomics approach to investigate the acetylation status of poplar proteins. Fifty-eight N a -acetylated proteins were identified and the majority of these proteins (47, .81%) undergo N-terminal Met cleavage and subsequent acetylation of the exposed N-terminal residue at position two. Moreover, the residues (Ala, Ser, Gly, Thr and Vla) at position two comply with the above-mentioned rule of NME. Surprisingly, of the 11 proteins belonging to group (i), we found that the Nterminal Met residue of one PPO (794816) was acetylated (Table 1); however, the adjacent residue at position two was a Gly and facilitated NME (File S2). Specifically, the Met of the Nterminal motif (Met-Gly-) of the poplar PPO should have been removed according to the NME rule; however, it was retained and acetylated. This observation is in accord with a recent study that showed that the Met-Gly-sequence represents a substrate motif of a newly identified NatF-type in yeast [12] (Figure 2A).

One Extended Nat Catalytic Subunit System Occurs in the Poplar Genome
Currently, six types of Nats (NatA-NatF) represent the full set of enzymes of the Nats system from yeast to humans [12]. In this study, we found that both Arabidopsis and poplar genomes contain the full Nat system composed of NatA-F ( Table 3). Most of the Nat catalytic subunits in poplar exist as two paralogous isoforms, such as the NatA catalytic subunits of Ptr Naa10p and Ptr Naa11p, and the NatB catalytic subunits of Ptr Naa20p and Ptr Naa21p. In contrast, only NatD exists as a single protein, Ptr Naa40p (Table 3). Conversely, no single Nat catalytic subunit of yeast contains paralogous isoforms, only one NatA catalytic subunit in humans contains paralogous isoforms (i.e., Naa10p and Naa11p) and one NatF catalytic subunit of Arabidopsis contains paralogous isoforms (Ath Naa60p and Ath Naa61p) ( Table 3 and File S4). This observation suggests that the genes encoding Nat catalytic subunits in poplar have expanded. This expansion, often present on a large number of Populus multi-gene families, could have occurred from multiple gene duplication events, including segmental duplication and tandem duplication events [20]. The presence of more Nat subunit genes in the Populus genome may reflect a greater requirement for acetylation of proteins. A detailed schematic view of the number of paralogous isoforms of each Nat catalytic subunit from the four organisms is provided in Figure S4.

Cytosolic Nat Isoforms Present in Poplar
Following Met cleavage by MAP, the exposed small side-chain amino acid, Ser-, Ala-, Thr-, Val-, Gly-or Cys-, is often further acetylated by NatA, a Nat enzyme present in either the cytoplasm [3,19,36] or the chloroplast of Arabidopsis [6,[37][38]. These data indicate that there should be both chloroplastic and cytosolic isoforms of NatA present in Arabidopsis. However, we have identified only one Arabidopsis NatA complex consisting of one catalytic subunit (AT5G13780, Ath Naa10p) and one auxiliary subunit (AT1G80410, Ath Naa15p) (Table 3). Surprisingly, TargetP prediction indicates that Ath Naa10p is secreted, whereas Ath Naa15p is targeted to the cytoplasm in Arabidopsis [18,22] (Table 3). Furthermore, cytosolic isoforms of NatA with Nat activity are composed of both Ath Naa10p and Ath Naa15p, and the chloroplastic isoforms of NatA only consist of Ath Naa10p [18,22]. Similarly, the secreted catalytic subunits (650021, Ptr Naa10p and 641307, Ptr Naa11p) and cytosolic auxiliary subunits (548659, Ptr Naa15p and 553694, Ptr Naa16p) are also present in poplar (Table 3). Thus, we propose that the single presence of one of the two catalytic subunits ''Ptr Naa10p and Ptr Naa11p'' combination with auxiliary subunits of Ptr Naa15p and Ptr Naa16p should be the cytosolic isoform forms of NatA in poplar.
According to the analysis of the substrates profile, NatA should be major N-terminal Nats, which could be responsible for acetylating 81% of the identified proteins ( Figure 2B and Table 1). Identifying which NatA isoforms carry out the acetylation of these proteins are important. To address this, information about subcellular location of the identified poplar proteins was obtained using TargetP [22], and by a comparison of their best hits in Arabidopsis with the latest plant plastid database (PPDB) [18]. As a result, among the 47 acetylated proteins from group (ii) by NatA, all proteins were targeted to cytoplasm, while no any proteins targeted to the chloroplast were found ( Table 1), suggesting that these acetylation events should be carried out by cytosolic NatA isoform and not by chloroplastic NatA isoform. Chloroplastic and cytosolic isoforms of NatB had respectively been found in Chlamydomonas reinhardtii [39] and Arabidopsis [18]. However, it is noteworthy that the two catalytic subunits of NatB in poplar, one is secreted catalytic subunit (Ptr Naa20p) and other one is cytosolic catalytic subunit (Ptr Naa21p), whereas the only one auxiliary subunit of NatB (Ptr Naa25p) in poplar is targeted to cytoplasm (Table 3). Similar to NatA, NatB of poplar should also exist in the forms of either chloroplastic or cytosolic isoforms, where the individual Ptr Naa20p or Ptr Naa21p combination with auxiliary subunit (Ptr Naa25p) could compose cytosolic isoform of poplar NatB.
In summary, we confirm that the N-terminal Met residues of these proteins from the NME-independent N a -acetylation group (i), should be directly N a -acetylated by cytosolic NatB, NatC, NatE and cytosolic NatF. And that the second N-terminal residue of these proteins from the NME-dependent N a -acetylation group (ii), should be N a -acetylated by cytosolic NatA following the subjected to NME by three cytosolic MAPs (PtrMAP1A, PtrMAP2A and PtrMAP2B) in poplar (Figure 2A and 2B).
The Biological Significance of Na-acetylation of these Proteins during the Dormancy of Poplar For many years, it was thought that N a -acetylation protected proteins from degradation [40][41]. On the contrary, N-terminal acetylated Met residues were recently found to be involved in creating degradation signals: a ubiquitin ligase, Doa10, recognizes N a -acetylated proteins and ubiquitinates the protein, thereby marking it for degradation [9]. Although these two hypotheses predict opposite functional outcomes for N a -acetylation and thus appear to be contradictory, both mechanisms may take place sideby-side in the cell, each functioning to a specific subsets of proteins under defined conditions [42]. Accordingly, it was proposed that the functional consequences of N a -acetylation of these identified proteins during dormancy of poplar may be dependent on each specific protein and its cellular state. However, the current major challenge is to determine the specific functions of each individual acetylated protein during the dormancy of poplar.
In conclusion, we have identified 58 N a -acetylated proteins using a tandem MS method combined with TiO 2 acetylpeptideenrichment strategies. Based on the analysis of the N-terminus of these proteins, we confirm that poplar possesses the analogous NPMs including NME rule and Nat system to other eukaryotes. Furthermore, we also confirm that the acetylation reactions and their involving enzymes of these identified proteins in poplar. Further experiments are now required to confirm that these specific MAP and Nat enzymes interact with the identified acetylation proteins in vivo. A promising way forward is to widely identify and characterize the dynamics of protein acetylation in response to environmental changes, applying specialized targeted quantitative acetylation proteomics tools.

Preparation of Total Protein
The dormant terminal buds were homogenized into a fine powder in liquid nitrogen and resuspended at 220uC with 10% (w/v) trichloroacetic acid (TCA) in cold acetone containing 0.07% (v/v) 2-mercaptoethanol for a minimum of 2 h. The mixture was centrifuged at 40,000 g at 4uC for 1 h and the precipitates were washed with cold acetone containing 0.07% (v/v) 2-mercaptoethanol. The pellets were dried by vacuum centrifugation and dissolved in 7 M urea, 2 M thiourea, 20 mM dithiothreitol, 1% (v/v) protease-inhibitor cocktail, 0.2 mM Na 2 VO 3 and 1 mM NaF at room temperature for 2 h before centrifugation at 40,000 g at 4uC for 1 h. The resulting supernatant was collected and stored at 280uC until further use. The total protein content of the samples was quantified using a 2-D Quant kit.

In-solution Protein Digestion
Total proteins were digested as previously described [43][44]. Briefly, after adjusting the pH of the total protein solution to pH 8.5 with 1 M ammonium bicarbonate, the sample was reduced for 45 min at 55uC by adding DTT to a final concentration of 10 mM, and then carboxyamidomethylated in 55 mM iodoacetamide for 30 min at room temperature in the dark. CaCl 2 was then added to a final concentration of 20 mM. Endoprotease Lys-C was added to a final substrate-to-enzyme ratio of 100:1 and the reaction was incubated at 37uC for 12 h. The Lys-C digestion was added to 1 M urea with 100 mM ammonium bicarbonate and modified trypsin was added to a final substrate-to-enzyme ratio of 50:1. The trypsin digestion was incubated at 37uC for 12 h. After digestion, the peptide mixture was acidified with formic acid for further MS analysis. Samples that were not immediately analyzed were stored at 280uC.

Enrichment of Acetylated Peptides Using a TiO 2 Microcolumn
The TiO 2 microcolumns were packed as previously described [30]. A small plug of C 8 material was stamped out of a 3 M Empore C 8 extraction disk using an HPLC syringe needle and placed at the small end of the GELoader tip. The C 8 disk served only as a frit to retain the TiO 2 beads within the GELoader tip. The TiO 2 beads were suspended in 100% ACN and an aliquot of this suspension (depending on the size of the column) was loaded onto the GELoader tip. Gentle air pressure created by a plastic syringe was used to pack the column. The TiO 2 microcolumn was equilibrated with loading buffer (40 ml; 80% ACN/5% TFA/ saturated phthalic acid solution) and the trypsin-digested peptide mixture diluted with loading buffer was then loaded onto the column. The TiO 2 microcolumn was washed once with loading buffer (40 ml) and three times with washing solution (40 ml; 80% ACN/2% TFA). The solvent used for washing and loading the sample onto the TiO 2 microcolumn contained organic solvent (80% ACN), which abrogates the adsorption of peptides to the C 8 material [45]. The bound peptides were eluted twice with 40 ml of ammonium bicarbonate, pH .10.5, and then with 10 ml of 30% ACN. The eluted peptides were lyophilized and dissolved in 1% formic acid before MS analysis.

NanoUPLC-ESI-MS/MS
NanoUPLC-ESI-MS/MS was performed with a splitless nanoUPLC (10 kpsi nanoAcquity; Waters) coupled to a Synapt high-resolution mass spectrometer with a nanospray ion source (Waters). The program MassLynx (version 4.1; Waters) was used for data acquisition and instrument control. A symmetric C 18 5mm, 180-mm 6 20-mm pre-column and a BEH C 18 1.7-mm, 75mm 6 250-mm analytical reversed-phase column (Waters) were used. The mobile phases were (A) 100% H 2 O/0.1% formic acid and (B) 100% ACN/0.1% formic acid. The samples were initially transferred in an aqueous 0.1% formic acid solution to the precolumn with a flow rate of 5 ml/min for 3 min. The peptides were separated by a gradient of 5?40% mobile phase B over 90 min at a flow rate of 200 nl/min, followed by a 10-min rinse with 90% mobile phase B. The column was re-equilibrated using the initial conditions for 20 min. The lock mass was delivered from the auxiliary pump of the NanoAcquity pump with a constant flow rate of 400 nl/min at a concentration of 100 fmol/ml of (Glu1) fibrinopeptide B to the reference sprayer of the NanoLockSpray source of the mass spectrometer. All samples were analyzed in triplicate. Data-dependent acquisition was performed in the positive ion mode. MS spectra were acquired for 1 s from massto-charge ratios (m/z) of 350 to 1990. Two of the most intense precursor ions that were doubly or triply charged were selected from m/z 350 to 1990. MS/MS spectra generated with collisioninduced dissociation were acquired for 2 s from m/z 50 to 1990. The collision energy was automatically calculated based on the peptide charge and m/z; a dynamic exclusion window was applied that prevented the same m/z from being selected for 2 min after its acquisition. The mass tolerance in the MS and MS/MS modes was 15 and 50 ppm, respectively. The candidate acetylpeptides were initially assigned by ESI-MS/MS using 42-Da mass increments per acetyl moiety relative to the unmodified peptides.

Data Analysis and Mascot Database Search
The MS/MS data were converted to a pkl file format using the ProteinLynx software (Waters) and the resulting pkl file was searched against the JGI Populus trichocarpa v1.1 (http://genome. jgi-psf.org/Poptr1_1/Poptr1_1.home.html) protein sequence database using an in-house Mascot server (version 1.8). Two missed cleavage sites were allowed: acetylation, carbamidomethylation, methionine oxidation and phosphorylation of serine/threonine/ tyrosine of the N-terminus of the protein were accepted as variable modifications. The FDR is 0.00% for peptide matches above the identity threshold and 0.36?0.85% for peptide matches above the homology or identity threshold.

Bioinformatics
The complete protein sequence database of poplar was downloaded from Populus trichocarpa v1.1 (www.jgi.doe.gov/ poplar). Using a custom Perl program, all the acetylated protein sequences were extracted from the protein databases by their protein ID identifiers. These protein sequences with the conserved domains of MetAP1 (cd01086) and MetAP2 (cd01088) were respectively considered as the family members of MAP1s and MAP2s by searching the P. trichocarpa protein sequence data [20] across the Conserved Domain Database (CDD) [21].
Each subunit of NatA-F orthologs in Populus was identified according to the following contents. First, Hidden Markov Model (HMM) profile files of two subunits Mdm20 (PF09797) and Mak10 (PF04112) were obtained from the Pfam database (http://pfam. sanger.ac.uk/). Second, these known protein sequences representing each subunits from the other nine Nat orthologs subunits of various organisms were respectively extracted from the UniProt database (http://www.uniprot.org), and then aligned using the ClustalW program [46]. Subsequently, their HMM profile files were respectively in-house established using the hmmbuild command of the HMMER (v 3.0) software [25]. Finally, HMM profile files of each Nat orthologs subunits were searched against the poplar protein database [20] using the hmmer search command of the HMMER (v 3.0) software [25].
Multiple sequence alignments of the full-length protein sequences were performed using the ClustalW program in the BioEdit software with default parameters. Based on these aligned sequences, the unrooted phylogenetic trees were constructed using the MEGA 5.0 software [47] and the Minimum Evolution method with the parameters p-distance and completed deletion. The reliability of the phylogenetic tree was estimated using a bootstrap value with 1000 replicates.
All non-redundant (Nr) best hits in Arabidopsis with the lowest expected values were collected by searching for these identified poplar protein sequences across the REFseq Nr Arabidopsis protein database using the BLASTP program of NCBI (http://www.ncbi. nlm.nih.gov/). These N-terminal acetylated peptides from 58 poplar proteins were aligned at these acetylated amino acid residues, and fourteen positions downstream of the acetylated site were included. The alignment of all the acetylated sites completed by the Weblogo program [23], which is a web-based application designed to generate sequence logos. Figure S1 Alignment of the amino acid sequence of MAPs from Arabidopsis and poplar. Color shading represents 70% identical residues among the sequences. Gaps were introduced to ensure maximum identity. a amino acid sequence alignment of Ptr MAP1E with MAP1A-D of Arabidopsis and poplar. Sequence conservation is highest in the region of the MetAP1 domain (unmarked), and these MAP1s had various Nterminal extension sequences (gray box above sequence alignment). In particular, the N-terminal extension is absent in Ptr MAP1E. b amino acid sequence alignment of MAP2s from poplar and Arabidopsis. MAP2s from poplar and Arabidopsis share near-identical amino acid sequences, indicating that these MAP2s have a conserved function. The identifiers of the proteins are shown in Table 2. (JPG) Figure S2 Amino acid sequence alignment of all predicted Nat catalytic subunits from poplar. The consensus acetyl coenzyme A (AcCoA) binding motif sequence RxxGxG/A, where x can be any amino acids, is boxed (red). The identifiers of the proteins are shown in Table 3. (JPG) Figure S3 Amino acid sequence alignments of each Nat catalytic subunit from several eukaryotes. The consensus acetyl coenzyme A (AcCoA) binding motif sequence RxxGxG/A, where x can be any amino acid, is indicated within the red boxes. Gaps were introduced to ensure maximum identity. Color shading represents 70% identical residues among the sequences. a amino acid sequence alignment of the NatA catalytic subunits from poplar, Arabidopsis, human and yeast. b amino acid sequence alignment of the NatB catalytic subunits from poplar, Arabidopsis, human and yeast. c amino acid sequence alignment of the NatC catalytic subunits from poplar, Arabidopsis, human and yeast. d amino acid sequence alignment of the NatD catalytic subunits from poplar, Arabidopsis, human and yeast. e amino acid sequence alignment of the NatE catalytic subunits from poplar, Arabidopsis, human and yeast. f amino acid sequence alignment of the NatF catalytic subunits from poplar, Arabidopsis and human. The identifiers of the proteins are shown in the Supplemental Table 3 and Table 3. (JPG) Figure S4 Schematic view of the number of paralogous isoforms of each Nat catalytic subunit from the four organisms.

(JPG)
File S1 A file that contains all the original MS/MS spectra of acetylpeptides identified in this study.

(TAR)
File S2 The detailed information for these identified N a -acetylated peptides in poplar.