Implications of polyploidy events on the phenotype, microstructure, and proteome of Paulownia australis

Polyploidy events are believed to be responsible for increasing the size of plant organs and enhancing tolerance to environmental stresses. Autotetraploid Paulownia australis plants exhibit superior traits compared with their diploid progenitors. Although some transcriptomics studies have been performed and some relevant genes have been revealed, the molecular and biological mechanisms regulating the predominant characteristics and the effects of polyploidy events on P. australis remain unknown. In this study, we compared the phenotypes, microstructures, and proteomes of autotetraploid and diploid P. australis plants. Compared with the diploid plant, the leaves of the autotetraploid plant were longer and wider, and the upper epidermis, lower epidermis, and palisade layer of the leaves were thicker, the leaf spongy parenchyma layer was thinner, the leaf cell size was bigger, and cell number was lower. In the proteome analysis, 3,010 proteins were identified and quantified, including 773 differentially abundant proteins. These results may help to characterize the P. australis proteome profile. Differentially abundant proteins related to cell division, glutathione metabolism, and the synthesis of cellulose, chlorophyll, and lignin were more abundant in the autotetraploid plants. These results will help to enhance the understanding of variations caused by polyploidy events in P. australis. The quantitative real-time PCR results provided details regarding the expression patterns of the proteins at mRNA level. We observed a limited correlation between transcript and protein levels. These observations may help to clarify the molecular basis for the predominant autotetraploid characteristics and be useful for plant breeding in the future.


Introduction
Paulownia tree species are native to China, but have been introduced to many other countries for afforestation and land reclamation because of their rapid growth and ability to adapt to poor environmental conditions [1][2][3]. Additionally, because of their good wood properties, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 central role in biological processes. Therefore, protein profile analyses may provide useful functional information that will help to supplement genomic and transcriptomic data.
Two-dimensional gel electrophoresis and isobaric tags for relative and absolute quantitation (iTRAQ) are the main techniques that have been used in proteomic studies. Two-dimensional gel electrophoresis is limited by the under-representation of low-abundance proteins. While, iTRAQ enables the simultaneous identification of many proteins and provides highly sensitive measurements of proteome-level changes. iTRAQ have been used to investigate polyploidy events, such as allopolyploid in Brassica species [24] and Tragopogon mirus [25], and autopolyploid in Arabidopsis thaliana [26]. A proteome study of autotetraploid Paulownia 'Yuza 1' plants using iTRAQ has been published recently [27]. However, very little is known about the differences in the protein profiles between diploid and autotetraploid P. australis plants.
The abundant transcript changes between diploid and autotetraploid P. australis plants have provided some information to characterize the mechanisms involved in plant polyploidization. Knowledge of proteome-level changes resulting from polyploidy events and the associated regulatory activities will help us to understand the mechanisms more throughly. In this study, the morphological, physiological, and microstructural features of P. australis leaves were analyzed, and a comprehensive iTRAQ-based comparative proteome-level investigation was conducted to study the effects of WGD events. The findings may help to improve our understanding of the molecular basis of polyploidy events.

Plant materials
Plant materials were obtained from the Laboratory of Forest Biotechnology, Henan Agricultural University, Zhengzhou, China. Autotetraploid P. australis (PA4) was induced from diploid P. australis (PA2) by colchicine treatment. Their ploidy have been identified by the flow cytometry analysis and chromosome counts [12]. The plants used in this study were tissue cultured seedlings. They were multiplied through in vitro plantlet regeneration using leaves from established tissue cultured seedling. The PA2 and PA4 seedlings prepared from the tissue cultures were grown for 30 days at 25 ± 2˚C under a 16-h light:8-h dark photoperiod (light intensity: 130 μmol m −2 s −1 ). Equal number of leaves from three individual plants were considered as one accession, and at least two biological replicates were used in the experiments. Harvested leaf samples were immediately frozen in liquid nitrogen and stored at −80˚C for iTRAQ and quantitative real-time PCR (qRT-PCR).

Determination of leaf traits
Mature and fully expanded leaves (i.e., second leaf from the apex) were used to measure leaf traits. One leaf from one seedling was considered as one accession. The length and width of at least 10 leaves were measured using a vernier caliper. Chlorophyll content was measured according to a previously described method [28], with three biological replicates for each measurement. Leaf cross-sections were evaluated using scanning electron microscopy. An area approximately 1-2 cm 2 from the center of each leaf was fixed in 2.5% glutaraldehyde buffered with 0.2 M sodium phosphate buffer (pH 7.2) for scanning electron microscopy. The samples were post-fixed in 1% (w/v) osmium tetroxide for 1 h and then washed three times in the buffer. The samples were dehydrated in a graded alcohol series and then examined in a Quanta 200 environmental scanning electron microscope (FEI, Hillsboro, TX, USA). Three biological replicates were used. The thicknesses of the upper epidermis, palisade layer, spongy parenchyma layer, lower epidermis, and leaf were measured according to the method described by Zhang et al [29]. The cell number and size were determined using Image J software (NIH, Bethesda, MD, USA).

Protein preparation
The harvested leave samples from PA2 and PA4 were ground to a fine powder in liquid nitrogen, and then treated with lysis buffer (7 M urea, 2 M thiourea, 4% CHAPS, and 40 mM Tris-HCl, pH 8.5) containing 1 mM PMSF and 2 mM EDTA (final concentration). Total protein was extracted and digested according to the method of Tian et al [30].
Proteins were identified and quantified using the Mascot 2.3.02 search engine (Matrix Science, Boston, MA, USA).The search parameters were as described in a previous study [30].
Proteins were identified using a transcriptome database consisting of 53,230 nonredundant sequences, from sequences that have been submitted to the Short Reads Archive under accession number SRP032321 [22]. To decrease the probability of incorrect protein identifications, only peptides with significance scores (! 20) in the 99% confidence interval (i.e., greater than "identity") of a Mascot probability analysis were considered accurately identified. The identification of each protein was based on at least one unique peptide. To quantify proteins, at least two unique peptides were needed for each protein. The calculated protein ratios were based on weighted averages, and were normalized against the median ratio in Mascot. Proteins with P values <0.05 and fold changes >1.2 were considered as differentially abundant proteins (DAPs).
For the functional analyses, the identified proteins were annotated by searching against the Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases.

Quantitative real-time PCR analysis of gene expression
Total RNA was extracted from PA2 and PA4 leaves using the TRIzol reagent (Sangon, Shanghai, China). Primers were designed using Beacon Designer (version 7.7) (Premier Biosoft International, Palo Alto, CA, USA) (S1 Table). PCR mixtures with a final volume of 20 μL contained 10 μL SYBR Green PCR mix, 1 μL cDNA, 0.4 μM forward primer, 0.4 μM reverse primer, and 7.0 μL sterile ddH2O. The qRT-PCR was completed using a CFX96 Real-Time System (Bio-Rad) and SYBR Premix Ex Taq II (Takara, Dalian, China). The PCR program was as follows: 95˚C for 1 min; 40 cycles of 95˚C for 10 s and 55˚C for 15 s. Three replicates were analyzed for each gene. Relative gene expression levels were calculated using the 2 −ΔΔCt method, and the 18S rRNA of Paulownia was chosen as a reference gene for normalization. Student's t test was used to detect differences between PA4 and PA2 at a significance level of p = 0.05.

Changes on leaf traits after whole genome duplications
The phenotypic differences among leaves are presented in Fig 1. Compared with PA2 plants, in PA4 plants, leaf length and width were greater; leaf size was bigger. Leaf structures were thicker, including the upper and lower epidermal layers and palisade tissues; but the spongy parenchyma layer was thinner (Fig 2). The cell number was decreased and cell size was increased in PA4 compared with PA2. Chlorophyll content was also higher in PA4 plants than in PA2 plants (Fig 1D), maybe because the thicker palisade tissue of PA4 leaves contained numerous chloroplasts, which are the primary sites for photosynthetic activities. Together, these changes may result in the greater net photosynthetic rate that has been observed in PA4 plants [32].
iTRAQ data analysis and protein identification Proteins extracted from PA2 and PA4 leaves were analyzed using an iTRAQ approach. To maximize the number of identified proteins, we limited the peptide matching error during database searches to less than 10 ppm. The distribution of errors between the actual and theoretical relative molecular weights of all matched peptides is presented in S1A Fig. A total of 366,435 spectra were obtained. Based on the analysis with the Mascot search engine, we acquired 21,982 spectra, of which 17,230 were unique spectra. We identified 8,665 peptides, including 7,575 unique peptides, and 3,010 proteins (S1B Fig and S2 Table). The distribution of numbers of peptides, mass and sequence coverage of proteins were shown in S1C, S1D and S1E Fig, respectively. The consistency of the results among the biological replicates implied the proteome-level analyses were reliable (S1F and S1G Fig). The MS/MS raw data generated in this study have been deposited in the ProteomeXchange Consortium database via the PRIDE partner repository (dataset identifier: PXD004237).
To characterize the functions of the 3,010 identified proteins, we first mapped them to the COG database. The proteins were mapped to 23 categories (S2 Fig). In the GO analysis, the identified proteins were classified into 54 groups (S3 Fig). To predict the main metabolic and signal transduction pathways in which the identified proteins involved, we mapped them to KEGG pathways. The proteins were mapped to 120 pathways (S3 Table).

Analysis of differentially abundant proteins
We detected 773 DAPs among the 3,010 identified proteins, including 410 DAPs with higher abundance and 363 DAPs with lower abundance in PA4 compared with PA2 (Fig 3 and S4  Table). We used GO enrichment analyses to determine the main biological functions of the DAPs. Under the biological process category, 122 GO terms were significantly enriched (P value<0.05), including "chlorophyll biosynthetic process", "oxidoreduction coenzyme metabolic process", and "photosynthesis". Under the cellular component category, 69 GO terms were significantly enriched, including "chloroplast part", "plastid part", and "thylakoid". Under the molecular function category, 49 GO terms were significantly enriched, including "copper ion binding", "ion binding", and "transition metal ion binding" (Fig 4 and S5 Table). The DAPs were also mapped to 101 KEGG metabolic pathways, including the highly enriched "photosynthesis", "glyoxylate and dicarboxylate metabolism", "metabolic pathways", "glutathione metabolism", and "carbon fixation in photosynthetic organisms" pathways (Table 1). Among the proteins whose abundances were altered by the WGD, the DAPs related to cell division, aerobic respiration, chlorophyll biosynthesis, carbon fixation, and lignin biosynthesis in particular, may provide relevant information regarding the changes induced by the WGD.
We comprehensively investigated the correlation between protein and mRNA profiles in PA2 and PA4 plants. Poor correlations (Pearson's r value = 0.14) were observed when all quantifiable proteins (1,792) that correlated with a cognate transcript were considered, regardless of the direction of the change (Fig 5A). Correlation between protein and transcript levels of the Group I and Group II members revealed positive (Pearson's r value = 0.77) and negative (Pearson's r value = −0.75) correlations, respectively (Fig 5B and 5C). The discrepancies between the protein and transcript levels may because the mRNA changes did not necessarily lead to similar changes in protein abundance. Alternatively, the proteome-level changes may have been underestimated in this study. Additionally, the identification of DAPs may have been restricted by limitations of the Paulownia sequence databases. As noted in other studies, incomplete databases may prevent the detection of all DAPs [24,33]. Although we used a transcriptome database in this study, it lacked sufficient information because the genomes of Paulownia species have not been fully sequenced. Thus, the proteins identified using the iTRAQ-based method likely represented only a small part of the proteome. Fully sequencing the genomes of Paulownia species to generate more comprehensive databases will likely considerably enhance future proteome-level studies.

Confirmation of differentially abundant proteins
To validate the expression change at mRNA level in 14 DAPs, we performed qRT-PCR analysis. The qRT-PCR results revealed that the expression of 13 DAPs at mRNA level were consistent with their protein expression (Fig 6). The discrepancy between the qRT-PCR and iTRAQ results for another DAP may be explained by post-transcriptional and/or post-translational regulatory processes.

Variations of leaf parameters
In this study, we determined that leaf length and width increased following WGDs, and speculated that this may be caused by the variations of cell division and expansion [38]. Leaf size is controlled by an organ-wide integration system with two main factors: cell number and size. Increases in cell size can result from a decrease in cell number and vice versa [39]. In our study, compared with PA2, PA4 has less cell number and bigger cell size. Sugiyama [18] found the bigger leaf in autotetraploid Lolium perenne and L. multifloru compared with diploid was a result of increased cell size, which is similar to the finding in this study. In mulberry, the cell size of autotetraploid was larger than in diploid [6]. Autotetraploid Salix viminalis has wider leaves and enlarged cells [19]. It appears that autopolyploidy can result in increased cell and leaf sizes. PA4 had thicker upper and lower epidermis than PA2. Among diploid, pentaploid, and hexaploid Betula papyrifera, the polyploids had thicker upper and lower epidermis [40], which is similar with our results. The amount of light absorbed by a leaf is related to its chlorophyll content [41]. In PA4, the chlorophyll content was higher than in PA2, suggesting that PA4 could absorb more light. Besides improvements in agricultural productivity and efficiency (biomass) through the phenotypic variations caused by polyploidization, stress tolerance will also become higher [20]. Leaf is highly responsive to environmental conditions, therefore, its microstructures may reflect the adaptability of plants to environmental conditions [42]. The olive Chemlali cultivar exhibited more tolerance to water stress than the Chétoui cultivar, and its palisade was thicker [43]. The diploid Betula papyrifera, with thinner upper and lower epidermis, was more sensitive to water deficit than the polyploids [40]. The leaf microstructure observed in the PA2 and PA4 plants may explain the higher tolerance to stresses of PA4 compared with PA2.

Differentially abundant proteins related to leaf trait variability
In this study, we determined that leaf length and width, and leaf size increased following WGDs. Cell division is one of the fundamental processes for plant organ growth and development [18]. Four DAPs (XP_002282146.1, AFJ42570.1, BAD45447.1, and NP_001183829.1) related to cell division were more abundant in PA4 plants than in PA2 plants. Larger cells may promote cell wall synthesis. Cellulose is a major component of plant cell walls, and two DAPs (CBI16310.3 and AFK34932.1) associated with cellulose biosynthesis were more abundant in PA4 than in PA2. Increases in leaf size may occur as a result of an increase in the cell elongation rate and/or by a prolonged cell elongation period, both of which require energy and materials. The F-type ATPases catalyze ATP synthesis, and the two F-type H + -transporting ATPase subunits (A29394 and AAQ84325.1) that were more abundant in PA4 plants than in PA2 plants might be responsible for supplying the energy required to increase leaf length. Although there was no direct relationship between leaf size and chlorophyll content, the thicker palisade tissues were associated with elevated chlorophyll content. We identified eight DAPs involved in chlorophyll biosynthesis that were more abundant in PA4 than in PA2 (Fig 7). The variability in abundance of these DAPs may help to clarify the molecular basis of elevated chlorophyll content in PA4. We also identified 10 DAPs related to carbon fixation that were more abundant in PA4 than in PA2 (Fig 8). This suggested increased chlorophyll content and photosynthetic activities, which is consistent with the results of a previous study [32]. Chlorophyll content may be an indicator of photosynthetic capacity.
Lignin is produced by the oxidative coupling of three monolignols which are synthesized from phenylalanine via the phenylpropanoid pathway. Chorismate originates from the shikimate pathway, and is a common intermediate in the biosynthetic pathways for phenylalanine and other aromatic amino acids. Chorismate is initially synthesized from D-erythrose 4-phosphate (E4P) and phosphoenolpyruvate (PEP), which are produced by the pentose phosphate pathway and glycolysis, respectively.
The immediate substrates of the shikimate pathway are E4P and PEP [46]. Additionally, TKTA catalyzes reactions in the Calvin cycle and oxidative pentose phosphate pathway to produce E4P. Henkes et al. observed that down-regulated TKTA activity results in decreased abundance of aromatic amino acids and lignin [47]. The carbon flux towards the shikimate pathway and phenylpropanoid metabolism is restricted by the supply of E4P. GPM catalyzes the conversion of 3-phosphoglycerate to 2-phosphoglycerate, and enolase catalyzes the production of PEP from 2-phosphoglycerate. TKTA, GPM, and enolase were more abundant in PA4 than in PA2, and may increase the carbon resources available for the shikimate pathway.
3-Deoxy-7-phosphoheptulonate (DHAP) synthase regulates the amount of carbon entering the pathway and catalyzes the conversion of E4P and PEP to DHAP [48]. AroK catalyzes the conversion of shikimate to shikimate-3-phosphate, with ATP as a co-substrate [49]. The two substrates for aroA are PEP and shikimate-3-phosphate, which are converted to phosphate and 5-enol-pyruvylshikimate-3-phosphate [50]. There are two routes for the post-chorismate branch of the pathway leading to phenylalanine synthesis (i.e., phenylpyruvate and arogenate routes). AST is part of the phenylpyruvate route, while PAT is involved in the arogenate route [51]. Both of these enzymes were more abundant in PA4 than in PA2, and may lead to increased phenylalanine levels.
Lignin is derived from phenylalanine. The monolignols produce the p-hydroxyphenyl, guaiacyl, and syringyl phenylpropanoid units of the lignin polymer. The HCT enzyme represents a key metabolic entry point for the synthesis of the most important lignin monomers. CAD catalyzes the reduction of cinnamyl aldehydes to the corresponding cinnamyl alcohols, which are the monomeric precursors of lignin [52]. The last step in lignin biosynthesis involves the polymerization of cinnamyl alcohols, which is affected by POD activities. In this study, we found that HCT, CAD, and POD were more abundant in PA4 plants than in PA2 plants. This may result in increased lignin production to strengthen cell walls and protect plants from environmental stresses. After cellulose, lignin is the most abundant terrestrial biopolymer, comprising approximately 30% of the organic carbon in the biosphere. Furthermore, the increased lignin content of PA4 plants may lead to enhanced carbon fixation, and ultimately increased biomass.
In plants, glutathione (GSH) is crucial for biotic and abiotic stress tolerance. We identified some DUPs related to GSH metabolism. Up-regulating GSH synthesis is likely an intrinsic plant response to stress [53]. Glutamate-cysteine ligase (gshA) may promote the synthesis of GSH [54]. GshA is an important component of the glutathione-ascorbate cycle that decreases hydrogen peroxide levels. In response to stress conditions, PA4 plants may increase gshA abundance to enhance their tolerance to environmental stresses. Abiotic stresses, such as drought and high salinity, damage plants via exposure to oxidative stress, which may induce the rapid generation of reactive oxygen species (ROS). Eliminating ROS improves plant growth. Glycine betaine (GB) is important for maintaining the enzymatic activities responsible for eliminating or detoxifying ROS [55,56]. Accumulating GB may enable plants to continue to grow under drought or salt stress conditions. Additionally, GB is synthesized from glycine, which is derived from serine. Glycine hydroxymethyltransferase (glyA) may promote the synthesis of glycine from serine. Increased abundance of glyA (XP_004253097.1, AFA36570.1, and CBI17302.3) may be conducive for the accumulation of GB. The adverse effects of ROS are also mitigated by superoxide dismutase. Previous studies revealed that the superoxide dismutase content was higher in PA4 leaves than in PA2 leaves [14,15]. Damage caused by ROS is amplified by the accumulation of toxic degradation products (e.g., aldehydes) arising from reactions between ROS and lipids or proteins. Aldehyde dehydrogenases eliminate toxic aldehydes [57], and in A. thaliana, they were shown to influence salinity tolerance [58]. In this study, we observed that aldehyde dehydrogenases (EMJ14942.1 and ACM89738.1) were accumulated in PA4 plants, possibly contributing to enhanced tolerance to environmental stresses.

Correlation between transcript and protein
The proteome and transcriptome reflect the situation of gene expression at two different levels. Combining these two omics data can help to reveal more complete expression information for an organism. Correlation between proteome and transcriptome data may reveal the relation between protein abundance and mRNA transcript levels. In yeast, the correlation between transcript expression and protein abundance was good [59], and in Tragopogon mirus [25], Brassica napus [60], and A. thaliana [26,61], limited correlations were found. Thus, it may be considered that low correlation between mRNA expression and protein abundance is common. In this study, we found non-significant correlation between proteomic and transcriptomic changes, suggesting the important role of post-transcriptional regulation and translational modification changes of proteins during the polyploidization process in P. australis. To comprehensively investigate the correlation, we clustered the quantified proteins into four groups. The information from the correlated DAPs and DEUs that had same change trends (Group I) may verify and explain some important processes; for example, the response to environmental stress in fission yeast [59]. The correlated DAPs and DEUs that had opposite change trends (Group II) may explain some phenomenon that could not explained by only proteome or transcriptome data; for example, the plant dormancy process [62]. Groups I and II together contained 129 proteins. Groups III and IV, which contained significantly changed mRNAs associated with unchanged proteins or significantly changed proteins correlated to unchanged mRNAs, contained 815 proteins. The differences in these numbers may result from the definition of DEUs and DAPs. Nonetheless, in many studies, in order to understand the effects of polyploidization, integrated analysis of transcriptome and proteome data is essential.

Conclusions
To characterize the molecular and biological mechanisms that regulate the effects of polyploidy events on P. australis, we compared the leaf parameters and proteomes of autotetraploid and diploid P. australis. The identification of DAPs related to cell division, GSH metabolism, and the synthesis of cellulose, chlorophyll, and lignin may help to characterize the differences between the diploid and autotetraploid P. australis plants, and clarify the mechanisms involved in regulating polyploidy events. Our results provide an overview of the changes caused by WGDs in P. australis and may be useful for breeding new Paulownia species and expanding the available germplasm resources.