Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Positively selected amino acid replacements within the RuBisCO enzyme of oak trees are associated with ecological adaptations

  • Carmen Hermida-Carrera,

    Roles Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Research Group on Plant Biology under Mediterranean Conditions, Universitat de les Illes Balears-INAGEA, Palma, Balearic Islands, Spain

  • Mario A. Fares,

    Roles Formal analysis, Methodology, Writing – review & editing

    Affiliations Integrative and Systems Biology Group, Department of Abiotic Stress, Instituto de Biología Molecular y Celular de Plantas (CSIC–UPV), Valencia, Spain, Department of Genetics, University of Dublin, Trinity College Dublin, Dublin 2, Ireland

  • Ángel Fernández,

    Roles Investigation, Resources

    Affiliation Unidad de Recursos Forestales, C.I.T.A. de Aragón, Zaragoza, Spain

  • Eustaquio Gil-Pelegrín,

    Roles Supervision, Writing – review & editing

    Affiliation Unidad de Recursos Forestales, C.I.T.A. de Aragón, Zaragoza, Spain

  • Maxim V. Kapralov,

    Roles Formal analysis, Methodology, Writing – review & editing

    Affiliation School of Natural and Environmental Sciences, Newcastle University, Newcastle-Upon-Tyne, United Kingdom

  • Arnau Mir,

    Roles Formal analysis, Writing – review & editing

    Affiliation Computational Biology and Bioinformatics Research Group, Department of Mathematics and Computer Science, Universitat de les Illes Balears, Palma, Balearic Islands, Spain

  • Arántzazu Molins,

    Roles Investigation

    Affiliation Research Group on Plant Biology under Mediterranean Conditions, Universitat de les Illes Balears-INAGEA, Palma, Balearic Islands, Spain

  • José Javier Peguero-Pina,

    Roles Investigation, Resources, Writing – review & editing

    Affiliation Unidad de Recursos Forestales, C.I.T.A. de Aragón, Zaragoza, Spain

  • Jairo Rocha,

    Roles Formal analysis, Writing – review & editing

    Affiliation Computational Biology and Bioinformatics Research Group, Department of Mathematics and Computer Science, Universitat de les Illes Balears, Palma, Balearic Islands, Spain

  • Domingo Sancho-Knapik,

    Roles Investigation

    Affiliation Unidad de Recursos Forestales, C.I.T.A. de Aragón, Zaragoza, Spain

  • Jeroni Galmés

    Roles Conceptualization, Formal analysis, Funding acquisition, Project administration, Supervision, Writing – original draft, Writing – review & editing

    jeroni.galmes@uib.cat

    Affiliation Research Group on Plant Biology under Mediterranean Conditions, Universitat de les Illes Balears-INAGEA, Palma, Balearic Islands, Spain

Correction

11 Dec 2017: Hermida-Carrera C, Fares MA, Fernández Á, Gil-Pelegrín E, Kapralov MV, et al. (2017) Correction: Positively selected amino acid replacements within the RuBisCO enzyme of oak trees are associated with ecological adaptations. PLOS ONE 12(12): e0188984. https://doi.org/10.1371/journal.pone.0188984 View correction

Abstract

Phylogenetic analysis by maximum likelihood (PAML) has become the standard approach to study positive selection at the molecular level, but other methods may provide complementary ways to identify amino acid replacements associated with particular conditions. Here, we compare results of the decision tree (DT) model method with ones of PAML using the key photosynthetic enzyme RuBisCO as a model system to study molecular adaptation to particular ecological conditions in oaks (Quercus). We sequenced the chloroplast rbcL gene encoding RuBisCO large subunit in 158 Quercus species, covering about a third of the global genus diversity. It has been hypothesized that RuBisCO has evolved differentially depending on the environmental conditions and leaf traits governing internal gas diffusion patterns. Here, we show, using PAML, that amino acid replacements at the residue positions 95, 145, 251, 262 and 328 of the RuBisCO large subunit have been the subject of positive selection along particular Quercus lineages associated with the leaf traits and climate characteristics. In parallel, the DT model identified amino acid replacements at sites 95, 219, 262 and 328 being associated with the leaf traits and climate characteristics, exhibiting partial overlap with the results obtained using PAML.

Introduction

RuBisCO is one of the best-studied enzymes and is often used as a model protein in evolutionary studies. During photosynthesis, RuBisCO binds CO2 to the Calvin cycle intermediate ribulose-1,5-bisphosphate (RuBP), thereby acting as the essential entry point for carbon into the biosphere. Due to its imperfect ability to distinguish between CO2 and O2, RuBisCO also catalyzes the oxygenation of RuBP, giving rise to the energy-dissipating process of photorespiration. Compared to other catalysts, RuBisCO is a sluggish enzyme, with a catalytic turnover rate (kcatc) of about 3 s−1 in terrestrial plants [1]. Alongside these catalytic imperfections and its large molecular weight, RuBisCO also represents a significant nitrogen investment, typically accounting for 25–30% of the leaf total nitrogen in C3 plants [2].

The photosynthetic process adapts to abiotic stress, such as high temperature or water deficit [3, 4, 5], by optimizing leaf conductance (stomatal and mesophyll) governing CO2 diffusion [6] and by adjustments in the activity and concentration of RuBisCO and other rate limiting enzymes [7, 8, 5]. Temperature and CO2 concentration at RuBisCO active sites are the main driving forces of RuBisCO evolution and adaptation [9, 10, 11, 12, 13, 14, 15]. Computational analysis of carbon uptake at the leaf [16] and canopy level [17] also suggests that optimization of RuBisCO kinetics in modern C3 plants depends on the temperature regime and CO2 concentration. Therefore, plants from dry environments and plants with high leaf mass per area have the lowest CO2 diffusion, and tend to have higher RuBisCO affinity for CO2 [12, 18]. By contrast, plants possessing the C4 carbon concentration mechanism have faster, but less CO2 specific RuBisCO [19, 20, 21, 22, 23, 24]. High temperatures decrease the ratio of CO2/O2 dissolved in the leaf liquid media, and directly decreases the affinity of RuBisCO for CO2 [25]. Accordingly, adaptation to higher temperatures can be achieved by a greater specificity of RuBisCO for CO2 (Sc/o), thereby reducing the loss of carbon due to photorespiration. Selection pressure on RuBisCO with increased Sc/o in hot environments has been demonstrated in some thermophilic red algae [26] and in terrestrial plants [12]. Because of the trade-off between RuBisCO affinity for CO2 and maximum carboxylase activity (kcatc), the selection for increased affinity for CO2 would inevitably take place at the expense of decreased kcatc [13, 14]. Such fine-tuning of RuBisCO kinetic traits is attributed to environmentally driven changes at the molecular level, most likely amino acid replacements within the catalytic large subunit.

In higher plants and green algae, the structure of RuBisCO consists of eight chloroplast-encoded large (L, 50–55 kDa) and eight nucleus-encoded small (S, 12–18 kDa) subunits assembled into a hexadecamer [27]. Large subunits possess the active site and therefore primarily determine RuBisCO kinetic traits [28], although recent studies demonstrate that S-subunits can also influence catalysis [29, 30, 31]. Directed mutagenesis and a variety of recombinant RuBisCOs from plastome-transformed plants allowed identifying molecular changes in L-subunit that translate into changes in RuBisCO catalysis, as well as determining how they affected photosynthesis and plant growth [32, 33, 34, 35, 36, 37]. Recent studies have demonstrated the relationship between amino acid polymorphism in the L-subunit of RuBisCO and catalytic efficiency in natural vegetation and crops, by comparing both distant phylogenetic lineages [18, 38, 39] and closely related species [15, 40] of land plants.

Studies comparing the rates of non-synonymous and synonymous substitutions along phylogenies have demonstrated that positive Darwinian selection is acting on RuBisCO within most lineages of plants, but is restricted to a relatively small number of residues [41, 42, 43, 44, 45, 46, 47, 48, 49]. Results derived from analyses of RuBisCO molecular adaptation complement trends in RuBisCO kinetics and confirm the predominant role of some environmental and physiological factors driving RuBisCO evolution. For example, signatures of positive selection are associated with changes in intracellular concentrations of CO2 driven by carbon-concentrating mechanisms, both in algae and terrestrial C4 plants [43, 48, 49, 50].

Mapping positively selected residues within the protein structure helps to locate catalytically important regions of RuBisCO, and suggests candidate amino acid replacements which could be implemented to optimize RuBisCO performance in crops [42, 48, 49]. However, the effect of an amino acid replacement on protein properties could vary in the presence of other mutations, either individually or together, because of the molecular sign epistasis among mutations [51]. These epistatic interactions impose strong selective constraints on amino acid replacements and also may explain the failure of most attempts to improve RuBisCO catalysis by single point mutations [34, 52]. In agreement with this prediction, positive selection analysis must also account for co-adaptive amino acid replacements through the identification of coevolutionary signatures to find how key residue changes affect RuBisCO structure and function. Coevolutionary studies have been applied to various proteins [53, 54, 55], but only recently to RuBisCO [47, 56]. It has been shown that coevolution of residues is common in RuBisCO of land plants and there is an overlap between coevolving and positively selected residues [56].

Evolutionary analyses are needed to identify adaptive changes in the Rubisco sequence, but the drivers of such evolution must also be investigated. In this paper, we used a predictive model called decision tree (DT), which is able to statistically associate a combination of environmental variables to variation in the amino acid residues. A DT can be used for both classification (classification tree) and regression (regression tree) tasks. We used this model for classification tasks, which are frequently employed in applied fields such as engineering and medicine [57, 58, 59, 60, 61]. A DT implicitly performs feature selection and requires relatively little effort for data preparation. The analysis is straightforward, the results shown graphically and they can be easily interpreted.

The objective of the study was to investigate molecular adaptation of Quercus RuBisCO to particular ecological conditions and to test if leaf morphological traits are associated with adaptive amino acid substitutions. To achieve this, we compared two different methodologies: the DT model and phylogenetic analysis by maximum likelihood. We selected oak (Quercus) species as a model group for this study because this genus contains a large number of species (ca. 500) inhabiting a wide range of environments. Both evergreen and deciduous oak species have contrasting leaf morphology [62], and therefore variable diffusive limitations to CO2 transfer from the atmosphere to the site of carboxylation [63]. Finally, oaks are often an ecosystem-defining species in most broad-leaved forests worldwide making them an ecologically important group.

Materials and methods

Taxon selection and sampling

A total of 174 species in Fagales were selected for the study (S1 Table). These species belong to the Fagaceae (n = 170) and Nothofagaceae (n = 4). Within Fagaceae, the majority of the species belong to Quercus (n = 158; ca. 30% of the total number of Quercus species).

Each species was classified according to its geographic distribution, prevalent climate and leaf habit (S1 Table). The geographic distribution area of each species was assigned according to Govaerts et al. (1998) [64] and information found in publicly available databases [65, 66, 67, 68]. The prevalent climate was obtained by overlapping the species geographical distribution in our study and the Köppen-Geiger world map of climate classification [69]. To simplify the analysis, fifteen Köppen-Geiger climate types were grouped into six: 1) tropical (including climates Af, Am and Aw according to Köppen-Geiger classification); 2) arid steppe (Bsh and Bsk); 3) temperate with dry winter and hot or warm summer (Cwa and Cwb); 4) temperate with dry summer and hot or warm summer (Csa and Csb); 5) temperate or cold without dry season and hot or warm summer (Cfa, Cfb, Dfa and Dfb) and 6) cold with dry summer and hot or warm summer (Dsa, andDsb) (S2 Table). Regarding the leaf habit, species were classified as evergreens (those species retaining their leaves during the whole year), deciduous (when losing all leaves during the unfavourable season) and semi-evergreen (those species that lose some leaves during the unfavourable season, depending on its length and severity).

Leaves from all species were sampled from living collections of Jardín Botánico de Iturrarán (Parque Natural de Pagoeta, Aya, Guipúzcoa, Spain), with the exceptions of Q. palmeri, Q. baloot and Q. vaccinifolia, which were collected from The Cheviton Barton collection (Bevon, UK). For each species, leaf density was calculated from leaf thickness and leaf mass area (LMA) measurements performed on fully expanded leaves that developed in the external part of the tree canopy (i.e., exposed to full solar irradiation). The leaf thickness of each species was measured on two discs (disc area = 0.33 cm2) per leaf from five fully hydrated leaves, collected from three to five different individuals. The leaf thickness was measured using a digital contact sensor GTH10L coupled to an amplifier GT-75AP (GT Series, Keyence Corporation, Japan) [70]. Afterwards, LMA of each disc was calculated as the ratio between the dry weight and the area. The dry weight was obtained after drying the leaf discs in a ventilated oven at 60°C until constant weight (typically after 2 days).

DNA sequencing

Total genomic DNA was extracted from leaf material using the DNeasy Plant Mini Kit (Qiagen Ltd., Crawley, UK) according to the manufacturer’s protocol.

We sequenced chloroplast genes rbcL and matK [71, 72]. To obtain the full rbcL sequence (1428 nucleotides), the gene was amplified using primers esp2F (5´-AATTCATGAGTTGTAGGGAGGGACTT-3´) and 1494R (5´-GATTGGGCCGAGTTTAATTTAC-3´). The matK gene was amplified in 43 species using the primer X390_F (5´- CGATCTATTCATTCAATATTTC-3´) and Xmatk9_R (5´-CAATCATTCGTGATTGGCCAG -3´). For 42 species, we obtained the nuclear microsatellite loci (SSRs) from [73] (QmC00716, QmC01095, QmC01990, QmC02241) and from [74] (ssrQpZAG15, ssrQpZAG46, ssrQpZAG110, ssrQrZAG-7, ssrQrZAG-20).

All PCR reactions were performed using the BioMix Red reagent mix (Bioline Ltd., London, UK). The PCR program for the amplification of the rbcL comprised an initial denaturation at 95°C, 2 min, and 36 cycles of 93°C for 30 s, 53°C for 30 s and 72°C for 3.5 min, and a final extension at 72°C for 30 min. The PCR program for the amplification of the matK gene comprised an initial denaturation at 95°C for 2 min, followed by 35 cycles of 30 s at 94°C (denaturing), 45 s at the annealing temperature of 56°C, 2 min at 72°C (extension), and a final extension phase of 7 min at 72°C. The microsatellites were amplified using the following PCR conditions: 95°C for 2 min, and 35 cycles of 95°C for 30 s, 50°C for 30 s and 72°C for 2 min, and a final extension at 72°C for 5 min. The rbcL and matK PCR products were separated on 2% agarose gels buffered with 1X TAE and purified using Roche High Pure PCR Product Purification Kit (Roche Diagnostics Corporation P.O., Indiana, USA). Chloroplast gene sequencing was performed using an ABI 3130 Genetic analyzer with the ABI BigDyeTM Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, California, USA). For microsatellites, we used the ABI 3130 XL Genetic analyzer and fragment analysis was performed using the GeneMapper software v4.1 (Applied Biosystems). The DNA sequences from the chloroplast markers were aligned using Clustal X [75] and manually adjusted with Bioedit v.7.2.5 [76]. All variable sites were checked against the original sequence chromatograms, and doubtful regions were sequenced again. All newly generated sequences were submitted to the GenBank (S3 Table).

Phylogenetic analyses

We inferred the phylogenetic relationships from the nucleotide data using Bayesian inference (BI). We constructed a phylogeny using rbcL sequences from the 158 Quercus species (denoted Quercus large dataset) (Fig 1). The tree topology was not fully resolved for this group when using only one gene. Because we require a robust phylogeny to detect adaptive evolution by maximum likelihood, we chose a subset of species to construct a multilocus tree with better-resolved topology. The tree was constructed with a concatenated alignment of 45 rbcL, 43 matK and 42 SSRs for Quercus species (denoted Quercus small dataset) (Fig 2). Tree topologies using rbcL were congruent with those based on the use of multiple genes, with both leading to similar lists of amino acid sites detected to have evolved under positive selection. Finally, we constructed the phylogeny for all 174 species containing the Fagaceae and Nothofagaceae species (denoted Fagales henceforward) (Fig 3).

thumbnail
Fig 1. Quercus large dataset Bayesian phylogram based on 158 rbcL sequences.

Numbers above branches correspond to Bayesian posterior probabilities. The figure was edited using FigTree Version 1.4.0 [77].

https://doi.org/10.1371/journal.pone.0183970.g001

thumbnail
Fig 2. Quercus small dataset Bayesian phylogeny based on 45 sequences of rbcL, 43 matK and 42 microsatellites.

Numbers above branches correspond to Bayesian posterior probabilities. The figure was edited using FigTree Version 1.4.0 [77].

https://doi.org/10.1371/journal.pone.0183970.g002

thumbnail
Fig 3. Fagales Bayesian phylogram based on 174 rbcL sequences.

Numbers above branches correspond to Bayesian posterior probabilities. The figure was edited using FigTree Version 1.4.0 [77].

https://doi.org/10.1371/journal.pone.0183970.g003

Nucleotide sequences were translated into amino acid sequences with MEGA 5 software [78] and aligned online using MAFFT [79]. The optimal DNA substitution model was determined by Modeltest v.3.7 package [80, 81] by comparing available models using Bayesian information criterion (BIC). BI was performed in MrBayes version 3.2 [82] allowing different models for each region (rbcL, matK and SSRs). Markov Chain Monte Carlo (MCMC) used two independent runs of 1 × 106 generations. Trees for Quercus small and Fagales datasets were sampled every 300 generations. For the Quercus large dataset the MCMC used independent runs of 5 × 106 generations and trees were sampled every 100 generations. The first 25% of the runs were discarded as burn-in. The trees sampled before reaching a stable posterior probability (PP) were excluded from the consensus. A majority rule consensus of the remaining trees from the two runs was edited in FigTree v 1.4.0 [77] and used as BI tree.

Maximum likelihood tests for positive selection

Six different codon based models were applied using the codeml program of the PAML package version 4.7 to test for the presence of positive selection [83]. These models were compared for goodness-of-fit to the data and phylogenies using the Likelhood Ratio Test (LRT) and the best model was used to estimate the nonsynonymous-to-synonymous rates ratio (ω = dN/dS). This ratio represents the selective pressures acting on the protein-coding gene with values of ω = 1, ω < 1, and positive ω > 1, being indicative of neutral evolution, purifying selection and positive selection, respectively. BI trees were used as the reference topologies for the PAML analyses (Figs 13).

Site models allow the ω ratio to vary among codons in the protein [84, 85]. Model M1a assumes the same selection pressures on all branches of the phylogenetic tree. In this model, codons can either evolve neutrally or under purifying selection, and thus the estimated values of ω < 1 and/or ω = 1. Model M2a allows for an extra category of codon site compared to M1a which can evolve under positive selection (ω > 1). Model M8a assumes a discrete beta distribution for ω, which is constrained between 0 and 1 including a class with ω = 1. Model M8 allows the same distribution as M8a with an extra class of codons under positive selection with ω > 1.

Branch-site models allow ω to vary both among sites in the protein and across branches on the tree with the aim to detect positive selection affecting a few sites along particular branches (known as foreground branches). The branch-site A model was applied for branches leading to species with high or low leaf density; deciduous, evergreen or semi-evergreen species and species living in climates 1, 2, 3, 4, 5 and 6. When the number of species inhabiting a particular climate represented less than 15% of the total species analysed, then this climate was discarded for the branch site test. Model A1 allows 0 < ω < 1 and ω = 1 for all branches and also two additional classes of codons with fixed ω = 1 along pre-specified foreground branches while restricted as 0 < ω < 1 and ω = 1 on background branches. The alternative model A allows 0 < ω < 1 and ω = 1 for all branches and also two alternative classes of codons under positive selection with ω > 1 along pre-specified foreground branches while restricted to values of 0 < ω < 1 and ω = 1 on background branches.

We performed three LRTs to compare the nested site models M1a-M2a, M8-M8a and branch-site models A-A1. The LRTs values are calculated as twice the difference in the log-likelihood values of the models being compared, with the degrees of freedom being the difference in the number of parameters estimated in each of the models. LRT value can be approximated to a chi-square distribution. For the comparisons between M1a-M2a, M8-M8a and A-A1 the df was 2, 1 and 0.5, respectively.

Coevolution analyses

CAPS software [86] was used to test for dependencies among amino acids on the RuBisCO structure. We used the Bayesian trees of 158 Quercus large and 174 Fagales as topology references for the analyses. CAPS compares the correlated variance of the evolutionary rates at two sites. This variance is corrected by the amount of divergence between the sequences compared using either the synonymous nucleotide substitutions or, alternatively, amino acid replacements as a relative measures of time. For each protein alignment, the corresponding BLOSUM matrix was applied depending on the average sequence identity. The significance of the results was evaluated by randomization of pairs of amino acid sites in the alignment, calculation of their correlation values, and comparison of the real values with the distribution of 10,000 randomly sampled values. An alpha value of 0.01 was applied to minimize the number of false positives. The level of substitutions per synonymous site weighted the correlated variability among amino acid sites in order to normalize parameters by the time of sequence divergence. The method detects phylogenetic-independent coevolution. We also conducted an analysis of the statistical support for each of the coevolving amino acid pair using a non-parametric bootstrap analysis. Briefly, for each pair exhibiting significant coevolution signatures we shuffled the sequences across the tree and we then re-ran CAPS on the new resulting alignment. We then identified coevolving pairs of amino acid sites and checked for the presence of the pair identified in the original non-random alignment. We repeated this procedure 1000 times and for each pairs of coevolving sites determined its frequency as the number of times it is detected in the 1000 replicates divided by 1000. A pair of coevolving sites was considered to be significantly represented in the bootstrap procedure when its frequency was equal or larger than 0.8.

Decision tree model

Decision tree (DT) model analysis (“rpart” package in R v3.1.1) [87] was used to relate the proportion of amino acids present in all variable positions of the L-subunit of RuBisCO to species-specific traits (geographic area, climate, leaf habit and density), denoted as external variables.

For each variable position, the program builds a DT as follows. First, a question is found based on the analysis of all three external variables to split the species. Then, based on that question, the species are separated into two groups, in which the variability of that site is as low as possible. The analysis is repeated for each subgroup using all three external variables. The process continues until the lowest entropic error (xerror) for the entire DT is obtained [88]. The quality of the DT is categorized by its xerror as a function of the proportion of correct predictions and the complexity of the tree. The lower the xerror, the higher the relationship between the external variable and the variable site. Only DTs with xerror < 1 were selected. The program also calculates the importance of each external variable within the predictive model.

An advantage of DTs is that no statistical assumptions (about the independence, the distribution, the variance, etc.) are needed. The main limitation of DTs is to identify the optimal tree under certain criteria, so algorithms are employed to give an approximate solution.

Results

The rbcL variability

We obtained complete sequences of the rbcL gene (1428 nucleotides) for 158 Quercus species, 12 other Fagaceae species (6 Fagus, 3 Castanea, 1 Castanopsis, 2 Lithocarpus) and 4 Nothofagaceae species (4 Nothofagus). Within the Fagales dataset (Fagaceae and Nothofagaceae, all 174 species), 19 variable amino acid sites were observed, resulting in 30 haplotypes (i.e. group of species with identical L-subunit sequence) (S4 Table). Within Quercus (158 species), 9 variable sites defined the L-subunit and species were grouped into 21 haplotypes. In the two datasets, most of the species belonged to haplotype 1.

Complete sequences of the chloroplast matK gene and nuclear SSRs were obtained for 43 and 42 species, respectively (Quercus small dataset). The phylogenetic tree constructed for the Quercus small dataset with rbcL, matK and SSRs is well resolved with posterior probabilities > 50% (Fig 2). The tree topology was similar to that of Manos et al. (1999) [89] based on combined chloroplast DNA and nuclear internal transcribed spacers (ITS).

Positive selection in Quercus rbcL

LRTs for positive selection (Table 1) indicated that the free-ratio model, that allows estimating ω for each of the branches of the tree, was significantly better than the models that do not allow for differences in ω values among tree branches (p-value = 0.0001). Models M2a and M8 both pointed to positive selection on rbcL in Quercus (small and large datasets) and Fagales. The three datasets exhibited positive selection at the amino acid sites 95, 219 and 328 (Table 1). In Quercus, asparagine (Asp) and serine (Ser) occurred at site 95, however threonine was found (Thr) in Nothofagus antarctica and N. procera (S4 Table). In the three groups, valine (Val) and leucine (Leu) occurred at site 219, and alanine (Ala) and Ser were found at position 328. In both the Quercus small and Fagales datasets, site 262 (Val or Ala) was positively selected. Sites 251 and 475 appeared as positively selected only in the Quercus large dataset. Isoleucine (Ile) and methionine (Met) occurred at residue 251, and Leu and Val occurred at residue 475. Site 145 in Fagus, Lithocarpus and Nothofagus was either a Val or Ala, although all Quercus species, Castanopsis carlesi, Castanea pumila and Lithocarpus densiflorus shared Ser.

thumbnail
Table 1. RuBisCO L-subunit sites subject to positive selection.

https://doi.org/10.1371/journal.pone.0183970.t001

In the Fagales dataset, LRT (Table 2) indicated that the branch-site model A (ω2 = estimated, in branches leading to deciduous or evergreen species or belonging to climate 5, see S1 and S2 Tables) was a significantly better fit to the data than the null model A1 (ω2 = 1, fixed) (p-value = 0.0001). However, no positively-selected sites were identified under the branch-site model A in either the Quercus small or large datasets. A total of five sites (95, 145, 251, 262 and 328) appeared as positively selected in Fagales, each exhibiting a posterior Bayesian probability greater than 0.90 (Table 2). In branches leading to evergreen species, Asp95Ser replacement was found at least two times (Q. germana, C. carlesii) (Table 3). In branches leading to deciduous species, i) Ile251Met replacement was found at least six times (Q. aliena, Q. fabri, Q. griffithii, Q. muehlenbergii, Q. serrrata var. brevipetiolata and Q. wutaishanica); ii) Ala262Val replacement occurred on one branch leading to C. sativa; iii) Ala328Ser replacement occurred in branches leading to Q. eugeniifolia and Q. seemani, although Ser328Ala replacement occurred instead in branches leading to N. procera and N. antarctica. In branches leading to species in climate 5, i) Ser145Ala replacement took place in at least six times (F. crenata, F. japonica, F. lucida, F. sylvatica, N. procera, and L. hancei), while Ser145Val replacement occurred at least three times in branches leading to N. menziesii, N. moorei and N. antarctica, ii) Ala262Val replacement occurred in the branch leading to C. sativa; iii) Ala328Ser replacement was found in branches leading to Q. costaricensis, while Ser328Ala replacement occurred in branches leading to C. sativa, C. mollissima. N. procera, N. menziesii, N. moorei and N. antarctica.

thumbnail
Table 2. Results of branch site model A in Fagales dataset (174 species).

https://doi.org/10.1371/journal.pone.0183970.t002

thumbnail
Table 3. RuBisCO L-subunit amino acid replacements in Fagales (174 species) identified under positive selection by the Bayes Empirical Bayes (BEB) analysis implemented in the PAML package [83, 90] along branches leading to species with particular leaf or habitat trait.

https://doi.org/10.1371/journal.pone.0183970.t003

Analysis of dependent evolution among amino acid sites in rbcL

Analysis of coevolution in rbcL identified 29 pairs of coevolving amino acids in Fagales dataset with a total of 14 non-redundant amino acids involved (Fig 4 and S5 Table). The largest group of coevolution was composed of sites with only one interaction, and included residues 95, 143, 225, 449, 472 and 475. The coevolving pairs with the highest correlations (1.0) (as a measure of coevolution) were 30–449, 270–340, 270–353, 270–470, 340–353, 340–470, 353–470 and 472–475. In the Quercus large dataset, no amino acid were detected as coevolving.

thumbnail
Fig 4. Coevolving sites in the RuBisCO L-subunit of Fagales dataset.

Location of amino acids implicated in co-evolutionary dependencies.

https://doi.org/10.1371/journal.pone.0183970.g004

Decision tree model

In the Quercus large dataset (158 species), the DT model pointed to a link between the external variables (geographic distribution, climate and leaf habit and density) and the RuBisCO L-subunit variable sites 95, 219, 262 and 328 (Table 4, Fig 5). The xerrors calculated for each variable site were 0.89, 0.39, 0.44 and 0.59, respectively. According to the xerror, the sites that were best explained by the external variables were 219 and 262 followed by 328 and 95. The leaf habit (evergreen, semi-evergreen and deciduous) was the external variable that best explained variability at site 95, followed by the geographic area, climate and leaf density (Table 4). The most important variable for sites 219 and 262 was geographic area, followed by climate and leaf density for site 219 and leaf habit, leaf density and climate for site 262. Site 328 was best explained by climate (2 or 5), followed by geographic area, leaf habit, and density (Table 4, Fig 5).

thumbnail
Table 4. Variable sites resolved with the DT model for the Quercus large dataset (158 species).

https://doi.org/10.1371/journal.pone.0183970.t004

thumbnail
Fig 5. Decision trees (DT) resolved for each RuBisCO L-subunit variable site in Quercus large dataset (158 species) (see S1 Table for details on the external variables: Geographic distribution, climate and leaf habit and density).

Numbers above each tree correspond to the RuBisCO L-subunit variable site according to the spinach sequence (AJ400848.1). First level presents the proportion of amino acids in each variable site (brackets). The external variable that allows the best separation of species is shown over the line. The second level presents the distribution of amino acids (in brackets) after the first split. Subsequent divisions are performed until the lowest xerror for the entire DT is obtained (symbolized as squares). Taking as an example the RuBisCO L-subunit variable site 95, the first level shows the separation of the 158 species between those that present N (121) and those that present S (37). Over the line, leaf habit is indicated as the external variable that gives the best split among the four external variables, with evergreen and semi-evergreen species having a proportion of N/S of 79/8. On the other hand, deciduous species present a proportion of N/S of 42/29. The latter group is further split using geographic area as the best external variable into a group of species from North and Central America having a N/S proportion of 29/9, and a group of species from Eurasia and Asia having a proportion of 13/20. The relative importance of each external variable is shown in Table 4.

https://doi.org/10.1371/journal.pone.0183970.g005

Discussion

Both phylogenetic analyses by maximum likelihood and decision tree analyses highlighted the same amino acid substitutions

Methodologically different approaches have been used to study molecular adaptation of RuBisCO to particular ecological conditions in oak trees (Quercus). Phylogenetic analysis by maximum likelihood (PAML) [90] is a standard method to identify positive selection at the molecular level. In the present study, six RuBisCO L-subunit sites (95, 219, 251, 262, 328 and 475) were identified by models M2a and M8 to have evolved under positive selection in Quercus large and small datasets (Table 1), all of which were previously reported in other groups of plants [41, 42, 43, 44]. For the same Quercus datasets, we compared the results of another method, the DT model. DT linked RuBisCO L-subunit sites 95, 219, 262 and 328 to distribution, climate, leaf habit and density (Table 4). All four variable sites resolved by the DT model were positively selected according to maximum likelihood analyses (Tables 1 and 4). The analytically simple DT method combined with PAML provided evidence of a link between amino acids replacements in RuBisCO and specific phenotypes [84, 85]. The combination of both methods constitutes a powerful tool to identify causal links between genetic variants and adaptation of the L-subunit of RuBisCO.

According to the DT model analysis, replacement at site 95 was linked to the leaf habit as the most important external variable (Table 4). Since site 95 evolved under positive selection (Table 1), and evergreen and deciduous species typically display different mesophyll conductance (gm) influencing the CO2 concentration at the site of carboxylation [63], this result provides support to the idea that CO2 availability shapes RuBisCO evolution [14, 91].

The six RuBisCO sites under positive selection in Quercus large and small datasets (Table 1) were located in functionally important subunit interfaces within the RuBisCO complex (95, 219, 251, 262, 328 and 475). Site 95 was hypothesized to be involved in interactions between RuBisCO and RuBisCO activase [92, 93]. Sites 219 and 262 were reported to be involved in interactions between large and small subunits [94]. Site 262 is located in loop 3 in a hydrophobic core in the C-terminal α-β barrel domain and could influence holoenzyme thermal stability and catalysis [95]. Site 251 seems to be involved in dimer-dimer interactions within the large subunits [96] and sites 328 and 475 are located close to the active site and in the C-terminus, respectively [95, 97].

Evidence for CO2 as a major factor driving RuBisCO evolution in Fagales

In Fagales, species with evergreen leaves had RuBisCO residues 95 and 262 under positive selection, species with deciduous leaves had residues 251, 262 and 328 under positive selection, and species inhabiting climate number 5 had residues 145, 262 and 328 under positive selection (Table 2). In species with evergreen leaves, amino acid replacements in position 95 (Table 3) may be linked to an increased affinity of RuBisCO for CO2 (i.e., low values of the RuBisCO Michaelis-Menten constant, Kc). The work by Galmés et al. [18] linked the Asp95Ser replacement to high affinity for CO2 (low Kc). The dataset of evergreen Fagales had an average LMA of 117.5 ± 0.4 g m-2. This high LMA, and specifically high leaf density, could have been associated with high resistance to CO2 internal diffusion [63, 97]. Moreover, species with high LMA also tend to present lower values of stomatal conductance [98]. Hence, RuBisCO of evergreen Fagales probably works at relatively low CO2 partial pressures. Taken together, these results suggest that amino acid replacements at position 95 in evergreen Fagales may lead to RuBisCO with increased affinity for CO2. Unfortunately, our attempts to extract active RuBisCO in different Fagales failed due to the high content of polyphenols and other secondary metabolic compounds. Future efforts will demand testing different extraction buffers to obtain sufficient active enzyme and determine key RuBisCO kinetic parameters. In species with deciduous leaves or inhabiting areas lacking a dry season (climate number 5), replacements at site 328 (Table 3) could be related with decreased RuBisCO CO2 affinity (i.e., high values of Kc). Galmés et al. [15] reported low specificity factor (Sc/o) and high maximum carboxylase turnover rate (kcatc) and Kc in Limonium species having serine at the site 328. Those kinetic values are associated with an increase in the chloroplast concentration of CO2. Within Fagales, species with deciduous leaves or belonging to temperate climate (climate number 5) had average LMA value of 87.6 ± 0.2 and 95.4 ± 0.4 g m-2, respectively. Low LMA and low leaf density, together with the absence of a dry season, may favour high CO2 concentration in the stroma of the chloroplast, via indirect effects on leaf conductance to CO2 [98]. RuBisCO in deciduous species or inhabiting climate number 5 could have adapted towards a higher kcatc and lower CO2 affinity (high Kc) and Sc/o, although this could be confirmed with future kinetic analyses.

In total, twenty-nine residue pairs in Fagales RuBisCO L-subunit were linked through intramolecular coevolution, representing c.a. 3% of L-subunit residues (14 out of 476) (Fig 4 and S5 Table). Many of the coevolving residues detected in the present study were already reported as coevolving in previous studies including different land plant lineages [47, 56]. Our results showed that both positive selection and coevolution affect some sites. For example, site 95 was positively selected within evergreen species (Tables 1 and 2) and appeared as coevolving with site 309 (S5 Table). By contrast, site 145 was positively selected within species of climate number 5 (Tables 1 and 2) and appeared as coevolving with seven sites (30, 142, 143, 270, 340, 353, 470) (S5 Table). Three of the coevolving pairs, 30–340, 142–470 and 270–309 coevolve in terms of their physic-chemical properties, including molecular weight and hydrophobicity (S5 Table). Coevolving sites may be located within structurally and/or functionally important positions. For example, Kellogg and Juliano (1997) [99] reported the importance of sites 142 and 145 for dimer-dimer association, and sites 225 and 449 could be important for the interaction between large and small subunits [100]. Knowledge of co-evolution networks operating in RuBisCO L-subunit of Fagales provides useful information on target substitutions to improve the catalytic performance of RuBisCO.

Concluding remarks

Based on our research, it is reasonable to postulate that finely tuned biochemical properties of Quercus RuBisCOs have evolved as a result of environmental pressures. Such evolution is manifested by positively selected amino acid replacements within the large subunits of Quercus RuBisCO, which are likely related to different physiological and environmental traits. These changes could have fine-tuned RuBisCO catalytic efficiency and may have facilitated Quercus adaptive radiation into diverse ecological niches. The DT model and phylogenetic analysis by maximum likelihood identified the same amino acid replacements associated with ecological adaptation and positive selection.

Supporting information

S1 Table. Studied species from Fagaceae and Nothofagaceae families from the order Fagales.

The genus, subgenus and section are indicated along with information on the species distribution, climate, and leaf habit and density. Data on the geographic distribution and leaf habit were obtained from Govaerts et al. (1998) [64] and publicly available databases [65, 66, 67, 68]. The climate types were obtained by overlapping the species geographical distribution and the Köppen-Geiger world map of climate classification [69]. Fifteen different Köppen-Geiger types of climates were grouped into six: 1 = tropical, 2 = arid steppe, 3 = temperate with dry winter and hot or warm summer, 4 = temperate with dry summer and hot or warm summer, 5 = temperate or cold without dry season and hot or warm summer, 6 = cold with dry summer and hot or warm summer. The species leaf density was calculated from leaf thickness and leaf mass area (LMA) measurements. The three columns on the right correspond with [1] classification.

https://doi.org/10.1371/journal.pone.0183970.s001

(PDF)

S2 Table. Köppen-Geiger climate types assigned to each of the 174 species (S1 Table) based on the geographic area.

To simplify the analysis, 15 different Köppen-Geiger types of climates were grouped into six: 1) tropical (including climates Af, Am and Aw according to Köppen-Geiger classification); 2) arid steppe (Bsh, Bsk); 3) temperate with dry winter and hot or warm summer (Cwa, Cwb); 4) temperate with dry summer and hot or warm summer (Csa, Csb); 5) temperate or cold without dry season and hot or warm summer (Cfa, Cfb, Dfa, Dfb), and 6) cold with dry summer and hot or warm summer (Dsa, Dsb).

https://doi.org/10.1371/journal.pone.0183970.s002

(PDF)

S3 Table. Fagaceae and Nothofagaceae GenBank accession numbers for rbcL and matk genes.

https://doi.org/10.1371/journal.pone.0183970.s003

(PDF)

S4 Table. RuBisCO L-subunit 19 variable sites identified in 174 Fagales species, grouped in 30 haplotypes (i.e., groups of species with identical L-subunit sequence).

Variable sites identified when Quercus large dataset (158 species) were analyzed separately are marked in grey (9 variable sites and 21 haplotypes). Species marked with an asterisk were used to construct the Quercus small dataset (45 species) phylogeny based on rbcL, matK and SSRs.

https://doi.org/10.1371/journal.pone.0183970.s004

(PDF)

S5 Table. Coevolution pairs within the L-subunit of RuBisCO.

Significant correlation coefficients are shown (p < 0.001). Mean D1 and Mean D2 values correspond to the mean distance calculated for each coevolving position based on BLOSUM as calculated in the method of Fares and Travers (2006) [1]. The bootstraps are based on 100 resampling, the confidence level associated with the pair that coevolves (values greater than 75% are significant after a non-parametric resampling over the tree). Atomic distances were calculated from 3D crystal structure wherever available by measuring the average Euclidean distance between atoms of two amino acids (Å). Atomic distances are not used here as evidence of coevolution but rather as additional supporting information in the identification of functional and structural coevolution. A test for variability in hydrophobicity and molecular weight has been also conducted giving as a result the pair 30–340, 142–470 and 270–309. n.a. not available.

https://doi.org/10.1371/journal.pone.0183970.s005

(PDF)

Acknowledgments

The authors acknowledge Jardín Botánico de Iturrarán (Parque Nacional de Pagoeta, Aia, Guipúzcoa), especially to Francisco Garín, for providing plant material used in this experiment and help during the recollection. The authors are grateful to Trinidad Garcia for her technical support at the Serveis Científico-Tècnics of UIB and to Miquel Truyols at the UIB Experimental Field and Greenhouses. Dr. Miquel Ribas-Carbó (UIB) and Prof. Andrew Young (LJMU) are acknowledged for improving a previous version of the manuscript.

References

  1. 1. Whitney SM, Houtz RL, Alonso H. Advancing our understanding and capacity to engineer nature’s CO2-sequestering enzyme, Rubisco. Plant Physiol. 2011; 155:27–35. pmid:20974895
  2. 2. Ellis RJ. The most abundant protein on earth. Trends Biochem Sci. 1979; 4:241–244.
  3. 3. Berry J, Bjorkman O. Photosynthetic response and adaptation to temperature in higher plants. Ann Rev Plant Physio. 1980; 31:491–543.
  4. 4. Ehleringer J, Mooney HA. Productivity of desert and Mediterranean-climate plants. In: Lange OL, Nobel PS, Osmond CB, Ziegler H, editors. Physiological plant ecology IV. Springer Berlin Heidelberg, 1983; pp 205–231.
  5. 5. Yamori W, Takahashi S, Makino A, Price GD, Badger MR, von Caemmerer S. The roles of ATP synthase and the cytochrome b6/f complexes in limiting chloroplast electron transport and determining photosynthetic capacity. Plant Physiol. 2011; 155:956–962. pmid:21177473
  6. 6. Flexas J, Diaz-Espejo A, Gago J, Gallé A, Galmés J, Gulías J, et al. Photosynthetic limitations in Mediterranean plants: a review. Environ Exp Bot. 2014; 103:12–23.
  7. 7. Hozain MDI, Salvucci ME, Fokar M, Holaday AS. The differential response of photosynthesis to high temperature for a boreal and temperate Populus species relates to differences in Rubisco activation and Rubisco activase properties. Tree Physiol 2010; 30:32. pmid:19864261
  8. 8. Galmés J, Ribas-Carbó M, Medrano H, Flexas J. Rubisco activity in Mediterranean species is regulated by the chloroplastic CO2 concentration under water stress. J Exp Bot. 2011; 62:653–665. pmid:21115663
  9. 9. Delgado E, Medrano H, Keys AJ, Parry MAJ. Species variation in Rubisco specificity factor. J Exp Bot. 1995; 46:1775–1777.
  10. 10. Raven JA. Land plant biochemistry. Philos T R Soc B. 2000; 355:833–846.
  11. 11. Sage RF. Variation in the kcat of Rubisco in C3 and C4 plants and some implications for photosynthetic performance at high and low temperature. J Exp Bot. 2002; 53:609–620. pmid:11886880
  12. 12. Galmés J, Flexas J, Keys AJ, Cifre J, Mitchell RAC, Madgwick PJ, et al. Rubisco specificity factor tends to be larger in plant species from drier habitats and in species with persistent leaves. Plant Cell Environ. 2005; 28:571–579.
  13. 13. Tcherkez GG, Farquhar GD, Andrews TJ. Despite slow catalysis and confused substrate specificity, all ribulose bisphosphate carboxylases may be nearly perfectly optimized. P Natl Acad Sci. 2006; 103:7246–7251.
  14. 14. Savir Y, Noorb E, Milob R, Tlustya T. Cross-species analysis traces adaptation of Rubisco toward optimality in a low-dimensional landscape. P Natl Acad Sci USA. 2010; 107:3475–3480.
  15. 15. Galmes J, Andralojc PJ, Kapralov MV, Flexas J, Keys AJ, Molins A, et al. Environmentally driven evolution of Rubisco and improved photosynthesis and growth within the C3 genus Limonium (Plumbaginaceae). New Phytol. 2014; 203:989–999. pmid:24861241
  16. 16. Galmés J, Conesa MÀ, Díaz-Espejo A, Mir A, Perdomo JA, Niinemets Ü, et al. Rubisco catalytic properties optimized for present and future climatic conditions. Plant Sci. 2014; 226:61–70. pmid:25113451
  17. 17. Zhu XG, Portis AR, Long SP. Would transformation of C3 crop plants with foreign Rubisco increase productivity? A computational analysis extrapolating from kinetic properties to canopy photosynthesis. Plant Cell Environ. 2004; 27:155–165.
  18. 18. Galmes J, Kapralov MV, Andralojc P, Conesa MÀ, Keys AJ, Parry MA, et al. Expanding knowledge of the Rubisco kinetics variability in plant species: environmental and evolutionary trends. Plant Cell Environ. 2014; 37:1989–2001. pmid:24689692
  19. 19. Yeoh HH, Badger MR, Watson L. Variations in Km (CO2) of ribulose-1,5-bisphosphate carboxylase among grasses. Plant Physiol. 1980; 66:1110–1112. pmid:16661586
  20. 20. Yeoh HH, Badger MR, Watson L. Variations in kinetic properties of ribulose-1,5-bisphosphate carboxylase among plants. Plant Physiol. 1981; 67:1151–1155. pmid:16661826
  21. 21. Seemann JR, Badger MR, Berry JA. Variations in the specific activity of ribulose-1,5-bisphosphate carboxylase between species utilizing differing photosynthetic pathways. Plant Physiol. 1984; 74:791–794. pmid:16663511
  22. 22. Ghannoum O, Evans JR, Chow WS, Andrews TJ, Conroy JP, Von Caemmerer S. Faster Rubisco is the key to superior nitrogen-use efficiency in NADP-malic enzyme relative to NAD-malic enzyme C4 grasses. Plant Physiol. 2005; 137:638–650. pmid:15665246
  23. 23. Kubien DS, Whitney SM, Moore PV, Jesson LK. The biochemistry of Rubisco in Flaveria. J Exp Bot. 2008; 59:1767–1777. pmid:18227079
  24. 24. Carmo-Silva AE, Keys AJ, Andralojc PJ, Powers SJ, Arrabaça MC, Parry MA. Rubisco activities, properties, and regulation in three different C4 grasses under drought. J Exp Bot. 2010; erq071.
  25. 25. Brooks A, Farquhar GD. Effect of temperature on the CO2/O2 specificity of ribulose-1, 5-bisphosphate carboxylase/oxygenase and the rate of respiration in the light. Planta. 1985; 165:397–406. pmid:24241146
  26. 26. Uemura K, Miyachi S, Yokota A. Ribulose-1,5-bisphosphate carboxylase/oxygenase from thermophilic red algae with a strong specificity for CO2 fixation. Biochem Bioph Res Co. 1997; 233:568–571.
  27. 27. Andersson I, Backlund A. Structure and function of Rubisco. Plant Physiol Bioch. 2008; 46:275–291.
  28. 28. Spreitzer RJ, Salvucci ME. Rubisco: structure, regulatory interactions, and possibilities for a better enzyme. Annu Rev Plant Biol. 2002; 53:449–475. pmid:12221984
  29. 29. Genkov T, Spreitzer RJ. Highly conserved small subunit residues influence rubisco large subunit catalysis. J Biol Chem, 2009; 284:30105–30112. pmid:19734149
  30. 30. Ishikawa C, Hatanaka T, Misoo S, Miyake C, Fukayama H. Functional incorporation of sorghum small subunit increases the catalytic turnover rate of Rubisco in transgenic rice. Plant Physiol. 2011; 156:1603–1611. pmid:21562335
  31. 31. Morita K, Hatanaka T, Misoo S, Fukayama H. Unusual small subunit that is not expressed in photosynthetic cells alters the catalytic properties of Rubisco in rice. Plant Physiol. 2014; 164:69–79. pmid:24254313
  32. 32. Kanevski I, Maliga P, Rhoades DF, Gutteridge S. Plastome engineering of ribulose-1,5- bisphosphate carboxylase/oxygenase in tobacco to form a sunflower large subunit and tobacco small subunit hybrid. Plant Physiol. 1999; 119:133–141. pmid:9880354
  33. 33. Sharwood RE, von Caemmerer S, Maliga P, Whitney SM. The catalytic properties of hybrid Rubisco comprising tobacco small and sunflower large subunits mirror the kinetically equivalent source Rubiscos and can support tobacco growth. Plant Physiol. 2008; 146:83–96. pmid:17993544
  34. 34. Whitney SM, Sharwood RE, Orr D, White SJ, Alonso H, Galmés J. Isoleucine 309 acts as a C4 catalytic switch that increases ribulose-1,5-bisphosphate carboxylase/oxygenase (rubisco) carboxylation rate in Flaveria. P Natl Acad Sci. 2011; 108:14688–14693.
  35. 35. Zhang XH, Webb J, Huang YH, Lin L, Tang RS, Liu A. Hybrid Rubisco of tomato large subunits and tobacco small subunits is functional in tobacco plants. Plant Sci. 2011; 180:480–488. pmid:21421395
  36. 36. Galmés J, Aranjuelo I, Medrano H, Flexas J. Variation in Rubisco content and activity under variable climatic factor. Photosynth Res. 2013; 117:73–90. pmid:23748840
  37. 37. Occhialini A, Lin MT, Andralojc PJ, Hanson MR, Parry MA. Transgenic tobacco plants with improved cyanobacterial Rubisco expression but no extra assembly factors grow at near wild-type rates if provided with elevated CO2. Plant J. 2016; 85:148–160. pmid:26662726
  38. 38. Hermida-Carrera C, Kapralov MV, Galmés J. Rubisco catalytic properties and temperature response in crops. Plant Physiol. 2016; 171:2549–2561. pmid:27329223
  39. 39. Orr DJ, Alcântara A, Kapralov MV, Andralojc PJ, Carmo-Silva E, Parry MA. Surveying Rubisco diversity and temperature response to improve crop photosynthetic efficiency. Plant Physiol. 2016; 172:707–717. pmid:27342312
  40. 40. Sharwood RE, Ghannoum O, Kapralov MV, Gunn LH, Whitney S. Temperature responses of Rubisco from Paniceae grasses provide opportunities for improving C3 photosynthesis. Nat Plants. 2016. pmid:27892943
  41. 41. Kapralov MV, Filatov DA. Molecular adaptation during adaptive radiation in the Hawaiian endemic genus Schiedea. PLoS One. 2006; 1:e8. pmid:17183712
  42. 42. Kapralov MV, Filatov DA. Widespread positive selection in the photosynthetic Rubisco enzyme. BMC evolutionary biology. 2007; 7:73. pmid:17498284
  43. 43. Christin PA, Salamin N, Muasya AM, Roalson EH, Russier F, Besnard G. Evolutionary switch and genetic convergence on rbcL following the evolution of C4 photosynthesis. Mol Biol Evol. 2008; 25:2361–2368. pmid:18695049
  44. 44. Iida S, Miyagi A, Aoki S, Ito M, Kadono Y, Kosuge K. Molecular adaptation of rbcL in the heterophyllous aquatic plant Potamogeton. PLoS One. 2009; 4:e4633. pmid:19247501
  45. 45. Kato S, Misawa K, Takahashi F, Sakayama H, Sano S, Kosuge K, et al. Aquatic plant speciation affected by diversifying selection of organelle DNA regions. J phycol. 2011; 47:999–1008. pmid:27020181
  46. 46. Miwa H, Odrzykoski IJ, Matsui A, Hasegawa M, Akiyama H, Jia Y, Murakami N. Adaptive evolution of rbcL in Conocephalum (Hepaticae, bryophytes). Gene. 2009; 441:169–175. pmid:19100313
  47. 47. Sen L, Fares MA, Liang B, Gao L, Wang B, Wang T, Su YJ. Molecular evolution of rbcL in three gymnosperm families: identifying adaptive and coevolutionary patterns. Biol Direct. 2011; 6:29. pmid:21639885
  48. 48. Kapralov MV, Kubien DS, Andersson I, Filatov DA. Changes in Rubisco kinetics during the evolution of C4 photosynthesis in Flaveria (Asteraceae) are associated with positive selection on genes encoding the enzyme. Mol Biol Evol. 2011; 28:1491–1503. pmid:21172830
  49. 49. Kapralov MV, Smith JAC, Filatov DA. Rubisco evolution in C4 eudicots: an analysis of Amaranthaceae sensu lato. PLoS One. 2012; 7:e52974. pmid:23285238
  50. 50. Young JN, Rickaby REM, Kapralov MV, Filatov DA. Adaptive signals in algal Rubisco reveal a history of ancient atmospheric carbon dioxide. Philos T R Soc B. 2012; 367:483–492.
  51. 51. Weinreich DM, Watson RA, Chao L. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution. 2005. 59:1165–1174. pmid:16050094
  52. 52. Raines CA. Transgenic approaches to manipulate the environmental responses of the C3 carbon fixation cycle. Plant Cell Environ. 2006; 29:331–339. pmid:17080589
  53. 53. Dutheil J. Evolution and Structure of Biomolecules. Evolution. 2008; (1/23).
  54. 54. Jiang X, Fares MA. Identifying coevolutionary patterns in human leukocyte antigen (Hla) molecules. Evolution. 2010; 64:1429–1445. pmid:19930454
  55. 55. Codoñer FM, Fares MA. Why should we care about molecular coevolution? Evol bioinform. 2008; 4:29.
  56. 56. Wang M, Kapralov MV, Anisimova M. Coevolution of amino acid residues in the key photosynthetic enzyme Rubisco. BMC Evolutionary Biology. 2011; 11:266. pmid:21942934
  57. 57. Schwabacher M, Aguilar R, Figueroa F. Using decision trees to detect and isolate simulated leaks in the J-2X rocket engine. IEEE Aeros Conf. 2009; pp 1–7.
  58. 58. Simard M, Saatchi SS, De Grandi G. The use of decision tree and multiscale texture for classification of JERS-1 SAR data over tropical forest. IEEE T Geosci Remote. 2000; 38:2310–2321.
  59. 59. Yang CC, Prasher SO, Enright P, Madramootoo C, Burgess M, Goel PK, Callum I. Application of decision tree technology for image classification using remote sensing data. Agr Syst. 2003; 76:1101–1117.
  60. 60. Soleimanian F, Mohammadi P, Hakimi P. Application of Decision Tree Algorithm for Data Mining in Healthcare Operations: A Case Study. Int J Comput Appl. 2012; 52:21–26.
  61. 61. James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning. 2015. New York: Springer p. 315.
  62. 62. Corcuera L, Camarero JJ, Gil-Pelegrín E. Functional groups in Quercus species derived from the analysis of pressure-volume curves. Trees-Struct Funct. 2002; 16:465–472.
  63. 63. Flexas J, Ribas-Carbó M, Diaz-Espejo A, Galmes J, Medrano H. Mesophyll conductance to CO2: current knowledge and future prospects. Plant Cell Environ. 2008; 31:602–621. pmid:17996013
  64. 64. Govaerts R, Frodin DG. World checklist and bibliography of Fagales (Betulaceae, Corylaceae, Fagaceae and Ticodendraceae). 1st ed. Kew, UK: The Royal Botanic Gardens; 1998.
  65. 65. http://www.efloras.org/
  66. 66. http://esp.cr.usgs.gov/data/little/
  67. 67. http://www.kew.org/
  68. 68. http://www.luomus.fi/en/atlas-florae-europaeae-afe-distribution-vascular-plants-europe
  69. 69. Peel MC, Finlayson BL, McMahon TA. Updated world map of the Köppen-Geiger climate classification. Hydrol Earth Syst Sc. 2007; 4:439–473.
  70. 70. Sancho-Knapik D, Álvarez-Arenas TG, Peguero-Pina JJ, Fernández V, Gil-Pelegrín E. Relationship between ultrasonic properties and structural changes in the mesophyll during leaf dehydration. J Exp Bot. 2011; 62:3637–3645. pmid:21414961
  71. 71. Manos P, Steele K. Phylogenetic analyses of "higher" Hamamelididae based on plastid sequence data. Am J Bot. 1997; 84:1407–1407. pmid:21708548
  72. 72. Piredda R, Simeone MC, Attimonelli M, Bellarosa R, Schirone B. Prospects of barcoding the Italian wild dendroflora: oaks reveal severe limitations to tracking species identity. Mol Ecol Resour. 2011; 11:72–83. pmid:21429102
  73. 73. Ueno S, Taguchi Y, Tsumura Y. Microsatellite markers derived from Quercus mongolica var. crispula (Fagaceae) inner bark expressed sequence tags. Genes Genet Sys. 2008; 83:179–187.
  74. 74. Steinkellner H, Lexer C, Turetschek E, Glössl J. Conservation of (GA)n microsatellite loci between Quercus species. Mol Ecol. 1997; 6:1189–1194.
  75. 75. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997; 25:4876–4882. pmid:9396791
  76. 76. Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl acid S. 1999; 41:95–98.
  77. 77. Rambaut A. Figtree 1.4.0. 2012. http://tree.bio.ed.ac.uk/software/figtree/
  78. 78. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011; 28:2731–2739. pmid:21546353
  79. 79. http://www.ebi.ac.uk/Tools/msa/mafft/
  80. 80. Posada D, Crandall KA. Modeltest: testing the model of DNA substitution. Bioinformatics. 1998; 14:817–818. pmid:9918953
  81. 81. Posada D, Buckley TR. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic biology. 2004; 53:793–808. pmid:15545256
  82. 82. Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003; 19:1572–1574. pmid:12912839
  83. 83. Yang Z, Wong WS, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005; 22: 1107–1118. pmid:15689528
  84. 84. Nielsen R, Yang Z. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998; 148:929–936. pmid:9539414
  85. 85. Yang Z, Nielsen R, Goldman N, Pedersen AMK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000; 155:431–449. pmid:10790415
  86. 86. Fares MA, McNally D. CAPS: coevolution analysis using protein sequences. Bioinformatics. 2006; 22:2821–2822. pmid:17005535
  87. 87. R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
  88. 88. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Chapmann and Hall;1984.
  89. 89. Manos PS, Doyle JJ, Nixon KC. Phylogeny, biogeography, and processes of molecular differentiation in Quercus subgenus Quercus (Fagaceae). Mol Phylogenet Evol. 1999; 12:333–349. pmid:10413627
  90. 90. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007; 24:1586–1591. pmid:17483113
  91. 91. Jordan DB, Ogren WL. Species variation in the specificity of ribulose biphosphate carboxylase/oxygenase. Nature. 1981; 291:513–515.
  92. 92. Ott CM, Smith BD, Portis AR, Spreitzer RJ. Activase Region on Chloroplast Ribulose-1,5-bisphosphate Carboxylase/Oxygenase nonconservative substitution in the large subunit alters species specificity of protein interaction. J Biol Chem. 2000; 275:26241–26244. pmid:10858441
  93. 93. Portis AR Jr. Rubisco activase—Rubisco's catalytic chaperone. Photosynth Res. 2003; 75:11–27. pmid:16245090
  94. 94. Du YC, Spreitzer RJ. Suppressor mutations in the chloroplast-encoded large subunit improve the thermal stability of wild-type ribulose-1, 5-bisphosphate carboxylase/oxygenase. J Biol Chem. 2000; 275:19844–19847. pmid:10779514
  95. 95. Spreitzer RJ, Salvucci ME. Rubisco: structure, regulatory interactions, and possibilities for a better enzyme. Annu Rev Plant Biol. 2002; 53:449–475. pmid:12221984
  96. 96. Knight S, Andersson I, Brändén CI. Crystallographic analysis of ribulose1,5-bisphosphate carboxylase from spinach at 2· 4 Å resolution: Subunit interactions and active site. J Mol Biol. 1990; 215:113–160. pmid:2118958
  97. 97. Zhu XG, Jensen RG, Bohnert HJ, Wildner GF, Schlitter J. Dependence of catalysis and CO2/O2 specificity of Rubisco on the carboxy-terminus of the large subunit at different temperatures. Photosynth Res. 1998; 57:71–79.
  98. 98. Niinemets Ü, Sack L. Structural determinants of leaf light-harvesting capacity and photosynthesis potentials. In: Esser K, Lüttge UE, Beyschlag W, Murata J, editors. Progress in botany. Springer Verlag, Berlin; 2006. pp. 385–419.
  99. 99. Kellogg E, Juliano N. The structure and function of Rubisco and their implications for systematic studies. Am J Bot. 1997; 84:413–413. pmid:21708595
  100. 100. Makowski M, Sobolewski E, Czaplewski C, Oldziej S, Liwo A, Scheraga HA. Simple physics-based analytical formulas for the potentials of mean force for the interaction of amino acid side chains in water. IV. Pairs of different hydrophobic side chains. J Phys Chem B. 2008; 112:11385–11395. pmid:18700740