Fig 1.
Phylogenetic orthogonal tree depicting major divergence points in the evolutionary history of modern PP2C sequences.
A rooted phylogenetic tree was inferred using BEAST analysis, as detailed in “Materials and Methods”. Two independent chains were run from the same input file for 50 million cycles, collecting 10,000 trees each from the posterior distribution. A “burn-in” of 1,000 trees was discarded from each sample and the remaining trees pooled manually. The log files for the two runs were combined, and are available as supporting information (S4 File). In this figure the root is located in the upper left of the image. Depicted is the maximum clade credibility tree from the posterior distribution tree sample. Nodes display the 95% high posterior density interval in blue. Each branch is labeled with the posterior probability (max = 1.0). Point 1, Point 2 and Point 3 are discussed in the text. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 1.
Fig 2.
Topological uncertainty in the phylogenetic tree summarizing PP2C sequence evolutionary history.
This tree is an alternate display of the same BEAST analysis data used to generate Fig 1. Green lines represent traces of individual trees from amongst the posterior distribution tree sample. In blue is the consensus tree with the highest clade support (“root canal”). Points 1, 2, and 3 are discussed in the text. Each represents a node with a black bar indicating the 95% high posterior density interval.
Fig 3.
Phylogenetic orthogonal tree depicting interrelationships between representative PP2C7 sequences from plants, green algae, and fungi.
Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” A typical example is shown. The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. Predicted in silico subcellular localizations are represented as follows: Ch, chloroplast; Cy, cytosol; M, mitochondria; S, signal peptide; Unk (unknown), sequence fragment lacking native amino terminus. Sequences used in phylogenetic tree generation are listed in Table A in S1 File, while compiled in silico subcellular localization data can be found in Table B in S1 File (non-photosynthetic organisms) and Table F in S1 File (photosynthetic organisms). * = Three algal sequences included in this cluster.
Fig 4.
Phylogenetic radial tree depicting interrelationships between PP2C7 sequences and bacterial Group II PP2C sequences.
The PP2C7 set is a very large and diverse one (238 sequences) and the bacterial Group II sequences are of the “GN” type, from the “More RsbX-Like” assemblage (144 sequences) (sequence varieties described in the text). For this analysis the sequences of the Myxococcales group have been removed (A9GSF9, A6FYN9, A9GWA1, E3FWN8, L7U8R0) (see text for rationale). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. Preliminary analyses showed that attainment of consistent tree topologies between the different inference methods required removal of the following sequences: Q2RIF7, H1Z3D3, C9R9C1, B1I2G2, G2MXY4, Q8RAY2, E4Q3X2. The cluster of sequences from α-Proteobacteria is indicated. The approximate location in the tree of the reference sequence BsP17906 (RsbX_BACSU) is indicated. This tree is based on the amino acid sequence alignment presented in S1(B) Fig. * = αProteobacteria cluster separated into adjacent fragments in this tree.
Fig 5.
Phylogenetic radial tree depicting interrelationships between eukaryotic PP2C sequences and bacterial Group II PP2C sequences.
The eukaryotic PP2C set consists of the combined sequences (excluding PP2C7s) from Arabidopsis and human (96 sequences total—HsTAB1 excluded). The bacterial Group II sequences are of the “GN” type, including the “More RsbX-Like” and “Less RsbX-Like” assemblages (328 sequences total) (see text for explanation of sequence varieties). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. The cluster of sequences from αβγ-Proteobacteria is indicated. “Myxo” designates sequences from the Myxococcales (δ-Proteobacteria). The approximate location in the tree of the reference sequence BsP17906 (RsbX_BACSU) is indicated. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 3. * = Myxococcales unresolved from other sequences in this tree.
Fig 6.
Phylogenetic radial tree depicting a large scale comparison between PP2C7 sequences, bacterial Group II sequences, bacterial Group I sequences and eukaryotic PP2C sequences.
There are 102 representative PP2C7 sequences, and 49 representative bacterial Group I sequences. The eukaryotic PP2C set consists of the combined sequences (excluding PP2C7s) from Arabidopsis and human (96 sequences total—HsTAB1 excluded). The bacterial Group II sequences include representatives from both “Bulk” (50 sequences) and “GN” types (38 sequences) (see text for explanation of sequence varieties). There are also 9 “Eukaryotic-Like” bacterial PP2C sequences (see text for explanation). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. The cluster of “Eukaryotic-Like” bacterial PP2C sequences within the eukaryotic PP2C clade is indicated by colored lines. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 4.
Fig 7.
Structure-guided alignment of bacterial Group II, PP2C7, bacterial Group I, and eukaryotic PP2C sequences.
Information from solved structures of bacterial Group II, bacterial Group I, and eukaryotic PP2Cs (indicated by their four-character PDB codes) was used to guide this alignment, as detailed in “Materials and Methods”. Above the sequences are shown conserved beta-strand and α-helical secondary structure elements. “Box” refers to a more variable region in multiple solved structures. For a secondary structure diagram, including element numbering, see Fig F in S2 File. Sequence motifs are as given in [84]. Universally conserved aspartates involved in metal coordination are given in red. Aspartates conserved in some but not all sequences are given in purple and orange (see text for discussion). The inset shows a simplified phylogenetic tree, with the proposed evolutionary advent of critical aspartate residues indicated. See Table A in S1 File for a listing of PP2C7 sequences. Bacterial Group II sequences without solved structures are from UniProt. Species for sequences are as follows: Bs (Bacillus subtilis); Ssp (Synechocystis sp.); Pa (Pseudomonas aeruginosa); Mt (Moorella thermoacetica); Tb (Trypanosoma brucei); Lm (Leishmania major); Tt (Tetrahymena thermophila); Ppa (Physcomitrella patens); At (Arabidopsis thaliana); Cr (Chlamydomonas reinhardtii); Vc (Volvox carteri); Ps (Phytophthora sojae); Ng (Naegleria gruberi); Xl (Xenopus laevis); Hs (Homo sapiens); Dm (Drosophila melanogaster); Fg (Fusarium graminearum); An (Aspergillus niger); Sc (Saccharomyces cerevisiae); Mtu (Mycobacterium tuberculosis); Sa (Streptococcus agalactiae); Ms (Mycobacterium smegmatis); Te (Thermosynechococcus elongatus); Ag (Anopheles gambiae).
Table 1.
Summary of plant PP2C7 gene expression data.