Figure 1.
Maximum likelihood phylogenies of the ppGpp synthetase and hydrolase domains.
Trees were generated from RaxML analyses of alignments of A) the ppGpp hydrolase (HD) domain-containing dataset (168 amino acid positions, 1535 sequences), and B) the ppGpp synthetase (SYNTH) domain-containing dataset (670 amino acid positions, 1706 sequences). In both trees, subgroups are labeled and shading behind the branches shows the most common domain structure observed for those groups, as per the legend in the inset box. Symbols on branches indicate bootstrap support, as per the inset box. In cases where the whole subgroup carries both the HD and SYNTH domain (Rel, SpoT, RelA, Rsh1-4, RshA-D), bootstrap support comes from the full length long RSH tree (supplementary file SI2). Branch length is proportional to the number of substitutions per site (see scale bar).
Table 1.
The 30 subgroups of RSHs, their taxonomic distributions and additional descriptions.
Table 2.
Typical combinations of RSHs in bacteria.
Figure 2.
Consensus alignment of long RSHs, with RelSeq NTD structure colored according to conservation patterns in the alignment.
A) Domain structure of the long RSHs, with domain lengths to scale with S. equisimilis Rel[HS]. B) Alignment of long RSH sequences at the 70% level. Secondary structure is shown below the alignment, with “)” characters indicating helices and “>” characters indicating sheets. Secondary structure is obtained from the structure of Rel [40] until position 362, after which second Psipred was used to predict the secondary structure. Disordered regions in the structure are underlaid with a pale grey box, and disordered regions predicted with Disopred are a darker grey. Highlighting of residue columns indicates conservation patterns (also see inset box). Blue highlighting indicates sites that are conserved across all long RSHs. Green highlighting shows those sites that are distinctive in RelA[hS] (strongly differentially conserved or conserved only in RelA[hS]). Yellow highlighted sites are well conserved in Rel[HS]+SpoT[HS] but less so RelA[hS], while purple highlighted sites are well conserved in Rel[HS]+RelA[hS] and less so in SpoT[HS]. Lines beneath the alignment indicate domains with the following colours: dark blue – HD, red – SYNTH, light blue – TGS, green – helical, turquoise – CC, magenta – ACT. Blue and red boxes show sites of the HD and SYNTH nucleotide binding pockets, respectively. Colored boxes in the TGS and ACT domain surround the usually most conserved blocks of these domains as per sequence logos in the Pfam database. The turquoise box in the CC domain indicates the most conserved block of this domain, which also contains the conserved cysteines of [59]. The orange bar above the alignment shows the location of the differentially conserved motif of [44]. Black boxes around RelA[hS] and SpoT[HS] residues show sites that have experienced shifts in substitution rate, as predicted with Diverge. C) Structure of the Rel[HS] protein from S. equisimilis (RelSeq) [40], colored according to the conservation patterns of the alignment in B.
Figure 3.
Schematic diagram for the evolution of long RSHs in bacteria.
Thick gray branches indicate the divergence of bacterial groups, while the inner line shows the divergence of long RSH proteins and their functionality, as per the inset box.
Figure 4.
Bayesian inference phylogeny of plant RSHs.
The tree was generated from a MrBayes analysis of 470 amino acid positions from 66 sequences. Colored sequence names indicate subgroups as follows: red – Rsh1, green – Rsh2, orange – Rsh3, blue – Rsh4, and black – bacterial Rel. Numbers on branches show support in the following format: BIPP/MLBP. Support is only shown for branches with BIPP>0.8. Branch length is proportional to the number of substitutions per site (see scale bar).
Figure 5.
Bayesian inference phylogeny of the SAHs Mesh1 and Mesh1L.
The tree was generated from a MrBayes analysis of 179 amino acid positions from 99 sequences. Branch support and length are shown as described in Fig. 4. Sequence names are colored by taxonomic groups.
Figure 6.
Consensus alignment of long RSH and small RSH subgroups across the ppGpp synthetase and hydrolase domains, with RelSeq NTD structure colored according to conservation patterns in the alignment.
A) Alignment of RSH NTD sequences at the 70% level. Yellow highlighting shows those residues that are only conserved only in long RSHs. Blue and red boxes show sites of the HD and SYNTH nucleotide binding pockets, respectively. Bright turquoise and orange boxes show the location of surface residues in the SYNTH and HD domains respectively that are likely to be involved in inter molecular interactions, or interactions with the CTD in long RSHs. The box is dotted where the region is disordered in the structure. The pale marine box shows those regions that appear to be involved in HD-SYNTH interactions. Arrows show especially interesting sites (see inset box in B). The orange bar above the alignment shows the location of the differentially conserved motif of [44]. B) Structure of the Rel[HS] protein from S. equisimilis (RelSeq) [40], colored according to the conservation patterns of the alignment in A. The inset box shows a subset of particularly interesting sites (labeled with arrows in A). Residue numbering is as in RelSeq, followed by alignment coordinates from A in parentheses.
Table 3.
Long RSH-specific sites that may be involved in intra- or inter-molecular interactions in long RSHs.