Flexibility and constraint: Evolutionary remodeling of the sporulation initiation pathway in Firmicutes

The evolution of signal transduction pathways is constrained by the requirements of signal fidelity, yet flexibility is necessary to allow pathway remodeling in response to environmental challenges. A detailed understanding of how flexibility and constraint shape bacterial two component signaling systems is emerging, but how new signal transduction architectures arise remains unclear. Here, we investigate pathway remodeling using the Firmicute sporulation initiation (Spo0) pathway as a model. The present-day Spo0 pathways in Bacilli and Clostridia share common ancestry, but possess different architectures. In Clostridium acetobutylicum, sensor kinases directly phosphorylate Spo0A, the master regulator of sporulation. In Bacillus subtilis, Spo0A is activated via a four-protein phosphorelay. The current view favors an ancestral direct phosphorylation architecture, with the phosphorelay emerging in the Bacillar lineage. Our results reject this hypothesis. Our analysis of 84 broadly distributed Firmicute genomes predicts phosphorelays in numerous Clostridia, contrary to the expectation that the Spo0 phosphorelay is unique to Bacilli. Our experimental verification of a functional Spo0 phosphorelay encoded by Desulfotomaculum acetoxidans (Class Clostridia) further supports functional phosphorelays in Clostridia, which strongly suggests that the ancestral Spo0 pathway was a phosphorelay. Cross complementation assays between Bacillar and Clostridial phosphorelays demonstrate conservation of interaction specificity since their divergence over 2.7 BYA. Further, the distribution of direct phosphorylation Spo0 pathways is patchy, suggesting multiple, independent instances of remodeling from phosphorelay to direct phosphorylation. We provide evidence that these transitions are likely the result of changes in sporulation kinase specificity or acquisition of a sensor kinase with specificity for Spo0A, which is remarkably conserved in both architectures. We conclude that flexible encoding of interaction specificity, a phenotype that is only intermittently essential, and the recruitment of kinases to recognize novel environmental signals resulted in a consistent and repeated pattern of remodeling of the Spo0 pathway.


S1.1 Halanaerobiales and Natranaerobiales
The Halanaerobiales and Natranaerobiales are anaerobic, halophilic extremophiles [Mesbah et al., 2007;Roush et al., 2014]. The Antunes data set includes all currently available, fully sequenced genomes from these taxa; i.e., the genomes of Natranaerobius thermophilus and five members of the Halanaerobiales. The Yutin data set includes N. thermophilus and one Halanaerobiales genome. No genomes from these taxa were included in our data set.
Analysis of these genomes for Spo0 components revealed putative Spo0B and Spo0F orthologs in the only available Natranaerobiales genome (Natranaerobius thermophilus), which is consistent with a phosphorelay, although this species is a reported non-spore former [Mesbah et al., 2007]. One spore-forming member of the Natranaerobiales order has been reported, Natranaerobaculum magdiense [Zavarzina et al., 2013], but a whole genome sequence for this species has not been published.
All Halanaerobiales analyzed encode Spo0A and at least one PAS-containing histidine kinase, but neither Spo0F, nor Spo0B. Again, all of the Halanaerobiales species included in either the Antunes or Yutin tree are considered to be asporogenous [Mavromatis et al., 2009;Oren et al., 1991;Vos et al., 2009;Zhilina et al., 1996]. However, several species within the order Halanaerobiales have been reported to form spores, including Halonatronum saccharophilum [Zhilina et al., 2012], Fuchsiella ferrireducens [Zhilina et al., 2015], and Natroniella acetigena [Zhilina et al., 1996]. The genus Sporohalobacter was initially reported to form spores [Oren et al., 1991], but subsequent characterization called this result into question as no growth was obtained following heat treatment [Ben Abdallah et al., 2015]. Swollen end cells were characterized as pre-spore-like structures, but no spores were observed via microscopy. These experimental differences could be due to the conditions tested or different assessment of what constitutes a spore. Of spore-forming Halanaerobiales, the only available genome is Halonatronum saccharophilum, which does not encode Spo0F or Spo0B orthologs according to our genome neighborhood conservation method.
Genomes from Halanaerobiales and Natranaerobiales form a clade in both recently published trees, but the location of that clade, in relation to other Firmicutes differs. In the Antunes tree, these taxa are basal to the divergence of Classes Bacilli and Clostridia. Since this placement makes them the earliest branching clade within the Firmicutes, the presence of an apparent phosphorelay architecture in N. thermophilus supports the hypothesis that the emergence of the Spo0 phosphorelay predates the divergence of the Clostridia and Bacilli.
In the Yutin tree, the clade (which includes one representative of each order, Natranaerobius thermophilus and Halothermothrix oreni) is one of several descendants of a polytomy at the base of Class Clostridia. Since this node is unresolved, this placement neither supports, nor refutes our prediction that the common ancestor of the Clostridia and Bacilli likely encoded a phosphorelay Spo0 pathway.
Regardless of their phylogenetic placement in the context of other species, the presence of a phosphorelay in the Natranaerobiales and a direct phosphorylation architecture in the Halanaerobiales adds to the patchy distribution of Spo0 architectures observed throughout the Phylum, requiring an additional remodeling event to explain the present-day phylogenetic distribution. The placement of these sister taxa in the two recently published trees does not contradict the hypothesis that the phosphorelay architecture was present in the common ancestor of the Bacilli or Clostridia; moreover, the evidence from one of those studies suggests that it predates that common ancestor.
All trees agree on the relationships between these taxa: Alkaliphilus is a sister taxon to Clostridiodes difficile and other Peptostreptococcaceae, when Gottschalkia is not present [Antunes et al., 2016], and vice versa [Yutin and Galperin, 2013]. When both are present (as in our tree), Gottschalkia is basal to a clade that includes both Alkaliphilus and Clostridiodes. Notably, there are asporogenous species interleaved between these taxa in all three trees.
These species are particularly interesting because, although closely related, they have different predicted Spo0 architectures. A. metalliredigens and G. acidurici were found to encode homologs of Spo0F and Spo0B and are therefore likely to initiate sporulation via a phosphorelay. No phosphorelay homologs were observed in either A. oremlandii or C. difficile, suggesting that these sporulating species have a direct phosphorylation Spo0 pathway. The presence of the phosphorelay homologs in G. acidurici and A. metalliredigens suggests that the phosphorelay has persisted despite repeated losses of Spo0F and Spo0B and/or sporulation within closely related taxa. This mixed distribution implies multiple transitions from phosphorelay to direct phosphorylation architecture within the Clostridiales, one at the base of each divergent group. This inference is supported by all three trees.

S1.3 Predicted architectures within Class Bacilli
Each tree has a different set of species from Class Bacilli, but the results of these differences are not at variance with the observations made here. Homologs of Spo0F and Spo0B were detected by our methods in all genomes in spore-forming Bacilli represented in the three data sets, with the exception of two Erysipelaclostridium genomes, two Paenibacillus genomes and the genome of Sporolactobacillus inulinus. Each of these exceptions is treated below.

Erysipelatoclostridium: The genomes of Erysipelatoclostridium ramosum DSM 1402 and
Erysipelatoclostridium spiroforme DSM 1552 both encode Spo0A; E. spiroforme additionally encodes an orphan kinase. Spore formation in these species has been described as "rare or absent" [Kaneuchi et al., 1979;Lavigne et al., 2003;Yutin and Galperin, 2013]. No Erysipelatrichiaceae species were included in the Antunes data set. In the Yutin tree, these genomes are basal to all other Bacilli. In our tree, these genomes are located within the Bacillaceae clade. However, since the Yutin data set does not include any early branching genomes in class Bacilli (e.g. Paenibacillaceae), the branching order in the two trees is consistent.
Paenibacillus: Both Paenibacillaceae genomes, Paenibacillus polymyxa and Brevibacillus brevis, included in our representative set possess all phosphorelay components. In our tree, these genomes form a clade that is a sister taxon to the Bacillaceae. These taxa are not represented in the Yutin tree. The Antunes tree includes the genomes of six members of the Paenibacillaceae, including genomes from the genera Desmospora, Brevibacillus, Paenibacillus, and Thermobacillus. These species are phylogenetically placed basal to the Bacillaceae, though paraphyletically.
Spo0B was not identified in two of these species, Paenibacillus mucilaginosa and Paenibacillus sp. Y412MC10. In both cases, inspection of the genome neighborhood of these two species reveals a hypothetical protein with the similar in sequence and domain content to Spo0B, though it appears to be missing a stop codon. This could be a loss of function mutation or due to an error in sequencing or assembly. This hypothetical protein was accepted as sufficient evidence for the presence of Spo0B in our analysis.
If Spo0B is truly a pseudogene in these species, this could indicate either the loss of sporulation or gain of the ability to sporulate via direct phosphorylation of Spo0A in these individuals. Interestingly, the Paenibacillus polymyxa kinase PP 1077 has been reported to directly phosphorylate P. polymyxa Spo0A when heterologously expressed in a B. subtilis mutant lacking Spo0B but not lacking any of the B. subtilis sporulation kinases [Park et al., 2012]. Spo0 architectures in these species warrant further investigation.
Sporolactobacillus: S. inulinus is present in the Antunes data set, but was not included in either our tree or the Yutin tree. The Antunes tree places this species basal to the Bacillaceae, diverging after the Paenibacillaceae. Spo0F was not identified by conserved genome neighborhood in S. inulinus, although S. inulinus is reported to produce endospores [Kitahara and Lai, 1967]. However, inspection of predicted RRs lacking an output domain in that species did reveal a possible candidate (SINU 10335). This protein aligns well to known and predicted Spo0F sequences and has specificity residues typical of Spo0F (QGILEVD), although it is not encoded in the proximity of any of the Spo0F neighborhood markers. All other single-domain RRs in this species had less sequence similarity to Spo0Fs and specificity residues that did not reflect the Spo0F signature ( Fig  5). This was accepted as sufficient evidence for the presence of Spo0F in our analysis.

S1.4 Ruminococcaceae and Lachnospiraceae
Species from the Ruminococcaceae and Lachnospiraceae are well-sampled in all three trees, however the relationship of these two families with respect to the Clostridiaceae varies slightly. In our tree and the Antunes tree, they are sister taxa to the Clostridiaceae, while the relationship between these three clades is not resolved in the Yutin tree. All sporogenous members of the Ruminococcaceae and Lachnospiraceae families are predicted to encode a direct phosphorylation architecture. If these two clades represent distinct lineages that are not sister taxa, as possible in the Yutin tree, then an additional transition between phosphorelay and direct phosphorylation architecture is required to explain the phylogenetic distribution of Spo0 pathway architectures in these species.

S1.5 Spore-formers with symbiotic or syntrophic life styles
Several spore-forming species in the Antunes tree have symbiotic or syntrophic lifestyles, including members of the genera Symbiobacterium [Ohno et al., 2000], Thermaerobacter , Tepidanaerobacter [Westerholm et al., 2011], and Thermosediminibacter . Only Symbiobacterium thermophilus was included in the Yutin tree and none were included in our tree.
Candidate Spo0F and Spo0B orthologs were not found in the Symbiobacterium and Thermaerobacter genomes, but were found in the closely related species, Sulfobacillus acidophilus DSM 10332, which has a typical free-living lifestyle [Norris et al., 1996]. The sister taxa Tepidanaerobacter and Thermosediminibacter encode orthologs of Spo0B, but not Spo0F, and lack orphan kinases. Of these species, all but Thermosediminibacter oceani have been observed to produce spores.
Interpreting the Spo0 pathway architectures in these species is complicated by their symbiotic nature. For example, Symbiobacterium thermophilum displays marked growth dependence on microbial commensalism with Bacillus sp. strain S [Ueda et al., 2004]. Similarly, sporulation increases from 0.1% to 20% when cultured in a dialysis flask with a constant influx of media used by Bacillus sp. strain S [Ueda et al., 2004], suggesting an apparent reliance on external factors to initiate sporulation. These external factors could be small signaling molecules or, potentially, proteins encoded by Bacillus sp. strain S that facilitate the phosphorylation of Spo0A in S. thermophilum. Further work on the factors on which these syntrophic and symbiotic bacteria rely may reveal the mechanism of initiation of sporulation in these species. Taking a conservative stance, we do not interpret the absence of genes encoding Spo0 pathway proteins to be evidence of an alternative Spo0 pathway architecture in symbiotic or syntrophic strains.