Evolution of a family of molecular Rube Goldberg contraptions

Case studies of the evolution of molecular machines remain scarce. One of the most diverse and widespread homologous families of machines is the type IV filament (TFF) superfamily, comprised of type IV pili, type II secretion systems (T2SSs), archaella, and other less-well-characterized families. These families have functions including twitching motility, effector export, rotary propulsion, nutrient uptake, DNA uptake, and even electrical conductance, but it is unclear how such diversity evolved from a common ancestor. In this issue, Denise and colleagues take a significant step toward understanding evolution of the TFF superfamily by determining a global phylogeny and using it to infer an evolutionary pathway. Results reveal that the superfamily predates the divergence of Bacteria and Archaea, and show how duplications, acquisitions, and losses coincide with changes in function. Surprises include that tight adherence (Tad) pili were horizontally acquired from Archaea and that T2SSs were relatively recently repurposed from type IV pili. Results also enable better understanding of the function of the ATPase family that powers the superfamily. The study highlights the role of tinkering by exaptation—the repurposing of pre-existing functions for new roles—in the diversification of molecular machines.

How life innovates has long fascinated humankind. Central to evolutionary innovation is exaptation, in which existing features are co-opted for new functions [1]. For example, avian flight evolved by exaptation of feathers for lift (feathers originated as insulators) [2], flexible wrists for wing folding (such wrists preceded flight) [3], and hollow bones for reduced weight (air pockets evolved with other functions and did not originally enter bones) [4], leading to an integrated flight system. In turn, such exaptations push the range of subsequent innovations into the "adjacent possible" [5]-in the case of flight, feathers were further exapted to become steering devices [6]. Thus, even with "nothing new under the sun," evolution repurposes or combines previously disparate elements to create new functions.
Exaptation is also responsible for evolutionary innovations in the molecular machines of life. For example, injectisomes-nanoscale syringes used by Bacteria to pump toxins into our cells, causing diseases including food poisoning, chlamydia, and plague-have exapted the assembly process of bacterial flagella to secrete toxins [7,8]. Another particularly rich family of inter-related molecular machines is the type IV filament (TFF) superfamily, which function as grappling hooks for motility, lassos to capture DNA, straps to harness Bacteria together, propellers for swimming, pumps to export toxins, and siphons to import nutrients [9]. Such astonishing functional diversity in a homologous family of machines highlights the role of molecular exaptation, but how this diversity arose from a core machinery has been unclear, mainly due to the lack of an evolutionary framework to contextualize what is known. The TFF superfamily is composed of a portfolio of molecular machines with diverse functions. First to be discovered were the namesake type IV pili, which extrude long, flexible filaments from the cell surface [9]. Type IV pili were historically subclassified based on differences in sequence lengths [10] into type IVa pili (T4aP), which can be sticky along their length for cell adhesion, retract as grappling hooks for jerky twitching motility [9], and bind DNA for uptake; less widespread type IVb pili (T4bP), associated with adhesion and microcolony formation during enteropathogenesis, including the toxin-coregulated pilus (Tcp) in Vibrio cholerae [11] and bundle-forming pilus in enteropathogenic Escherichia coli [12]; and, distinct from T4aP and T4bP, the tight adherence (Tad) pili, for cell adhesion [13]. More exotic roles include electrical conduction [14] and phage reception [15] by T4aP. Subsequently, type II secretion systems (T2SSs) were discovered to be TFF superfamily members. T2SSs push folded proteins through an outer membrane portal using a short dynamic pseudopilus related to other TFF pili (the distinction is functional: overexpression of pseudopilus components results in hyperpseudopili that resemble other TFF pili [16]). T2SS substrates can be enzymes or pathogenesis-associated effectors. Other curious TFF variants include archaella, which rotate helical pili as propellers despite being unrelated to bacterial flagella or eukaryotic cilia [17]; archaeal bindosomes for sugar uptake that may make pseudopili [18]; competence (Com) pili in monoderm Bacteria that bind DNA and retract for DNA uptake [9]; archaeal UV-induced pili (Ups) for DNA exchange and repair [9]; archaeal EppA dependent (Epd) pili [19]; and mannose-sensitive hemagglutinin (MSH) pili for surface attachment by pathogens.
Perhaps unsurprising for such a diverse family, nomenclature may confuse newcomers. Type IV pili are not to be confused with other unrelated, convergently evolved pili [20], or similarly named type IV secretion systems (T4SSs) that transfer DNA and deliver effectors into eukaryotic cells-particularly confusing given that the T4SS has a pilus called the T4SS-pilus. Similarly, T2SSs are not to be confused with type II pili. TFF pilins and pseudopilins contain a Class III signal peptide, not to be confused with the type III secretions systems at the core of the export apparatuses of injectisomes and flagella-rather, Class III signal peptides distinguish the N-terminal signal peptide in pilins from Class I signal peptides, which direct proteins for general secretion pathway (Sec) transport, and Class II signal peptides, which are lipoprotein sorting sequences. Furthermore, archaellum genes have recently been renamed from fla to arl [21] (here, I append the old names in parentheses), and homologs between different TFF families do not necessarily have corresponding naming systems; indeed, even homologs from the same family may have different names in different organisms.
Despite their functional differences, all TFFs have a conserved four-protein core that extrudes pili (or pseudopili) from the cell surface, supplemented by additional, lineage-specific proteins. The pilus is made of proteins called pilins, whose N-terminal Class III signal peptide is composed of a short hydrophilic sequence, ending with a conserved glycine and a 21-amino acid hydrophobic helix [9,19]. Major pilins are named PilA or PilE in T4aP, GspG in T2SSs, and ArlB (FlaB) in archaella. Pilins are translated into the membrane, where the second conserved component, a prepilin peptidase, cleaves the signal peptide after the glycine. This leaves the pilin as a transmembrane protein with an extracellular domain but no cytoplasmic residues [22]. The third conserved TFF component is a membrane platform protein made of three transmembrane domains that incorporates or removes pilins at the base of the pilus. Different techniques have all suggested that this protein dimerizes [23][24][25][26] and is called PilC or PilG in T4aP, GspF in T2SS [27], and ArlJ (FlaJ) in archaella. The fourth conserved component is a family of paralogous ATPases from the AAA+ ATPase superfamily that bind the cytoplasmic face of the membrane platform. Three variants are found: a universal "extension" ATPase that powers insertion of pilins from the membrane into extending pili (PilB or PilF in T4aP, GspE in T2SSs, ArlI [FlaI] in archaella) and two ATPases (PilT and less widespread PilU) that power removal of pilins from retracting pili, returning them to the membrane pool. Different conformations in the extension and retraction ATPases from T4aP suggest that the extension ATPase pushes against the membrane platform to incorporate pilins, while the retraction ATPase conversely pulls pilins out of the pilus [28]. The functions of these ATPases may, however, be more complex than first thought, as discussed below.
How, though, did the TFF superfamily evolve? This week in PLOS Biology, Denise, Abby, and Rocha [29] describe a global study of the phylogeny of the TFF superfamily to understand just this. Their study builds on earlier work that analyzed type IV pili and T2SSs [30], reviewed and surveyed homologies across species [9], and determined a phylogeny of archaeal TFF superfamily members [31]. The authors employed an elegant semiautomated approach to identify TFF systems that handles the complexity of duplicated, absent, substituted, or distantly diverged components using their previously developed Macromolecular System Finder, or MacSyFinder [32]. MacSyFinder facilitates context-based insights not possible from sequence alone, building on previous studies on other microbial systems [29,33]. The approach identifies mandatory or accessory proteins with hidden Markov models, then applies rules based on patterns of occurrence and genomic co-localization to identify TFF families.
The authors used this approach to identify TFF protein components across the Bacteria. The authors first consolidated protein families by identifying difficult-to-detect interfamily relationships. Using their global data set, they found that the ATPase and membrane platform together produced robust phylogenies of the core system, while the major pilin and the prepilin peptidases were less useful. They then used the related non-TFF AAA+ ATPase FtsK to determine the root of the ATPase phylogeny. FtsK is particularly suitable as it is only ever present in a single copy and is never horizontally transferred. Finally, the authors produced a rooted phylogeny of the entire TFF superfamily by transferring the ATPase root to a phylogeny estimated from a concatenation of the ATPase and membrane platform protein sequences.
The results are a fascinating bird's-eye view of the entire TFF superfamily, enabling the authors to propose how its diversity evolved (Fig 1). One conclusion is immediately clear: the superfamily splits into two broad clades representing Bacteria and Archaea, indicating that TFF are ancient and were present in the last universal common ancestor (LUCA). This suggests that the progenitor TFF superfamily member function was to take up DNA, consistent with horizontal gene transfer being commonplace at the dawn of cellular life [34].
The archaeal TFF branch agrees with a recent study by Makarova and colleagues [31], although it unearths a surprising new discovery (Fig 1). After small basal clades of unknown function, the archaeal branch divides into two major clades: one featuring rotary archaella, adhesive Halobacteria pili, archaeal adhesive pilus (Aap) pili, sugar-uptake bindosomes, UVinducible Ups pili, and as-yet-uncharacterized TFF systems; and a second clade clustering Epd pili [19] with-unexpectedly-bacterial Tad pili. This unanticipated finding is supported by congruency of phylogenies of individual components; shared fission of their membrane platform protein into two fragments; the same accessory components; short pilins; and similar genetic organization, making it clear that Tad pili, until now thought to be bacterial, were horizontally transferred from the Archaea. The transfer of a proto-Tad system was likely to a diderm Bacterium, as it coincided with recruitment of a secretin as a portal through the outer membrane; secretins are not required by Archaea because they lack an outer membrane. The secretin came from a pre-existing bacterial T4aP, suggesting that most bacterial TFF families  [29] capturing salient branch topologies and selected protein acquisition and loss events anticipated by parsimony. The upper major clade represents archaeal TFF superfamily members; the lower major clade represents Bacteria. The phylogeny suggests that the TFF superfamily was present in the LUCA. Archaea and Bacteria evolved distinct TFF families based on nuanced changes in core and accessory components, and Tad pili were a horizontal transfer from Archaea to Bacteria. Black hexagons represent archaeal S-layer; green mesh represents bacterial peptidoglycan. Aap, archaeal adhesive pilus; Com, competence; Epd, EppA dependent; LUCA, last universal common ancestor; MSH, mannose-sensitive hemagglutinin; Tad, tight adherence; TFF, type IV filament; T2SS, type II secretion system; T4aP, type IVa pilus; T4bP, type IVb pilus; Ups, UV-inducible pilus. https://doi.org/10.1371/journal.pbio.3000405.g001 were already present at the time of this transfer. It is satisfying to find that Tad pili are found almost exclusively in a single locus, facilitating their horizontal transfer as a single unit multiple times, explaining their widespread occurrence across Bacteria.
The bacterial TFF branch shows a ladder of branching TFF families (Fig 1). Unlike the archaeal branch, the ancestral Bacterial ATPase duplicated to produce a retraction ATPase at the root of the bacterial tree, indicating that the original bacterial TFFs were retractile. A secretin was also acquired at this stage, consistent with a diderm last bacterial common ancestor [35]. The T4bP branched first, confirming that T4aP and T4bP are genuinely distinct families and that the "type IV pili" grouping is paraphyletic. The monoderm-associated Com pili diverged next, with corresponding loss of their secretin. This left the T4aP, from which diverged the MSH pili and the T2SSs. Intriguingly, a so-called T2SS from Cytophaga branched from the T4aP and not from other T2SSs. Furthermore, the so-called T2SS from Chlamydia is chimaeric. These results distinguish these two "T2SSs" from true T2SSs.
These unexpected results suggest that secretion systems evolved within the TFF superfamily multiple times, and are relatively modern innovations. Naively, it has been thought that type IV pili evolved from "simpler" T2SSs. Instead, T2SSs branch relatively recently from the T4aP. Indeed, that the Cytophaga and Chlamydia "T2SSs" are not bona fide T2SSs suggests that T4aP were exapted to form T2SS-esque systems multiple times. Whether there is something particularly suitable about T4aP for such exaptation is unclear, and discovery of other independent T2SS-esque systems in the future will help better understand this. The finding echoes the independent emergence of secretion systems from bacterial flagella at least five times, with injectisomes, a secretion system in Buchnera aphidicola, and secreting flagella in Bacillus, Campylobacter, and Yersina. Indeed, it was also first thought that flagella evolved from "simpler" injectisomes, when the opposite is actually true. That T2SSs and injectisomes evolved relatively recently from T4aP and flagella, respectively, may be due to host-associated lifestyles emerging only relatively recently in Bacteria as Eukaryotes became potential niches.
One of the most significant contributions of the study is that it helps make sense of the function of the ATPases. Complements of ATPases vary in different bacterial TFF families. Although it was thought that adhesive pili lack retraction ATPases because they have no need to retract, sufficient evidence has accumulated to take seriously the idea that "extension" ATPases can also drive retraction in these pili. Denise and colleagues' study makes it clear that these reports are distributed across the breadth of the TFF superfamily, implying that both extension and retraction by a single ATPase was an ancestral trait: Tad pili lacking a retraction ATPase can retract powered by their sole ATPase, albeit with lower force-approximately one sixth that exerted by T4aP [9,36,37]; T4bP are involved in twitching motility despite lacking a retraction ATPase [38]; Ups and Com pili facilitate DNA transfer and retract despite lacking a retraction ATPase [9,39]; and T4aP retract even after deletion of their retraction ATPase, albeit with lower force and speed [40]. This suggests that a bifunctional ancestral ATPase duplicated to form two ATPases, which sub-functionalized into dedicated extension and retraction ATPases. Intriguingly, TFF ATPases may form heterohexamers [41], which would eliminate the need to swap extension and retraction homohexamers. Sub-functionalization would then enable dedicated ATPases to optimize their individual functions. Duplication of an ancestral bifunctional ATPase protomer would by default lead to assembly of heterohexamers until mutations abolished heteromer binding interfaces. This could also explain how the secondary retraction ATPase, PilU, evolved. PilU has been implicated in producing higher retraction forces [42,43], which may correspond to the high force of T4aP retraction. Indeed, PilU cannot retract without PilT, further supporting that mixed-ability heterohexamers form [44]. Nevertheless, because the ATPases are capable of both extension and retraction, loss of the dedicated retraction ATPase could be mitigated by regain of the retraction function by the extension ATPase, as seen in various branches of the TFF superfamily that have lost their retraction ATPases. What controls switching between extension or retraction activity in a heterohexamer remains to be established.
The field is now poised to probe how molecular exaptations were implemented. Denise and colleagues' study necessarily focuses on core components, and it will be interesting to now focus on changes in noncore components and how they contributed to exaptations of core protein functions. For example, what alterations enabled the sole archaellar ATPase, ArlI (FlaI), to power both assembly and rotation of archaellar filaments? ArlI drives both 36˚and 60˚rotations of the archaellum [45], suggesting a link to its hexameric structure, which may have been exapted to drive rotation. Was this putative exaptation intrinsic to changes in ArlI, or did ArlI's rotary function simply require extrinsic changes in the noncore components, such as recruitment of coupling proteins ArlH (FlaH), ArlX (FlaX), or ArlCDE (FlaCDE); or anchoring by proposed stator complex ArlF (FlaF) [46]? Similarly, how did T2SSs develop the ability to secrete effectors using an exapted pseudopilus [47]? More comparative genomics, more structures, and more biophysical studies will be needed to better understand such transitions.
The study also lays groundwork to understand how noncore components are recruited during evolution. What were the origins of newly recruited proteins and what, if anything, poised them for recruitment? Future studies on the origins of secretins and factors that make them predisposed to co-option may be particularly insightful here. Secretins are modular and easily horizontally acquired. Bacterial TFFs first recruited a secretin and later donated it to the incoming Tad system from the Archaea, injectisomes have recruited secretins multiple times, and secretins are also found in filamentous phages [48].
Another ambitious future goal will be to understand the origins of the TFFs. The only core component with clear non-TFF homologs is the AAA+ ATPase family. The AAA+ ATPases are an incredibly diverse family [49] that tend to form hexamers that undergo cyclical conformation changes with ATP hydrolyses. These conformational changes hint that a rotary function was always present for exaptation for extension, retraction, or rotation. Whether speculative homologies of the membrane platform protein to ABC transporters or ATP synthase components are correct will require structural verification [29].
Denise and colleagues highlight that at the heart of evolution of the TFF superfamily has been evolutionary tinkering using the tools at hand: no radically new mechanisms or components have been introduced to enable the functional diversity of the TFF. Rather, pre-existing functions of pre-existing components have been exapted to shift the function of existing machinery. Perhaps unsurprisingly, evolution the tinkerer has built Rube Goldberg (or, east of the Atlantic, Heath Robinson) contraptions [50]: the piston-pumping T2SS, the whirling archaellum, the sugar-siphoning bindosome, and other devices, using only the material at hand-an ancestral, DNA-lassoing type IV pilus.