Long-Term Persistence of Bi-functionality Contributes to the Robustness of Microbial Life through Exaptation

Modern enzymes are highly optimized biocatalysts that process their substrates with extreme efficiency. Many enzymes catalyze more than one reaction; however, the persistence of such ambiguities, their consequences and evolutionary causes are largely unknown. As a paradigmatic case, we study the history of bi-functionality for a time span of approximately two billion years for the sugar isomerase HisA from histidine biosynthesis. To look back in time, we computationally reconstructed and experimentally characterized three HisA predecessors. We show that these ancient enzymes catalyze not only the HisA reaction but also the isomerization of a similar substrate, which is commonly processed by the isomerase TrpF in tryptophan biosynthesis. Moreover, we found that three modern-day HisA enzymes from Proteobacteria and Thermotogae also possess low TrpF activity. We conclude that this bi-functionality was conserved for at least two billion years, most likely without any evolutionary pressure. Although not actively selected for, this trait can become advantageous in the case of a gene loss. Such exaptation is exemplified by the Actinobacteria that have lost the trpF gene but possess the bi-functional HisA homolog PriA, which adopts the roles of both HisA and TrpF. Our findings demonstrate that bi-functionality can perpetuate in the absence of selection for very long time-spans.


Introduction
Enzymes are remarkably specific catalysts and this characteristic led to the traditional view of "one enzyme, one substrate, one reaction", which assumes an evolutionary preference for mono-functionality.However, it is clear now that enzymes can catalyze reactions other than those for which they evolved; see [1] and references therein.Prominent examples of multifunctional enzymes are glutathione S-transferases and cytochrome P450s, which can process several different substrates [1].However, multi-functional enzymes may cause metabolic conflicts if they are involved in different, possibly independent, metabolic pathways [2].Along these lines, multi-functionality seems to be of no immediate advantage for an organism, which argues against a positive selection of this trait.Presumably, neutral drift causes the broadening or narrowing of reaction specificity, see [1] and references therein; however it is unclear, whether multi-functionality is a short-term or a long-term trait.
Some evolutionary innovations originate non-adaptively as exaptations [3], i. e. as by-products of other, positively selected traits.These features were not built by natural selection for their current role.For example, feathers evolved for temperature regulation prior to their function in flight [3] and the light-refracting lens crystallins stem from enzymes [4].In silico analyses suggest that exaptation is an important means of evolutionary innovation for metabolic systems [5].The contribution of exaptation to evolutionary processes would be of even greater importance, if such traits existed over a long evolutionary time-span.In order to address this issue, we traced bi-functionality of a key metabolic enzyme over two billion years.
A detailed tracing of HisA bi-functionality required an analysis in two dimensions: A survey of PriA-like characteristics in modern HisA homologs and a retrospect of ancestors related to bacterial speciation.To begin with, we used in silico analyses and in vitro characterization of extant HisA enzymes and found that PriA-like bi-functionality is not strictly limited to Actinobacteria.Furthermore, we reconstructed in silico the sequences of the HisA/PriA ancestors of all Actinobacteria, all Proteobacteria, and all Bacteria, and tested the resulting precursor proteins for their ProFAR and PRA isomerase activities.Our results show that all three reconstructed ancestral enzymes are bi-functional in vitro and in vivo.Thus, our findings provide an example for an enzyme, whose bi-functionality pertained for two billion years of evolution, most likely without obvious, immediate benefit, except for exaptation.

Occurrence and functional characterization of extant HisA and PriA enzymes
The existence of the bi-functional PriA enzyme has originally been described for two actinobacterial species, namely Streptomyces coelicolor and Mycobacterium tuberculosis [8].In order to determine the distribution of PriA-like enzymes within all bacterial phyla, we computed a sequence similarity network (SSN) of the HisA/PriA superfamily (Fig 2).In an SSN, nodes represent individual sequences and edges correspond to statistically significant similarities deduced from pairwise alignments, calculated by BLAST [11].Our analysis showed that hisA genes are present in all major phylogenetic groups (Fig 2A ) and that the occurrence of annotated priA genes is indeed restricted to the Actinobacteria (Fig 2B, top right cluster).The mean sequence identity in the Actinobacteria cluster is 52±9%; it can thus be assumed that all these sequences correspond to PriA enzymes.
The ability of PriA to catalyze both the HisA and the TrpF reaction requires that its active site can bind the two respective substrates in a productive conformation.As it is evident from the crystal structure of PriA from M. tuberculosis (mtPriA) [9], both substrates are bound in the same active site pocket (  sequence logo).In contrast, the majority of HisA sequences deviate from the PriA-typical motif in 2-3 residues, mainly at positions 109 and 143 (Fig 3C, remaining sequence logos).Surprisingly, however, the PriA-typical motif is present in some HisA enzymes from Bacteroidetes (1 representative corresponding to 0.4% of all Bacteroidetes sequences), Euryarchaeota (6 / 5.1%), Firmicutes (25 / 8.9%), and Proteobacteria (43 / 4.9%).Moreover, the PriA-typical motif is also found in HisA from Thermotoga maritima (tmHisA), except that Lys is present at position 143 instead of the PriA-typical Arg.
In order to test if the presence of the PriA-typical active site sequence motif in HisA enzymes leads to TrpF activity, tmHisA and two HisA enzymes from Proteobacteria (Pelobacter carbinolicus, pcHisA; Desulfovibrio desulfuricans, ddHisA) were produced by heterologous gene expression in Escherichia coli.The recombinant proteins were purified and characterized by steady-state kinetics with respect to their ProFAR and PRA isomerization activities.Compared to PriA from S. coelicolor (scPriA) and M. tuberculosis (mtPriA), the catalytic efficiencies k cat =K ProFAR M of tmHisA, ddHisA, and pcHisA are about tenfold higher (Table 1, HisA reaction).They are comparable to the catalytic efficiency k cat =K ProFAR M of HisA from Salmonella enterica (seHisA), which is considered to be an archetypical representative of the HisA family [12].Strikingly, tmHisA, ddHisA, and pcHisA also displayed TrpF-activity, something that has not been shown before.However, their catalytic efficiencies k cat =K PRA M are lower by about 10 5 -10 6fold compared to scPriA and mtPriA (Table 1, TrpF reaction).
In vivo complementation experiments showed that tmHisA, ddHisA, and pcHisA were able to rescue the growth deficiency of an E. coli ΔhisA strain.Moreover, despite their weak in vitro TrpF activity, they were also able to complement a ΔtrpF strain (Table 2).The enzymes were further able to complement a ΔhisAΔtrpF double deletion strain (Table 2), whereby the time required for complementation is clearly limited by their TrpF activity.

Reconstruction of ancient sequences
We next asked whether the bi-functionality of HisA is an ancient feature that has been conserved in certain extant enzymes.To this end, we computationally reconstructed three HisA precursors as described in the following.It has been shown that concatenating related sequences increases the strength of the phylogenetic signal available for tree construction [14].Thus, we concatenated species-wise HisA with HisH and HisF sequences.The respective genes were most likely part of the LUCA genome [7] and have remained elements of the histidine Table 1.Steady-state kinetic parameters of extant PriA and HisA enzymes, and reconstructed HisA ancestors.

HisA reaction TrpF reaction
Enzyme 2 Data taken from [9]. 3 Unlike in previous work [13], tmHisA (as well as ddHisA and pcHisA) showed measurable albeit very low TrpF activity.Although the exact reasons for this discrepancy are unknown, it may be due to differences in enzyme purification and handling. 4Data taken from [12]; n. d.: values were not determined.
The resulting MSA HisFAH comprised 103 concatenations (species names listed in S1 Table ).After preprocessing this input, a phylogenetic tree was determined and assessed by means of PhyloBayes v3.3 [15].Four independent MCMC samplings of length 50,000 were computed using pb and compared to ensure convergence.Several parameters confirmed the validity of our approach: Convergence and mixing were checked by means of the discrepancy index maxdiff; for the pairwise comparison of all chains, the maxdiff value was at most 0.06.The effective size was at least 100, as determined by means of tracecomp.A consensus tree was deduced from the concatenation of these four chains (S1 Fig) .The posterior probability of edges interlinking ancestors of phyla or classes was at least 0.87, which testifies to the high quality of the tree.
This tree and the corresponding MSA HisFAH were used to deduce a predecessor of the actinobacterial enzymes (CA-Act-HisA) by means of FASTML [16].In order to exclude any effect of the 22 actinobacterial sequences (and especially their active site motif) on the reconstruction of more ancient predecessors of HisA, these sequences were removed from MSA HisFAH .The resulting MSA HisFAH-Act , which contained the remaining 81 non-actinobacterial sequences, was used to calculate a second tree (S2 Fig) .Applying FASTML, the sequences of the common ancestors of Proteobacteria (CA-Prot-HisA) and of Bacteria (CA-Bact-HisA) were determined.A schematic representation of the two trees is given in Fig 4 .The archaeal sequences served as an outgroup in both reconstructions.Experimental assessment of HisA precursors The genes coding for the three precursors were synthesized and heterologously expressed in E. coli.The recombinant proteins were soluble and stable, and could be purified.Steady-state kinetic analysis yielded k cat =K ProFAR M values in the order of 10 2 -10 5 M -1 s -1 for the HisA reaction, and k cat =K PRA M values in the order of 10 2 -10 3 M -1 s -1 for the TrpF reaction (Table 1).Compared to scPriA and mtPriA, the catalytic efficiency of the ancestral proteins for the TrpF reaction is therefore only two to three orders of magnitude lower.For all three proteins this is the result of a lower k cat value; the K PRA M is practically identical to that of scPriA.Furthermore, all three precursors were able to complement the growth deficiencies of ΔhisA and ΔtrpF strains (Table 2).The time required for in vivo complementation agrees well with k cat /K M values determined from in vitro measurements.For example, CA-Bact-HisA and CA-Prot-HisA have the highest k cat =K ProFAR M values and required the least time to complement the ΔhisA strain.CA-Act-HisA has the highest k cat =K PRA M value and required the least time to complement the ΔtrpF strain.All three HisA-ancestors were further able to complement the growth deficiency of a ΔhisAΔtrpF double deletion strain (Table 2).The observed complementation times agree well with those determined from the single deletion strains.The complementation by CA-Act-HisA is limited by its ability to compensate for the missing HisA reaction, whereas complementation by CA-Prot-HisA and CA-Bact-HisA is limited by their ability to catalyze the missing TrpF reaction.
The active site sequence motif of CA-Act-HisA is identical to that of modern PriA enzymes.The motifs of CA-Prot-HisA and CA-Bact-HisA match in six of the eight residues.Nonmatching is position 109, which contains a Lys instead of a Glu.At the second non-matching position 143, both precursors contain a Lys instead of an Arg.It is therefore plausible to assume that a basic residue at position 143 is crucial for bi-functionality.In contrast, the recently published SGG sequence motif of PriA [17] seems not to be required for bi-functionality.Only the immediate actinobacterial precursor CA-Act-HisA contains the SGG-motif whereas both other precursors displayed significant bi-functionality albeit containing a GGG-motif.

Discussion
In contrast to previous results [18], the reconstructed CA-Prot-HisA and CA-Bact-HisA are to our knowledge the first examples of ancestral metabolic enzymes from approximately 2.5 to 2.0 billion years ago [19] that were shown to be bi-functional.This trait is even more interesting when one considers that only extant HisA sequences but no extant PriA sequences were selected to reconstruct the CA-Prot-HisA and CA-Bact-HisA predecessors.
Strikingly, we also detected bi-functionality in the modern tmHisA, pcHisA, and ddHisA and thus provide the first examples of HisA/TrpF bi-functionality in extant HisA enzymes.It is worth noting that these three species all contain a trpF gene, which suggests that no selective pressure exists for these species to maintain the bi-functionality in HisA.Moreover, the in vivo complementation experiments show that tmTrpF is functional and is able to rescue an E. coli ΔtrpF strain (Table 2).Also, the bi-functionality of these modern HisA enzymes does not force their hosts to face functional trade-offs because K PRA M values are 10-to 170-fold higher than K ProFAR M values.Thus the obligate HisA activity of these enzymes is most likely not impaired by the binding of PRA or CdRP.Moreover, the catalytic efficiencies k cat =K PRA M are in a physiologically irrelevant range below 14 M -1 s -1 thus making TrpF side-activity tolerable.Along these lines, the CA-Bact-HisA predecessor evolved most likely in a similar way such that the remaining TrpF side-activity was physiologically not harmful.
Our results do not allow us to decide whether all modern HisA enzymes are bi-functional: We have performed in vivo complementation experiments with four additional HisA enzymes from Bacteroidetes, Firmicutes, Proteobacteria, and Euryarchaeota lacking the PriA-typical sequence motif.These enzymes were unable to rescue ΔtrpF or ΔhisAΔtrpF deletion strains within eight days.Nevertheless, extremely slow growing colonies were observed occasionally.This growth may be due to residual TrpF activity of inherent E. coli enzymes like PurF [20] and may therefore indicate the existence of additional routes of exaptation.The active site motifs (Fig 3) suggest that bi-functionality is determined by Glu 109 and Arg 143.HisA homologs that retained bi-functionality have conserved the PriA typical residues at these two positions, despite a relatively low overall sequence identity.As this bi-functionality seems to be neither beneficial nor harmful for an organism, we assume that its presence is simply a matter of historical contingency.This conclusion is in agreement with the finding that a few mutations acquired in not more than several thousand generations were sufficient to transform a bi-functional HisA variant from S. enterica into a specialized HisA enzyme lacking TrpF activity or vice versa [21].Along these lines, the bi-functional PriA became a mono-functional HisA enzyme in the Corynebacteria, a distinct genus within the Actinobacteria.This re-narrowing of substrate specificity in the so-called subHisA occurred after the horizontal acquisition of a whole pathway tryptophan operon (including a trpF gene) from a member of the γ-Proteobacteria [22].Again, this transition from a bi-functional PriA to a mono-functional HisA enzyme required only subtle sequence alterations [17].Noteworthy is a change from Arg 143 to an Asn, which supports the important role of Arg 143 for bi-functionality.Again, mono-functionality of HisA is easily accessible, if under evolutionary constraints.For Corynebacteria, this evolutionary pressure is most likely due to a metabolic conflict between histidine and tryptophan biosynthesis.
This bi-functionality provided a means for compensating the loss of the trpF gene within the Actinobacteria.Importantly, such exaptations are not rare: A screening of 104 single-gene knockout strains made clear that approximately 20% of these auxotrophs were rescued by the overexpression of at least one noncognate E. coli gene [23].Thus, the functional diversity of gene products contributes to metabolic robustness and evolvability.These evolutionary advantages are further increased, if a bi-functionality that confers no cost or benefit to organismal fitness, can be conserved throughout long evolutionary time-spans.The characteristics of ancient and extant HisA and PriA enzymes confirm that this is feasible, even for enzymes of the primary metabolism.

Generation of sequence similarity networks
The SSN of the HisA/PriA-superfamily (7824 sequences, IPR023016 from InterPro release 47.0 [24]) was created using standard methods [25] provided by the Enzyme Function Initiative [26].In order to eliminate sequence fragments, the length of the sequences that were included in the all-by-all BLAST comparison was restricted to 230-260 amino acids.From the remaining 7428 sequences, a representative network with an E-value cut-off of 1E-54 was generated in which sequences that share >95% identity were grouped into single nodes by CD-HIT [27].Detailed phylogenetic information (superkingdom, phylum, class, order, family, genus) was added for each node using a modified version of Key2Ann [28].Networks were visualized with the organic y-files layout in Cytoscape 3.2.0[29,30].Phylum-specific sequence sets were compiled from the SSN and used to compute sequence logos of the active site residues, essentially as described [31].

Reconstruction of ancestral sequences
BLAST [11] and the nr database of the NCBI were used to search for the sequences of HisA homologs in completely sequenced genomes.Species where chosen, where hisA and the hisF and hisH genes were neighbors in the genome; the respective sequences were concatenated.We selected species from the archeal phyla Euryarchaeota and Crenarchaeota, and from the bacterial phyla Bacteroidetes, Firmicutes, Spirochaetes, Actinobacteria, and Proteobacteria.A multiple sequence alignment (MSA) was deduced by means of MAFFT [32].Positions containing more than 50% gaps were removed by using GBlocks [33].The resulting MSA contained 430 meaningful positions.The program pb (version 3.3 of PhyloBayes, [15]) with options-catgtr was used to compute in four independent Monte Carlo Markov Chains (MCMC) 50 000 samples each.The options-cat-gtr induce an infinite mixture model, whose components differ by their equilibrium frequencies.The quality of mixing was assessed by computing the discrepancy index (maxdiff) by means of bpcomp and the minimum effective size with tracecomp.A consensus tree was determined by means of readpb, the burnin was 5000.
An MSA and a rooted tree determined as described were the input for FASTML [16].The JTT substitution model and the maximum likelihood method were used for indel reconstruction.As a representative predecessor, we chose the most likely sequence related to the respective node of the phylogenetic tree.Nucleotide and amino acid sequences of synthesized genes for ancestral proteins are given in S2 Table.

Site directed mutagenesis and cloning
A list of all oligonucleotides used for cloning and site-directed mutagenesis is provided in S3 Table .The scpriA gene from S. coelicolor, which served as a positive control in the in vivo complementation assays, was amplified from scPriA-pTYB4 (a gift of Dr. Matthias Wilmanns) by standard PCR, using the oligonucleotides 5ʹscpriA_SphI/3ʹscpriA_ Stop_HindIII, and cloned into the pTNA vector [6] via the introduced restriction sites for SphI and HindIII.The tmtrpF gene from T. maritima, which served as a negative control in the in vivo complementation assays, was available in a pTNA vector from previous work [34].
The hisA gene from T. maritima (tmhisA) was amplified using the template pDS56/RBSII_hisA [35] with the oligonucleotides 5ʹtmhisA_NdeI/3ʹtmhisA_NotI (pET21a) and 5ʹtmhisA_SphI/ 3ʹtmhisA_Stopp_HindIII (pTNA) and subsequently cloned into pET21a (Stratagene) and pTNA vectors using the respective terminal restriction sites.The genomic DNA of D. desulfuricans ssp.Desulfuricans and P. carbinolicus were ordered from DSMZ (DSM2380 and DSM6949, respectively).The respective hisA genes (ddhisA and pchisA) were amplified in a standard PCR using the oligonucleotides 5ʹddhisA_NdeI/3ʹddhisA_XhoI and 5ʹpchisA_NdeI/3ʹpchisA_XhoI, respectively, and subsequently cloned into the pET24a vector (Stratagene) via the introduced restriction sites for NdeI and XhoI.For in vivo complementation assays both hisA genes were cloned into the pTNA vector via the restriction sites for SphI and HindIII.To this end, pchisA was amplified with the oligonucleotides 5ʹpchisA_SphI and 3ʹpchisA_Stopp_HindIII, whereas in the case of ddhisA an overlap extension PCR [36] was necessary to remove an intrinsic SphI restriction site.This reaction was performed with the oligonucleotides 5ʹddhisA_SphI, 3ʹddhisA_C516T, 5ʹddhisA_C516T, and 3ʹddhisA_Stopp_HindIII.
The genes coding for the reconstructed ancestors were optimized for their expression in E. coli, synthesized (LifeTechnologies), and cloned into the pTNA and pET24a vectors using the terminal restriction sites for SphI and HindIII.In order to render pET24a compatible for cloning with SphI, two QuikChange mutagenesis steps were performed: the NdeI restriction site of pET24a was replaced by a SphI restriction site using the oligonucleotides 5ʹpET24a_NdeI_to_SphI and 3ʹpET24a_NdeI_to_SphI, whereas a SphI restriction site remote from the multiple cloning site was removed using the oligonucleotides 5ʹpET24a_A536T and 3ʹpET24a_A536T.All gene constructs were entirely sequenced to exclude inadvertent mutations.

Heterologous expression and purification of recombinant proteins
Gene expression, harvesting of cells, and cell lysis were performed essentially as described [18].The genes pchisA and ddhisA were expressed in E. coli T7 Express cells (New England Biolabs) containing the pRARE helper plasmid [34].The gene tmhisA was expressed in E. coli BL21-Co-donPlus-(DE3)-RIPL cells (Agilent Technologies).The genes for the reconstructed proteins were expressed in E. coli BL21-Gold (DE3) cells (Agilent Technologies).For purification of tmHisA, heat denaturation (70°C, 15 min) was performed to remove most of the host proteins.Soluble cell extracts were loaded onto a HisTrapFF crude column (5 mL; GE Healthcare), which had been equilibrated with 50 mM potassium phosphate, pH 7.5, 300 mM sodium chloride, and 10 mM imidazole.After washing with equilibration buffer, the bound protein was eluted by applying a linear gradient of 10-375 mM imidazole.Subsequently, fractions with pure protein were pooled and dialyzed twice against 50 mM TrisÁHCl, pH 7.5.Before dialyzing the reconstructed proteins CA-Bact-HisA, CA-Prot-HisA, and CA-Act-HisA in the same manner, fractions containing the respective protein were loaded onto a Superdex75 column (HiLoad 26/60, 320 mL, GE Healthcare) operated with 50 mM TrisÁHCl, pH 7.5, and 50 mM sodium chloride at 4°C.In all cases, at least 1 mg protein was obtained per liter of culture.All proteins were more than 95% pure, as judged by SDS-PAGE.

Steady-state enzyme kinetics
The HisA reaction was measured spectrophotometrically at 300 nm and 25°C as described [6].The TrpF reaction was followed at 25°C by a fluorimetric assay (excitation at 350 nm, emission at 400 nm) [37].The substrate PRA was generated in situ from anthranilate and phosphoribosylpyrophosphate (PRPP) using 1 μM yeast anthranilate phosphoribosyl transferase.To assure a constant concentration of the unstable PRA during the individual TrpF activity measurements, a 30-fold molar excess of PRPP over anthranilate was used.The k cat and K M values for both reactions were determined by fitting the hyperbolic saturation curves with the Michaelis-Menten equation.For unknown reasons, the CA-Prot-HisA and CA-Bact-HisA proteins exhibited a strong hysteresis, both in the HisA and TrpF reaction.Therefore, entire progress curves were recorded starting with as many as five different initial substrate concentrations.The curves were analyzed with COSY [38] using the integrated Michaelis-Menten equation for progress curves of the HisA reaction and a Michaelis-Menten equation that includes product inhibition for progress curves of the TrpF reaction.

E. coli knockout strains
The E. coli ΔhisA strain was generated according to a classical protocol [39].In brief, an ampicillin resistance gene was integrated into an E. coli DY329 helper strain to replace the genomic hisA gene with the aid of this strain's genetically encoded bacteriophage λ Red recombination system [40].The resistance gene was then transferred to E. coli BW25113 via P1 phage transduction and replaced the genomic hisA gene.The complete deletion of the hisA gene was verified by sequencing.The E. coli ΔhisAΔtrpF double deletion strain was generated from the ΔhisA strain in the same manner, with the genomic trpF gene being replaced by a chloramphenicol resistance gene.The E. coli ΔtrpF single deletion strain (E. coli JMB9r-m+ΔtrpF) was available from previous work [41].

In vivo complementation assays
Complementation assays with pTNA_scpriA, pTNA_tmhisA, pTNA_tmtrpF, pTNA_ddhisA, and pTNA_pchisA, as well as with the pTNA constructs of the reconstructed ancestors CA-Act-HisA, CA-Prot-HisA, and CA-Bact-HisA were performed on M9 minimal medium agar plates.An identical experimental procedure was followed in all cases: First, the respective plasmid was used to transform either chemical competent ΔhisA, ΔtrpF, or ΔhisAΔtrpF E. coli cells.Next, single colonies were picked in order to inoculate 5 mL of LB medium supplemented with 150 μg/mL ampicillin only (ΔhisA cells) or with 150 μg/mL ampicillin and 30 μg/mL chloramphenicol (ΔtrpF and ΔhisAΔtrpF cells).After incubation at 37°C overnight, 5 mL of LB medium containing the respective resistance markers were inoculated (optical density of 0.1 at 600 nm) and incubated at 37°C until an optical density of about 1 at 600 nm was reached (corresponding to approximately 10 8 cells).Subsequently, the cells in 1 mL suspension were collected by centrifugation (4°C, 4000 g, 10 min) and washed three times with 1% NaCl.Finally, 1:10 5 and 1:10 4 dilutions were streaked out on M9 minimal medium agar plates containing 150 μg/mL ampicillin and incubated at 37°C.

Fig 3 )
. The most notable difference between the HisA state (Fig 3A) and the TrpF state (Fig 3B) is a twist of loop 5 and a concomitant swap of the localization of R143 and W145.This goes along with rearranged hydrogen bond networks at positions 19 and 109.Despite that, however, the same eight residues are involved in forming the active site in both states.We thus analyzed and compared their conservation in HisA and PriA sequences from the major SSN clusters.The actinobacterial PriA active site is characterized by a strong residue conservation resulting in the motif D-R-E-D-R-G-W-D (Fig 3C, Actinobacteria

Fig 1 .
Fig 1. Isomerization of the aminoaldoses ProFAR and PRA to the aminoketoses PRFAR and CdRP.In most prokaryotes the two reactions are catalyzed by the enzymes HisA and TrpF, respectively.In Actinobacteria, however, the bi-functional PriA catalyzes both isomerizations.doi:10.1371/journal.pgen.1005836.g001

Fig 2 .
Fig 2. Sequence similarity network of the HisA/PriA superfamily.Nodes are colored by either the main five phyla contributing to this superfamily (A) or by annotation as HisA or PriA (B).The network was generated from all-by-all BLAST comparisons of 7428 HisA and PriA sequences.Each node represents a single sequence or a group of sequences with more than 95% identical residues; experimentally characterized HisA or PriA proteins are marked by diamonds.Each edge in the network represents a bi-directional BLAST hit with an E-value 1E−54 (corresponding to a median sequence identity of 44%).At this cutoff, the PriA cluster is clearly separated from, but still connected to the central HisA cluster.Lengths of edges are not meaningful except that sequences in tightly clustered groups are relatively more similar to each other than sequences with few connections.doi:10.1371/journal.pgen.1005836.g002

Fig 3 .
Fig 3. Two states of the PriA active site from M. tuberculosis.(a) Schematic view of the site in the HisA-state (bound product PRFAR, PDB ID 3zs4).(b) The same active site in the TrpF-state (bound product analogue reduced-CdRP, PDB ID 2y85).Residues of the active site are shown as stick models.Residue numbering is based on PDB ID 3zs4.(c) Sequence logos showing the conservation of the motif as deduced from SSN clusters of the HisA/PriA superfamily.Basic and acidic residues are colored blue and red, respectively.doi:10.1371/journal.pgen.1005836.g003

Fig 4 .
Fig 4. Phylogenetic tree depicting the position of extant HisA and PriA enzymes (diamonds) and their relationship to the reconstructed ancestral HisA enzymes (circles).The topology of the tree was inferred from the phylogenetic trees used for sequence reconstruction (S1 and S2 Figs).CA-Act-HisA, CA-Prot-HisA, and CA-Bact-HisA are the predecessor of HisA enzymes from Actinobacteria, Proteobacteria and Bacteria, respectively.Note that actinobacterial sequences were omitted for reconstruction of CA-Prot-HisA and CA-Bact-HisA (indicated by grey shading of the Actinobacteria branch).ddHisA and tmHisA were not used for sequence reconstruction and are only listed because they were characterized experimentally.The vertical bar indicates the branch length that corresponds to 0.5 mutations per site.The catalytic efficiencies k cat /K M of the enzymes for processing ProFAR and PRA are given in red and blue, respectively.Abbreviations: sc, S. coelicolor; dd, D. desulfuricans; pp, P. carbinolicus; tm, T. maritima; Sp., Spirochaetes; Bact., Bacteroidetes.doi:10.1371/journal.pgen.1005836.g004

Table 2 .
In vivo complementation of auxotrophic E. coli strains by PriA, HisA, HisA ancestors, and TrpF.For all experiments, the mean time is given after which visible colonies appeared on minimal medium agar plates.All experiments were repeated independently at least three times.A growth time of 16 hours indicates that colonies appeared over night.Growth times below 120 hours could be reproduced with a maximum error of 25%, growth times above 120 hours with a maximum error of 40%."No growth" indicates that no colonies were observed after 14 days.A negative control with empty pTNA plasmid did not lead to growth within 14 days, either. doi:10.1371/journal.pgen.1005836.t002