Skip to main content
Advertisement

< Back to Article

Figure 1.

Overview of Approach

An overview of our phylogenetic inference procedure is given. We look only at histidine kinase domains from HPKs, and compare the distribution of these to the species tree. When homologs in distant outgroups are more distantly related, we infer simple vertical descent. Paralogs and distances that contradict species phylogeny result in our inference of gene duplication or horizontal transfer. Only events (such as duplication or transfer) that occurred more recently than the cutoff as described in the Methods are considered. Four hypothetical cases are shown, and each is labeled as present/absent (“1” or “0”) from each outgroup according to the procedure described in the Methods.

More »

Figure 1 Expand

Figure 2.

HPK Content versus Genome Size

The percent of each genome is plotted as a function of genome size. As reported in previous studies, there is a roughly linear correlation. Highlighted in colored symbols are several groups of genomes described in the text: genomes coding a high (≥1.5%) fraction of HPKs—red squares; the model organisms, E. coli and B. subtilis—green circles; genomes with a high number of HGT eventsblue triangles; Bradyrhizobium japonicum (high number of both HGT and LSE genes)pink diamond; and Streptomyces coelicolor (high percentage of LSE genes) turquoise diamond.

More »

Figure 2 Expand

Figure 3.

Summary of Evolutionary Events

The number of events inferred for different bacteria is summarized in this figure.

(A) Average numbers for the major taxonomic groups used in this study.

(B) Specific numbers for targeted genomes (those with colored symbols in Figure 1).

More »

Figure 3 Expand

Figure 4.

LSE Events versus HGT Events

The number of LSE and HGT events for each genome are shown. Colored symbols correspond to the genomes identified in Figure 1. Note the position of the red squares well above the x-axis.

More »

Figure 4 Expand

Figure 5.

Proximity of Different Classes of HPKs to RRs

Shown is the cumulative percentage of HPKs of each type that have RRs within the distance on the chromosome specified by the x-axis. The different gene types shown are: old HPKsblack squares (Old); HGT genes without recent paralogsblue triangles (HGT); and HPKs with recent paralogsred diamonds (LSE). In the bottom right panel, an average over all genomes is shown. In general, and for most specific cases (excepting Streptomyces), horizontally transferred genes are observed to have a much higher fraction of RRs in close genomic proximity. “Hybrid” HPKs, which have RRs in the same ORF as the HPK, were excluded from this analysis. Only genes that are not believed to have undergone duplication within a lineage are used in the HGT group. Lines stop when cumulative percentage equals 100%.

More »

Figure 5 Expand

Figure 6.

Coevolution of Orphan HPKs and RRs

The number of “orphan” HPKs is plotted versus the number of “orphan” RRs. A moderate but highly significant linear correlation is observed (ordinary least squares linear regression: r = 0.57, p < 10−15).

More »

Figure 6 Expand

Figure 7.

Extent of Domain Shuffling in Different Classes of HPKs

The fraction of HPKs with identical upstream domains to either their inferred HGT partners (red bars), or to their closest paralog (blue bars) in the case of LSE. Only genes that are not believed to have undergone duplication within a lineage are used in the HGT group. B. japonicum, which has a significant number of genes classified as HGT and LSE, is shown twice.

More »

Figure 7 Expand

Figure 8.

Domain Shuffling in a Desulfovibrio spp. Expansion

The domain structure of genes in a large LSE in the Desulfovibrio genus is shown. In addition, three similar proteins identified in Pseudomonas species are shown, which are likely the result of HGT. Genes are identified by their species name and their accession number in the MicrobesOnline database (http://microbesonline.org) for easy reference (DV refers to genes present in D. vulgaris and DA refers to genes present in D. alaskensis G20). Each domain corresponds to a branch of the TREE-PUZZLE phylogenetic tree (of only the HPK domains) shown at left. Each PAS domain is colored according to sequence homology (as inferred by BLASTp), and domains with the same color comprise subfamilies of closely related domains. While upstream domains are generally shuffled, each gene shown contains a PAS domain immediately preceding the conserved HPK domain. Moreover, this PAS domain is largely conserved among paralogs at the sequence level, while more N-terminal domains are not. Interestingly, the Pseudomonas gene, which we infer to be involved in a horizontal transfer event, has a set of signaling domains identical to one of the Desulfovibrio copies, suggesting a likely donor–acceptor pair, and highlighting the qualitative difference in genes acquired by HGT and LSE.

More »

Figure 8 Expand

Figure 9.

Gene Expression of D. vulgaris Expansions

Gene expression profiles across a compendium of experimental stress response conditions (NaCl, heat shock, cold shock, nitrite, and oxygen) were monitored using DNA microarrays, and shown next to a phylogenetic NJ tree (with 1,000 bootstraps, generated using the MEGA3 software package [28]) of all HPK domains in D. vulgaris. Blue colors indicate down-regulated genes (relative to unperturbed cells), and red colors indicate up-regulated genes. No significant excess correlation in gene expression was observed for genes within each cluster (compared with randomly chosen pairs of genes) using a Student's t-test to compare mean correlations or the Kolmogorov-Smirnov test to compare distributions. The domain structure of each gene in the three LSEs is shown to the right. Gene names are provided for all genes, and MicrobesOnline accession numbers are provided in parentheses for genes in each of the major clusters for comparison with Figure 8. Bootstrap values are provided for each of the major clusters, and the amino acid sequence of the “H-Box” motif for genes in each cluster is shown. A more detailed description of the experiments performed is given in Methods.

More »

Figure 9 Expand

Figure 10.

Genomic Distribution of HPKs and RRs

The position of signaling proteins in several genomes is shown. In the outer ring, HPKs of different classes are shown: Old (gray), LSE (purple), HGT without duplication (blue), and genes that recently underwent a “Birth” event (green). The middle ring shows the position of response regulators, with blue colors indicating hybrid response regulators (containing HPK domains). The inner ring shows the location of all genes in each genome annotated as signaling proteins according to the MicrobesOnline database [37].

More »

Figure 10 Expand

Figure 11.

HPK Evolution over Time

The number of HPKs entering a lineage is shown as a function of the time each HPK entered that lineage (i.e., the distance of that species to the last ancestor predicted to contain that HPK). Red lines/diamonds indicate LSE events, and blue lines/triangles indicate HGT events. Plots are cumulative showing all events dating more recently than the distance shown on the x-axis. HGT lines do not extend as far to the right as LSE lines, since they require genes to be lost from two consecutive outgroups. The vertical dashed line shows the phylogenetic cutoff distance, and the horizontal dashed line shows the total number of HPKs in each genome. Symbols indicate the evolutionary distance (arbitrary units; see Methods) of outgroups used in the analysis. Some clades with greater taxon sampling have better resolved timings. The last panel shows total numbers for all genomes.

More »

Figure 11 Expand