Skip to main content
Advertisement

< Back to Article

Figure 1.

Test queries: lac operon and Lsr system.

A. The lac operon is composed of beta-galactosidase (LacZ), the lactose importer (LacY), and a beta-galactoside transacetylase (LacA). Upstream of the operon, the operon repressor (LacI) is expressed in a co-directional orientation. The primary function of the lac system is as a regulated importer/processing unit. Lactose brought in through the permease LacY is converted into allolactose or hydrolyzed into glucose and beta-galactose. Both reactions are catalyzed by LacZ. Allolactose then acts to release the repression of the system by LacI. B. The Lsr system is composed of two divergent operons. One operon consists of an AI-2 kinase and a system repressor. The other operon consists of an AI-2 transporter and phospho-AI2 processing genes. Contextual system behavior is partly governed by distinctly regulated parts including an alternative importer [23], an exporter [33], and the AI-2 synthase gene. Relative to the canonical lac system, the Lsr system is complicated by the fact that the cell synthesizes,exports, and imports AI-2, and by the negative regulation associated with the divergently arranged structure. AI-2 exported by a mechanism involving YdgG traverses the outer membrane through a porin and enters the periplasmic space. Through the ABC-type importer, LsrACDB, AI-2 is then transported back into the cytosol. Once there, AI-2 is phosphorylated by LsrK. This phosphorylated form (AI-2-P) derepresses the Lsr system and is catabolized by LsrF and LsrG into separate downstream products.

More »

Figure 1 Expand

Figure 2.

LMNAST heuristic.

LMNAST operates in a BLAST-like manner, using the results of BLAST searches themselves as a curated database. 1. For each member of the query, in any nucleotide record, a homolog's membership to a character type is assigned by scoring below a specified BLAST E-value threshold. Genes assigned to characters are highlighted blue. Genes without sufficient homology to any character are represented by dashed boxes. 2. Sufficiently long stretches of adjacent characters are identified as seeds (red). 3. Sufficiently proximal characters are connected to seeds or seeds are connected to each other when separated by a base pair distance<d. 4. Rearrangements, losses, and deletions are scored according to a standard similarity heuristic. Noncontinuous elements are dropped iteratively until a maximum score is achieved, arriving at… 5. An LMNAST hit.

More »

Figure 2 Expand

Figure 3.

Lac operon LMNAST hits overlaid onto phylogenetic distributions of different scopes.

The larger, bolded leaves represent species that contain lac operon homologs, whereas the grayed italicized leaves were completely bereft. A. An E. coli specific phylogenetic tree as adapted from [4], wherein all genes from the core E. coli were used to construct a consensus tree. Among the species/strains represented here, lac system homologs were absent from certain Shigella. Additionally, BW2952, SE11, IAI1, HS, E243227A, CFT073, K-12 DH10B, and 11 of 18 O:H serotyped strains contained truncated systems. Uniquely, CFT073 retains lacA, while missing lacI in an otherwise preserved lac system structure. B. The phylogenetic dispersion of the lac system is mostly limited to Escherichia and proximal species, as seen in the 16s based tree adapted from [5]. Bolded leaves indicate the presence of the lac system in at least one strain.

More »

Figure 3 Expand

Figure 4.

Coincidence heat map for Lac operon LMNAST stringent search hits.

Each shaded index represents the normalized frequency of hits containing the row gene that also contain the column gene out of the total number of hits containing the row gene, as denoted by (#). The matrix on the left is a representation of an unbiased set of evenly distributed homologs (AB, BC, CD, ABC, BCD, and ABCD). The middle matrix is the actual coincidence data. The matrix on the right is a heat map of the difference between the two. LacI is heavily over-represented according to the difference matrix. LacY occurred less than would be expected according to a random distribution.

More »

Figure 4 Expand

Figure 5.

2D Similarity Plot of Lac operon LMNAST stringent search hits overlaid with attributed annotation.

Each gray dot represents the homology coordinate of a hit. The size of the dot scales directly with the number of hits at the same coordinate. The dashed line is a 1∶1 line along which hits have the same degree of homology by both BLAST and LMNAST measures. Seemingly vertical displacements may imply horizontal gene transfer, while horizontal displacements may imply gene loss or arrangement within the same or proximal species. Ovals indicate a clustering of similarly annotated hits. Dashed ovals denote cases where only the majority of hits therein share the labeled Genbank annotation. Here, the dominant features are systems mirroring the original structure and the evolved beta-galactosidase system. Very little HGT is apparent while gene loss and rearrangement are ostensibly more common.

More »

Figure 5 Expand

Figure 6.

Lsr system LMNAST hits overlaid onto phylogenetic distributions of different scopes.

A. E. coli Lsr system LMNAST hits overlaid onto an E. coli specific phylogenetic distribution as developed in and adapted from [4], [27]. The larger, bolded leaves contain Lsr system homologs, whereas the grayed italicized leaves do not. Lsr system loss is evident in B2 strains. B. E. coli Lsr system LMNAST hits overlaid onto an Enterobacteriales and Pasteurellales phylogenetic distribution adapted from [34]. The larger, bolded leaves contain Lsr system homologs, whereas the grayed italicized leaves do not. Loss and gain events are denoted by – and+respectively based on parsimony. Compared to the distribution of the lac operon, Lsr is more dispersed but also shallower, phylogenetically.

More »

Figure 6 Expand

Figure 7.

Annotated 2D similarity plot for Lsr system LMNAST weak search hits.

A. HGT of homologous systems is evident among hits with perfect organizational homology but diminished mean element homology. A great number of hits have low similarity along both axes. As seen in B., these hits are mostly involved in the metabolism of 5 carbon carbohydrates according to their annotation. This is likely reflective of the fact that AI-2 is of similar structure to 5 carbon sugars.

More »

Figure 7 Expand

Figure 8.

Coincidence matrix for E. coli Lsr system LMNAST stringent search hits.

This coincidence matrix depicts the subset of hits with a mean element homology >0.3 and also containing 4–6 gene characters. This subset was chosen for its intermediate degree of homology to the query Lsr system. LsrF and lsrG characters were found to be overrepresented among these hits coincident to hits also containing lsrR and lsrK characters.

More »

Figure 8 Expand

Figure 9.

Coincidence matrix for E. coli Lsr system LMNAST extended window stringent search hits.

Letters represent the respective Lsr genes. 1–5 represent the five genes preceding the Lsr system: ydeT, yneL, hipA, hipB, and ydeK-lipoprotein. 14–17 represent the four genes after the Lsr system: tam (transaconitate methyltransferase), yneE, uxaB, and a predicted diguanylate cyclase. The figure suggests a) strong conservation of the association between Lsr system genes relative to its neighbors, b) hits in which the Lsr system has been excised entirely from its gene neighbors, and c) a weak coincidence between yneH-glutaminase characters and the Lsr system. The matrix also indicates that the overall prevalence of lsrF and lsrG characters is lower than other canonical Lsr characters, although the presence of lsrF and lsrG characters is a good predictor of the presence of other Lsr genes.

More »

Figure 9 Expand

Figure 10.

Phylogenetic Distribution of Lsr at different Phylogenetic scales using reconciled LMNAST results.

E. coli Lsr system LMNAST hits overlaid onto a bacterial phylogenetic distribution as developed in and adapted from [27]. Each leaf bears a representative member from a larger unseen collapsed branch. The larger, bolded leaves contain Lsr system homologs within the collapsed branch, whereas the grayed italicized leaves do not. Parenthesized numbers indicate the number of strains and species with Lsr system LMNAST hits contained within the collapsed branch.

More »

Figure 10 Expand