Figure 1.
Average ΔG ° Values in the TIRs for 18 Organisms
For all 18 genomes in our study, we calculated the average ΔG ° value for each RS position. Zero on the x-axis corresponds to the 5′ A residue in the rRNA sequence 5′–ACCUCC–3′ being positioned over the first base in the initiation codon. The dramatic drops in ΔG ° prior to RS 0 show the presence of SD sequences. The sudden drop in ΔG ° immediately after the first base in the initiation codon (at RS+1) shows that there is a significant binding potential between the 16S rRNA and the mRNA close to the initiation codon, an unexpected location. (A) was drawn from data generated by free_scan and (B) is from data generated from RNAhybrid [34]. Differences between the two graphs are discussed in the text.
Figure 2.
Normalized Histogram Plots Showing the RS for the Lowest ΔG ° Values in the TIRs
The x-axis shows the RS, or distance between the 5′ A residue in the rRNA sequence 5′–ACCUCC–3′ from the 3′ tail and the first base in the start codon. Negative numbers indicate that the 5′ A is upstream from the start codon, while positive numbers indicate that it is downstream. The y-axis is the fraction of genes in a genome where the lowest ΔG ° value is at a particular RS.
Table 1.
Usage Statistics for the Three Most Common Initiation Codons: AUG, GUG, and UUG
Figure 3.
Examples from E. coli Showing How RS Is Calculated
The complementary bases, plus G/U mismatches, that are predicted to bind together are capitalized. The predicted SD sequence consists of the capitalized letters in the mRNA. The location of the start codon is indicated with the hat character, ^, and the location of the 5′ A residue in the rRNA sequence 5′–ACCUCC–3′ is indicated with a v. The RS is the distance between the 5′ A and the first base in the start codon. If the SD is upstream from the start codon, then the RS is given as a negative number. If the SD is downstream, it is given as a positive number. Both SD sequences for wecF and argD come before the start codons (in these cases, the start codon is AUG). The RS for wecF is −4 and for argD it is −10. radC's SD sequence includes the start codon, GUG, and the RS is +1.
Figure 4.
mRNA bases between positions −7 to 5 would need to bind to the rRNA tail for RS+1. For each position, the sequence logo displays amount of information content and the frequency of nucleotides. Positions that have no information content are blank, whereas those with information content contain a stack of nucleotide characters. The size of the nucleotide character in the stack is proportional to its frequency at that position.
Table 2.
A Summary of the Annotation Programs Used for the Genomes in This Study
Table 3.
Downstream Start Codons
Table 4.
Binding at the Start Codon for Strong +1 Genes Compared with Upstream Binding
Table 5.
A Summary of Predicted rRNA–mRNA Binding
Table 6.
Model Comparisons
Figure 5.
Average ΔG ° Values in the TIR for Synechocystis
The trough prior to RS 0 clearly shows the presence of an SD motif in many genes.
Table 7.
A Summary of the Data and Its Sources Used in This Study
Figure 6.
An Overview of How ΔG ° Values Are Calculated in Each TIR
For each base in each initiation region, we simulated the change in free energy required for the 3′ 16S rRNA tail to hybridize with the mRNA. A minimum of two consecutive bases need to pair, and for the binding to occur spontaneously require a change more negative than −4.08 kcal/mol [13], the value for ΔGinit °, In this example, the initiation region from E. coli's gene hcaF, alignment 1 is set to zero because the change in free energy required to bring together a single complementary double is not favorable. Alignment 2 and 71 are set to zero because there are no complementary doublets. Alignment 6 is set to −16.5 because it requires −16.5 kcal/mol less than −4.08 kcal/mol to hybridize.