Rare Codons Cluster
For each codon, three E. coli absolute codon frequencies are tabulated using codon usage data from KazUSA : (i) the frequency with which this codon is used in the entire E. coli genome (Actual), (ii) the usage frequency for the most common codon encoding this amino acid (Max), and (iii) the usage frequency for the least common codon encoding this amino acid (Min). An average usage frequency (Avg) is also calculated for each residue by summing the individual codon frequencies and dividing by the number of codons (for each residue). The resulting values are typically averaged over an 18-codon window (a window of 5 is used here); window sizes from 5 to 30 codons produced similar distributions of rare codon clusters, though the noise was increased with smaller window sizes. These four codon usage frequencies are used to calculate %Max and %Min using the equations shown; note that only positive values are reported (i.e., each window may yield a value for either %Min or %Max, not both). A %Min value of 51 means that this sequence is approximately halfway between the maximum rare sequence and the average sequence, and is plotted as −51.