Rare Codons Cluster
(A) %MinMax was applied to the P22 tailspike gene, using a sliding window size of 18 codons and the E. coli codon bias (essentially identical to the codon bias of S. enterica serovar Typhimurium, the endogenous host of P22). Dark %Max bars correspond to clusters of common codons; lighter %Min (negative) bars correspond to clusters of rare codons. In contrast, the average of 200 random reverse translations of tailspike, biased to E. coli codon usage frequencies, yields a %MinMax profile that is entirely %Max (grey line). The white arrow marks the location of the deepest %Min peak, at codon 406. Silent mutagenesis of P22 tailspike to replace this rare codon cluster with synonymous common codons alters the %MinMax plot (black line); these mutations only affect the indicated %Min peak. (B) The %MinMax value for every window of the entire E. coli ORFeome was calculated using a sliding window of 18 codons and used to construct a histogram of %MinMax values at intervals of 1%MinMax. Negative bin numbers represent %Min values. The effects of codon clustering are seen when the E. coli ORFeome (black line) is compared to the +1 and −1 out-of-frame sequences of the E. coli genome (dotted lines) or the average of 200 codon-biased random reverse translations analyzed using the same statistical conditions as the entire ORFeome (grey line). (C) The deviation of the distribution of %MinMax bins throughout the E. coli ORFeome from the average of 200 codon-biased random reverse translations of the entire ORFeome is greatest in high %Max regions (30 standard deviations from mean), and at −31%Min (28 standard deviations from mean). (D) Tailspike was expressed in vivo on E. coli ribosomes. After lysis, the N-terminal His-tag of tailspike was detected using an anti-His tag antibody, revealing two major bands: full length tailspike (asterisk), which dwells on the ribosome post-translationally , and a 49 kDa band corresponding to the size of a nascent chain produced during pausing at approximately codon 406, the location of the deepest %Min peak (white arrow). Silent mutagenesis to eliminate the large rare codon cluster centered at codon 406 (SYN) eliminates the 49 kDa band.