Rare Codons Cluster
(A) The %MinMax distribution for every gene of the Arabidopsis thaliana genome annotation database was calculated using a window size of 18 codons and compared to 200 random reverse translations as described for Figure 2B. A. thaliana shares a similar enrichment of rare codon clusters and very common codon clusters as seen for the E. coli ORFeome (Figure 2B). (B) A wide variety of organisms are enriched for rare and very common codon clusters. Regions of enrichment (≥8σ from the mean, thick grey bars) were observed for the ORFeomes of eukaryotes A. thaliana, H. sapiens, and C. neoformans, as well as prokaryotes E. coli, Nostoc, P. fluorescens and S. meliloti. The low %Max regions, which represent a more random distribution of rare and common codons (less clustering), were typically either significantly under-represented (open bars) or not significantly different from the random reverse translations (black bars). In some extreme regions, the random reverse translations were unable to provide sufficient coverage to ensure a normal distribution of the data (light grey bars); see Methods for more details.