Conflict between Translation Initiation and Elongation in Vertebrate Mitochondrial Genomes

The strand-biased mutation spectrum in vertebrate mitochondrial genomes results in an AC-rich L-strand and a GT-rich H-strand. Because the L-strand is the sense strand of 12 protein-coding genes out of the 13, the third codon position is overall strongly AC-biased. The wobble site of the anticodon of the 22 mitochondrial tRNAs is either U or G to pair with the most abundant synonymous codon, with only one exception. The wobble site of Met-tRNA is C instead of U, forming the Watson-Crick match with AUG instead of AUA, the latter being much more frequent than the former. This has been attributed to a compromise between translation initiation and elongation; i.e., AUG is not only a methionine codon, but also an initiation codon, and an anticodon matching AUG will increase the initiation rate. However, such an anticodon would impose selection against the use of AUA codons because AUA needs to be wobble-translated. According to this translation conflict hypothesis, AUA should be used relatively less frequently compared to UUA in the UUR codon family. A comprehensive analysis of mitochondrial genomes from a variety of vertebrate species revealed a general deficiency of AUA codons relative to UUA codons. In contrast, urochordate mitochondrial genomes with two tRNAMet genes with CAU and UAU anticodons exhibit increased AUA codon usage. Furthermore, six bivalve mitochondrial genomes with both of their tRNA-Met genes with a CAU anticodon have reduced AUA usage relative to three other bivalve mitochondrial genomes with one of their two tRNA-Met genes having a CAU anticodon and the other having a UAU anticodon. We conclude that the translation conflict hypothesis is empirically supported, and our results highlight the fine details of selection in shaping molecular evolution.


INTRODUCTION
Vertebrate mitochondrial DNA has two strands of different buoyant densities, i.e., the H-strand and the L-strand. The Hstrand is the sense strand for 1 protein-coding gene (ND6) and 8 tRNA genes and the L-strand is the sense strand for 12 proteincoding genes, 2 rRNA genes and 14 tRNA genes. The two strands have different nucleotide frequencies, with the H-strand being GT-rich and the L-strand AC-rich [1,2]. This asymmetrical distribution of nucleotides has been explained [3][4][5][6] in terms of the strand-displacement model of mitochondrial DNA (mtDNA) replication [7][8][9][10]. In short, the H-strand is left single-stranded for an extended period and subject to spontaneous deamination of A and C [11,12] to G and U. In particular, the CRU mutation mediated by the spontaneous deamination is known to occur in single-stranded DNA about 100 times as frequently as in doublestranded DNA [13]. Therefore, the H-strand tends to accumulate ARG and CRU mutations and become GT-rich while the Lstrand tends to become AC-rich. This pattern is similar to the strand bias observed in eubacterial genomes [14][15][16].
The strand-biased mutation spectrum has profound consequences on codon usage in mitochondrial protein-coding sequences (CDSs) and the anticodon of tRNA genes [17]. First, the codons of the 12 CDS sequences (that are collinear with the AC-rich L-strand) end mainly with A or C, and the codon bias in the ND6 gene collinear with the opposite strand is the opposite. Second, the 8 tRNA sequences collinear with the GT-rich Hstrand is more GT-rich than the 14 tRNA sequences collinear with the AC-rich L-strand. Third, because the overall codon usage is mainly determined by the 12 CDSs collinear with the AC-rich L strand, the A-ending and C-ending codons are almost always the most frequently used codons. The anticodon of 21 tRNA genes (out of a total of 22), regardless of which strand they are located, have anticodons with their wobble site forming Watson-Crick base-paring with the most abundant codons in each codon family, i.e., the wobble site of the tRNA genes is either a U to pair with the abundant A-ending codons or a G to pair with the abundant Cending codons [17].
The codon-anticodon adaptation is long known [18][19][20][21][22][23], and the pattern described above would have been nice but boring had there not been an interesting and singular exception to the general pattern of tRNA anticodon matching the most abundant codon. The tRNA Met anticodon is 39-UAC-59 (or CAU for short), with the wobble site being C instead of U, and forms a Watson-Crick match with the AUG codon instead of the AUA codon, in spite of the fact that the latter is used much more frequently than the former. The ability of the CAU anticodon to pair with the AUA codon is achieved by modifying the C in the anticodon CAU to 5-formylcytidine [24,25]. A similar case involves the methylation of guanine in starfish tRNA Ser to translate all four AGN codons [25].
The use of the CAU anticodon instead of a UAU anticodon in vertebrate mitochondrial tRNA Met is unexpected from two existing hypotheses of anticodon usage. The codon-anticodon adaptation hypothesis [17][18][19][20]23,26] predicts that the anticodon should match the most abundant codon. Because AUA is much more frequent than AUG, the hypothesis predicts that the anticodon of the tRNA Met gene should be UAU instead of the observed CAU. The hypothesis of selection on anticodon wobble versatility [17], which was implicitly proposed before [27,28] and may be more appropriate for vertebrate mitochondrial genomes because each codon family is translated by a single tRNA species, states that the anticodon should maximize its wobble versatility in paring with synonymous codons. Because U in general is more versatile than C in wobble pairing with both A and G [29,30], the hypothesis of selection on anticodon versatility also predicts an UAU anticodon to maximize its paring versatility with the AUA and AUG codons. The fact that the observed tRNA Met anticodon is CAU instead of the predicted UAU is intriguing.
This unexpected tRNA Met anticodon has been attributed to a compromise between translation initiation and elongation [17] as follows. AUG is not only the most frequently used initiation codon, but also the most efficient initiation codon in Escherichia coli [31] and Saccharomyces cerevisiae [32]. In E. coli, the most efficient non-AUG initiation codon is AUA and its rate of initiation is only 7.5% of AUG [31]. In yeast mitochondria, a mutation of the initiation AUG to AUA in the COX2 gene caused at least a fivefold decrease in translation [33], and similar finding was also duplicated in another yeast mitochondrial gene COX3 [34]. Assuming the generality of these findings, an anticodon matching AUG will increase the initiation rate and would be favored by natural selection because translation initiation is often the limiting step in protein production [22,35]. This presents a conflict between translation initiation and translation elongation. An AUG-matching anticodon would increase the translation initiation rate but decrease the translation elongation rate because an overwhelming majority of methionine codons are AUA in vertebrate mitochondrial genomes. The fact that all known vertebrate tRNA Met genes feature an AUG-matching anticodon implies that nature has chosen to maximize the translation initiation rate [17]. This hypothesis that invokes a conflict between translation initiation and translation elongation to explain the usage of the CAU anticodon in tRNA Met will be referred hereafter as the translation conflict hypothesis.
Two consequences can be derived from the translation conflict hypothesis. First, we should expect a relative reduction of AUA usage because the AUG-matching anticodon imposes selection against the use of AUA codons as AUA would need to be wobbletranslated. To fix ideas, let us focus only on AUR (methionine) and UUR (leucine) codon families. The reason for choosing UUR instead of any other R-ending codon families is because other Rending codon families do not have a middle U and the middle nucleotide in a codon is known to affect the nucleotide at the third codon position (P. Higgs, pers. comm.).
For the 12 CDSs that are collinear with the AC-rich L-strand, the mutation favors A-ending codon [3,4,17]. For UUR codons, because the anticodon wobble site is U and form Watson-Crick base pair with A, we also expect UUA codon to be preferred against UUG codons. Thus, both mutation and the tRNAmediated selection favor the use of UUA against UUG codons. However, for the methionine codons, the AUG-matching tRNA Met anticodon would favor the AUG codon against the AUA codon. Thus, the tRNA-mediated selection and the mutation bias are in opposite directions. If we define for each of these two codon families, where X is either A or U, and N XUA and N XUG are the number of XUA and XUG codons, respectively, we should find P AUA to be smaller in the AUR codon family than P UUA in the UUR codon family. An argument against using Eq. (1) is that the result would be biased in favor of supporting the prediction of P AUA ,P UUA because the initiation codon, which is AUG in most cases, was not excluded. A more convincing comparison should compute P AUA after excluding initiation codons entirely. This is what we used in this study.
For the ND6 gene collinear with the GT-rich H-strand, the strand-biased mutation spectrum favors G-ending codons in the two XUR codon families. For the methionine codon family, the AUG-matching anticodon also favors the AUG codon against the AUA codon. So the AUA codon will be depressed by both the strand-biased mutation and the tRNA-imposed selection. The tRNA-imposed selection is absent against UUA codon in the UUR codon families because their respective tRNA anticodons all match the A-ending codons [17]. Thus, for the ND6 gene, we also expect P AUA to be smaller in the AUR codon family than P UUA in the UUR codon family.
The expected P AUA ,P UUA , if confirmed, can have two possibilities. If the total number of methionine remains constant across mitochondrial genomes, then a deficiency of the AUA codons in one genome implies an equal amount of surplus of AUG codons. In contrast, if there is no selection maintaining a constant number of methionine codons but there is selection against AUA codons because it requires the inefficient wobble translation, then a genome with a deficiency of AUA codons would also exhibit a deficiency of methionine codons.
In this paper, we use mitochondrial genomes from representative vertebrates, urochordates and bivalves to test these predictions (with the relevance of the bivalve mitochondrial genomes pointed out to us by an anonymous reviewer). While vertebrate mitochondrial genomes all have just one tRNA-Met gene, urochordates have two tRNA-Met genes, with one having a CAU anticodon and the other having an UAU anticodon. The presence of the UAU anticodon in the tRNA-Met gene in urochordate mitochondrial genomes implies that the selection against AUA codon should be weaker in urochordates than in vertebrates. The bivalve mitochondrial genomes are particularly interesting because some of them have anticodon CAU in both of their tRNA Met genes whereas others have an anticodon CAU in one of their tRNA Met genes and an anticodon UAU in the other tRNA Met gene. We expect the selection against the AUA codon to be weaker in the latter than in the former.

MATERIALS AND METHODS
To test the predictions derived from the translation conflict hypothesis, we retrieved all 498 vertebrate mitochondrial genomes available from NCBI Entrez by Sept. 29,2005. The CDS sequences from each mitochondrial genome were extracted and codon usage quantified by using DAMBE [36,37], and separated into two groups, with one containing the 12 CDSs collinear with the L-strand and the other containing the ND6 gene collinear with the H-strand. We tabulated the frequencies of the A-ending and G-ending codons for AUR and UUR codon families for each of the two groups. The CDS-derived amino acid usage is also computed with DAMBE.
P XUA values as defined in Eq. (1) are computed for each of the six codon families, with the initiation codon excluded. Many of these mitochondrial genomes are very similar to each other and detailed analysis were carried out for 30 species, with six species each from teleosts, amphibians, non-avian sauropods, birds, and mammals. The reason for choosing 30 species is that a random selection of any 30 species, with six from each of the five groups, always lead to the same conclusion based on multiple random selections. The 30 species is just one of many samples of 30 species.
It is important to keep in mind that the 30 species above do not represent independent data points. For example, their common ancestor could have somehow evolved a reduced P AUA relative to P UUA , and this character has been inherited among all its descendents. This means that all 30 species could be equivalent to just single data point. For this reason, corroborative evidence needs to be sought in other species.
The mitochondrial genomes of four urochordates (Halocynthia roretzi, Ciona intestinalis, C. savignyi, and Doliolum nationalis) deposited in GenBank are particularly relevant to this study for several reasons. First, all 13 protein-coding genes are located in one strand, which eliminates the effect of differential strand-biased mutation on codon usage, i.e., all genes are subject to the same strand-specific mutation bias, if any. Second, they all have two tRNA Met genes, one with a CAU anticodon and the other with a UAU anticodon [38][39][40][41][42][43]. This would eliminate, or at least reduce, the hypothesized selection against AUA codon usage. We can therefore predict that P AUA should be increased relative to P UUA in these urochordate mitochondrial genomes compared to vertebrate mitochondrial genomes.
One anonymous reviewer pointed out to us that another contrast can be made within mitochondrial genomes of bivalve mollusks. Ten complete bivalve mitochondrial genomes are available in GenBank. Aside from Lampsilis ornata which has undergone a great deal of genome rearrangement with proteincoding genes distributed on both strands and has only one tRNA Met gene [44], the other nine bivalve species all have two tRNA Met genes in their mitochondrial genomes and all have their protein-coding genes in the same strand. Among these nine species, six of them have two CAU-tRNA Met gene (matching the AUG codon), whereas the other three has a CAU-tRNA Met gene and a UAU-tRNA Met gene (matching the AUA codon) in the other tRNA-Met gene. We predict that tRNA-mediated selection against the AUA codon should be stronger in the former than in the latter according to the translation conflict hypothesis.

RESULTS AND DISCUSSION
For the 30 vertebrate mitochondrial genomes, the mean P AUA value is consistently smaller than the mean P UUA values (Table 1), consistent with the prediction from the translation conflict hypothesis. The prediction is strongly supported by data from both the 12 CDSs collinear with the L-strand and the ND6 gene collinear with the H strand ( Table 1).
The observation of a relative deficiency of AUA codons can be interpreted in two ways. If methionine usage remains constant among vertebrate mitochondrial genomes, then a deficiency of AUA in a genome implies an equal amount of surplus in AUG.
On the other hand, if the number of methionine codons (N Met ) is weakly constrained, then the selection against AUA codons may result in a net loss of methionine codons. This would lead to a positive association between P AUA and N Met , i.e., small P AUA is associated with small N Met . The empirical data supports the latter inference, i.e., reduction of AUA codons leads to a reduction in methionine usage in the genome (Fig. 1).
It is important to recognize that the results presented above, while all consistent with the translation conflict hypothesis, do not exclude the possibility that AUA codon usage may be reduced for reasons unrelated to the CAU anticodon in tRNA Met . It would be nice to have a mitochondrial genome in which the tRNA Met anticodon is not CAU but UAU. If such a genome also has a reduced AUA usage relative to UUA codons, then we cannot interpret the reduced AUA usage in the vertebrate mitochondrial genomes as a response to the selection mediated by the CAU anticodon in tRNA Met . On the other hand, if such a genome does not exhibit a deficiency of AUA codons relative to UUA codons, but instead exhibit an increased AUA codon usage favored by the UAU anticodon, then the translation conflict hypothesis is strengthened.  In this context the mitochondrial genomes of four urochordates (Halocynthia roretzi, Ciona intestinalis, C. savignyi, and Doliolum nationalis) deposited in GenBank are particularly useful in providing corroborative evidence. All four genomes have two tRNA Met genes, one with CAU anticodon and the other with a UAU anticodon [38][39][40][41][42][43]. This would eliminate, or at least reduce, the hypothesized selection against AUA codon usage. We can therefore predict that P AUA should be increased relative to P UAU for these urochordate genomes, in contrast to the vertebrate mitochondrial genomes in which the only tRNA Met has a CAU anticodon that would favor a decreased usage of AUA codons. In other words, we should expect (P UUA 2P AUA ) to smaller in the urochordate mitochondrial genomes than (P UUA 2P AUA ) in vertebrate mitochondrial genomes. This prediction is confirmed ( Table 2). The mean (P UUA 2P AUA ) is only 23.225 in the urochordate mitochondrial genomes, in contrast to the vertebrate mitochondrial genomes where the mean (P UUA 2P AUA ) values are significantly greater than 0 (p = 0.0001 for the 12 CDSs collinear with the L-stand and p = 0.0004 for ND6 collinear with the H-strand).
Another independent contrast can be made within bivalve mollusks. Ten bivalve mitochondrial genomes are publicly available in GenBank. Aside from Lampsilis ornata which differs from the other species in that (1) it has undergone a great deal of genome rearrangement with protein-coding genes distributed on both strands and (2) has only one tRNA gene for methionine [44], the other nine species all have two tRNA Met genes and all have their protein-coding genes in the same strand. Among these nine species, six of them have two CAU-tRNA Met genes (matching AUG codon), whereas the other three has one CAU-tRNA Met gene and one UAU-tRNA Met gene (matching AUA codon). We have predicted that tRNA-mediated selection against AUA codon should be stronger in the former than in the latter according to the translation conflict hypothesis. The prediction is strongly supported ( Table 3). The six genomes with the two CAU-tRNA Met genes, labeled CAU/CAU, have P UUA greater than P AUA , but the three genomes with one CAU-tRNA Met gene and one UAU-tRNA Met gene, labelled CAU/UAU, have P UUA smaller than P AUA (Table 3). We did not perform a significance test between the two groups of species because the three species with CAU/UAU anticodons are all in the Mytilus genus. So the contrast should be taken as only one contrast.
In conclusion, the translation conflict hypothesis is empirically supported. The presence of a CAU anticodon matching the AUG methionine codon represents a significant selection force against AUA codon usage in vertebrate mitochondrial genomes, resulting in P AUA smaller than P UUA . The reduced AUA codon usage is associated with a reduced methionine usage in the vertebrate mitochondrial genomes. When such selection is weakened in the urochordate mitochondrial genomes containing CAU-tRNA Met and UAU-tRNA Met genes, the AUA codon is no longer strongly selected against, and P AUA becomes similar to P UUA . In bivalve mollusks, mitochondrial genomes with only CAU-tRNA Met genes have reduced AUA usage than those with both CAU-tRNA Met and UAU-tRNA-Met genes.

ACKNOWLEDGMENTS
We thank S. Aris-Brosou, R. Debry, P. Higgs, D. Hickey, H. C. Wang and members in Xia Lab for comments and discussion. Suggestions from an anonymous reviewer substantially strengthened the conclusions and improved the readability of the manuscript.