The Effect of Codon Mismatch on the Protein Translation System

Incorrect protein translation, caused by codon mismatch, is an important problem of living cells. In this work, a computational model was introduced to quantify the effects of codon mismatch and the model was used to study the protein translation of Saccharomyces cerevisiae. According to simulation results, the probability of codon mismatch will increase when the supply of amino acids is unbalanced, and the longer is the codon sequence, the larger is the probability for incorrect translation to occur, making the synthesis of long peptide chain difficult. By comparing to simulation results without codon mismatch effects taken into account, the fraction of mRNAs with bound ribosome decrease faster along the mRNAs, making the 5’ ramp phenomenon more obvious. It was also found in our work that the premature mechanism resulted from codon mismatch can reduce the proportion of incorrect translation when the amino acid supply is extremely unbalanced, which is one possible source of high fidelity protein synthesis after peptidyl transfer.


Introduction
Understanding of the gene translation process is important for human health [1][2][3], biotechnology [4][5][6] and evolution [3,4,7,8]. In recent years a number of technologies have been developed to characterize different features related to the gene translation and multiple roles of the coding sequence have been proposed. Recent studies suggested that the order of codons along the mRNA plays an important role in determining translation efficiency [4,[9][10][11]. It was suggested that there is weak folding of mRNA molecule in the region surrounding the start codons [9][10][11][12][13][14][15][16], and endogenous genes tend to perform strong mRNA folding in the region after the start codon [10,17,18], which can improve the fidelity of translation initiation [10,17,19,20]. It was also suggested that the first 30-50 codons at the beginning of the open reading frame (ORF) tended to be recognized by tRNA species with lower intracellular abundance [6,21,22], resulting in slower ribosomal elongation speed in this region [6,23,24]. Fast initiation of short genes also causes a 5' ribosomal ramp [14]. However, this process is still enigmatic with contradicting conclusions in different studies. Although there are many publications show that the codon sequence is one reason of the 5' end ramp, other research suggested that the ramp of 5' end is caused primarily by faster initiation in short genes, rather than by the ordering of codons within each gene [14]. In this paper, we proposed that the ribosome premature is another causeof5' ramp.
Translation of mRNA has been studied by a variety of computational models based on the totally asymmetric simple exclusion process, justifying the role of codon ordering in determining spatial patterns of ribosomes along mRNAs [18,25]. Such models were built based on constant, inexhaustible supplies of amino acids, free ribosomes and free tRNAs in the cell. A more realistic alternation, the "Whole cell" [14] was developed to investigate the gnome scale gene translation properties. This model tracks all ribosomes and tRNAs in a cell-each of which is either freely diffusing or bound to a specific mRNA molecule at a specific codon position at any time point. Transition rates among states are parameterized in seconds so that the model describes the dynamics of translation in real time. Unlike many other models of translation, which treat each mRNA molecule in isolation and assume an inexhaustible supply of free ribosomes that initiate the message at a constant rate, "Whole cell" model keeps track of every tRNA, mRNA, and ribosome molecule in the cell simultaneously. But codon mismatch effects, leading to premature, especially under unbalanced starvation conditions [26,27], was still ignored in the "Whole cell" model. Our model are based on the "Whole cell" model, and focus on a previously ignored problem: the effect of codon mismatch and translation premature. From the simulation result of our model, the premature is one reason for the 5' ramp and cannot be ignored. Although the chance of codon mismatch at each codon is low [28], it can be accumulated along the long mRNA sequence. In order to investigate the codon mismatch effects in translation processes [29][30][31][32], we built a model to study the translation performance under two different conditions: balanced and unbalanced amino-acid supplies.

Model description
Our model takes the premature event into account (Fig 1). Consider a set of mRNA sequences and each has N i sites (i is the index of sequence) which can either be occupied by a ribosome or be empty, ribosomes can transfer between different sites according to the following rules: given a movable ribosome randomly, if its current position j (j, 1 j N i ) is between 1 and N i − 1, then the ribosome on site j will move to site j + 1.
In this process, codon mismatch can occur based on a predefined probability. For mRNA i, codon j, the codon mismatch probability p j_mis can be calculated based on the number of available aminoacyl tRNAs (Eqs 1-5). Here p mis_based is the average mismatch probability under normal conditions, which is available from experimental results [28]; p relative_mis_aa is the average probability of mismatch for the current codon under normal amino acid supply conditions; p relative_match_aa is the average match probability for the current codon under normal amino acid supply conditions; p mis_abort is the total premature probability for a ribosome when a codon mismatch occurs. If a non-cognate amino acyl tRNA is incorporated, the peptide chain will lose specificity in the A site of the ribosome and the propagation of this error may result in premature of the peptide [28,33]. From Fig 2 [28], we can see that when a codon mismatch occurs, the premature probability for a ribosome is rather high. Because the probability of continuing an incorrect translation will be significantly reduced, the premature termination is assumed to occur immediately after the first mismatch with the defined probability in this model. p relative mis aa ¼ p mis based 61 ð1Þ For the current codon j, the index of matched aminoacyl tRNA type is t; x is the set of matched aminoacyl tRNA; p j_total_relative_mis is the probability of mismatch and p j_total_relati-ve_match is the probability of match. p j total relative mis ¼ X t= 2x T f t p relative mis aa ð3Þ p j mis ¼ p j total relative mis p j total relative match þ p j total relative mis ð5Þ p j match ¼ p j total relative match p j total relative match þ p j total relative mis ð6Þ Here T f t is the number of free aminoacyl tRNAs of type t. For a movable ribosome on site j, it has three possible moves. The probability of the ribosome to move to the next site correctly is When a codon mismatch occurs, the ribosome will either abort or continue to translate the left codons. When the ribosome aborts, all the work has been done will become invalid and the ribosome will return to the free pool, while some ribosomes can still reach the ending codons with incorrect peptides, which will decrease translation efficiency.
doi:10.1371/journal.pone.0148302.g001 p j_match ; the probability for it to move to the next position, but performing incorrect translation is p j_mis × (1 -p mis_abort ); and the probability of a premature termination is p j_mis × p mis_abort .
A reasonable estimation of the initiation rate under various conditions is a key problem. Here we assume that the initiation rate is the bottleneck of translation rate because the average number of ribosomes on mRNA sequences is small, indicating that the transfer between different sites should be much faster than the initiation [18,24,34]. Here we infer that the initiation rate is determined by the supply of free Methionyl-tRNAs and ribosomes (Eqs 9 and 10).
Initiation rate for one mRNA of gene i: Total initiation rate: f i is the fraction of mRNAs of gene i that can be initialized. A i is the number of mRNAs of type i. p i is the gene-specific initiation probability [14], D r is the diffusion coefficient of ribosomes. D t is the diffusion coefficient of tRNAs, τ r and τ t are the characteristic times of ribosomes and tRNAs [14], R f is the number of free ribosomes, M is the number of free Methionyl-tRNAs. Because the mismatched aminoacyl tRNAs also have a certain chance to be accepted in the translation especially with an unbalanced amino acid supply, a transfer rate is introduced.
Transfer rate for one ribosome on codon: Total transfer rate: If the anticodon k does not match codon j: If the anticodon k matches codon j: ω j is the wobble parameter, s is tRNA competition coefficient, T f k is the number of free tRNAs of type k. This model can also simulate the process that does not count codon mismatchby setting p mis_based = 0.

Simulation Setup Details
To investigate how protein production is affected by stress, we simulated translation under conditions of balanced and unbalanced amino-acid supply conditions. We modeled the stress of a particular amino acid by changing the abundance of its (charged) cognate tRNAs by 2 x folds [14]. Here 2 x is the supply coefficient. The package used for the simulations can be download from ftp://159.226.238.166/pub/. It was written in C++ and can be compiled under centos. Based on the package of "Whole cell" [14], the codon mismatch and premature features were added.
Condition 1: a balanced amino acid supply condition in which the amount of each aminoacyl tRNAs is multiplied by the same supply coefficient 2 x . When x = 0, the system is in normal amino acid supplycondition [14].
Condition 2: an unbalanced amino acid supply condition. Here we take Arg imbalance as an example to investigate the performance of the translation system. So only the amount of Arginyl-tRNA is multiplied by a supply coefficient 2 x . Condition 3: random starvation condition that all amino-acid supplies are randomly modified, different amino acids have different coefficients less than one. In fact this condition is also an unbalanced condition.
Normal amino acid supply condition: the ratio of amino acid supply is defined with the relative copy numbers of charged aminoacyl tRNAs (Table 1) [14].
In our simulation, p mis_abort was set to be 0.5, p mis_based was set to be 0.001 [28]. 3,795 genes and 60,000 mRNAs were adopted from S. cerevisiae [14]. To investigate the effects of mistranslation, we studied the translation under three conditions: Condition 1, Condition 2and Condition 3 as previously mentioned. Equilibrations in all simulation systems were achieved in the first 2x10 9 steps, followed by another 100 seconds of simulation for data collection.

Results and Discussions Incorrect translation proportion is high under unbalanced conditions
There will be 2 types of productions: one is the correct translation production which is the correct translation of the whole codon sequence. The second is incorrect translation production in which the peptide is synthesized with codon mismatch or is shorter than the correct peptide. The fraction of the three fractions were got from the simulation result and are defined as the following: fraction premature len ¼ total length of released premature chains total length of released chains ð16Þ Compared to the simulation under Condition 1, the fraction of incorrect translation under unbalanced conditions is relatively high (Fig 3). The fraction of incorrect translation is minor and does not perform large fluctuations under Condition 1, while under Condition 2, the fraction of incorrect translation increases with the stress of unbalanced amino acid supply. Under Condition 3, all the 10 fractions of incorrect translation are higher than the normal conditions. This is an evidence of the premature termination mechanism, which can contribute to the reduction of incorrect production when the supply of amino acid is extremely unbalanced.
Although the chance of codon mismatch is low under balanced conditions, accumulation of codon mismatch can lead to a high proportion of incorrect translation of long mRNAs (Fig 4a,  4b and 4c) and the chance can be increased under the Condition 2 (Fig 4b) and Condition 3 (Fig 4c). In Fig 4a, the fraction of incorrect peptide length does not change very much when there are more charged tRNA available, because under normal condition, the proportion of charged tRNAs can almost meet the demand of protein synthesis, and so is the balanced conditions of more charged tRNA available. So the fraction of mismatch on a codon almost remains unchanged (Fig 4d and 4f). But in Fig 4b, the fraction of incorrect peptide length increases fast with more unbalanced supplies. The reason is that when the number of Arginyl-tRNAs decreases or increases based on normal condition, the proportion of charged tRNAs cannot meet the demand of protein synthesis any longer. When the number of Arginyl-tRNAs decreases, the fraction of mismatch on Arg related codons will significantly increase, while the fraction of mismatch on other codons will decrease. When the number of Arginyl-tRNAs increases, the fraction of mismatch on Arg related codons will decrease, while the fraction of mismatch on other codons will increase (Fig 4e and 4g). Thus, study of codon mismatch effects is essential to build the model of the translation process.  Fig 3a, 3b and 3c, the vertical axis shows the total length fraction of the three kinds of released peptide chains which are the premature chains, mature chains with error, and the incorrect chains. In Fig 3a and 3b Mismatch is one reason of 5'-to-3' ramp Codon mismatch effects can be used to explain some well-known phenomenon in the translation process. In order to remove the effects of faster initiation of short mRNAs, 10410 mRNAs with codon sequence length larger than 500 codons were selected. To study the effect of mismatch on  Fig 4a, 4b and 4c, some of mRNAs were not translated for enough times, which could have a significant effect on the final result. So for each simulation result, the mRNAs were sorted with the correctly translated times, and the top 100 were selected. The vertical axis shows the total length fraction of the incorrect chains. The horizontal axis shows the codon sequence length. In  Fig 4a and 4b, the incorrect translation fraction is plotted against the codon sequence length with 5 different supply coefficients; In Fig 4c, there are 5 curves for each simulation, different charged aminoacyl tRNA coefficient was randomly set less than 1 ; Fig 4d and 4e show the fraction of mismatch events on a codon, which is equal to the number of mismatch events on the codon divided by the number of all events on the same codon. Fig 4f and 4g show the fraction of mismatch events of a codon, which is equal to the number of mismatch events on the codon divided by the number of mismatch events on all codons. the 5' ramp, the simulation with and without mismatch were run and the fraction of mRNAs with bound ribosome at each site was averaged over all the mRNAs. For each case (Fig 5a, 5b, 5c and 5d), a faster decline of fraction of mRNAs with bound ribosome is found through the elongation process when the mismatch events are considered (Fig 5b and 5d), which suggests that the mismatch premature events play an important role to form the 5' ramp. The tests under Condition 3 with and without codon mismatch effects counted were also done (Fig 5e). There are 200 different random supply configurations and for each supply configuration, the simulation with and without codon mismatch effects counted were done. Finally the fraction of mRNAs with bound ribosome of each site was averaged over all the 200 simulations results. The result also shows that the curve declines faster when codon mismatch effects are counted.
Since the incorrect but complete translation would not affect the fraction of mRNAs with bound ribosome, it can be inferred that the fraction decrease along mRNA is, at least partly, caused by the premature termination mechanism. If a premature termination occurs, ribosome cannot translate the left codons, which will do contribution to a higher ribosome density in 5' zones and lower ribosome density in 3' zones.
In Fig 5, another observation is the lower fraction of mRNAs with bound ribosome when amino acids is more available, because when there are more charged aminoacyl tRNAs available, the ribosome will move faster from one codon to the next. But the initiation rate is the bottleneck for the whole translation process, which cannot be speeded as fast as the codon translation rate. So the increase of ribosome flux cannot be increased as fast as the increase of the charged aminoacyl tRNA,which will lead to a lower fraction of mRNAs with bound ribosome. Moreover, this effect is alleviated in Condition 2 under which only the number of Arginyl-tRNA fluctuates. When the number of Arginyl-tRNA increases, only the Arginyl-tRNA related codons will be affected significantly. That is to say only the ribosome at Arginyl-tRNA related codons site will translate faster than before, which cannot affect the fraction of mRNAs with bound ribosome very much. But when the Arginyl-tRNA drops in large number, all the codons will be affected because the ribosome on the Arginyl-tRNA related codons site moves slower, which will lead more stalled ribosomes, then the fraction of mRNAs with bound ribosome will increase significantly.

Codon mismatch affects the rate of correct translation
Incorrect translation productions waste cell energies, decrease the rate of correct translation and even do harm to the living cell. The rate of the amino acid synthesis is about 5 aa/s under normal amino acid supply condition, which agrees with the results from empirical measurements [35,36]. In this work, the rates of correct translation under a variety of unbalanced amino acid supplies conditions were studied (Fig 6). It is interesting to see that the increase of Arg affects the translation rate in different manners. With a supply coefficient smaller than 2 0 , the translation rate increases with more Arg supplied due to the increase in reagent concentration. However, the trend is then reversed, which suggests that the degree of amino acid imbalance plays a more important role, causing a higher chance of codon mismatch and higher probability of incorrect translation. Thus, the codon mismatch is key to the determination of translation consequence.

Discussion
All the above simulations are based on the assumption that amino acids are supplied continuously. But the actual environment is more complex than that. For example, there may be an extreme condition that the supply of some amino acids is ceased. As far as the model concerned, without the contribution of premature, there will be deadlock states on corresponding codon positions and the process of translation event will be paused without releasing the ribosome and the nascent peptide. From our results, it can be inferred that the premature termination can help avoid entering into the deadlock states. If cells release some incomplete nascent peptides through the premature mechanism and recover some scarce amino acids through the recycling mechanism [37][38][39], some of the stalled ribosomes caused by the deficient amino acids will continue to translate the left codons. Thus, although the codon mismatch does reduce the rate of valid work when amino acids supply is balanced, it can also be helpful when supply is unbalanced. Mismatch may be a way for living creatures to adapt to stressful environments. Due to the limitation of computational power and simulation algorithms, a number of simplifications were made in our model. The initialization, translation and termination should be each separated into several steps [40][41][42], which are not counted in this work. Additionally, in real biological environments the numbers of free tRNAs, ribosomes and mRNAs are changing continuously [25,[43][44][45], while in our model these numbers are fixed. Some regulatory mechanisms [39,46,47] involved in the translation process are also ignored in the process of the simulation and other kinds of mismatch error that exist in the translation process [29,48] are also ignored. We mainly considered two supply conditions. One is balanced and the other is unbalanced. Considering the unbalanced supply of different amino acids have the similar effect, we just take the Arg as an example. In this model we have investigated the starvation condition that one amino acid is very deficient, but in fact, such a condition is terrible, almost no cell can keep live or cells have already adjusted the metabolic pathways to suit the terrible conditions. Based on the assumption that cells can survive and the amino acid metabolism pathway is not changed, we just only focus on the translation system behaviour and properties. It is more easier to get unusual behaviour under terrible conditions. Hopefully more parameters can be achieved in further experimental studies on various transition systems to improve this model. The Effect of Codon Mismatch