The importance of nonsense errors: Estimating the rates and implications of ribosome drop-off during protein synthesis

Alexander L. Cope; Denizhan Pak; Michael A. Gilchrist

doi:10.1371/journal.pgen.1012162

Abstract

The process of translation is both energetically costly and relatively error-prone compared to transcription and replication. Nonsense errors during translation occur when a ribosome drops off a transcript before reaching a stop codon, resulting in energetic investment in an incomplete and likely non-functional protein. Nonsense errors impose a potentially significant energy burden on the cell, making it critical to quantify their frequency and energetic cost. Here, we present a model of ribosome movement for estimating protein production, elongation, and nonsense error rates from high-throughput ribosome profiling data. Applying this model to an exemplary ribosome profiling dataset in S. cerevisiae, we find that nonsense error rates vary substantially between codons and that these types of errors place an energetic burden on cells comparable to ribosome pausing. Overall, we present multiple lines of evidence that selection against nonsense errors is a prominent force shaping protein-coding sequence evolution and codon usage bias, in particular.

Author summary

The process of translating mRNA into a protein is both energetically expensive and relatively error-prone. As such, natural selection is thought to shape the evolution of protein-coding genes to reduce the cost of these errors when they occur. Nonsense errors (NSEs) occur when a ribosome stops translation before completing a functional protein, resulting in wasted energy on a non-functional product. Despite their functional consequences, NSEs and their effects on coding sequence evolution are generally understudied compared to other types of translation errors. This is in part due to the challenge of quantifying these errors from omics-scale data. We present a model for quantifying codon-specific estimates of elongation and NSE rates from ribosome profiling data, which gives a snapshot of the actively translating ribosomes in a cell. Although it is well-established that sense codons vary in their elongation rates, we find evidence that codons also vary in their NSE rates. Using our parameter estimates, we find multiple lines of evidence for selection against NSEs shaping patterns of codon usage bias. Our results suggest the cost of NSEs is comparable to the cost of ribosome pausing, and thus may play a greater role in coding sequence evolution than previously appreciated.

Citation: Cope AL, Pak D, Gilchrist MA (2026) The importance of nonsense errors: Estimating the rates and implications of ribosome drop-off during protein synthesis. PLoS Genet 22(6): e1012162. https://doi.org/10.1371/journal.pgen.1012162

Editor: Jianzhi Zhang, University of Michigan, UNITED STATES OF AMERICA

Received: July 31, 2025; Accepted: May 11, 2026; Published: June 9, 2026

Copyright: © 2026 Cope et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Raw sequencing reads were obtained from the SRA (SRR1049521, SRR7241903, SRR23242245, SRR23242246, SRR5766382, SRR5766388). Processed data and scripts for performing model fits and subsequent analyses are available at https://github.com/acope3/Yeast_Nonsense_Error_Analysis.

Funding: This work was supported by the NIH-funded Rutgers INSPIRE IRACDA Postdoctoral Program (grant #GM093854 to ALC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Protein synthesis, i.e., the transcription and translation of mRNAs, is one of the most energetically costly cellular processes [1]. For example, 66% of ATP usage is dedicated to protein synthesis in E. coli [2]. While both transcription and translation are critical to protein synthesis, the cost of translation is estimated to be 125 times the cost of transcription [3]. Translation involves both direct (GTPs used to charge tRNAs and drive each elongation cycle) and indirect costs (overhead cost of the translation infrastructure, such as the ribosome and tRNAs) [4–6]. Furthermore, mRNAs generally undergo multiple rounds of translation. Given the energetic cost of translation and the finite energy budget of a cell, genotypes reducing the cost of protein production are expected to have a selective advantage that scales with a gene’s average protein production rate. A classic example of this is the bias of highly expressed genes toward synonymous codons recognized by more abundant tRNA, which is hypothesized to be an adaptation due to natural selection to reduce the indirect cost of ribosome pausing [5–8].

In addition to being energetically costly, translation is relatively error-prone compared to DNA replication and transcription [9]. Translation errors potentially result in non-functional proteins. Much of the literature has focused on the impact of ribosomes misreading codons, resulting in the incorrect amino acid being inserted into the growing peptide chain (i.e., a missense error) [10–13]. The incorrect amino acid sequence may result in a misfolded protein that is potentially corrected via interactions with chaperone proteins [14,15]. In contrast, a nonsense error (a.k.a. premature termination errors or processivity errors) results in an incomplete protein [16,17]. As an incomplete protein is expected to have little, if any, functional capacity and is unlikely to be rescued via chaperones, nonsense errors may contribute substantially to the overall cost of translation [18]. Based on work in E. coli and S. cerevisiae, nonsense errors are estimated to occur at an approximate frequency of 1 every 10,000 codons [19–23].

Each ribosome elongation step can be thought of as having two possible outcomes: a nonsense error or successful elongation. Assuming the amount of time a ribosome spends waiting before elongation or an error occurs is exponentially distributed, the probability of a nonsense error at a given codon depends on both its nonsense error rate (NSE rate, i.e., the background rate at which a nonsense error event is triggered) and its elongation rate. It is well-established that ribosome elongation rates generally vary across codons, often due to differences in the tRNA availability and codon-anticodon base-pairing effects (e.g., wobble) [24–29]. As a result, slower codons are expected to be more prone to nonsense errors, even if NSE rates are uniform across codons [18,30–32].

Many theoretical studies of translation dynamics assume the NSE rate is uniform across codons. Theoretical studies making this assumption, while still allowing variable ribosome elongation rates across codons, predict that the probability of a ribosome completing translation varies substantially across transcripts as a function of amino acid sequence, protein length, and codon usage bias [30,33,34]. Conflicting with the assumption of uniformity, studies in both E. coli and S. cerevisiae indicate sense codons differ in their capacity to pair with release factors [35–37], often by mimicking the structural mechanism of stop codon recognition upon release factor binding [38]. Additionally, other mechanisms that lead to nonsense errors, such as peptidyl-tRNA drop-off [39–41] and ribosomal frameshifting [42,43], are expected to vary across codons. This limited evidence suggests NSE rates differ across codons, which could amplify or dampen the differences in codon-specific elongation rates. If codons vary in their NSE probabilities, then it follows that genes can evolve to reduce the cost of nonsense errors via codon usage. As before, the selective advantage for synonymous genotypes reducing the cost of nonsense errors should increase with a gene’s average protein production rate. Thus, we expect to see increasing evidence of adaptation in codon usage bias to reduce the cost of protein production with a gene’s average protein production rate.

Ribosome profiling is a powerful technique for quantifying steady-state translation across the transcriptome [44]. Compared to mass spectrometry-based approaches (including those specifically intended to quantify translation), ribosome profiling provides better sequence coverage, a broader dynamic range, and reveals actively translating ribosomes at codon-level resolution [45]. Due to these advantages and the decreasing monetary cost of next-generation sequencing, ribosome profiling is a powerful approach for studying translation on an omics scale [46]. However, ribosome profiling does not directly track ribosome movement along a transcript, requiring mathematical models to extract biological information from these empirical measurements. Previous attempts to quantify nonsense errors from ribosome profiling data have typically averaged over codons and genes, ignoring variation in elongation, nonsense error, and protein production rates [21,23]. To explicitly account for this variation, we extend our previous model of nonsense errors and ribosome movement along a transcript [18,30] to estimate codon-specific elongation and NSE rates using a high-quality ribosome profiling dataset in S. cerevisiae [27].

Using our estimates, we evaluate and contextualize the estimated distribution and costs of nonsense errors across the S. cerevisiae transcriptome. We find compelling evidence that NSE rates and probabilities differ across codons and amino acids. As genes differ in amino acid usage, protein length, and codon usage, our results indicate the probability that a ribosome completes translation of a transcript varies across genes (interquartile range 0.87 – 0.95). We identify multiple lines of evidence that the S. cerevisiae genome is extensively adapted to reduce the cost of nonsense errors during translation. Our evidence includes an overall bias towards nonsense error-prone codons near the 5’-ends of the coding DNA sequences (CDS) and an avoidance of nonsense error-prone codons in highly expressed genes. Consistent with previous theoretical studies relying upon tRNA-based proxies of elongation rates [30,33], we find approximately 60% of genes exhibit signals of adaptation to reduce the cost of nonsense errors in S. cerevisiae. Despite these adaptations to reduce their cost, we estimate nonsense errors impose an energetic burden comparable to, if not greater than, ribosome pausing. Although nonsense errors are believed to be less frequent than missense errors, because a preponderance of nonsense errors (but only a fraction of missense errors) are likely to disrupt a protein’s functionality, our work suggests natural selection against nonsense errors plays a substantial, important, and underappreciated role in protein-coding sequence evolution.

Materials and methods

We are interested in calculating the probability of observing a ribosome footprint (RFP, i.e., a mapped sequencing read) at a codon within an mRNA transcript as measured via ribosome profiling. We assume there is a pool of RFP generated from the transcriptome and that the mRNAs in this pool are close to steady-state in terms of ribosome initiation and completion of translating a transcript. Below we give an overview of our model formulation and usage, with additional details found in S1 Text. Definitions of all model parameters can be found in Table 1.

Download:

Table 1. Table of model parameters.

https://doi.org/10.1371/journal.pgen.1012162.t001

Ribosome pausing and nonsense error model

Below, we outline the assumptions behind our ribosome PAusing and NonSense Error (PANSE) model and the formal likelihood function they lead to.

PANSE model assumptions.

For a given gene g, m_g represents its equilibrium mRNA transcript concentration within a cell and represents the average ribosome initiation rate for each of its mRNAs. Gene g consists of n_g codons and the elongation and nonsense error rates at codon position i are c_g,i and b_g,i, respectively. We assume the elongation rate c_g,i varies between sites such that InverseGamma where the shape parameter and scale parameter can vary between the 61 sense codons. The across-site heterogeneity of a given sense codon’s elongation rates reflects a variety of factors that alter ribosome elongation [22,47]. Although the nonsense error rate b_g,i likely varies with codon and position, for simplicity we focus solely on the variation between codons and, thus, treat the codon-specific nonsense error rate b as a constant, having the same value across all codons of type c. Based on these assumptions, it is possible to model the distribution of footprints as coming from a negative binomial distribution, i.e., with , , where . The composite parameter is a rescaling of a codon’s specific rate parameter by the ratio of the partition function for the RFP distribution Z from which the footprints are sampled and the total number of observed footprints in a dataset . Thus, Y/Z is a measure of the sampling efficiency of the experiment.

Given Y_g,i represents the observed number of ribosome footprints (RFP) at codon i of gene g, it follows that the likelihood of observing Y_g,i is,

(1)

where is a rescaling of the mRNA_g specific ribosome initiation rate by the density of mRNA_g transcripts within the cell, m_g. Assuming independence between elongation steps, it follows that the probability of a ribosome successfully elongating at codons 1 through is,

(2)

where represents the PDF of the InverseGamma distribution for the ribosome elongation rate for codon i.

Because the nonsense error rate (NSE rate) b between sense codons is likely to be several orders of magnitude less than a given c_g,i and we will be working with our likelihood function on the log scale, we can greatly speed up our evaluation of the log of Eqn. (1) by approximating using a 2nd order Taylor Series approximation around a mean NSE rate for all codons . Doing so gives,

Analysis of ribosome profiling data

Raw ribosome profiling for S. cerevisiae reads were downloaded from the Sequence Read Archive (SRA Run Accession: SRR1049521) [27]. These data were generated using a flash-freeze protocol to halt ribosome elongation, avoiding many of the technical artifacts caused by cycloheximide [27,48]. The raw reads were processed using riboviz2 [49] and a transcriptome FASTA file containing annotated protein-coding sequences from the Saccharomyces Genome Database (Release R64-2–1), as well as 250 nucleotide upstream and downstream of the start and stop codons, respectively. Aligned reads of length 28–30 were extracted and assigned to codons in the A-site, assuming a 15-nucleotide offset relative to the 5’-end of the read [44] (S1 Fig). We note that using different A-site offsets based on the riboWaltz R package had little impact on parameter estimation (S2 Fig).

A flat file was created that includes the number of ribosome footprints (i.e., counts) assigned to each codon within a gene. This file was used as the primary input into our implementation of PANSE. We employed many filtering criteria to remove genes that violate the assumptions of the model (see S1 Text and S3 Fig). This led to a final dataset consisting of RFP counts for 3,112 S. cerevisiae protein-coding genes, representing approximately 50% of genes found in the genome.

Fitting the pausing and nonsense error model

The PANSE model was fit via a Markov chain Monte Carlo (MCMC) algorithm to the codon-level RFP counts for the genes included in the final data set. The MCMC was run for 50,0000 iterations, keeping every 5th iteration to reduce autocorrelation. Convergence was assessed by comparing the results of two separate MCMCs that were started at random points in parameter space. Posterior means and 95% highest density intervals (HDIs) were calculated for each parameter of PANSE based on the MCMC traces. Consistent with our previous work [7], gene-specific initiation rates are assumed to follow a lognormal distribution with a mean initiation rate of 1. This is accomplished by fixing the mean of the lognormal distribution to be , where is the standard deviation of the lognormal distribution. The shape and scale parameters, and , for each codon-specific elongation rate c, and the codon-specific NSE rates b, were assumed to have a flat prior with ranges (0, 100), (0, 100), and (1E-100, 1E-1) on the natural scale. We note these are particularly broad distributions, but we wanted to ensure that adequate parameter space could be explored. We fit PANSE (1) assuming no nonsense errors occur, (2) assuming uniform NSE rates across codons, and (3) allowing NSE rates to vary across codons. These three model fits were compared using the Deviance Information Criterion (DIC) [50] to determine statistical support for variation in NSE rates across codons.

The ribosome profiling data we used were prepared with a flash-freeze protocol to halt elongation, but there is still an observable increase in ribosome density in the first 200 codons. Although this increased density was less than observed in ribosome profiling measurements using cycloheximide to halt elongation, it is unclear if this reflects true biology or a technical artifact [27]. As it is plausible that ribosome counts for the first 200 codons are impacted by unknown technical biases, we fit PANSE masking the ribosome counts for this region in the likelihood calculation for each gene. However, we did include these codons in our calculation of the expected probability of elongation up to codon i .

We compared parameter estimates of PANSE to (1) independent empirical data such as mRNA abundances and tRNA-based proxies for elongation rates [27], and (2) parameter estimates from the Ribosomal Overhead Cost version of the Stochastic Evolutionary Model of Protein Production Rates (ROC-SEMPPR), which estimates protein production rates and codon-specific selection coefficients from codon usage bias patterns [7]. See previous work regarding parameter estimation with ROC-SEMPPR for S. cerevisiae [7,51].

Analyzing variation in NSE rates b across codons

To better understand the variation in NSE rates b across codons, we performed a linear regression of the posterior mean NSE rates b with properties of the codon. This included the number of stop codons 1 nucleotide mutation away from the codon (0, 1, or 2), the missense error probability of the codon [52], and whether or not the codon is prone to frameshifts [42]. We weighted each codon in the regression by the standard deviations estimated from the MCMC traces.

Quantifying variation in translation completion probabilities across genes

We used our estimates of elongation and NSE rates, c and b, for each of the 61 sense codons to calculate the translation completion probability . This allowed us to apply this formula to all genes in the S. cerevisiae genome, regardless of whether or not it was included in the model fit. We assessed how variation in translation completion probabilities varied across genes as a function of length and gene expression measured as mRNA abundances (in units of RPKM) taken from previous work [27]. Additionally, we compared how well our inferences of translation completion probabilities made directly from empirical ribosome profiling data compared to theoretical estimates based on simulations from a Totally Asymmetric Exclusion Process (TASEP) model of translation [34].

Quantifying elongation probability variation within and across genes

As before, c_i and b_i represent the expected elongation and NSE rates, respectively, of the codon at position i. Given that there are only two possible outcomes in our model (elongation or a nonsense error), it follows that the probability a ribosome elongates the growing peptide chain at codon i is,

The probability a ribosome reaches codon i by successfully elongating at the previous codons is,

To evaluate how elongation probabilities vary with codon position i, we calculated the average probability a ribosome elongates at a given position as the geometric mean of the elongation probability across genes. We then regressed the log(Position i) with the average elongation probability to quantify how the elongation probability changes as a function of position. To test if the observed slope was greater than expected under the null model of no selection against nonsense errors, we generated 1000 different permuted sets of genes for each of 3 different possible nulls: (1) the synonymous codons of an amino acid were permuted within a gene (randomized by amino acid), (2) amino acids and codons were permuted within a gene’s coding region (randomized by CDS), and (3) codons were permuted across genes (randomized across CDS). The slope estimated from the real set of genes was then compared to each of the 3 null distributions, with the p-value = (k + 1)/(1000 + 1), where k is the number of occurrences in which a slope from a permuted set of genes was greater than the slope from the real set of genes. A similar analysis was performed on the real data after binning genes based on mRNA abundances into the lower quartile (low expression), interquartile (moderate expression), and upper quartile (high expression) on the log scale.

Identifying codons enriched in the 5’-end

To identify codons enriched in the 5’-end and 3’-end of the CDSs (i.e., near the start and stop codon, respectively), we used both an absolute and a relative definition of the termini. The absolute definition considered the termini to be the first and last 100 codons of a coding sequence. In this case, we restricted our analysis to coding sequences with a minimum of 250 codons. The relative definition considered the termini to be the first and last 25% of codons along a coding sequence. In this case, no minimum length cutoff was used.

To identify codons enriched in the 5’-end, an empirical expectation was determined by calculating the frequency of each codon, relative to its synonyms, in the “middle” (i.e., neither of the termini) of the CDS across all genes. For each codon, we used a one-sided binomial test to determine if the observed frequency in the 5’-ends was greater than expected by chance based on its observed frequency in the middle of the CDS. A similar enrichment test was used for the 3’-ends.

Calculating the cost of nonsense errors

To better understand the potential importance of nonsense errors in translation, we calculated the expected cost of producing a complete, functional protein (in terms of the number of hydrolyzed NTPs) from a given transcript that includes the impact of nonsense errors. For simplicity, we excluded the cost of amino acid synthesis from our calculations.

Conceptually, we break the cost of protein synthesis into two overlapping sets of categories: direct vs. indirect costs and fixed vs. variable costs. Direct costs include the NTP used in assembling the small and large ribosome subunits on the mRNA and each elongation step by the ribosome, whereas indirect costs are based on the synthesis cost of the translation infrastructure. In terms of fixed and variable costs, fixed costs are the direct and indirect costs of translating a protein in the absence of a nonsense error, i.e., costs incurred every round of successful translation. Variable costs are the expected direct and indirect costs of translation that are wasted when a nonsense error occurs – these costs are variable because they are only incurred if a nonsense error occurs. As a result, the total cost of producing a particular protein is the sum of fixed and variable costs, each of which is comprised of direct and indirect costs.

From our model, it follows that

(3)

Where a₁ = 4 NTP and a₂ = 4 NTP are the direct costs of translation initiation and elongation, respectively, in terms of hydrolyzed phosphate bonds [18].

Combining the contribution of fixed and variable direct costs of protein synthesis yields,

(4)

Similarly, combining the contribution of the fixed and variable indirect costs of protein synthesis yields,

(5)

The term represents the indirect cost of ribosome pausing up to codon i. Because RPF data lacks information on the absolute rate of ribosome elongation, we rescaled our estimates of pausing times such that the average elongation rate across the 61 sense codons was ∼9.3 codon/sec or, equivalently, sec/codon. Assuming the average mRNA has a CDS of 400 codons, the same average elongation rate as above, and that at any given time 80% of the ribosome population is engaged in translating mRNAs [27], it follows that the expected initiation rate c₀ is sec (see S1 Text for more details).

The parameter C converts the indirect cost of ribosome pausing (in units of seconds) into their equivalent costs in terms of NTP and has the units of NTP/sec. Although we are unaware of any empirical estimates of C, we used two different approaches to estimate C that vary by less than one order of magnitude. One estimate of C is based on selection coefficients estimated from a ROC-SEMPPR analysis of the S. cerevisiae genome and yields C_ROC = 0.71 NTPs/sec). The second estimate of C is based on the assembly cost (in NTPs) and average lifespan (in seconds) of the ribosome and yields = 5.5 NTPs/sec. See S1 Text for detailed descriptions of these calculations. Given these two independent estimates of C are the only ones we have, we treat C_ROC and as lower and upper bounds on C, respectively. Thus, the average cost of ribosome pausing is approximately or 0.59 NTP/codon for C_ROC and , respectively, suggesting indirect costs are a fraction of the direct cost of 4 NTP/codon.

In addition to using our cost estimates directly, we also used them to test the hypothesis that the order of synonymous codons along a gene shows evidence of adaptation to reduce the cost of nonsense errors. To do so, we generated a null distribution of the expected cost of translation for each gene when there is no adaptation to reduce the cost of nonsense errors by permuting the order of synonymous codons. These permuted sequences had the exact same amino acid sequence and codon usage as the sequence observed in the genome, but differed in the order of synonyms. For each permutation, we calculated its expected protein production cost using Eqn. (3). We then compared the mean protein production cost of our population of permuted sequences to that of the protein production cost of the observed sequence. We then scored each observed sequence as having a protein production cost either greater than (0) or less than (1) the expected cost under the null hypothesis. To test whether the observed sequences are biased towards lower costs than expected under the null hypothesis, we performed a binomial test (H₀: p = 0.5). This allowed us to test if the number of sequences with an expected cost less than the mean expected cost across the permuted sequences was greater than expected by chance. Because the gross cost of nonsense errors scales with the gene’s protein production rate, we performed a logistic regression to test if the probability that a gene has a lower cost than expected increases with gene expression.

Analysis of additional S. cerevisiae ribosome profiling data

To complement our analysis of the Weinberg et al. data, which is generally considered to be high quality, we also analyzed data from two more recent ribosome profiling experiments: Wu et al. 2019 (SRA: SRR7241903) [29] and Ferguson et al. 2023 (SRA:SRR23242245, SRR23242246) [53] for comparison. We also analyzed a wild-type replicate (SRA:SRR5766382) and an elp1 deletion strain replicate (SRA: SRR5766388) from Chou et al. 2017 [28] to confirm that changes to tRNA function (in this case, deletion of a tRNA modification enzyme) primarily impact elongation rates c and not NSE rates b. These datasets were processed using the riboviz2 pipeline, with minor differences in the read lengths selected and in defining the A-site (see S1 Text). Finally, to estimate stop codon “NSE rates,” we introduced a slight modification to the Weinberg et al. data, allowing up to 15 codons in empirically determined 3’-UTRs (obtained from [54]).

Results

After PANSE was fit to S. cerevisiae ribosome profiling data from Weinberg et al. [27], we compared parameter estimates to appropriate empirical and theoretical proxies, finding them in good agreement (Fig 1, S1 Table, S2 Table). Both mRNA abundances and tRNA gene copy number (tGCN) are common empirical proxies for translation initiation rates and elongation rates, respectively [24,27]. We expected PANSE estimates of gene-specific translation initiation rates and codon-specific elongation rates c to be well-correlated with these empirical proxies. Indeed, translation initiation rates and independent RNA-Seq-based estimates of mRNA abundances are strongly correlated [27] (Fig 1A, Spearman rank correlation ). PANSE estimates of elongation rates c and estimates based on tGCNs (including effects of wobble base pairing) are reasonably well-correlated (Fig 1B, Spearman rank correlation ).

Download:

Fig 1. Comparing PANSE gene-specific initiation rates and codon-specific elongation rates to empirical and codon-based estimates.

Histograms on x and y-axes represent the distributions of the relevant variables. (A) RNA-seq estimates of mRNA abundances (RPKM) (a common proxy for translation initiation rates) and PANSE estimates of initiation rates . (B) tRNA gene copy numbers (tGCN) based estimates of elongation rates and PANSE estimates of elongation rates. (C) ROC-SEMPPR estimates of protein production rates (based on differences in codon usage) and PANSE estimates of initiation rates . (D) ROC-SEMPPR estimates of selection coefficients and relative ribosome waiting times . Waiting times w are defined as the inverse of the elongation rates w = 1/c. Selection coefficients and waiting times were set relative to the alphabetically last codon for each amino acid.

https://doi.org/10.1371/journal.pgen.1012162.g001

Theoretical biophysical models rooted in population genetics principles effectively estimate parameters relevant to the evolution of codon usage bias [6,7,18,55]. One such model, ROC-SEMPPR, estimates protein production rates and selection coefficients solely from genome-wide patterns of codon usage bias. ROC-SEMPPR assumes differences in natural selection on codon usage are due to differences in elongation rates c between synonyms [6,7]. We obtained parameter estimates from a previous application of ROC-SEMPPR to the S. cerevisiae CDSs [51]. We note protein production rates , but ROC-SEMPPR ignores translation errors such that . As expected, initiation rates and protein production rates are strongly correlated (Fig 1C, Spearman rank correlation ). To make elongation rates c comparable to the ROC-SEMPPR selection coefficients (which reflect selection against slow codons), we converted our elongation rates to pausing times and made them relative to ROC-SEMPPR’s pre-defined reference codon for each synonymous codon family (, where i is a codon and c_ref is the reference synonym). These estimated relative waiting times and selection coefficients are strongly correlated (Fig 1D, Spearman rank correlation ).

To the best of our knowledge, empirical measurements of codon-specific NSE rates b are missing from the literature; however, the frequencies at which ribosomes mistakenly readthrough stop codons have been quantified using a variety of approaches [54,56], with TAA and TGA being the least and most prone to readthrough, respectively. Stop codon readthrough results from a missense error at a ribosome awaiting a release factor, allowing it to elongate into the 3’-UTR of a transcript. We do not explicitly model the process of translation termination and rare missense error events at stop codons, but we expect stop codon “NSE” rates b to be anti-correlated with their readthrough efficiencies. By slightly modifying our Weinberg dataset to allow for up to 15 codons in the 3’-UTRs, we find that TAA had the highest “NSE” rate b, while TGA had the lowest, consistent with independent estimates of stop codon readthrough efficiency (S4A Fig).

The three stop codons have “NSE” rates b greater (by ≥2 orders of magnitude) than the NSE rates of the sense codons.

As an additional test of PANSE’s robustness, we tested if the deletion of the gene encoding a non-essential tRNA modification enzyme elp1 impacted primarily codon-specific elongation rates c or NSE rates b. The deletion of elp1 is known to decrease the elongation rates of codons AAA, CAA, and GAA [28,57]; thus, we expect this effect to be primarily absorbed into the elongation rates c parameter. By comparing model parameter estimates between the deletion strain and a wild-type strain [28], we observe a clear decrease in elongation rates for the specified codons, but observe no significant change in the NSE rates (S4B, S4C Fig).

NSE rates vary across codons

Theoretical and computational studies often ignore nonsense errors or assume a uniform (background) NSE rate b across all codons [21,30,33,34]. However, empirical studies indicate nonsense errors occur at an appreciable frequency [9,32], with targeted studies suggesting background NSE rates b could vary across codons [35,36]. This naturally leads to two questions: (1) do we see evidence of nonsense errors in ribosome profiling data, and (2) do we see evidence that NSE rates b (and not just NSE probabilities) vary by codon? To answer question (1), we compared the most complex model (NSE rates b vary across codons) to the simplest model (no nonsense errors) using a standard model comparison approach based on the Deviance Information Criterion (DIC) [50,58]. We find that the variable NSE rate b model better fits the ribosome profiling data compared to the no nonsense error model by 2381 DIC units. This indicates that the effects of nonsense errors are detectable in ribosome profiling data and that changes to ribosome density along transcripts are not solely due to differences in codon waiting times.

To answer question (2), we compared the model allowing NSE rates b to vary across codons to a model that assumed all codons had a uniform NSE rate. The uniform NSE rate model still allows for differences in NSE probabilities Pr(NSE) through differences in codon elongation rates. We find that the model allowing for variation in NSE rates b across codons is 187 DIC units better than the model assuming uniform NSE rates, indicating strong support for differences in the NSE rates across codons. This suggests the existence of factors of codons aside from elongation rate that alter the probability of nonsense errors Pr(NSE). NSE rates b varied over multiple orders of magnitude, ranging from 8.87 × 10⁻⁶ to 1.98 × 10⁻³ (Fig 2A). When accounting for differences in elongation rates c, the mean NSE probabilities Pr(NSE) across codons were on the order of 10⁻⁴, consistent with previous estimates in E. coli and S. cerevisiae [19–23].

Download:

Fig 2. The background NSE rates b vary across codons.

NSE rates b are reported on a log₁₀ scale. (A) Posterior means and 95% highest density intervals (HDIs) of NSE rates b estimated for each codon. Colors indicate the number of nucleotide mutations away a codon is from a stop codon. Solid and dashed black lines indicate the NSE rate b posterior mean and 95% HDI for a model fit sharing information across codons (i.e., assuming no variation in NSE rates). To be consistent with our previous work, we separated the amino acid serine into two blocks denoted S and Z. Bolded x-axis labels indicate codons that are frameshift competent [42]. (B) Comparison of NSE rates b with empirical missense error probabilities estimated from mass spectrometry data [52]. The Spearman rank correlation is reported. (C) Comparison NSE rates b between 11 frameshift competent codons (as defined in [42]) and the 50 other sense codons. A Welch’s two-sample t-test is reported. (D) Regression coefficient estimates from a weighted multiple regression of codon properties and the NSE rates b. Variables include the number of stop codon neighbors at each position, whether or not the codon was prone to frameshifts (frameshift competent), and the missense error probability of the codon. The intercept reflects the mean NSE rate b of the reference class of codon. Error bars represent 95% confidence intervals. An * indicates statistical significance (p < 0.05).

https://doi.org/10.1371/journal.pgen.1012162.g002

Importantly, our model is agnostic to the specific mechanisms associated with nonsense errors. Notable mechanisms that can lead to nonsense errors are release factor binding of sense codons, peptidyl-tRNA drop-off, and frameshift errors [59]. Mismatches between the codon and anticodon resulting from near-cognate or even non-cognate tRNA binding (i.e., missense errors) can increase the chances of peptidyl-tRNA drop-off and frameshift errors [40,43,60]. Consistent with this, we find that our NSE rate b estimates from ribosome profiling are positively correlated with codon-specific missense error probabilities estimated from a large number of mass spectrometry measurements in S. cerevisiae (Spearman rank correlation , , Fig 2B) [52]. Previous work in S. cerevisiae identified 11 sense codons that were particularly prone to causing frameshifts when located in the P-site of a ribosome, particularly when followed by a slow codon [42]. Based on our model estimates, these 11 “frameshift competent” codons had generally higher NSE rates b than the other 50 sense codons (on the log₁₀-scale, mean of -3.58 vs. -4.44, Welch’s two-sample t-test, p = 0.0015, Fig 2C).

To better understand how these factors associated with higher NSE rates b contribute to the observed variation, we regressed our estimates against the number of stop codon neighbors (the number of stop codons that are a single nucleotide mutation away from a codon), the average missense error probability, and whether or not the codon is frameshift competent (Fig 2D, S3 Table). We note that we treated the number of stop codon neighbors as a categorical variable, as it is always either 0, 1, or 2. As this is regressing multiple categorical variables against the log₁₀(NSE rate b), each of these coefficients represents a change relative to a reference class defined by the intercept.

One might expect that having more stop codon neighbors increases the NSE rate b of a codon, as there would seem to be a greater chance of being mistakenly recognized by a release factor. Indeed, we find that codons with a single stop codon neighbor at the 3rd position of the codon have an NSE rate b almost 1 order of magnitude greater than codons with no stop codon neighbors at the third position (, p = 0.0012). We did not observe such a relationship for codons with 2-stop codon neighbors at the third position (, p = 0.884). In contrast, codons with a stop codon neighbor at either the 1st or 2nd nucleotide position showed no significant difference in NSE rates b compared to codons with no stop codon neighbors at these positions. This likely reflects that the ribosome more closely monitors codon-anticodon base pairing at the first 2 positions [61]. Consistent with the independent tests (Fig 2B,2C) and in line with the suspected role of missense errors in contributing to peptidyl-tRNA drop-off and frameshifts, we find a positive effect of a codon’s missense error probability (, ) and frameshift competency (, ) on NSE rates b.

Nonsense errors are an unlikely explanation for the “5’-ramp”

Ribosome profiling measurements often exhibit increased ribosome densities at the 5’-end of the CDS relative to the remainder of the mRNA [27]. We simulated ribosome profiling data based on the PANSE model using the parameters estimated from the real data. We observe a moderate correlation between the real and simulated ribosome counts (Spearman rank correlation , . On a position-by-position basis (ignoring the first 200 codons), the average log-fold difference between the real and simulated data is approximately 0.015 (One-sample t-test , S5 Fig), suggesting our PANSE model slightly underestimates the number of counts by about 1.5%, on average.

Based on the metagene profile of the ribosome densities for the real and simulated data, we observe good agreement in the post-200^th codon region (Fig 3A, unshaded region). This is expected because this was the region used for calculating the likelihood of the data during model fitting. In contrast, the model poorly predicts ribosome densities near the 5’-end (a.k.a the 5’-ramp region), which were excluded during the model fitting. The gradual decrease in ribosome densities along the first 200 codons is far more drastic in the real data than expected based on the model parameters estimated using the remainder of the CDSs (Fig 3A, shaded region). This suggests other factors are at play in the 5’-ramp that impact ribosome densities, including that this is a technical artifact [27].

Download:

Fig 3. The effects of increased ribosome density at the 5’-end of coding sequences.

(A) Comparing metagene profiles for real ribosome profiling data (purple) and simulated ribosome profiling data based on the PANSE model (yellow). The first 200 codons are shaded to emphasize these codons were excluded in the likelihood calculation during model fitting. (B) Comparison of NSE rate b estimates for each of the 61 sense codons based on either the first 200 codons or the remainder of the gene.

https://doi.org/10.1371/journal.pgen.1012162.g003

The discrepancy between real and simulated data raises the question of the necessary NSE rates b to generate the dramatic drop in ribosome density at the 5’-end. We fit PANSE to only the first 200 codons to determine if the parameters estimated were biologically plausible. Although initiation rates and elongation rates c are in good agreement with estimates from the post-200 codon fits (S6 Fig), the NSE rates b are significantly greater, with a mean NSE rate of approximately 0.003 across codons (Fig 3B). These higher NSE rates b translate to a higher average NSE probability Pr(NSE) of approximately 0.004 across the 61 sense codons. An NSE probability Pr(NSE) of 0.004 per codon means only 45% of initiated ribosomes are expected to make it to the 200^th codon, which is unrealistically low. Taken together, our results suggest the 5’-ramp in ribosome profiling data is, at best, only partially due to nonsense errors.

The probability that translation is completed varies greatly between transcripts

Based on our best model fit, codons differ in their elongation rates c and NSE rates b, meaning they also differ in their NSE probability (i.e., )). As such, variation in codon usage across genes is expected to lead to variation in the probability of a ribosome completing translation . Across all genes, the median probability of a ribosome completing translation is approximately 0.92 (interquartile range 0.87 – 0.95). A key factor determining the probability of a ribosome completing translation is the number of codons within a transcript’s CDS. Even if the probability of a nonsense error at any given codon is rare, longer CDSs afford more opportunities for a nonsense error. The length of a CDS and its probability of experiencing a nonsense error are highly correlated, as expected (Fig 4A, Spearman rank correlation , ). Unsurprisingly, the S. cerevisiae proteins YHR099W, YKR054C, and YLR106C are the only transcripts considered with a greater than 50% chance of a ribosome experiencing a nonsense error. This likely has little to do with codon usage, as their respective transcripts have > 3700 codons, such that there are many opportunities for any given ribosome to drop off during translation.

Download:

Fig 4. Variation in the probability of a single ribosome completing translation

.

Colors represent from ROC-SEMPPR (i.e., ) for each gene. (A) Relationship between CDS length (in number of codons) and the probability of a nonsense error occurring, . (B) Comparison of the probability of no nonsense error occurring with simulation-based estimates of drop-off resilience [34]. The size of the point represents the length (in number of codons) of the gene’s coding region.

https://doi.org/10.1371/journal.pgen.1012162.g004

A previous study estimated the probability that a ribosome completes translation using a Totally Asymmetric Exclusion Process (TASEP) simulation parameterized by polysome profiling data (translation initiation rates), tRNA counts (codon-specific elongation rates), and a single NSE rate estimate from E. coli [34]. We compared our estimates of translation completion probability based on a model fit to empirical data to theoretical expectations based on this TASEP simulation (referred to as “drop-off resilience”), finding them to be highly correlated (Fig 4B, Spearman rank correlation , ). We still observe a high correlation between the empirical and theoretical estimates when conditioning on CDS length (partial Spearman rank correlation, , ), indicating length is not the only cause of the similarity between the two estimates of translation completion probabilities. Estimates of from PANSE are generally greater than expected based on these TASEP simulations, which assumed a uniform NSE rate b across all sense codons. The NSE rate b used for the TASEP simulations was taken from a study that implicitly assumed the drop in ribosome density along transcripts was solely due to nonsense errors [21]. As we and others [22] have shown, the 5’-ramp region inflates estimates of NSE rates. Regardless, the high correlation between and theoretical expectations suggests our PANSE model generally captures the across-transcript variability in the probability a ribosome completes translation based solely on empirical data.

Evidence supports adaptation to reduce nonsense errors

As protein synthesis is energetically costly, natural selection is expected to result in genome-level patterns consistent with adaptations to reduce the costs of nonsense errors. We specifically examine two key adaptations: (1) the reduction in frequency of codons with higher NSE probabilities Pr(NSE) along a CDS (5’ to 3’ direction), and (2) an anti-correlation between the frequencies of codons with higher NSE probabilities and gene expression.

Evidence that adaptation increases with position.

As the energetic investment in producing a protein increases as amino acids are added to the peptide chain, natural selection against nonsense errors is expected to be weakest near the start of translation, resulting in position-specific patterns of codon usage in which nonsense error-prone codons are biased toward the 5’-end of a gene’s CDS [17,18,30,33,62]. As such, codons with higher NSE probabilities Pr(NSE) are expected to be enriched in the 5’-ends relative to the remainder of the CDS. To control for the different lengths of CDSs, we assigned codons to the 5’-end based on their relative positions (i.e., their actual position divided by the number of codons in a given CDS) using a relative position cutoff of 0.25. The “middle” of a gene’s CDS was defined as codons falling between 0.25 and 0.75. By comparing synonymous codon frequencies in the 5’-ends to the middle region, we identified codons enriched at the 5’-end (one-sided binomial test, Benjamini-Hochberg adjusted p < 0.05). As expected, codons enriched in the 5’-ends have higher Pr(NSE) than codons showing no difference between the 5’-end and the middle regions (Wilcoxon rank sum test, , Fig 5A). We observe no such pattern if comparing the Pr(NSE) of codons enriched in the 3’-ends to those not enriched relative to the middle of CDSs (Wilcoxon rank sum test p = 0.67, Fig 5B). We obtain a similar result if we define the termini as the first and last 100 codons of each gene’s CDS (S7 Fig). We emphasize that the first 200 codons of each CDS were not considered in the actual parameter estimation (i.e., were not considered in the likelihood calculation).

Download:

Fig 5. Selection against nonsense errors is weakest at the 5’-ends.

(A) Comparison of codon-specific Pr(NSE) (on the log scale) of codons enriched in the 5’-end (the first 25% of codons for each sequence) vs. those not significantly enriched relative to the middle regions of the CDSs. Wilcoxon rank sum test p-value reported. (B) Same as in (A), but using the last 25% of codons. (C) Geometric mean of the probability of successful elongation by the ribosome up to the 500^th codon across all CDSs (purple). Solid lines and equations represent the linear regressions relating the log(Position) of a codon to the change in probability of elongation for the real sequences and the various nulls. For the null regression lines, the mean slope and the mean intercept across the 1000 permuted sequences were used.

https://doi.org/10.1371/journal.pgen.1012162.g005

Natural selection against nonsense errors is expected to strengthen along a CDS, thereby increasing the probability of successful elongation. Except for the first 10 codons, the probability of elongation increases along the CDS before appearing to plateau (Fig 5B). This increase in the probability of elongation as a function of codon position is consistent with adaptations to reduce the energetic cost of nonsense errors. Regarding the apparent decrease in the probability of elongation for the first 10 codons, we note that these results are qualitatively consistent with previous findings that observed a decrease in codon bias immediately following the start codon, followed by a gradual increase [62]. The change in the probability of elongation observed in the true sequences is much greater than observed for the null expectations generated by permuting codons (Figs 5B, S8). This is consistent with natural selection against nonsense errors being generally weaker at the 5’-end.

Evidence that adaptation increases with gene expression.

As highly expressed genes generally undergo more rounds of translation (i.e., take up a larger portion of the cell’s energy budget), a lower probability of completing translation in a highly expressed gene will lead to more wasted NTP. Thus, selection against nonsense errors should increase with gene expression, such that highly expressed genes generally have higher probabilities of completing translation . In our comparisons of CDS length and the probability a nonsense error occurs, we observed a clear gradient in this relationship: highly-expressed genes tend to have lower probabilities of experiencing a nonsense error compared to CDSs of similar length but lower expression (Fig 4A). Using mRNA abundances [27] as an independent measure of gene expression, we find a positive correlation between the probability of completing translation and gene expression (Spearman rank correlation , , Fig 6A). This analysis does not control for CDS length, indicating highly expressed genes are more likely to be successfully translated regardless of length. By using a partial Spearman correlation to control for length, we find that the correlation between gene expression and the probability of completing translation increased (partial Spearman rank correlation , ). As expected, our results are consistent with selection against nonsense errors being generally stronger in high-expression genes compared to moderate- or low-expression genes of similar length.

Download:

Fig 6. Highly-expressed genes are better adapted to avoid nonsense errors.

(A) Comparing gene expression (mRNA abundance RPKM) with the probability of a ribosome completing translation . Histograms on the x and y-axes represent the distributions of the corresponding variables. Colors indicate the length of the CDS.(B) Comparing relative NSE probabilities Pr(NSE) among synonymous codons to selection coefficients from ROC-SEMPPR. A higher value of indicates a codon that is disfavored by selection relative to a reference codon (the alphabetically last codon for each amino acid). Error bars represent the 95% HDIs. (C) Same as in Fig 5B but separating genes into bins based on mRNA abundances (RPKMs). Solid lines represent the linear regression relating codon log(position) to the probability of elongation. Regressions (all coefficients with ) and Spearman rank correlations are reported.

https://doi.org/10.1371/journal.pgen.1012162.g006

ROC-SEMPPR assumes selection for on codon usage is uniform along a CDS, but codons with higher NSE probabilities are expected to be avoided in high-expression genes due to the increased energetic costs of experiencing frequent translation errors. Under selection against nonsense errors, we expect ROC-SEMPPR’s selection coefficients to be correlated with relative NSE probabilities (i.e., between synonymous codons). We find that and differences in NSE probabilities Pr(NSE) are well-correlated (Spearman rank correlation , Fig 6B), indicating that synonymous codons with higher NSE probabilities Pr(NSE) are avoided in high-expression genes.

Consistent with the avoidance of synonymous codons with higher NSE probabilities Pr(NSE) in high-expression genes, we observed that the probability of successful elongation by position is greater in high-expression genes (Fig 6C), suggesting these genes are better adapted to reduce the cost of nonsense errors. Independent of gene expression, we observed that the probability of successful elongations increases along the transcripts.

The energetic costs of nonsense errors are likely substantial

Although nonsense errors are rarer on average than missense errors [9,59], they may be more costly due to the high probability that the truncated protein is non-functional. We calculated the expected cost of translation (in terms of NTP) (see Materials and Methods) that accounts for the direct ATP cost of translation initiation and peptide elongation, the indirect overhead cost due to ribosome pausing, and the direct and indirect costs of nonsense errors. The direct and indirect costs associated with translation initiation and elongation are the fixed costs, as these will be incurred every time a functional protein is produced. In contrast, direct and indirect costs associated with nonsense errors are variable costs as these will only be incurred if a nonsense error occurs (see Eqn. 3). For the cost of ribosome pausing during translation C, we focus on a rough estimate based on ribosome production costs and half-lives as this estimate is more directly tied to ribosome assembly (see S1 Text). Unsurprisingly, the majority of NTP used during translation is associated with fixed direct costs of translation initiation and ribosome elongation (Fig 7A); however, this is not expected to impact the evolution of codon usage because fixed direct costs do not vary across codons.

Download:

Fig 7. Expected cost of translation in terms of NTP across genes.

(A) Relative costs of translation in terms of direct costs (translation initiation and peptide elongation), indirect costs (overhead costs of ribosome pausing), and variable costs (cost of nonsense errors). Genes are ordered based on total energetic flux: the product of the total costs and protein production rates (as estimated by ROC-SEMPPR, i.e., ). (B) The log fold-difference between the variable costs (i.e., the direct and indirect costs of nonsense errors) and the fixed indirect cost of elongation (i.e., ribosome pausing). The mean log fold difference is -0.56 (One-sample t-test . (C) Relationship between gene expression (mRNA RPKM) and the expected cost of translation per codon. The Spearman rank correlation and the associated p-value are reported. (D) Relationship between gene expression and the probability of whether or not a gene has a lower observed energetic cost than expected based on the permuted nulls. Histograms represent the distribution of gene expression for genes with and without evidence of adaptation against nonsense errors. The solid line and the equation represent the result of a logistic regression. The logistic regression slope is statistically significant (p = 0.00024), but the intercept is not (p = 0.19). (E) Same as in (D), but with length as the predictor variable. Both the logistic regression slope () and intercept () are statistically significant.

https://doi.org/10.1371/journal.pgen.1012162.g007

Studies on the evolution of codon usage have primarily focused on the cost of pausing, either explicitly or implicitly. Based on NTP/s, the fixed indirect cost (i.e., the cost of ribosome pausing) is generally greater than variable costs (both variable direct and variable indirect costs) (Fig 7A,7B). However, variable costs (i.e., the direct and indirect costs associated with a nonsense error) are usually within an order of magnitude of the fixed indirect costs (i.e., the total cost of ribosome pausing for producing a single protein) for the majority of genes under consideration (≈86%, Fig 7B). We note this is likely a conservative estimate of variable costs. Using NTP/s, which is based on comparing ROC-SEMPPR’s selection coefficients to ribosome elongation rates c, we find the variable costs to be generally greater than the fixed indirect costs (S9 Fig). Consistent with adaptation to reduce the cost of translation, indirect costs (both fixed and variable) generally decrease as a function of total energetic flux (S10 Fig). We suspect our estimates of C_ROC and reflect reasonable bounds on the true energetic cost of ribosome pausing. Combined, our results suggest the costs of nonsense errors are comparable to the cost of ribosomal pausing.

As natural selection to reduce energetic costs is expected to be stronger in high-expression genes, we expect the cost of a gene to decrease as gene expression increases. Consistent with this expectation, we observe the expected cost per codon (i.e., , where n is the number of codons) is negatively correlated with gene expression (Fig 7C). To test for evidence of adaptation to reduce the cost of nonsense errors, we generated a null distribution for the expected translation cost for each gene based on 1000 permuted sequences of synonymous codons. These permutations could be viewed as nulls reflecting the absence of natural selection against NSEs, as permutations do not alter the codon-order invariant fixed translation costs. As codon usage for S. cerevisiae is at selection-mutation-drift equilibrium [6], we are effectively testing against a null that assumes adaptive codon usage via natural selection that varies with gene expression but not with position within a gene (e.g., natural selection against ribosome pausing). We find the true expected cost of a sequence is less than the mean expected cost of the permuted sequence for 59% of genes, which is greater than the 50% expected by random chance (two-sided binomial test, ). The average difference between the true cost and the mean cost of the permuted sequences is approximately 4 NTPs, roughly the same cost as initiating another ribosome or the direct cost of translation elongation.

We have already seen evidence that highly expressed genes are better adapted to reduce the cost of nonsense errors. Furthermore, as nonsense errors are more likely to occur in longer genes, gene length may be another predictor of adaptation to reduce the cost of nonsense errors. By classifying genes as either those that were better adapted relative to the permuted sequences (i.e., 1 if and 0 otherwise), we find highly expressed and longer genes are more likely to be adapted to reduce the cost of nonsense errors (Fig 7D,7E).

Parameter estimates across S. cerevisiae ribosome profiling datasets are consistent

Ribosome profiling, as with any high-throughput experiment, can be subject to technical and biological variation [55]. There are many protocols for performing ribosome profiling, each with possible biases [27,48,63]. To ensure that our model parameter estimates were generally consistent across measurements, we fit PANSE to data from two independent ribosome profiling measurements: Wu et al. [29] and Ferguson et al. [53]. Although these measurements were performed in S. cerevisiae, they used different protocols.

Comparing the metagene plots for the transcripts considered in each of the datasets, Weinberg et al. and Wu et al. exhibit the 5’-ramp; however, this effect is nearly absent from the Ferguson et al. data, with a slight depletion in ribosomes in the region between the 75^th and 100^th codon (S11A Fig). We emphasize that none of these reads were included in the actual likelihood estimation. The Ferguson et al. data had the fewest transcripts under consideration (1,918 as compared to 2,785 for Wu et al., and 3,112 for Weinberg et al.). The Ferguson et al. data were also much sparser (S11B Fig), even if only considering the same sequences across all datasets (S11C Fig). Despite this, we see overall good agreement between the NSE rate b estimates of these two datasets with the Weinberg et al. estimates, despite the reduced number of reads (Spearman rank correlation and for Wu et al. and Ferguson et al., respectively, S11D, S11E Fig).

Notably, the Ferguson et al. dataset has much noisier estimates (S11E Fig), but this is unsurprising given that it is a much smaller dataset. When assuming NSE rates b across sense codons when analyzing Ferguson et al., the NSE rate b is lower, but comparable, to the same analysis using the Weinberg et al. data: 9.29 × 10⁻⁵ (95% HDI: 8.27 × 10⁻⁵ − 1.03 × 10⁻⁴) vs. 1.41 × 10⁻⁴ (95% HDI: 1.34 × 10⁻⁴ − 1.47 × 10⁻⁴). Although estimates of ribosome waiting times w = 1/c (inverse of the elongation rates) are correlated with a proxy based on the tRNA gene copy number, this effect is weakest for the Ferguson et al. data (S11F Fig). Additionally, the distribution of ribosome waiting times w appears narrower than in the Weinberg et al. and the Wu et al. data (S11F Fig). Overall, our results suggest we are detecting a consistent signal of nonsense errors across these independent datasets, but the reduced number of reads in Ferguson et al. data significantly weakens this signal.

Discussion

The impact of translation errors on coding-sequence evolution has been a major focus for the last 3 decades, with most of this work focused on the impact of missense errors [10,12,16,64]. Recent advances in mass spectrometry technology and proteome bioinformatics make it possible to detect missense errors on a proteome-scale [13,52]. Due to the robustness of the genetic code, missense errors may not necessarily lead to a non-functional protein. In contrast, nonsense errors are likely to almost always result in a non-functional protein. There is no current high-throughput technology to directly identify the location of a nonsense error at a particular codon in a transcript. However, high-throughput ribosome profiling allows for a steady-state measurement of the “translatome,” including the ability to assign ribosomes to particular codons [44]. We developed a model of ribosome movement along an mRNA during translation to quantify codon-specific nonsense error rates and probabilities from ribosome profiling data.

Applying our PANSE model to an exemplary measurement for S. cerevisiae [27], we find evidence that NSE rates b (sometimes referred to as the “background NSE rate”) vary across codons. Prior work generally assumed the probability of a nonsense error occurring at any given codon was solely due to variation in the elongation rates of the codons, with the NSE rate b assumed uniform [18,30,34]. In contrast, we observed that NSE rates b vary over multiple orders of magnitude (10⁻⁶ to 10⁻³), suggesting other properties of codons contribute to their propensity to trigger a nonsense error aside from differences in elongation rates c.

Importantly, our model is agnostic to the specific mechanisms that cause nonsense errors; however, our parameter NSE rates b reflect three key properties thought to contribute to nonsense errors. One potential cause of variation in NSE rates b among codons is the propensity for them to be mistakenly bound by release factors [32]. Indeed, codons a single nucleotide away from a stop codon (i.e., have a stop codon neighbor) at the 3rd nucleotide position (cysteine codons TGC/T and tryptophan codons TGG) have a higher NSE rate b, on average, compared to codons that do not have a 3rd position stop codon neighbor. The high NSE rates of the tryptophan codon TGG and arginine codon CGA are particularly notable in the context of more direct studies that indicate eRF1 can bind these sense codons [36,37]. Surprisingly, this was not the case for the amino acid tyrosine, for which both of its codons (TAC/T) each have 2 stop codon neighbors (TAA/G) – although the effect of having 2-stop codon neighbors at the 3rd nucleotide is positive compared to a codon with no stop neighbors, it was not statistically significant. Consistent with our results, previous work found eRF1 binds TAC only weakly [36]. Taken together, our results indicate that being a single nucleotide away from a stop codon at the wobble position can increase the chances of a nonsense error, but it is not in and of itself sufficient to increase the NSE rates b.

We find codons with higher missense error probabilities generally have higher NSE rates b. Previous work suggested mismatches between the codon and anticodon in the P-site can increase the chances of peptidyl-tRNA drop-off and ribosomal frameshifting [39,40,43,60]. Consistent with the latter, codons particularly prone to frameshifts also had higher NSE rates b. Recent work in E. coli found that peptidyl-tRNA drop-off driven by missense errors generally happens in the subsequent round of elongation [60]. Similar to the incorrect binding of release factors to sense codons [32], slower elongation in the A-site due to, e.g., low tRNA availability would seem to increase the probability of peptidyl-tRNA drop-off and ribosomal frameshifting [42].

Other mechanisms may also trigger premature translation termination and be absorbed into the model’s estimates of NSE rates b [65–68]. For example, previous work in E. coli found that a codon-anticodon mismatch for the tRNA in ribosome P-site was detrimental to the accurate decoding of the codon in the A-site, increasing the probability of a release factor binding to a sense codon [65,69] (although we note this was not observed in yeasts using a similar experimental setup [70]). In this case, the probability of a nonsense error at any given codon is not independent, but a function of the missense error rate of its immediate upstream neighbor, among other things. This likely also affects estimates of NSE rates b as it pertains to ribosomal frameshifts and peptidyl-tRNA drop-off. The fact that we find codons with higher NSE rates b tend to be those with higher missense error probabilities and higher ribosome frameshift capacities suggests our model detects these effects, but these estimates may be conservative given that we do not consider the surrounding sequence context.

We observe that simulated ribosome counts across codons and genes generated from PANSE were generally in good agreement with the real data beyond the 5’-ramp region (first 200 codons), indicating PANSE adequately models the underlying processes shaping ribosome profiling data. We emphasize that while the PANSE model does treat elongation rates as a random variable, it does not explicitly account for various potential factors hypothesized to impact local elongation rates, from upstream basic amino acids in the ribosome tunnel to downstream mRNA secondary structure [22,47,71]. We note the correlation between the real and simulated data may be inflated due to the latter being generated from parameters estimated from the former, but also attenuated (i.e., biased toward 0) due to the inherent noise in any sequencing data [72]. Regarding the impact of noise, previous work found generally poor to moderate agreement between the ribosome footprint counts at individual codons across independent ribosome profiling experiments [73]. Similarly, for the 3 ribosome profiling experiments we considered, the Spearman rank correlation between codon counts on a position-by-position basis ranged from 0.45 (Wu vs. Ferguson) to 0.58 (Weinberg vs. Wu). Future extensions of the model will benefit from explicitly accounting for noise in ribosome profiling data by combining information across independent measurements.

Significant efforts have been made to understand the increased ribosome density at the 5’-end of coding regions frequently observed in ribosome profiling data. In combination with the increased frequency towards slow codons at the 5’-end, previous work hypothesized slow translation was favored at the 5’-end to prevent downstream ribosome queuing (“the 5’-ramp hypothesis”) [74]. The 5’-ramp hypothesis remains controversial [27,75–80], although recent evidence suggests a very short ramp within the first 5 codons can impact translation efficiency [81]. Our work here is insufficient to directly address the 5’-ramp hypothesis. However, we can say that nonsense errors do not entirely explain the dramatic drop observed in ribosome density at the 5’-end. Given that the expected NSE rates b estimated from this region are unrealistically high, we suspect (but cannot confirm) that the increased ribosome density at the 5’-ends results from an experimental artifact [27] and/or a gradual increase in the rate of ribosome elongation over the first 200 codons.

We find numerous lines of evidence consistent with adaptation to reduce the cost of nonsense errors in intragenic codon usage patterns. Generally, selection against nonsense errors is expected to be weakest at the 5’-end of a CDS, as a nonsense error occurring early in translation will waste fewer NTPs. As a result, the codon usage of 5’-ends is expected to be less adapted than regions further along the CDS. Indeed, codons with lower NSE probabilities Pr(NSE) tend to be enriched in the 5’-ends. Consistent with this avoidance of higher NSE probability Pr(NSE) codons at the 5’-end, we find that the probability of successful elongation increases along a CDS.

In general, high-expression genes tend to avoid codons more prone to nonsense errors. As a result, high-expression genes generally have a higher probability of completing translation and a lower expected total energetic cost of protein production. Furthermore, selection coefficients estimated via the population genetics model ROC-SEMPPR (generally assumed to reflect differences in ribosome pausing times) [7] are well-correlated with relative differences in NSE probabilities Pr(NSE), consistent with high-expression genes avoiding nonsense error-prone codons. Selection coefficients estimated via ROC-SEMPPR will average over any form of selection pressure that correlates with gene expression [82]. We cannot say to what extent the selection coefficients reflect selection for a reduction in ribosome pausing as opposed to selection against nonsense errors, as both will lead to an increase in translation efficiency.

Surprisingly, moderately expressed genes showed the greatest change along sequences in the probability of a ribosome successfully elongating compared to high and low-expression genes. Assuming selection on codon usage is at least partially due to selection against nonsense errors, this may seem counterintuitive at first glance. If selection against nonsense errors is strongest in high-expression genes, then selection may be shaping codon usage at the 5’-end. As a result, high-expression genes would be well-adapted even at the 5’-end. In contrast, low-expression genes are expected to be under the weakest selection against nonsense errors, resulting in a lower probability of completion overall that changes very little as a function of position. Moderate-expression genes may have high enough protein production rates such that selection against nonsense errors is expected to be more effective than in low-expression genes, but not so much that the 5’-end is well-adapted. We speculate this explains the greater increase in the probability of elongation along sequences in moderate-expression genes compared to high and low-expression genes.

Although studies on codon usage bias have predominantly focused on selection against ribosome pausing, recent theoretical work concluded the indirect cost of elongation due to ribosome pausing may be less of a factor shaping codon usage bias than other factors, including nonsense errors [71]. Based on a simple model for approximating the energetic costs (in NTPs) for each gene, we find the costs of nonsense errors to be comparable (i.e., within an order of magnitude) to the fixed indirect costs of ribosome pausing across the majority of the transcriptome. The majority (59%) of genes exhibit signals consistent with adaptations to reduce the cost of nonsense errors. However, this is likely a conservative estimate because our permutation test does not represent the expected codon frequencies sans natural selection. More accurately, our test reflects adaptation to reduce the cost of nonsense errors, in which the position of a codon is relevant, to adaptation to reduce a cost invariant with position, e.g., ribosome pausing. Although a logistic regression revealed the probability of a gene being adapted increased with gene expression, this was not the case for many high-expression genes (Fig 7D). As discussed previously, high-expression genes have a relatively low probability of experiencing a nonsense error even at the 5’-end (Fig 6C), such that permuting synonymous codons likely has a small impact on the energetic cost for a significant portion of high-expression genes. The expected energetic cost savings of adaptation against NSEs is approximately 4 NTPs – roughly the same as the cost of initiating a new ribosome for translation or a single round of elongation. Noting that translation is generally initiation-limited, adaptation against nonsense errors is expected to result in 1 fewer initiation event to produce a functional protein. These estimates were based on many assumptions and parameter estimates from previous model fits, and do not include the potential cost of certain quality-control mechanisms that may be triggered by nonsense errors (e.g., nonsense-mediated decay). As such, we suspect our estimates of the energetic cost of nonsense errors are conservative.

We emphasize that our results are consistent with natural selection against nonsense errors, but our parameters do not reflect evolutionary parameters of interest in population genetics studies (e.g., selection coefficients). The most notable pattern of natural selection against nonsense errors is the enrichment of codons with higher NSE probabilities Pr(NSE) in the 5’-end due to presumably weaker selection shortly after translation initiation. As natural selection against nonsense errors is correlated with natural selection for efficient translation, these two selective pressures are correlated across genes, but the latter is not expected to lead to the enrichment of nonsense error-prone codons in the 5’-end [83].

Other selective pressures are hypothesized to lead to unique patterns of codon usage at the 5’-end of CDSs. A key challenge in understanding selection on synonymous codon usage is the potential for correlated selective effects. First, as previously discussed, is the “5’-ramp hypothesis.” In this case, the enrichment of slow codons, which generally have higher NSE probabilities Pr(NSE), could lead to a similar pattern observed for selection against nonsense errors. However, our previous work using a population genetics model to quantify natural selection shaping codon usage in S. cerevisiae and E. coli showed that the enrichment in slow codons in the 5’-end is more consistent with a reduction in the strength of natural selection for fast codons, rather than selection generally favoring slow codons in this region [82,83]. Second, codon usage at the 5’-end is hypothesized to be shaped by selection against mRNA secondary structure to promote efficient translation initiation [75,84]. However, evidence in S. cerevisiae and other species suggests this effect is primarily relevant to the first 10–15 codons [85], whereas we see a gradual decrease in the probability of nonsense errors over a far broader region in the 5’-end. Third, previous work hypothesized natural selection against ribosomal frameshifting would be stronger at the 5’-end [86]. Ribosomal frameshifts often result in premature translation termination due to encountering an out-of-frame stop codon soon after the frameshift [87–89] and can result from the presence of slow codons in the ribosome A-site in combination with a near- or non-cognate codon-anticodon match in the P-site [42,43,86]. We expect selection against ribosomal frameshifts to be strongly correlated with selection against nonsense errors. Along these lines, codons prone to causing frameshifts when located in the P-site [42] had higher NSE rates b. Future work will focus on quantifying the evolution of codon usage bias as it relates to selection against nonsense errors.

Our results are based on extracting biological information from empirical data using a computational model. As is obligatory, “all models are wrong, but some are useful,” [90]. Using the parameter estimate from our model, we were able to test many predictions about expected coding sequence patterns if they were shaped by selection against nonsense errors, using the parameters from our model, many of which were confirmed, illustrating the model’s utility. Our PANSE model of ribosome movement is applicable to any ribosome profiling measurement. This allows for investigations into the impact of different growth conditions on nonsense error probabilities [21,23]. With the number of species with ribosome profiling measurements growing, it is possible to perform comparative analyses of nonsense error rates and probabilities. Furthermore, using the approaches outlined here, it is possible to quantify the impact of natural selection against nonsense errors on coding-sequence evolution across species and how this may change with the effective population size N_e [91].

Conclusion

By applying a model of translation to an exemplary ribosome profiling dataset in S. cerevisiae, we find multiple lines of evidence that nonsense errors play a significant role in protein-coding evolution that has largely been underappreciated. Overall, 59% of protein-coding genes exhibit signals of adaptation to reduce the cost of nonsense errors in S. cerevisiae. Natural selection to reduce the cost of ribosome pausing has been the predominant hypothesis to explain codon usage bias, but if the cost of nonsense errors is comparable, if not greater than the cost of pausing, then this hypothesis must be updated or revised. Further consideration of the impact and consequences of nonsense errors is critical for understanding the evolution of codon usage bias, which has been observed to varying degrees across all taxa.

Supporting information

S1 Text. Supplemental materials and methods.

https://doi.org/10.1371/journal.pgen.1012162.s001

(PDF)

S1 Fig. Ribogrid from analysis for Weinberg et al. data using the riboviz2 pipeline [49].

This illustrates the number of ribosome footprints assigned to nucleotide based on the 5’-end of the read. Darker colors indicate more ribosome footprints assigned to a nucleotide. The nucleotide at position 0 indicates the first nucleotide of the start codons.

https://doi.org/10.1371/journal.pgen.1012162.s002

(PDF)

S2 Fig. Impact of A-site assignment rules on parameter estimates.

Comparison of NSE rate estimates from Weinberg et al. data using either the “standard” A-site 15 nt offsets vs. the offsets estimated by riboWaltz. Spearman rank correlation coefficient is reported.

https://doi.org/10.1371/journal.pgen.1012162.s003

(PDF)

S3 Fig. Factors related to filtering genes from the final analyzed dataset.

(A) Distribution of correlations between position within a gene and ribosome density. (B-C) Genes exhibiting sudden increase in ribosome densities. Dashed lines indicate ATG codons. (D-E) Genes exhibiting a sudden decrease in ribosome densities.

https://doi.org/10.1371/journal.pgen.1012162.s004

(PDF)

S4 Fig. Confirmation of model’s capacity to estimate NSE rates b and elongation rates c.

(A) Comparison of NSE rates b for stop codons. Data from Weinberg et al. was used, slightly modified to allow for up to 15 codons from empirically determined 3’-UTRs [54]. (B) Log fold-changes of mean-centered waiting times between the elp1 deletion strain and the reference wild-type strain obtained from [28]. Top panel indicates the codons known to be impacted by this tRNA modification enzyme deletion, while the bottom indicates the other 58 sense codons. (C) Same as in (B), but with the NSE rates b.

https://doi.org/10.1371/journal.pgen.1012162.s005

(PDF)

S5 Fig. Deviations between real and simulated ribosome counts across all genes and all positions within the dataset.

Deviations are calculated as the log fold-difference.

https://doi.org/10.1371/journal.pgen.1012162.s006

(PDF)

S6 Fig. Impact of 5’-ramp region on parameter estimates.

Comparison of (A) elongation waiting times and (B) total initiation rates when considering only the first 200 codons (i.e., the 5’-ramp region) vs. the remainder of the genes. Spearman rank correlations are reported. Error bars represent 95% posterior probability intervals.

https://doi.org/10.1371/journal.pgen.1012162.s007

(PDF)

S7 Fig. First 100 codons are enriched in codons with higher NSE probabilities Pr(NSE).

Difference in NSE probabilities between codons enriched in the (A) 5’-end and (B) 3’-ends of coding sequences (first and last 100 termini). Wilcoxon rank sum test p-values are reported.

https://doi.org/10.1371/journal.pgen.1012162.s008

(PDF)

S8 Fig. Null distributions of slope estimates for regression lines relating codon position to the across-gene average in the probability of a nonsense error per position.

The slope for the real sequences is represented by the dashed line.

https://doi.org/10.1371/journal.pgen.1012162.s009

(PDF)

S9 Fig. Impact of cost on ribosome pausing C on total cost estimates.

Comparison of proportion of total cost per gene as a function of total energetic flux (cost times the protein production rate) based on (Assembly) and C_ROC.

https://doi.org/10.1371/journal.pgen.1012162.s010

(PDF)

S10 Fig. Breakdown of energetic costs.

Comparison of proportion of total fixed or variable cost per gene as a function of total energetic flux (cost times the protein production rate) based on (Assembly) and C_ROC.

https://doi.org/10.1371/journal.pgen.1012162.s011

(PDF)

S11 Fig. Comparison of datasets used as input for PANSE analysis.

(A) Comparison of metagene ribosome densities from Weinberg et al. [27], Wu et al. [29], and Ferguson et al. [53]. (B) Total number of ribosome footprints included in the final PANSE analysis for each dataset (excluding the 5’-ends), expressed as a percentage relative to the Ferguson et al. data. (C) Same as in (B), but only considering the CDSs included in the Ferguson et al. analysis. (D) Comparison of the NSE rate b estimates (on log₁₀ scale) from the Weinberg et al. and Wu et al. datasets. Error bars represent the 95% HDIs. (E) Same as in (D), but using the Ferguson et al. dataset. (F) Comparison of PANSE estimated ribosome waiting times w_c and the inverse of codon weights estimated via the tRNA adaptation index (tAI).

https://doi.org/10.1371/journal.pgen.1012162.s012

(PDF)

S1 Table. Gene-specific parameters estimated from PANSE and other models.

Contains parameter estimates relevant to gene-specific values from PANSE (initiation rate , probability of successful protein production ) and ROC-SEMPPR (protein production rate ), theoretical estimates of ribosome drop-off resilience [34], and gene expression estimates from high-throughput sequencing data (RNA-seq, Ribo-seq) obtained from previous studies [27].

https://doi.org/10.1371/journal.pgen.1012162.s013

(TSV)

S2 Table. Codon-specific parameters estimated from PANSE and other models.

Contains parameter estimates relevant to codon-specific values from PANSE (NSE rate b, ribosome waiting times 1/c) and ROC-SEMPPR (selection coefficients ), tRNA gene copy numbers and wobble efficiency, and frameshift competency [42].

https://doi.org/10.1371/journal.pgen.1012162.s014

(TSV)

S3 Table. Regression analysis comparing NSE rates to codon properties.

Properties include missense error rates [52], frameshift competency [42], and number of stop codons one mutation away from the sense codon.

https://doi.org/10.1371/journal.pgen.1012162.s015

(TSV)

Acknowledgments

We thank Premal Shah, Matt Pennell, Edward Wallace, Daohan Jiang, Josh Schraiber, and Antonis Rokas for helpful discussions throughout the course of this project and the writing of this manuscript. Grammarly.com (Free subscription plan) was used to assist the authors in proofreading this manuscript. All suggested changes to spelling and grammar were manually validated by the authors. Grammarly.com was not used to generate any new ideas or sentences for this manuscript.

References

1. Wagner A. Energy constraints on the evolution of gene expression. Mol Biol Evol. 2005;22(6):1365–74. pmid:15758206
- View Article
- PubMed/NCBI
- Google Scholar
2. Russell JB, Cook GM. Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiol Rev. 1995;59(1):48–62. pmid:7708012
- View Article
- PubMed/NCBI
- Google Scholar
3. Lynch M, Marinov GK. The bioenergetic costs of a gene. Proc Natl Acad Sci U S A. 2015;112(51):15690–5. pmid:26575626
- View Article
- PubMed/NCBI
- Google Scholar
4. Li WH. Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons. J Mol Evol. 1987;24(4):337–45. pmid:3110426
- View Article
- PubMed/NCBI
- Google Scholar
5. Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129(3):897–907. pmid:1752426
- View Article
- PubMed/NCBI
- Google Scholar
6. Shah P, Gilchrist MA. Explaining complex codon usage patterns with selection for translational efficiency, mutation bias, and genetic drift. Proc Natl Acad Sci U S A. 2011;108(25):10231–6. pmid:21646514
- View Article
- PubMed/NCBI
- Google Scholar
7. Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R. Estimating gene expression and codon-specific translational efficiencies, mutation biases, and selection coefficients from genomic data alone. Genome Biol Evol. 2015;7:1559–79.
- View Article
- Google Scholar
8. Frumkin I, Lajoie MJ, Gregg CJ, Hornung G, Church GM, Pilpel Y. Codon usage of highly expressed genes affects proteome-wide translation efficiency. Proc Natl Acad Sci U S A. 2018;115(21):E4940–9. pmid:29735666
- View Article
- PubMed/NCBI
- Google Scholar
9. Drummond DA, Wilke CO. The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet. 2009;10(10):715–24. pmid:19763154
- View Article
- PubMed/NCBI
- Google Scholar
10. Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136(3):927–35. pmid:8005445
- View Article
- PubMed/NCBI
- Google Scholar
11. Drummond DA, Raval A, Wilke CO. A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol. 2006;23(2):327–37. pmid:16237209
- View Article
- PubMed/NCBI
- Google Scholar
12. Drummond DA, Wilke CO. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. 2008;134(2):341–52. pmid:18662548
- View Article
- PubMed/NCBI
- Google Scholar
13. Mordret E, Dahan O, Asraf O, Rak R, Yehonadav A, Barnabas GD, et al. Systematic detection of amino acid substitutions in proteomes reveals mechanistic basis of ribosome errors and selection for translation fidelity. Mol Cell. 2019;75(3):427–441.e5. pmid:31353208
- View Article
- PubMed/NCBI
- Google Scholar
14. Warnecke T, Hurst LD. GroEL dependency affects codon usage--support for a critical role of misfolding in gene evolution. Mol Syst Biol. 2010;6:340. pmid:20087338
- View Article
- PubMed/NCBI
- Google Scholar
15. Warnecke T, Hurst LD. Error prevention and mitigation as forces in the evolution of genes and genomes. Nat Rev Genet. 2011;12(12):875–81. pmid:22094950
- View Article
- PubMed/NCBI
- Google Scholar
16. Kurland CG. Translational accuracy and the fitness of bacteria. Annu Rev Genet. 1992;26:29–50. pmid:1482115
- View Article
- PubMed/NCBI
- Google Scholar
17. Eyre-Walker A. Synonymous codon bias is related to gene length in Escherichia coli: selection for translational accuracy? Mol Biol Evol. 1996;13(6):864–72. pmid:8754221
- View Article
- PubMed/NCBI
- Google Scholar
18. Gilchrist MA. Combining models of protein translation and population genetics to predict protein production rates from codon usage patterns. Mol Biol Evol. 2007;24(11):2362–72. pmid:17703051
- View Article
- PubMed/NCBI
- Google Scholar
19. Tsung K, Inouye S, Inouye M. Factors affecting the efficiency of protein synthesis in Escherichia coli. Production of a polypeptide of more than 6000 amino acid residues. J Biol Chem. 1989;264(8):4428–33. pmid:2538444
- View Article
- PubMed/NCBI
- Google Scholar
20. Jørgensen F, Adamski FM, Tate WP, Kurland CG. Release factor-dependent false stops are infrequent in Escherichia coli. J Mol Biol. 1993;230(1):41–50. pmid:8450549
- View Article
- PubMed/NCBI
- Google Scholar
21. Sin C, Chiarugi D, Valleriani A. Quantitative assessment of ribosome drop-off in E. coli. Nucleic Acids Res. 2016;44(6):2528–37. pmid:26935582
- View Article
- PubMed/NCBI
- Google Scholar
22. Dao Duc K, Song YS. The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation. PLoS Genet. 2018;14(1):e1007166. pmid:29337993
- View Article
- PubMed/NCBI
- Google Scholar
23. Awad S, Valleriani A, Chiarugi D. A data-driven estimation of the ribosome drop-off rate in S. cerevisiae reveals a correlation with the genes length. NAR Genom Bioinform. 2024;6(2):lqae036. pmid:38638702
- View Article
- PubMed/NCBI
- Google Scholar
24. dos Reis M, Savva R, Wernisch L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004;32(17):5036–44. pmid:15448185
- View Article
- PubMed/NCBI
- Google Scholar
25. Kramer EB, Vallabhaneni H, Mayer LM, Farabaugh PJ. A comprehensive analysis of translational missense errors in the yeast Saccharomyces cerevisiae. RNA. 2010;16(9):1797–808. pmid:20651030
- View Article
- PubMed/NCBI
- Google Scholar
26. Joshi K, Bhatt MJ, Farabaugh PJ. Codon-specific effects of tRNA anticodon loop modifications on translational misreading errors in the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2018;46(19):10331–9. pmid:30060218
- View Article
- PubMed/NCBI
- Google Scholar
27. Weinberg DE, Shah P, Eichhorn SW, Hussmann JA, Plotkin JB, Bartel DP. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 2016;14(7):1787–99. pmid:26876183
- View Article
- PubMed/NCBI
- Google Scholar
28. Chou H-J, Donnard E, Gustafsson HT, Garber M, Rando OJ. Transcriptome-wide analysis of roles for tRNA modifications in translational regulation. Mol Cell. 2017;68(5):978–992.e4. pmid:29198561
- View Article
- PubMed/NCBI
- Google Scholar
29. Wu CC-C, Zinshteyn B, Wehner KA, Green R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol Cell. 2019;73(5):959–970.e5. pmid:30686592
- View Article
- PubMed/NCBI
- Google Scholar
30. Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol. 2006;239(4):417–34. pmid:16171830
- View Article
- PubMed/NCBI
- Google Scholar
31. Shah P, Gilchrist MA. Effect of correlated tRNA abundances on translation errors and evolution of codon usage bias. PLoS Genet. 2010;6(9):e1001128. pmid:20862306
- View Article
- PubMed/NCBI
- Google Scholar
32. Yang Q, Yu C-H, Zhao F, Dang Y, Wu C, Xie P, et al. eRF1 mediates codon usage effects on mRNA translation efficiency through premature termination at rare codons. Nucleic Acids Res. 2019;47(17):9243–58. pmid:31410471
- View Article
- PubMed/NCBI
- Google Scholar
33. Gilchrist MA, Shah P, Zaretzki R. Measuring and detecting molecular adaptation in codon usage against nonsense errors during protein translation. Genetics. 2009;183(4):1493–505. pmid:19822731
- View Article
- PubMed/NCBI
- Google Scholar
34. Bonnin P, Kern N, Young NT, Stansfield I, Romano MC. Novel mRNA-specific effects of ribosome drop-off on translation rate and polysome profile. PLoS Comput Biol. 2017;13(5):e1005555. pmid:28558053
- View Article
- PubMed/NCBI
- Google Scholar
35. Freistroffer DV, Kwiatkowski M, Buckingham RH, Ehrenberg M. The accuracy of codon recognition by polypeptide release factors. Proc Natl Acad Sci U S A. 2000;97(5):2046–51. pmid:10681447
- View Article
- PubMed/NCBI
- Google Scholar
36. Chavatte L, Frolova L, Laugâa P, Kisselev L, Favre A. Stop codons and UGG promote efficient binding of the polypeptide release factor eRF1 to the ribosomal A site. J Mol Biol. 2003;331(4):745–58. pmid:12909007
- View Article
- PubMed/NCBI
- Google Scholar
37. Wada M, Ito K. Misdecoding of rare CGA codon by translation termination factors, eRF1/eRF3, suggests novel class of ribosome rescue pathway in S. cerevisiae. FEBS J. 2019;286.
- View Article
- Google Scholar
38. Svidritskiy E, Demo G, Korostelev AA. Mechanism of premature translation termination on a sense codon. J Biol Chem. 2018;293(32):12472–9. pmid:29941456
- View Article
- PubMed/NCBI
- Google Scholar
39. Menninger JR. Peptidyl transfer RNA dissociates during protein synthesis from ribosomes of Escherichia coli. J Biol Chem. 1976;251(11):3392–8. pmid:776968
- View Article
- PubMed/NCBI
- Google Scholar
40. Caplan AB, Menninger JR. Tests of the ribosomal editing hypothesis: amino acid starvation differentially enhances the dissociation of peptidyl-tRNA from the ribosome. J Mol Biol. 1979;134(3):621–37. pmid:395319
- View Article
- PubMed/NCBI
- Google Scholar
41. Farabaugh PJ, Björk GR. How translational accuracy influences reading frame maintenance. EMBO J. 1999;18(6):1427–34. pmid:10075915
- View Article
- PubMed/NCBI
- Google Scholar
42. Vimaladithan A, Farabaugh PJ. Special peptidyl-tRNA molecules can promote translational frameshifting without slippage. Mol Cell Biol. 1994;14(12):8107–16. pmid:7969148
- View Article
- PubMed/NCBI
- Google Scholar
43. Sundararajan A, Michaud WA, Qian Q, Stahl G, Farabaugh PJ. Near-cognate peptidyl-tRNAs promote +1 programmed translational frameshifting in yeast. Mol Cell. 1999;4(6):1005–15. pmid:10635325
- View Article
- PubMed/NCBI
- Google Scholar
44. Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–23. pmid:19213877
- View Article
- PubMed/NCBI
- Google Scholar
45. Aviner R, Geiger T, Elroy-Stein O. Genome-wide identification and quantification of protein synthesis in cultured cells and whole tissues by puromycin-associated nascent chain proteomics (PUNCH-P). Nat Protoc. 2014;9(4):751–60. pmid:24603934
- View Article
- PubMed/NCBI
- Google Scholar
46. Brar GA, Weissman JS. Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol. 2015;16(11):651–64. pmid:26465719
- View Article
- PubMed/NCBI
- Google Scholar
47. Tunney R, McGlincy NJ, Graham ME, Naddaf N, Pachter L, Lareau LF. Accurate design of translational output by a neural network model of ribosome distribution. Nat Struct Mol Biol. 2018;25(7):577–82. pmid:29967537
- View Article
- PubMed/NCBI
- Google Scholar
48. Lareau LF, Hite DH, Hogan GJ, Brown PO. Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments. eLife. 2014.
- View Article
- Google Scholar
49. Cope AL, Anderson F, Favate J, Jackson M, Mok A, Kurowska A, et al. riboviz 2: a flexible and robust ribosome profiling data analysis and visualization workflow. Bioinformatics. 2022;38(8):2358–60. pmid:35157051
- View Article
- PubMed/NCBI
- Google Scholar
50. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. J R Stat Soc Series B: Stati Methodol. 2002;64(4):583–639.
- View Article
- Google Scholar
51. Cope AL, Shah P. Intragenomic variation in non-adaptive nucleotide biases causes underestimation of selection on synonymous codon usage. PLoS Genet. 2022;18(6):e1010256. pmid:35714134
- View Article
- PubMed/NCBI
- Google Scholar
52. Landerer C, Poehls J, Toth-Petroczy A. Fitness effects of phenotypic mutations at proteome-scale reveal optimality of translation machinery. Mol Biol Evol. 2024;41(3):msae048. pmid:38421032
- View Article
- PubMed/NCBI
- Google Scholar
53. Ferguson L, Upton HE, Pimentel SC, Mok A, Lareau LF, Collins K, et al. Streamlined and sensitive mono- and di-ribosome profiling in yeast and human cells. Nat Methods. 2023;20(11):1704–15. pmid:37783882
- View Article
- PubMed/NCBI
- Google Scholar
54. Mangkalaphiban K, He F, Ganesan R, Wu C, Baker R, Jacobson A. Transcriptome-wide investigation of stop codon readthrough in Saccharomyces cerevisiae. PLoS Genet. 2021;17(4):e1009538. pmid:33878104
- View Article
- PubMed/NCBI
- Google Scholar
55. Wallace EWJ, Airoldi EM, Drummond DA. Estimating selection on synonymous codon usage from noisy experimental data. Mol Biol Evol. 2013;30(6):1438–53. pmid:23493257
- View Article
- PubMed/NCBI
- Google Scholar
56. Beznosková P, Pavlíková Z, Zeman J, Echeverría Aitken C, Valášek LS. Yeast applied readthrough inducing system (YARIS): an invivo assay for the comprehensive study of translational readthrough. Nucleic Acids Res. 2019;47(12):6339–50. pmid:31069379
- View Article
- PubMed/NCBI
- Google Scholar
57. Nedialkova DD, Leidel SA. Optimization of codon translation rates via tRNA modifications maintains proteome integrity. Cell. 2015;161(7):1606–18. pmid:26052047
- View Article
- PubMed/NCBI
- Google Scholar
58. Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res. 2004;33:261–304.
- View Article
- Google Scholar
59. Jørgensen F, Kurland CG. Processivity errors of gene expression in Escherichia coli. J Mol Biol. 1990;215(4):511–21. pmid:2121997
- View Article
- PubMed/NCBI
- Google Scholar
60. Nagao A, Nakanishi Y, Yamaguchi Y, Mishina Y, Karoji M, Toya T, et al. Quality control of protein synthesis in the early elongation stage. Nat Commun. 2023;14(1):2704. pmid:37198183
- View Article
- PubMed/NCBI
- Google Scholar
61. Ogle JM, Brodersen DE, Clemons WM Jr, Tarry MJ, Carter AP, Ramakrishnan V. Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science. 2001;292(5518):897–902. pmid:11340196
- View Article
- PubMed/NCBI
- Google Scholar
62. Qin H, Wu WB, Comeron JM, Kreitman M, Li W-H. Intragenic spatial patterns of codon usage bias in prokaryotic and eukaryotic genomes. Genetics. 2004;168(4):2245–60. pmid:15611189
- View Article
- PubMed/NCBI
- Google Scholar
63. Hussmann JA, Patchett S, Johnson A, Sawyer S, Press WH. Understanding biases in ribosome profiling experiments reveals signatures of translation dynamics in yeast. PLoS Genet. 2015;11(12):e1005732. pmid:26656907
- View Article
- PubMed/NCBI
- Google Scholar
64. Yang J-R, Liao B-Y, Zhuang S-M, Zhang J. Protein misinteraction avoidance causes highly expressed proteins to evolve slowly. Proc Natl Acad Sci U S A. 2012;109(14):E831-40. pmid:22416125
- View Article
- PubMed/NCBI
- Google Scholar
65. Zaher HS, Green R. A primary role for release factor 3 in quality control during translation elongation in Escherichia coli. Cell. 2011;147(2):396–408. pmid:22000017
- View Article
- PubMed/NCBI
- Google Scholar
66. Chiabudini M, Tais A, Zhang Y, Hayashi S, Wölfle T, Fitzke E, et al. Release factor eRF3 mediates premature translation termination on polylysine-stalled ribosomes in Saccharomyces cerevisiae. Mol Cell Biol. 2014;34(21):4062–76. pmid:25154418
- View Article
- PubMed/NCBI
- Google Scholar
67. Presnyak V, Alhusaini N, Chen Y-H, Martin S, Morris N, Kline N, et al. Codon optimality is a major determinant of mRNA stability. Cell. 2015;160(6):1111–24. pmid:25768907
- View Article
- PubMed/NCBI
- Google Scholar
68. Chadani Y, Niwa T, Izumi T, Sugata N, Nagao A, Suzuki T, et al. Intrinsic ribosome destabilization underlies translation and provides an organism with a strategy of environmental sensing. Mol Cell. 2017;68(3):528-539.e5. pmid:29100053
- View Article
- PubMed/NCBI
- Google Scholar
69. Nguyen HA, Hoffer ED, Fagan CE, Maehigashi T, Dunham CM. Structural basis for reduced ribosomal A-site fidelity in response to P-site codon-anticodon mismatches. J Biol Chem. 2023;299(4):104608. pmid:36924943
- View Article
- PubMed/NCBI
- Google Scholar
70. Eyler DE, Green R. Distinct response of yeast ribosomes to a miscoding event during translation. RNA. 2011;17(5):925–32. pmid:21415142
- View Article
- PubMed/NCBI
- Google Scholar
71. Erdmann-Pham DD, Dao Duc K, Song YS. The key parameters that govern translation efficiency. Cell Syst. 2020;10(2):183–192.e6. pmid:31954660
- View Article
- PubMed/NCBI
- Google Scholar
72. Sokal RR, Rohlf FJ. Biometry - the principles and practices of statistics in biological research. 3rd ed. W.H. Freeman; 1995.
73. Wright G, Rodriguez A, Li J, Clark PL, Milenković T, Emrich SJ. Analysis of computational codon usage models and their association with translationally slow codons. PLoS One. 2020;15(4):e0232003. pmid:32352987
- View Article
- PubMed/NCBI
- Google Scholar
74. Tuller T, Carmi A, Vestsigian K, Navon S, Dorfan Y, Zaborske J, et al. An evolutionarily conserved mechanism for controlling the efficiency of protein translation. Cell. 2010;141(2):344–54. pmid:20403328
- View Article
- PubMed/NCBI
- Google Scholar
75. Goodman DB, Church GM, Kosuri S. Causes and effects of N-terminal codon bias in bacterial genes. Science. 2013;342(6157):475–9. pmid:24072823
- View Article
- PubMed/NCBI
- Google Scholar
76. Bentele K, Saffert P, Rauscher R, Ignatova Z, Blüthgen N. Efficient translation initiation dictates codon usage at gene start. Mol Syst Biol. 2013;9:675. pmid:23774758
- View Article
- PubMed/NCBI
- Google Scholar
77. Osterman IA, Chervontseva ZS, Evfratov SA, Sorokina AV, Rodin VA, Rubtsova MP, et al. Translation at first sight: the influence of leading codons. Nucleic Acids Res. 2020;48(12):6931–42. pmid:32427319
- View Article
- PubMed/NCBI
- Google Scholar
78. Zhao T, Chen Y-M, Li Y, Wang J, Chen S, Gao N, et al. Disome-seq reveals widespread ribosome collisions that promote cotranslational protein folding. Genome Biol. 2021;22(1):16. pmid:33402206
- View Article
- PubMed/NCBI
- Google Scholar
79. Sejour R, Leatherwood J, Yurovsky A, Futcher B. Enrichment of rare codons at 5’ ends of genes is a spandrel caused by evolutionary sequence turnover and does not improve translation. Elife. 2024;12:RP89656. pmid:39008347
- View Article
- PubMed/NCBI
- Google Scholar
80. Zhang J, Qian W. Functional synonymous mutations and their evolutionary consequences. Nat Rev Genet. 2025;26(11):789–804. pmid:40394196
- View Article
- PubMed/NCBI
- Google Scholar
81. Verma M, Choi J, Cottrell KA, Lavagnino Z, Thomas EN, Pavlovic-Djuranovic S, et al. A short translational ramp determines the efficiency of protein synthesis. Nat Commun. 2019;10(1):5774. pmid:31852903
- View Article
- PubMed/NCBI
- Google Scholar
82. Cope AL, Gilchrist MA. Quantifying shifts in natural selection on codon usage between protein regions: a population genetics approach. BMC Genomics. 2022;23(1):408. pmid:35637464
- View Article
- PubMed/NCBI
- Google Scholar
83. Cope AL, Hettich RL, Gilchrist MA. Quantifying codon usage in signal peptides: Gene expression and amino acid usage explain apparent selection for inefficient codons. Biochim Biophys Acta Biomembr. 2018;1860(12):2479–85. pmid:30279149
- View Article
- PubMed/NCBI
- Google Scholar
84. Kudla G, Murray AW, Tollervey D, Plotkin JB. Coding-sequence determinants of gene expression in Escherichia coli. Science. 2009;324(5924):255–8. pmid:19359587
- View Article
- PubMed/NCBI
- Google Scholar
85. Gu W, Zhou T, Wilke CO. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Comput Biol. 2010;6(2):e1000664. pmid:20140241
- View Article
- PubMed/NCBI
- Google Scholar
86. Huang Y, Koonin EV, Lipman DJ, Przytycka TM. Selection for minimization of translational frameshifting errors as a factor in the evolution of codon usage. Nucleic Acids Res. 2009;37(20):6799–810. pmid:19745054
- View Article
- PubMed/NCBI
- Google Scholar
87. Clarke CH, Miller PG. Consequences of frameshift mutations in the trp A, trp B and lac I genes of Escherichia coli and in Salmonella typhimurium. J Theor Biol. 1982;96(3):367–79. pmid:6181349
- View Article
- PubMed/NCBI
- Google Scholar
88. Seligmann H, Pollock DD. The ambush hypothesis: hidden stop codons prevent off-frame gene reading. DNA Cell Biol. 2004;23(10):701–5. pmid:15585128
- View Article
- PubMed/NCBI
- Google Scholar
89. Itzkovitz S, Alon U. The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genome Res. 2007;17(4):405–12. pmid:17293451
- View Article
- PubMed/NCBI
- Google Scholar
90. Launer RL, Wilkinson GN. Box G E P. Robustness in the Strategy of Scientific Model Building. Academic Press; 1979. pp. 201–36.
91. Charlesworth B. Effective population size and patterns of molecular evolution and variation. Nat Rev Genet. 2009;10(3):195–205.
- View Article
- Google Scholar

[ref1] 1. Wagner A. Energy constraints on the evolution of gene expression. Mol Biol Evol. 2005;22(6):1365–74. pmid:15758206
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Russell JB, Cook GM. Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiol Rev. 1995;59(1):48–62. pmid:7708012
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Lynch M, Marinov GK. The bioenergetic costs of a gene. Proc Natl Acad Sci U S A. 2015;112(51):15690–5. pmid:26575626
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Li WH. Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons. J Mol Evol. 1987;24(4):337–45. pmid:3110426
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129(3):897–907. pmid:1752426
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Shah P, Gilchrist MA. Explaining complex codon usage patterns with selection for translational efficiency, mutation bias, and genetic drift. Proc Natl Acad Sci U S A. 2011;108(25):10231–6. pmid:21646514
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R. Estimating gene expression and codon-specific translational efficiencies, mutation biases, and selection coefficients from genomic data alone. Genome Biol Evol. 2015;7:1559–79.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref8] 8. Frumkin I, Lajoie MJ, Gregg CJ, Hornung G, Church GM, Pilpel Y. Codon usage of highly expressed genes affects proteome-wide translation efficiency. Proc Natl Acad Sci U S A. 2018;115(21):E4940–9. pmid:29735666
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Drummond DA, Wilke CO. The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet. 2009;10(10):715–24. pmid:19763154
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref10] 10. Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136(3):927–35. pmid:8005445
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref11] 11. Drummond DA, Raval A, Wilke CO. A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol. 2006;23(2):327–37. pmid:16237209
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref12] 12. Drummond DA, Wilke CO. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. 2008;134(2):341–52. pmid:18662548
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref13] 13. Mordret E, Dahan O, Asraf O, Rak R, Yehonadav A, Barnabas GD, et al. Systematic detection of amino acid substitutions in proteomes reveals mechanistic basis of ribosome errors and selection for translation fidelity. Mol Cell. 2019;75(3):427–441.e5. pmid:31353208
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref14] 14. Warnecke T, Hurst LD. GroEL dependency affects codon usage--support for a critical role of misfolding in gene evolution. Mol Syst Biol. 2010;6:340. pmid:20087338
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref15] 15. Warnecke T, Hurst LD. Error prevention and mitigation as forces in the evolution of genes and genomes. Nat Rev Genet. 2011;12(12):875–81. pmid:22094950
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref16] 16. Kurland CG. Translational accuracy and the fitness of bacteria. Annu Rev Genet. 1992;26:29–50. pmid:1482115
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref17] 17. Eyre-Walker A. Synonymous codon bias is related to gene length in Escherichia coli: selection for translational accuracy? Mol Biol Evol. 1996;13(6):864–72. pmid:8754221
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref18] 18. Gilchrist MA. Combining models of protein translation and population genetics to predict protein production rates from codon usage patterns. Mol Biol Evol. 2007;24(11):2362–72. pmid:17703051
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. Tsung K, Inouye S, Inouye M. Factors affecting the efficiency of protein synthesis in Escherichia coli. Production of a polypeptide of more than 6000 amino acid residues. J Biol Chem. 1989;264(8):4428–33. pmid:2538444
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Jørgensen F, Adamski FM, Tate WP, Kurland CG. Release factor-dependent false stops are infrequent in Escherichia coli. J Mol Biol. 1993;230(1):41–50. pmid:8450549
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Sin C, Chiarugi D, Valleriani A. Quantitative assessment of ribosome drop-off in E. coli. Nucleic Acids Res. 2016;44(6):2528–37. pmid:26935582
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Dao Duc K, Song YS. The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation. PLoS Genet. 2018;14(1):e1007166. pmid:29337993
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Awad S, Valleriani A, Chiarugi D. A data-driven estimation of the ribosome drop-off rate in S. cerevisiae reveals a correlation with the genes length. NAR Genom Bioinform. 2024;6(2):lqae036. pmid:38638702
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref24] 24. dos Reis M, Savva R, Wernisch L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004;32(17):5036–44. pmid:15448185
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref25] 25. Kramer EB, Vallabhaneni H, Mayer LM, Farabaugh PJ. A comprehensive analysis of translational missense errors in the yeast Saccharomyces cerevisiae. RNA. 2010;16(9):1797–808. pmid:20651030
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref26] 26. Joshi K, Bhatt MJ, Farabaugh PJ. Codon-specific effects of tRNA anticodon loop modifications on translational misreading errors in the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2018;46(19):10331–9. pmid:30060218
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref27] 27. Weinberg DE, Shah P, Eichhorn SW, Hussmann JA, Plotkin JB, Bartel DP. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 2016;14(7):1787–99. pmid:26876183
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref28] 28. Chou H-J, Donnard E, Gustafsson HT, Garber M, Rando OJ. Transcriptome-wide analysis of roles for tRNA modifications in translational regulation. Mol Cell. 2017;68(5):978–992.e4. pmid:29198561
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref29] 29. Wu CC-C, Zinshteyn B, Wehner KA, Green R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol Cell. 2019;73(5):959–970.e5. pmid:30686592
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref30] 30. Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol. 2006;239(4):417–34. pmid:16171830
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref31] 31. Shah P, Gilchrist MA. Effect of correlated tRNA abundances on translation errors and evolution of codon usage bias. PLoS Genet. 2010;6(9):e1001128. pmid:20862306
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref32] 32. Yang Q, Yu C-H, Zhao F, Dang Y, Wu C, Xie P, et al. eRF1 mediates codon usage effects on mRNA translation efficiency through premature termination at rare codons. Nucleic Acids Res. 2019;47(17):9243–58. pmid:31410471
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref33] 33. Gilchrist MA, Shah P, Zaretzki R. Measuring and detecting molecular adaptation in codon usage against nonsense errors during protein translation. Genetics. 2009;183(4):1493–505. pmid:19822731
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref34] 34. Bonnin P, Kern N, Young NT, Stansfield I, Romano MC. Novel mRNA-specific effects of ribosome drop-off on translation rate and polysome profile. PLoS Comput Biol. 2017;13(5):e1005555. pmid:28558053
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref35] 35. Freistroffer DV, Kwiatkowski M, Buckingham RH, Ehrenberg M. The accuracy of codon recognition by polypeptide release factors. Proc Natl Acad Sci U S A. 2000;97(5):2046–51. pmid:10681447
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref36] 36. Chavatte L, Frolova L, Laugâa P, Kisselev L, Favre A. Stop codons and UGG promote efficient binding of the polypeptide release factor eRF1 to the ribosomal A site. J Mol Biol. 2003;331(4):745–58. pmid:12909007
View Article
PubMed/NCBI
Google Scholar

[141] View Article

[142] PubMed/NCBI

[143] Google Scholar

[ref37] 37. Wada M, Ito K. Misdecoding of rare CGA codon by translation termination factors, eRF1/eRF3, suggests novel class of ribosome rescue pathway in S. cerevisiae. FEBS J. 2019;286.
View Article
Google Scholar

[145] View Article

[146] Google Scholar

[ref38] 38. Svidritskiy E, Demo G, Korostelev AA. Mechanism of premature translation termination on a sense codon. J Biol Chem. 2018;293(32):12472–9. pmid:29941456
View Article
PubMed/NCBI
Google Scholar

[148] View Article

[149] PubMed/NCBI

[150] Google Scholar

[ref39] 39. Menninger JR. Peptidyl transfer RNA dissociates during protein synthesis from ribosomes of Escherichia coli. J Biol Chem. 1976;251(11):3392–8. pmid:776968
View Article
PubMed/NCBI
Google Scholar

[152] View Article

[153] PubMed/NCBI

[154] Google Scholar

[ref40] 40. Caplan AB, Menninger JR. Tests of the ribosomal editing hypothesis: amino acid starvation differentially enhances the dissociation of peptidyl-tRNA from the ribosome. J Mol Biol. 1979;134(3):621–37. pmid:395319
View Article
PubMed/NCBI
Google Scholar

[156] View Article

[157] PubMed/NCBI

[158] Google Scholar

[ref41] 41. Farabaugh PJ, Björk GR. How translational accuracy influences reading frame maintenance. EMBO J. 1999;18(6):1427–34. pmid:10075915
View Article
PubMed/NCBI
Google Scholar

[160] View Article

[161] PubMed/NCBI

[162] Google Scholar

[ref42] 42. Vimaladithan A, Farabaugh PJ. Special peptidyl-tRNA molecules can promote translational frameshifting without slippage. Mol Cell Biol. 1994;14(12):8107–16. pmid:7969148
View Article
PubMed/NCBI
Google Scholar

[164] View Article

[165] PubMed/NCBI

[166] Google Scholar

[ref43] 43. Sundararajan A, Michaud WA, Qian Q, Stahl G, Farabaugh PJ. Near-cognate peptidyl-tRNAs promote +1 programmed translational frameshifting in yeast. Mol Cell. 1999;4(6):1005–15. pmid:10635325
View Article
PubMed/NCBI
Google Scholar

[168] View Article

[169] PubMed/NCBI

[170] Google Scholar

[ref44] 44. Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–23. pmid:19213877
View Article
PubMed/NCBI
Google Scholar

[172] View Article

[173] PubMed/NCBI

[174] Google Scholar

[ref45] 45. Aviner R, Geiger T, Elroy-Stein O. Genome-wide identification and quantification of protein synthesis in cultured cells and whole tissues by puromycin-associated nascent chain proteomics (PUNCH-P). Nat Protoc. 2014;9(4):751–60. pmid:24603934
View Article
PubMed/NCBI
Google Scholar

[176] View Article

[177] PubMed/NCBI

[178] Google Scholar

[ref46] 46. Brar GA, Weissman JS. Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol. 2015;16(11):651–64. pmid:26465719
View Article
PubMed/NCBI
Google Scholar

[180] View Article

[181] PubMed/NCBI

[182] Google Scholar

[ref47] 47. Tunney R, McGlincy NJ, Graham ME, Naddaf N, Pachter L, Lareau LF. Accurate design of translational output by a neural network model of ribosome distribution. Nat Struct Mol Biol. 2018;25(7):577–82. pmid:29967537
View Article
PubMed/NCBI
Google Scholar

[184] View Article

[185] PubMed/NCBI

[186] Google Scholar

[ref48] 48. Lareau LF, Hite DH, Hogan GJ, Brown PO. Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments. eLife. 2014.
View Article
Google Scholar

[188] View Article

[189] Google Scholar

[ref49] 49. Cope AL, Anderson F, Favate J, Jackson M, Mok A, Kurowska A, et al. riboviz 2: a flexible and robust ribosome profiling data analysis and visualization workflow. Bioinformatics. 2022;38(8):2358–60. pmid:35157051
View Article
PubMed/NCBI
Google Scholar

[191] View Article

[192] PubMed/NCBI

[193] Google Scholar

[ref50] 50. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. J R Stat Soc Series B: Stati Methodol. 2002;64(4):583–639.
View Article
Google Scholar

[195] View Article

[196] Google Scholar

[ref51] 51. Cope AL, Shah P. Intragenomic variation in non-adaptive nucleotide biases causes underestimation of selection on synonymous codon usage. PLoS Genet. 2022;18(6):e1010256. pmid:35714134
View Article
PubMed/NCBI
Google Scholar

[198] View Article

[199] PubMed/NCBI

[200] Google Scholar

[ref52] 52. Landerer C, Poehls J, Toth-Petroczy A. Fitness effects of phenotypic mutations at proteome-scale reveal optimality of translation machinery. Mol Biol Evol. 2024;41(3):msae048. pmid:38421032
View Article
PubMed/NCBI
Google Scholar

[202] View Article

[203] PubMed/NCBI

[204] Google Scholar

[ref53] 53. Ferguson L, Upton HE, Pimentel SC, Mok A, Lareau LF, Collins K, et al. Streamlined and sensitive mono- and di-ribosome profiling in yeast and human cells. Nat Methods. 2023;20(11):1704–15. pmid:37783882
View Article
PubMed/NCBI
Google Scholar

[206] View Article

[207] PubMed/NCBI

[208] Google Scholar

[ref54] 54. Mangkalaphiban K, He F, Ganesan R, Wu C, Baker R, Jacobson A. Transcriptome-wide investigation of stop codon readthrough in Saccharomyces cerevisiae. PLoS Genet. 2021;17(4):e1009538. pmid:33878104
View Article
PubMed/NCBI
Google Scholar

[210] View Article

[211] PubMed/NCBI

[212] Google Scholar

[ref55] 55. Wallace EWJ, Airoldi EM, Drummond DA. Estimating selection on synonymous codon usage from noisy experimental data. Mol Biol Evol. 2013;30(6):1438–53. pmid:23493257
View Article
PubMed/NCBI
Google Scholar

[214] View Article

[215] PubMed/NCBI

[216] Google Scholar

[ref56] 56. Beznosková P, Pavlíková Z, Zeman J, Echeverría Aitken C, Valášek LS. Yeast applied readthrough inducing system (YARIS): an invivo assay for the comprehensive study of translational readthrough. Nucleic Acids Res. 2019;47(12):6339–50. pmid:31069379
View Article
PubMed/NCBI
Google Scholar

[218] View Article

[219] PubMed/NCBI

[220] Google Scholar

[ref57] 57. Nedialkova DD, Leidel SA. Optimization of codon translation rates via tRNA modifications maintains proteome integrity. Cell. 2015;161(7):1606–18. pmid:26052047
View Article
PubMed/NCBI
Google Scholar

[222] View Article

[223] PubMed/NCBI

[224] Google Scholar

[ref58] 58. Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res. 2004;33:261–304.
View Article
Google Scholar

[226] View Article

[227] Google Scholar

[ref59] 59. Jørgensen F, Kurland CG. Processivity errors of gene expression in Escherichia coli. J Mol Biol. 1990;215(4):511–21. pmid:2121997
View Article
PubMed/NCBI
Google Scholar

[229] View Article

[230] PubMed/NCBI

[231] Google Scholar

[ref60] 60. Nagao A, Nakanishi Y, Yamaguchi Y, Mishina Y, Karoji M, Toya T, et al. Quality control of protein synthesis in the early elongation stage. Nat Commun. 2023;14(1):2704. pmid:37198183
View Article
PubMed/NCBI
Google Scholar

[233] View Article

[234] PubMed/NCBI

[235] Google Scholar

[ref61] 61. Ogle JM, Brodersen DE, Clemons WM Jr, Tarry MJ, Carter AP, Ramakrishnan V. Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science. 2001;292(5518):897–902. pmid:11340196
View Article
PubMed/NCBI
Google Scholar

[237] View Article

[238] PubMed/NCBI

[239] Google Scholar

[ref62] 62. Qin H, Wu WB, Comeron JM, Kreitman M, Li W-H. Intragenic spatial patterns of codon usage bias in prokaryotic and eukaryotic genomes. Genetics. 2004;168(4):2245–60. pmid:15611189
View Article
PubMed/NCBI
Google Scholar

[241] View Article

[242] PubMed/NCBI

[243] Google Scholar

[ref63] 63. Hussmann JA, Patchett S, Johnson A, Sawyer S, Press WH. Understanding biases in ribosome profiling experiments reveals signatures of translation dynamics in yeast. PLoS Genet. 2015;11(12):e1005732. pmid:26656907
View Article
PubMed/NCBI
Google Scholar

[245] View Article

[246] PubMed/NCBI

[247] Google Scholar

[ref64] 64. Yang J-R, Liao B-Y, Zhuang S-M, Zhang J. Protein misinteraction avoidance causes highly expressed proteins to evolve slowly. Proc Natl Acad Sci U S A. 2012;109(14):E831-40. pmid:22416125
View Article
PubMed/NCBI
Google Scholar

[249] View Article

[250] PubMed/NCBI

[251] Google Scholar

[ref65] 65. Zaher HS, Green R. A primary role for release factor 3 in quality control during translation elongation in Escherichia coli. Cell. 2011;147(2):396–408. pmid:22000017
View Article
PubMed/NCBI
Google Scholar

[253] View Article

[254] PubMed/NCBI

[255] Google Scholar

[ref66] 66. Chiabudini M, Tais A, Zhang Y, Hayashi S, Wölfle T, Fitzke E, et al. Release factor eRF3 mediates premature translation termination on polylysine-stalled ribosomes in Saccharomyces cerevisiae. Mol Cell Biol. 2014;34(21):4062–76. pmid:25154418
View Article
PubMed/NCBI
Google Scholar

[257] View Article

[258] PubMed/NCBI

[259] Google Scholar

[ref67] 67. Presnyak V, Alhusaini N, Chen Y-H, Martin S, Morris N, Kline N, et al. Codon optimality is a major determinant of mRNA stability. Cell. 2015;160(6):1111–24. pmid:25768907
View Article
PubMed/NCBI
Google Scholar

[261] View Article

[262] PubMed/NCBI

[263] Google Scholar

[ref68] 68. Chadani Y, Niwa T, Izumi T, Sugata N, Nagao A, Suzuki T, et al. Intrinsic ribosome destabilization underlies translation and provides an organism with a strategy of environmental sensing. Mol Cell. 2017;68(3):528-539.e5. pmid:29100053
View Article
PubMed/NCBI
Google Scholar

[265] View Article

[266] PubMed/NCBI

[267] Google Scholar

[ref69] 69. Nguyen HA, Hoffer ED, Fagan CE, Maehigashi T, Dunham CM. Structural basis for reduced ribosomal A-site fidelity in response to P-site codon-anticodon mismatches. J Biol Chem. 2023;299(4):104608. pmid:36924943
View Article
PubMed/NCBI
Google Scholar

[269] View Article

[270] PubMed/NCBI

[271] Google Scholar

[ref70] 70. Eyler DE, Green R. Distinct response of yeast ribosomes to a miscoding event during translation. RNA. 2011;17(5):925–32. pmid:21415142
View Article
PubMed/NCBI
Google Scholar

[273] View Article

[274] PubMed/NCBI

[275] Google Scholar

[ref71] 71. Erdmann-Pham DD, Dao Duc K, Song YS. The key parameters that govern translation efficiency. Cell Syst. 2020;10(2):183–192.e6. pmid:31954660
View Article
PubMed/NCBI
Google Scholar

[277] View Article

[278] PubMed/NCBI

[279] Google Scholar

[ref72] 72. Sokal RR, Rohlf FJ. Biometry - the principles and practices of statistics in biological research. 3rd ed. W.H. Freeman; 1995.

[ref73] 73. Wright G, Rodriguez A, Li J, Clark PL, Milenković T, Emrich SJ. Analysis of computational codon usage models and their association with translationally slow codons. PLoS One. 2020;15(4):e0232003. pmid:32352987
View Article
PubMed/NCBI
Google Scholar

[282] View Article

[283] PubMed/NCBI

[284] Google Scholar

[ref74] 74. Tuller T, Carmi A, Vestsigian K, Navon S, Dorfan Y, Zaborske J, et al. An evolutionarily conserved mechanism for controlling the efficiency of protein translation. Cell. 2010;141(2):344–54. pmid:20403328
View Article
PubMed/NCBI
Google Scholar

[286] View Article

[287] PubMed/NCBI

[288] Google Scholar

[ref75] 75. Goodman DB, Church GM, Kosuri S. Causes and effects of N-terminal codon bias in bacterial genes. Science. 2013;342(6157):475–9. pmid:24072823
View Article
PubMed/NCBI
Google Scholar

[290] View Article

[291] PubMed/NCBI

[292] Google Scholar

[ref76] 76. Bentele K, Saffert P, Rauscher R, Ignatova Z, Blüthgen N. Efficient translation initiation dictates codon usage at gene start. Mol Syst Biol. 2013;9:675. pmid:23774758
View Article
PubMed/NCBI
Google Scholar

[294] View Article

[295] PubMed/NCBI

[296] Google Scholar

[ref77] 77. Osterman IA, Chervontseva ZS, Evfratov SA, Sorokina AV, Rodin VA, Rubtsova MP, et al. Translation at first sight: the influence of leading codons. Nucleic Acids Res. 2020;48(12):6931–42. pmid:32427319
View Article
PubMed/NCBI
Google Scholar

[298] View Article

[299] PubMed/NCBI

[300] Google Scholar

[ref78] 78. Zhao T, Chen Y-M, Li Y, Wang J, Chen S, Gao N, et al. Disome-seq reveals widespread ribosome collisions that promote cotranslational protein folding. Genome Biol. 2021;22(1):16. pmid:33402206
View Article
PubMed/NCBI
Google Scholar

[302] View Article

[303] PubMed/NCBI

[304] Google Scholar

[ref79] 79. Sejour R, Leatherwood J, Yurovsky A, Futcher B. Enrichment of rare codons at 5’ ends of genes is a spandrel caused by evolutionary sequence turnover and does not improve translation. Elife. 2024;12:RP89656. pmid:39008347
View Article
PubMed/NCBI
Google Scholar

[306] View Article

[307] PubMed/NCBI

[308] Google Scholar

[ref80] 80. Zhang J, Qian W. Functional synonymous mutations and their evolutionary consequences. Nat Rev Genet. 2025;26(11):789–804. pmid:40394196
View Article
PubMed/NCBI
Google Scholar

[310] View Article

[311] PubMed/NCBI

[312] Google Scholar

[ref81] 81. Verma M, Choi J, Cottrell KA, Lavagnino Z, Thomas EN, Pavlovic-Djuranovic S, et al. A short translational ramp determines the efficiency of protein synthesis. Nat Commun. 2019;10(1):5774. pmid:31852903
View Article
PubMed/NCBI
Google Scholar

[314] View Article

[315] PubMed/NCBI

[316] Google Scholar

[ref82] 82. Cope AL, Gilchrist MA. Quantifying shifts in natural selection on codon usage between protein regions: a population genetics approach. BMC Genomics. 2022;23(1):408. pmid:35637464
View Article
PubMed/NCBI
Google Scholar

[318] View Article

[319] PubMed/NCBI

[320] Google Scholar

[ref83] 83. Cope AL, Hettich RL, Gilchrist MA. Quantifying codon usage in signal peptides: Gene expression and amino acid usage explain apparent selection for inefficient codons. Biochim Biophys Acta Biomembr. 2018;1860(12):2479–85. pmid:30279149
View Article
PubMed/NCBI
Google Scholar

[322] View Article

[323] PubMed/NCBI

[324] Google Scholar

[ref84] 84. Kudla G, Murray AW, Tollervey D, Plotkin JB. Coding-sequence determinants of gene expression in Escherichia coli. Science. 2009;324(5924):255–8. pmid:19359587
View Article
PubMed/NCBI
Google Scholar

[326] View Article

[327] PubMed/NCBI

[328] Google Scholar

[ref85] 85. Gu W, Zhou T, Wilke CO. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Comput Biol. 2010;6(2):e1000664. pmid:20140241
View Article
PubMed/NCBI
Google Scholar

[330] View Article

[331] PubMed/NCBI

[332] Google Scholar

[ref86] 86. Huang Y, Koonin EV, Lipman DJ, Przytycka TM. Selection for minimization of translational frameshifting errors as a factor in the evolution of codon usage. Nucleic Acids Res. 2009;37(20):6799–810. pmid:19745054
View Article
PubMed/NCBI
Google Scholar

[334] View Article

[335] PubMed/NCBI

[336] Google Scholar

[ref87] 87. Clarke CH, Miller PG. Consequences of frameshift mutations in the trp A, trp B and lac I genes of Escherichia coli and in Salmonella typhimurium. J Theor Biol. 1982;96(3):367–79. pmid:6181349
View Article
PubMed/NCBI
Google Scholar

[338] View Article

[339] PubMed/NCBI

[340] Google Scholar

[ref88] 88. Seligmann H, Pollock DD. The ambush hypothesis: hidden stop codons prevent off-frame gene reading. DNA Cell Biol. 2004;23(10):701–5. pmid:15585128
View Article
PubMed/NCBI
Google Scholar

[342] View Article

[343] PubMed/NCBI

[344] Google Scholar

[ref89] 89. Itzkovitz S, Alon U. The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genome Res. 2007;17(4):405–12. pmid:17293451
View Article
PubMed/NCBI
Google Scholar

[346] View Article

[347] PubMed/NCBI

[348] Google Scholar

[ref90] 90. Launer RL, Wilkinson GN. Box G E P. Robustness in the Strategy of Scientific Model Building. Academic Press; 1979. pp. 201–36.

[ref91] 91. Charlesworth B. Effective population size and patterns of molecular evolution and variation. Nat Rev Genet. 2009;10(3):195–205.
View Article
Google Scholar

[351] View Article

[352] Google Scholar

This is an uncorrected proof.

Figures

Abstract

Author summary

Introduction

Materials and methods

Ribosome pausing and nonsense error model

PANSE model assumptions.

Analysis of ribosome profiling data

Fitting the pausing and nonsense error model

Analyzing variation in NSE rates b across codons

Quantifying variation in translation completion probabilities across genes

Quantifying elongation probability variation within and across genes

Identifying codons enriched in the 5’-end

Calculating the cost of nonsense errors

Analysis of additional S. cerevisiae ribosome profiling data

Results

NSE rates vary across codons

Nonsense errors are an unlikely explanation for the “5’-ramp”

The probability that translation is completed varies greatly between transcripts

Evidence supports adaptation to reduce nonsense errors

Evidence that adaptation increases with position.

Evidence that adaptation increases with gene expression.

The energetic costs of nonsense errors are likely substantial

Parameter estimates across S. cerevisiae ribosome profiling datasets are consistent

Discussion

Conclusion

Supporting information

S1 Text. Supplemental materials and methods.

S1 Fig. Ribogrid from analysis for Weinberg et al. data using the riboviz2 pipeline [49].

S2 Fig. Impact of A-site assignment rules on parameter estimates.

S3 Fig. Factors related to filtering genes from the final analyzed dataset.

S4 Fig. Confirmation of model’s capacity to estimate NSE rates b and elongation rates c.

S5 Fig. Deviations between real and simulated ribosome counts across all genes and all positions within the dataset.

S6 Fig. Impact of 5’-ramp region on parameter estimates.

S7 Fig. First 100 codons are enriched in codons with higher NSE probabilities Pr(NSE).

S8 Fig. Null distributions of slope estimates for regression lines relating codon position to the across-gene average in the probability of a nonsense error per position.

S9 Fig. Impact of cost on ribosome pausing C on total cost estimates.

S10 Fig. Breakdown of energetic costs.

S11 Fig. Comparison of datasets used as input for PANSE analysis.

S1 Table. Gene-specific parameters estimated from PANSE and other models.

S2 Table. Codon-specific parameters estimated from PANSE and other models.

S3 Table. Regression analysis comparing NSE rates to codon properties.

Acknowledgments

References