Novel mRNA-specific effects of ribosome drop-off on translation rate and polysome profile

The well established phenomenon of ribosome drop-off plays crucial roles in translational accuracy and nutrient starvation responses during protein translation. When cells are under stress conditions, such as amino acid starvation or aminoacyl-tRNA depletion due to a high level of recombinant protein expression, ribosome drop-off can substantially affect the efficiency of protein expression. Here we introduce a mathematical model that describes the effects of ribosome drop-off on the ribosome density along the mRNA and on the concomitant protein synthesis rate. Our results show that ribosome premature termination may lead to non-intuitive ribosome density profiles, such as a ribosome density which increases from the 5’ to the 3’ end. Importantly, the model predicts that the effects of ribosome drop-off on the translation rate are mRNA-specific, and we quantify their resilience to drop-off, showing that the mRNAs which present ribosome queues are much less affected by ribosome drop-off than those which do not. Moreover, among those mRNAs that do not present ribosome queues, resilience to drop-off correlates positively with the elongation rate, so that sequences using fast codons are expected to be less affected by ribosome drop-off. This result is consistent with a genome-wide analysis of S. cerevisiae, which reveals that under favourable growth conditions mRNAs coding for proteins involved in the translation machinery, known to be highly codon biased and using preferentially fast codons, are highly resilient to ribosome drop-off. Moreover, in physiological conditions, the translation rate of mRNAs coding for regulatory, stress-related proteins, is less resilient to ribosome drop-off. This model therefore allows analysis of variations in the translational efficiency of individual mRNAs by accounting for the full range of known ribosome behaviours, as well as explaining mRNA-specific variations in ribosome density emerging from ribosome profiling studies.


Discretisation of the mRNA Lattice
We now address how the choice of discretisation for the lattice affects the simulation results, and how they compare with the analytical results derived in the main body of the manuscript. We therefore calculate the translation rate (current at the right boundary J(L)) of an mRNA with homogeneous elongation rate and ribosome footprint 1 for each of the LD, HD and MC phases, depending on the discretisation ε = L/N used. In order to be able to compare lattices with different discretisations, we must keep the rescaled drop-off rate Γ = γ/ε constant, as explained in the section 'Density and Current Profiles' of the main manuscript. Therefore, the analytical result for J(L) will be independent of ε, since the expressions for the density and current profiles depend on Γ only, rather than on γ and ε separately. However, the results of simulations corresponding to the discrete system can depend on the chosen ε. the values of the analytical expressions for J(L) are independent of ε, or alternatively, N . The values of the numerical results coincide very well with the analytical expressions for the LD and HD phases for all values of N . However, the values of J M C (N ) deviate slightly from the analytical result for low values of N (coarse-grained discretisation), and it is seen to converge to the analytical value as N increases, i.e, for refined discretisations. This result therefore validates the continuous approximation adopted in the paper, showing that even for quite a coarse-grained discretisation of the mRNA lattice, the results obtained for the discrete system agree well with the results derived for the continuous system.

Ribosome Drop-Off Effects Depending on mRNA Length
One clearly expects the translation of longer mRNAs to be more strongly affected by ribosome drop-off than the one of shorter sequences. To analyse how the impact of ribosome drop-off scales with mRNA length, we compute the probability P (L) for a ribosome to finish translation as a function of the sequence length L. This probability can be computed by calculating the ratio of the current of ribosomes at position L of the lattice, normalised by the current of ribosomes at position 0, i.e., It can be determined using within the framework of the continuous approximation. We thus expect this probability P (L) to depend on the phase sustained by the mRNA, given that the expressions for J(x) differ between these phases. Indeed, by using the expressions for the ribosome density depending on x for each of the phases, ρ LD (x), ρ HD (x) and ρ M C (x) from the main manuscript and substituting them into Eq (2), we compute the probability P (L) for an initiating ribosome to finish translation in the different phases (Fig 2). The solid lines indicate the analytical results, and the dots show the results from numerical simulations (LD (blue), HD (red), MC (black)). The correspondence between the continuous and discrete system is made by keeping both Γ and γ constant and increasing N proportionally to L, according to N = Γ γ L. As expected the probability to finish translation decays monotonously with the length L of the sequences whatever the transport phase. Interestingly, the HD case shows the strongest decay. This is due to the higher dwell time of ribosomes on the mRNA caused by the queue: ribosomes are more likely to undergo drop-off, and a smaller fraction of the initiated ribosomes make it to the stop codon, compared to the LD and MC cases. Notice that the analytical curve for P (L) for the HD case finishes at a critical value L c of approximately L c = 1.3 for the parameters chosen. Beyond this length L c the HD phase can no longer be sustained on the mRNA at the given values of Γ, k, α and β. This is in good agreement with the numerical results, which show that there is a plateau in P (L) versus L, which starts at L c . Ultimately the curve then merges with the one corresponding to the MC phase. This indicates that, as we keep increasing L, the lattice undergoes a (discontinuous) transition HD to MC, which passes through a coexistence region (the shock phase), the latter corresponding to the plateau. The effect of increasing L is thus equivalent to the effect of increasing the drop-off rate Γ, and therefore, it can cause phase transitions. Also notice that the plateau in P (L) during the SP is directly due to the fact that ρ(0) = α/k and ρ(L) = 1 − β/k in SP, independently It is important to note that this is not in contradiction with the results obtained in the section 'Ribosome Drop-Off Effects Depend Strongly on Codon Configuration' of the main manuscript. There, HD sequences have been shown to possess maximal resilience to drop-off when compared to other phases. Here, they are seen to have lower probability to finish translation than other phases. The former arises since the translation rate in HD remains unaffected by drop-off, since new ribosomes are initiated in sufficient numbers to compensate drop-off and thus sustaining the queue. The latter simply reflects this fact and accounts for the reduced chance to successfully terminate translation for any given ribosome. Moreover, the decay of P (L) with L is slightly stronger in the MC phase compared to the LD phase. This is again coherent with the observation that dwell time of ribosomes are slightly higher in the MC phase, compared to the LD phase, due to more frequent steric interactions which hinder their motion. A further interesting question is whether the translation of the longest mRNA of the S. cerevisiae genome remains sustainable 1 . This longest mRNA sequence corresponds to the gene YLR106C (MDN1), which has 4911 codons. Using the physiological value of the drop-off rate (γ = 1.4 × 10 −3 s −1 ) and its corresponding physiological value for the initiation rate (α φ = 0.015s −1 ) one can estimate the probability P (L) to finish translation under physiological conditions: it is found to be slightly above 10%, where we have used the more realistic scenario explained in the main text, incorporating the inhomogeneous hopping rates corresponding to this particular mRNA, as well as the extended footprint of the ribosome.

Ribosome Drop-Off Effects Depending on Initiation Rate
Here we analyse the effect of initiation rate α on the ribosome drop-off resilience χΓ. We therefore compute the resilience χΓ versus the initiation rateα for a fixed value ofΓ = 0.05 for two different cases: (i)β = 0.35 and (ii)β = 0.6. The rest of the parameters are kept constant: L = 1, N = 500. Like that, we expect the sequence to undergo an LD-SP-HD transition in case (i), and LD-MC transition in case (ii). Figures 3, 4 show the results. The red dots indicate stochastic simulations, whereas the black lines represent the analytical results. The agreement between stochastic simulations and analytics is good in general, with the MC phase showing a slight deviation, consistent with the results presented in the section 'mRNAs with High Elongation Rates Are more Resilient to Ribosome Drop-Off' of the main manuscript.
Within the LD phase, the resilience decreases, although very slightly, as the initiation rate α is increased. However, it is constant in both the HD and MC phases, since the ribosomal current does not depend onα in those phases. Moreover, in HD phase we recover the maximal possible resilience equal to 1, and therefore, the resilience increases withα as we go through the SP (Fig. 3).

Estimation of Elongation Rates
We estimate the elongation rates for S. cerevisiae for the ribosome drop-off model following the approach taken in [1]. We assume a spatially homogenous pool of tRNAs in the cell. Let us first assign synonymous codons the same elongation rate, being proportional to the concentration of its cognate tRNA. Since not all tRNA abundances are available for S. cerevisiae, the Gene Copy Number (GCN) is taken as a proxy that indicates the proportion of different tRNAs in the cell [2]. Therefore, we start from where r is a constant of proportionality. Note that the sum in Eq (3) goes only up to 41, corresponding to the different number of tRNAs in S. cerevisiae.
In the following step we consider the reduction in elongation rates due to wobble base-pairing [3], so that the translation rates of codons using the G-U wobble are reduced by 39% compared to their G-C counterparts. Analogously, codons using the wobble I-C and codons using the wobble I-A are reduced by 36% relative to their I-U counterparts [3]. We then reduce the initially set rates accordingly. Finally, to calculate the factor of proportionality r, we impose k i = 10 codons/s, where · denotes the average over all 61 codons, calculated as follows where n i denotes the total number of codons in the cell of exactly type i, and n = 61 i=1 n i . By equating Eq (4) to 10 codons/s we can then obtain the value of r numerically. The table of the resulting elongation rates for each codon is provided in the Supplementary Tables. In this procedure the termination rate is assumed to be fast and comparable to the highest elongation rate, and we therefore take it to be β = 18.03s −1 .

Genome-Wide Estimation of Physiological Values of Initiation Rates
We estimate the physiological value of the initiation rate for each mRNA sequence by using yeast genome-wide experimental data of the ribosome density ρ φ from [4] grown under physiological conditions. For each mRNA we identify the physiological value α φ of the initiation rate, using our ribosome drop-off model by requiring ρ(α φ ) = ρ φ , i.e., we take the initiation rate to be the one which replicates the experimentally measured ribosome density, as illustrated in Fig 5 for the mRNA of YAL033W (POP5) [1]. Figure 6 shows the histogram of the genome-wide estimated values of α φ . Furthermore, we have divided this histogram into four regions, from small to high α φ , and performed a Gene Ontology (GO) enrichment analysis, using the software Gene Ontology Term Finder (http://go.princeton.edu/cgi-bin/GOTermFinder). P-values are computed using the hypergeometric distribution, and the Bonferroni correction is applied to take into account multiple hypothesis tests [5]. Significantly enriched GO annotations can be identified in each region of the histogram. Like that, mRNAs with α φ < 0.1s −1 contain highly disproportionate number of proteins involved in biological regulation, the regulation of metabolic processes, regulation of transcription and response to stimulus, as well as proteins involved in the cell cycle, mainly located in the nucleus, protein complexes, chromosome and the membrane. In the region corresponding to initiation rates 0.1s −1 < α φ < 0.2s −1 we find significant over-representation of proteins involved in different aspects of metabolism, such as carboxylic acid metabolic process, oxoacid metabolic process and organic acid metabolic process. In the next region, for 0.2s −1 < α φ < 0.3s −1 GO annotations corresponding to cytoplasmic translation, metabolism, glycolysis, ATP generation from ADP are significantly over-represented, even though a large proportion of those genes are unannotated. Finally, in the last region of the histogram, for α φ > 0.3s −1 most genes are unannotated. Table 1 summarises the results from the GO enrichment analysis.
These results are consistent with the results presented in the main text on the GO analysis enrichment of resilience to drop-off and its correlation with abundance of encoded proteins. From the GO analysis for the initiation rate values we see that mRNAs encoding proteins which are present at high levels in the cell (such as the ones involved in translation and glycolysis) have predominantly high initiation rates. In contrast, mRNAs encoding stress-related proteins, typically present in the cell at low levels under physiological conditions, have mainly low initiation rates.  Frequency Initiation Rate (1/s) Figure 6. Histogram of physiological values of the initiation rate α φ for the S. cerevisiae genome. The colours of the different regions correspond to the following α φ intervals: green: α φ < 0.1s −1 ; yellow: 0.1s −1 < α φ < 0.2s −1 ; orange 0.2s −1 < α φ < 0.2s −1 , and red: α φ > 0.3s −1 .