An optimal growth law for RNA composition and its partial implementation through ribosomal and tRNA gene locations in bacterial genomes

The distribution of cellular resources across bacterial proteins has been quantified through phenomenological growth laws. Here, we describe a complementary bacterial growth law for RNA composition, emerging from optimal cellular resource allocation into ribosomes and ternary complexes. The predicted decline of the tRNA/rRNA ratio with growth rate agrees quantitatively with experimental data. Its regulation appears to be implemented in part through chromosomal localization, as rRNA genes are typically closer to the origin of replication than tRNA genes and thus have increasingly higher gene dosage at faster growth. At the highest growth rates in E. coli, the tRNA/rRNA gene dosage ratio based on chromosomal positions is almost identical to the observed and theoretically optimal tRNA/rRNA expression ratio, indicating that the chromosomal arrangement has evolved to favor maximal transcription of both types of genes at this condition.


Supporting Information for
An optimal growth law for RNA composition and its partial implementation through ribosomal and tRNA gene locations in bacterial genomes Xiao-Pan Hu, Martin J. Lercher    (1), which ignores protein degradation, whereas the dashed red curve shows the RNA composition growth law with protein degradation included. Protein degradation rate typically ranges from 0.02 h -1 to 0.04 h -1 [1,2] (see estimation below) and is higher at low growth rates than at high growth rates [1][2][3]. We here assumed a constant protein degradation rate (kdeg = 0.04 h -1 ); its inclusion is equivalent to moving the optimal TC/ribosome ratio 0.04 h -1 to the left. As the degradation rate is relatively small compared to the maximal growth rate of E. coli [1,2], protein degradation affects the optimal TC/ribosome ratio only at very low growth rates, where the degradation rate becomes comparable to the growth rate. to the data reported in [2] and to range from 0.025 h -1 to 0.03 h -1 according to the data reported in [1]. To be conservative, we here chose kdeg = 0.04 h -1 , the largest reported degradation rate.

Fig B.
Comparison of the optimal TC/ribosome expression ratio with a previously reported optimal TC/ribosome ratio for E. coli [4]. Klumpp et. al [4] predicted the optimal TC/ribosome expression ratio by identifying the proteome fractions of ribosome and TC that maximize growth rate in a coarse-grained, phenomenological model of cellular growth [4]. This optimal proteome allocation pattern results in a predicted TC/ribosome ratio (gray line, extracted from Fig 4C in [4]) that is substantially lower than the experimentally observed data. Here, we considered the optimal resource allocation into cellular dry mass (RNA and protein combined; red curve), motivated by the near-constant cellular dry mass density across growth conditions. The optimal dry mass allocation explains the experimentally observed TC/ribosome expression ratios much better than the optimal proteome allocation considered in Ref. [4]. Protein accounts for roughly 1/3 of the ribosome mass, whereas it accounts for roughly 2/3 of the ternary complex mass. If only the protein cost is considered, the ternary complex appears much more expensive to the cell, resulting in a lower predicted ternarycomplex/ribosome ratio.  The effective population sizes for 46 out of the 170 species were obtained from Ref. [5]. As shown in panels 1-3, we found no statistically significant Spearman rank correlations (P > 0.1) between Ne and the different positions considered in our study (position of rRNA genes, tRNA genes, and the relative position, positiontRNA -positionrRNA ) when analyzing fastand slow-growing species (P > 0.1, see colored text in panels 1-3). In contrast, we found statistically significant correlations between Ne and genome size (panel 4) and, for fastgrowing species, between Ne and maximal growth rate μmax (panel 5).

Fig E.
The numbers of genomically encoded rRNA genes and tRNA genes are positively correlated with μmax and negatively correlated with tRNA/ribosome ratios. With increasing maximal growth rate μmax, bacterial genomes harbor more rRNA genes (panel 1) and tRNA genes (panel 2). Panels 3-6 show the relationship between tRNA/ribosome dosage and genomic ratios and the number of ribosome and tRNA genes in the genome. While a negative correlation between the tRNA/ribosome ratios and the number of rRNA genes would trivially occur also if the numbers of tRNA and rRNA genes were independent (panels 3 and 4), a negative correlation of the tRNA/ribosome ratios with the number of tRNA genes would not be expected in this case (panels 5 and 6); the latter observation thus provides strong support for our hypothesis that the relative number of tRNA and rRNA genes are constrained to optimize translation efficiency.  [6]. The DNA replication rate became slow at low temperatures, and multiple replication rounds were observed even at low growth rates. This effect made the rRNA/tRNA dosage ratio almost independent of growth rate. As we could not find experimental data for the tRNA/ribosome expression ratio under temperature stress conditions, it is unclear if the tRNA/ribosome expression ratio is still optimal under temperature stress conditions.
Text A. Assessment of tRNA/rRNA gene dosage ratio calculated by a constant DNA replication rate (k rep = 1000 s -1 ) In the main text, we used a constant DNA replication rate (krep = 1000 s -1 ) for all species to calculate the C period and the tRNA/rRNA gene dosage ratio (Fig 1B and Fig 3D). This is an approximation, as the DNA replication rate (1) is growth rate dependent in a given species [10,11] and (2) is species-specific and depends on the maximal growth rate (μmax) across species. Here, we assess how the assumption of a constant krep = 1000 s -1 affects the tRNA/rRNA dosage ratio in the results shown in Fig 1B and Fig 3D.

The effect of a growth rate-dependent C period on the tRNA/rRNA dosage ratio in E. coli
In E. coli, the C period is almost constant at growth rates above 0.7 h -1 but increases with decreasing growth rate below 0.7 h -1 (Fig G1; data from Refs. [10,11]). However, as shown in Fig G2, when calculating the tRNA/rRNA dosage ratio using the experimentally observed ratedependent C period, the results are very similar to those calculated under the assumption of a constant C period and DNA replication rate (krep = 1000 s -1 ) (calculated by equation (21)). Thus, setting krep = 1000 bp s -1 appears to be an acceptable approximation for the tRNA/rRNA dosage ratio calculation.

The effect of a species-specific replication rate on the tRNA/rRNA gene dosage ratio across species
In a literature search, we found estimates of the replication rate or C period for 5 species in our dataset, including one slow-growing species and four fast-growing species (Fig H1 and S5  Table). We fitted a linear model for the dependence of the replication rate on μmax (both on log scale; Fig H1) and used this linear model to estimate the μmax-dependent DNA replication rate for all species in our dataset. The tRNA/ribosome gene dosage ratio estimated using a μmax-dependent replication rate (S4 Table) is very similar to the dosage estimated using a constant replication rate krep = 1000 s -1 (Fig H2; R 2 = 0.973). Further, the dependence of the tRNA/ribosome gene dosage ratio on the maximal growth rate μmax when considering a growth rate-dependent replication rate (Fig H3, Spearman's rank correlation = −0.47, = 7.0 × 10 −4 for fast-growing species; = −0.47, = 5.5 × 10 −8 for slow-growing species) is very similar to that shown in Fig 3D . Thus, the dependence of the DNA replication rate on μmax does not appear to affect our conclusions. Relationship between the tRNA/rRNA gene dosage ratio calculated using a μmax-dependent replication rate and μmax.

Text B. Genome size affects gene position
At the same DNA replication rate, bacteria with smaller genomes need less time to replicate their DNA than bacteria with larger genomes; at the same growth rate, they hence have fewer simultaneous replication rounds in the cell. For example, Streptococcus pneumoniae (with a 2.1 Mb genome) does not need multiple replication rounds at its fast growth rate [7]. Thus, replication-associated gene dosage effects will be less important in bacteria with small genomes, and the positions of tRNA and rRNA genes might play less important roles in these species.
To  Text C. The positions of tRNA genes are not correlated with the corresponding codon frequencies.
In the main text, we showed that the genomic averages of tRNA gene positions are affected by the maximal growth rate of a species (Fig 3B) and constrained by the optimal RNA growth law, which posits that tRNA genes are on average farther from oriC than rRNA genes (equation (1) and Fig 3C). When calculating the tRNA/rRNA gene dosage ratio (Fig 3D), we treated all tRNAs equally. However, tRNAs decode codons with different frequencies. While the optimal scaling of the tRNA/rRNA expression ratio (equation (2)) with growth rate is independent of codon frequencies, it is still conceivable that selection pressure toward specific genomic positions is stronger for tRNA genes whose products decode more abundant codons. We thus asked if there is a systematic dependence of genomic positions on the frequencies of the cognate codons.
As the anticodons of tRNAs are not annotated in RefSeq for some species [8], we identified the anticodon using tRNAscan-SE 2.0 [9] and used the wobble paring rule to find the cognate codon(s) of a given tRNA. Then, a tRNA's cognate codon frequency was calculated as the summed frequencies of all its cognate codon(s).
We tested if there is a correlation between a tRNA's position and its cognate codon frequency in a given species. Please note that the analysis here is different from Fig 3B in the main text.
In the main text, we tested if the average position of tRNA genes tends to be close to oriC in fast-growing species. Here, we test if individual tRNAs with higher codon frequencies tend to be located closer to oriC in a given genome.
Statistically significant correlations (P < 0.05) between tRNA genomic position and the cognate codon frequency were found in 21 out of 170 species (Fig J1). 6 out of the 21 species show positive correlations, while the remaining 15 species show negative correlations. This means that in only 15 species those tRNAs that can decode more codons (by codon frequency) tend to be located closer to the origin of replication.
We also tested for a correlation between tRNA copy number (the number of tRNA genes with the same anticodon) and cognate codon frequency. We found that tRNAs with higher copy numbers tend to have higher cognate codon frequencies in most fast-growing species (Fig J2; P < 0.05 in 45 out of 48 species).
We thus conclude that in fast-growing species, the copy number of a tRNA gene but not its genomic position is strongly affected by its cognate codon frequency; this finding supports our equal treatment of tRNAs when calculating tRNA gene dosage. Text D. The effect of assuming a constant TC/ribosome expression ratio instead of an optimal ratio Several recent modeling studies have assumed that the TC/ribosome expression ratio is constant [4,[12][13][14][15][16]. In contrast, we found that the optimal TC/ribosome expression ratio is growth rate-dependent (equation (1)), a relationship consistent with experimental data across species (Fig 1B and Fig 2A). In this section, we estimate by how much growth rate predictions are expected to change when accounting for this growth rate dependence. To be independent of any specific model, we estimated the associated change in the cellular cost for translation, where -consistent with the assumptions of our optimality estimate, equation (1) -we used the cytosol density as a proxy for cost; detailed predictions derived from this cost measure have been found to be in good agreement with experimental data for the E. coli translation machinery [17].
We first calculated the optimal dry mass per volume occupied by ribosomes and TCs as a function of growth rate according to our model. Assuming a constant protein concentration across conditions in E. coli, we first calculated the molar concentrations of ribosome and TC based on equations (9) and (10) Using the respective molecular weights, we converted both estimates for the ribosome and TC concentrations into a combined cytosolic mass density of the core translation components. Comparing these combined mass densities, we found that the total dry mass allocated to translation is very similar between the two calculations at high growth rates; substantial differences are only found at low growth rates (Fig K).
Thus, the assumption of a constant TC/ribosome expression ratio will lead to very similar growth rate predictions at moderate to fast growth rates. However, translation is a very expensive process for fast-growing cells, and even these small differences will have a substantial effect on evolution in natural populations. The efficiency of natural selection in large bacterial populations likely explains why the optimal expression of the ribosome and TC (Fig 1B and Fig 2) and the differential genomic position of rRNA and tRNA genes (Fig 3) are consistent with experimental data across species.