Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Correlation between gene expression levels under drought stress and synonymous codon usage in rice plant by in-silico study

  • Fatemeh Chamani Mohasses,

    Roles Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft

    Affiliation Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran

  • Mahmood Solouki,

    Roles Writing – review & editing

    Affiliation Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran

  • Behzad Ghareyazie ,

    Roles Conceptualization, Project administration, Supervision, Validation, Writing – review & editing (BG); (MM)

    Affiliation Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO), Karaj, Iran

  • Leila Fahmideh,

    Roles Writing – review & editing

    Affiliation Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran

  • Motahhareh Mohsenpour

    Roles Conceptualization, Formal analysis, Methodology, Project administration, Supervision, Validation, Visualization, Writing – review & editing (BG); (MM)

    Affiliation Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO), Karaj, Iran

Correlation between gene expression levels under drought stress and synonymous codon usage in rice plant by in-silico study

  • Fatemeh Chamani Mohasses, 
  • Mahmood Solouki, 
  • Behzad Ghareyazie, 
  • Leila Fahmideh, 
  • Motahhareh Mohsenpour


We studied the correlation of synonymous codon usage (SCU) on gene expression levels under drought stress in rice. Sixty genes related to drought stress (with high, intermediate and low expression) were selected from rice meta-analysis data and various codon usage indices such as the effective number of codon usage (ENC), codon adaptation index (CAI) and relative synonymous codon usage (RSCU) were calculated. We found that in genes highly expressing under drought 1) GC content was higher, 2) ENC value was lower, 3) the preferred codons of some amino acids changed and 4) the RSCU ratio of GC-end codons relative to AT-end codons for 18 amino acids increased significantly compared with those in other genes. We introduce ARSCU as the Average ratio of RSCUs of GC-end codons to AT-end codons in each gene that could significantly separate high-expression genes under drought from low-expression genes. ARSCU is calculated using the program ARSCU-Calculator developed by our group to help predicting expression level of rice genes under drought. An index above ARSCU threshold is expected to indicate that the gene under study may belong to the “high expression group under drought”. This information may be applied for codon optimization of genes for rice genetic engineering. To validate these findings, we further used 60 other genes (randomly selected subset of 43233 genes studied for their response to drought stress). ARSCU value was able to predict the level of expression at 88.33% of the cases. Using third set of 60 genes selected amongst high expressing genes not related to drought, only 31.65% of the genes showed ARSCU value of higher than the set threshold. This indicates that the phenomenon we described in this report may be unique for drought related genes. To justify the observed correlation between CUB and high expressing genes under drought, possible role of tRNA post transcriptional modification and tRFs was hypothesized as possible underlying biological mechanism.


Abiotic stresses such as drought, salinity, low or high temperatures and other environmental extremes negatively affect crop growth and reduce crop yield on a global scale [1]. Among these stresses, drought stress is the most important factor-limiting yield in agricultural systems in arid and semi-arid regions [2]. Rice (Oryza sativa), a well-known cost-effective cereal in the world, is very sensitive to drought stress because of its limited adaptation to water-deficit conditions. Although rice germplasm has a large amount of functional genetic diversity for drought tolerance–related traits/mechanisms [3], breeding for drought tolerance is still a challenge for breeders because of its complex genetic nature with higher environmental plasticity and multiple metabolic pathways involved. There are hopes that transgenic technology can improve stress tolerance by introducing novel exogenous genes or altering the expression levels of endogenous genes [4]. Researchers have endeavored during recent decades to generate transgenic crops with improved tolerance against abiotic stresses. The first step of transgenic research is the identification of genes serving as key regulators of different metabolic pathways, including osmolyte synthesis, ion homeostasis through selective ion uptake, antioxidant defense system and other frontline defense pathways [5]. Rice is currently considered a model crop for genetic engineering with well-developed tissue culture and gene transfer protocols. There are numerous reports on the production of genetically modified (GM) rice for such traits as enhanced resistance against insect pests and/or herbicide tolerance [610]. In some cases, approvals have even been issued for such GM rice plants in Japan, Canada, Colombia, Mexico, Honduras, New Zealand, Philippines and the United States [11]. However, unlike in the case of drought tolerant maize (DroughtGard) [12], to date there is no verifiable report of production and/or approval of any GM drought-tolerant rice. Despite the ever-increasing discovery and introduction of genes affecting different traits, the challenge remains to discover and explore genes effectively rendering abiotic stress tolerance [13]. Plant species are reported to indicate wide diversity in terms of their gene expression, physiology and stress response under different environmental conditions [14]. A key to progress towards breeding better crops for stressed conditions is to understand the alterations in cellular, biochemical and molecular machinery that occur in response to stress [15]. A particular stress changes the expression level of specific genes in a species-dependent fashion. It causes differences in the efficiency of signal perception and subsequent transcriptional changes leading to elicitation of a specific response and adaptation and finally enhanced stress tolerance [16]. In addition, gene expression levels correlate with several features of the underlying genes and encoded proteins, including synonymous codon (codons encoding the same amino acid) usage, amino acid composition, rates of protein evolution, and the length of coding sequence [17,18]. Synonymous codons are not equally present in all genes and/or species. This phenomenon, called codon usage bias (CUB), is ubiquitous in all organisms. Mutational pressure and natural selection are considered the two major factors contributing to CUB [19]. The correlation of CUB with gene expression levels has been reported in organisms from all domains of life [2029]. Codon usage presents the specific pattern of gene expression and it has been noted that the use of the same codon is preferred in gene expression under the same physiological state [30]. It has been shown that the overall preference of amino acid usages and codon usage preferences in the proteins of a given organism were significantly affected by major environmental factors [31]. This study notes the presence of a correlation between cellular function and codon usage profiles of the genes in the studied pairs. In addition, some previous studies showed the existence of correlation between gene expression levels and CUB [18,32,33]. Furthermore, several published studies support the idea that translation of specific codon-biased transcripts in stress response genes can be regulated by tRNA modifications [3436]. Wobble-base tRNA modification levels have the potential to work in concert with codon usage patterns in specific transcripts to regulate translation of response proteins. Dedon et al. [37] designated these as modification tunable transcripts (MoTTs). Modified tRNA directly participate in translation. However, unmodified mature tRNAs or pre-tRNAs are processed into various types of transfer RNA-derived fragments known as tRFs [3841]. Production of tRFs is promoted by abiotic stress conditions to eventually control gene expression by transcriptional, post-transcriptional and translational regulation [42].

Hence, knowledge of the codon usage and codon-pair context patterns of plants and underlying evolutionary forces will be useful to understand the molecular mechanism of environmental adaptation and biological diversity of each species [43]. In addition, this information can be applied for codon optimizations of diverse sets of genes to be used in plant transformation programs. Here, we use genome sequence and genome-wide expression data of drought responsive genes to investigate the relationship between gene expression levels and CUB in rice.

Material and methods

Selection of genes with different expression levels under drought stress

Sixty genes associated with drought stress in rice were identified through a structured literature survey from meta-analysis data [4446]. Then the genes were divided based on their expression levels under drought condition into three categories: high-, intermediate- and low-expression genes. Complete coding DNA sequences of these genes were accessed from the NCBI nucleotide database [47]. In the present study, high-expression genes included NAC transcription factors, zinc finger proteins, AP2/ERF transcription factors, LEA proteins, phosphatidylethanolamine binding proteins, peptide transporters, dioxygenases, actin-binding proteins, protein phosphatases, dehydrins and membrane proteins family. These genes have been reported to show high expression under drought stress in different experiments (Table 1). In addition to the high expression, some of these genes showed enhanced drought tolerance in reported genetic engineering experiments [4857] (Table 2).

Table 1. Genes, their level of expression under drought condition (high, intermediate and low), their encoded proteins, log2 of the fold change (gene expression level), the nucleotide composition, ENC value and CAI value.

(Additional information about the genes used in genetic transformation programs (shown with asterisk) is provided in Table 2).

Table 2. High expression genes that have been used in genetic engineering programs for enhanced stress tolerance.

Analysis of the nucleotide composition

The program CAIcal [58] was used to calculate the AT content and GC content at first, second and the third nucleotide positions, (AT1, AT2, AT3) and (GC1, GC2, GC3), respectively.

Relative Synonymous Codon Usage (RSCU) analysis

RSCU was calculated following Hastings and Emerson [59], as the ratio of the observed frequency of a codon to the expected frequency of the same codon within a synonymous codon group (with no bias) in the entire coding sequence of the gene concerned. RSCU can be equal to one when there is no bias, more than one when there is preference to use that specific codon and less than one when the codon is underused [60]. RSCU values can range from zero where the specific codon is not used at all to 2, 3, 4 and 6 when only one codon is used for encoding amino acids with 2, 3, 4 and 6 synonymous codons, respectively. An RSCU value >1 for each codon shows that this codon is preferred. Higher RSCU value indicates the presence of higher CUB. The RSCU of 60 genes associated with drought stress in rice were calculated using ACUA software [61] and the program CAIcal [58], excluding the stop codons and the two amino acids which are encoded by a single codon (Trp and Met).

ENC analysis

Effective number of codon usage (ENC) refers to the number of unique codons found in a gene. This value can be in a range of 20 (where there is an extreme bias towards the use of only one codon for each amino acid) to 61 (representing the use of all synonymous codons). If there are fewer than 60 synonymous codons used in a particular gene, then CUB may be present (considering the fact that Trp has only one codon). We used CAIcal to calculate ENC.

Codon Adaptation Index (CAI) analysis

CAI is another widely used method for evaluation of CUB. It measures the similarity between the frequency of synonymous codons used by a gene and that of a reference set [58]. The range of the CAI value is between 0 and 1. The rice codon usage from the Kazusa database [62] was used as a reference set. The program CAIcal [58] was used for CAI calculation.

Searching for tRFs data under drought stress

Data on tRFs under drought stress in rice were obtained from the PtRFdb ( [63] and their frequency was compared in the control and drought stress condition in different stages of plant growth. Codons related to each tRF were also compared with the rice preferred codons under drought stress.


Base composition analysis

In the present study, GC content was higher than AT content in all three gene-expression categories (high, intermediate and low-expression genes), as is expected for the rice genome. The means of GC content in categories 1, 2 and 3 were 70.9%, 56.95% and 54.2%, respectively. The results showed that the GC content at the three-codon positions was noticeably different. G or C at position three of each codon (GC3) was higher than that at position one (GC1), and it was lowest at position two (GC2) among most of the studied genes. The means of GC3 content in categories 1, 2 and 3 were 94%, 69.3% and 56.2%, respectively (Table 1, Fig 1).

Fig 1. The nucleotide composition of three gene expression categories under drought stress (high, intermediate and low expression) in 60 selected rice genes.

ENC analysis

The CUB for each gene was also calculated using ENC. The ENC value of the genes of the first group was between 27 and 34, the second group was between 38 and 56 and the third group was between 46 and 56 (Table 1).

Analysis of CAI

The range of the CAI value in the first category was between 0.859 and 0.946; in the second and third categories were between 0.733 and 0.867 and between 0.729 and 0.894, respectively (Table 1). High expression genes showed higher CAI value as compared to the other categories.

RSCU analysis

The comparative study among the RSCU values of the studied genes in three expression categories under drought stress with that in general pattern of rice genes reveals the existence of CUB. In other words, when genes with different expression levels under drought in rice are compared with the standard rice codon usage table, a change in preferred codon of some amino acids of high expression genes and low expression genes under stress is observed. For example, alanine and serine are encoded by the most preferred codon GCG and TCG instead of rice’s most widely used codons GCC and TCC, respectively, in the high expression genes category. The third codon position changed to the nucleotide G in this category. Aspartic acid and proline are encoded by the most preferred codon GAT and CCA instead of rice standard codons for these two amino acids: GAC and CCG, respectively, in the intermediate expression genes category. The third codon position changed to the nucleotides A and T in this category (Fig 2).

Fig 2. RSCU values of each codon/amino acid across 20 genes in each high, intermediate and low gene expression categories in rice.

The heat-plot of different gene categories based on the RSCU index is shown in Fig 3. This plot separates distinctly the high expression genes under drought conditions from other categories so that the colors were divided into two distinct groups (Fig 3). These data may indicate that genes with higher expression under drought stress have acquired their own preferred codons in the course of evolution. Also, the heat-plots showed that G or C end codons are highly preferred in this set of genes for most amino acid codons, but the TTG codon was not preferred for leucine in high expression genes under drought stress.

Fig 3. The heat-plots of codon usage profiles for high, intermediate and low gene expression categories under drought stress in rice based on RSCU index.

The green color represents high propensity for using the specified codon, whereas the red color shows lower propensity and the black color shows intermediate tendency. TTG codon (marked with red) was not preferred for leucine in high expression genes under drought stress.

Average RSCU (GC/AT): A new index for estimation of CUB

After calculating the RSCU index for each codon per gene, it was observed that the ratio of RSCU of codons with GC end to codons with AT end in each amino acid was significantly higher in high expression genes than that in the other categories. Even in some high expression genes, the RSCU of codons with AT end was zero. Hence, an RSCU-based index was defined and called ARSCU. It measures the average ratio of RSCUs with GC- to AT-end codons for all amino acids in one gene, providing only one value for each gene. Our findings indicate that ARSCU could significantly separate highly expressed genes under drought from the others. In this study, the ARSCU (GC/AT) index ranged between 9.5 to 18.5, 0.7 to 8.5, and 0.7 to 4.5 for high, intermediate and low expression genes, respectively (Fig 4A).

Fig 4.

(A) The ARSCU GC/AT value of high, intermediate and low expression gene categories in rice; (B) Validation of ARSCU-Calculator using 60 rice drought stress related genes, randomly selected and calculated by the program. (C) The ARSCU value of 60 high expressing non drought related genes. Genes with ARSCU above 13 are estimated as high expression genes under drought conditions) green area. (ARSCU between 9 to 13 are estimated as genes with high or intermediate expression (yellow area). ARSCU less than 9 are estimated as genes with low expression or intermediate expression (pink area). Blue and red shapes: correct and wrong prediction, respectively. High, intermediate and low-expression genes are indicated with solid circle, triangle and square, respectively.

ARSCU was calculated as:

Where aa is amino acid, a is RSCU of GC end codons and b is RSCU of AT end codons (any a and b with a value of zero is arbitrarily assigned a value of 0.1).

ARSCU- calculator

A program was developed in an Excel file for calculating the ARSCU index (S1 Table). Input data for this program is the RSCU index of all amino acid codons in each gene. This program calculates only one ARSCU (GC/AT) for each gene (whilst every gene has 59 RSCU values). Since RSCU may be zero for some codons, any RSCU with a value of zero is arbitrarily assigned a value of 0.1.

Based on our findings, ARSCU values can be used for discrimination of genes with different expressions under drought conditions, as follows:

  • Genes with ARSCU above 13 (arbitrary threshold) are predicted to be high expression genes under drought conditions;
  • Genes with ARSCU ranging from 9 to 13 are predicted to be genes with high or intermediate expression;
  • Genes with ARSCU numbers less than 9 are predicted to be genes with low or intermediate expression.

Validation of discriminating power of ARSCU

For validation of discriminating power and accuracy of the ARSCU, 60 other rice genes were randomly selected from 43233 genes studied for their response to drought stress [44] and their ARSCU were calculated by the program (S2 Table). The results were correlated with the expression datafrom the published meta-analysis [44]. ARSCU calculator was able to predict genes with different expression level under drought stress in rice with an accuracy of 88.3%) Fig 4B).

In order to compare the codon usage pattern of different highly expressing genes unrelated to drought with that of drought responsive high expression genes, third set of 60 other highly expressed genes not related to drought [64,65] were used. The ARSCU value for these genes was also calculated by the program) S3 Table). Only 31.65% of the genes (18.33% plus 13.33% were correctly categorized as high and intermediate expressing genes, respectively) similar to those obtained for drought responsive genes (Fig 4C). ARSCU as an index worked for identification of drought related genes (with 88% accuracy) compared to only 32% accuracy on non-drought related genes.

tRFs under drought stress

Results of our search for tRFs under drought stress and control condition in rice, showed that:

  1. In most cases, no tRF has been reported for the preferred codon under drought stress, which means the tRNA remains for longer time and participates in translation.
  2. In few cases, tRF has been reported for tRNA related to preferred codon under drought stress. However, even in these cases the frequency of this tRF has decreased under drought stress compared with its frequency under normal condition.
  3. Only in one case (Ala), tRF related to preferred codon has increased under drought stress. tRNA expression data is required in order to explain this observation. One mechanism may be over expression of the preferred tRNA in such amounts that even its break down to its tRF does not affect the over-expression of the gene in concern (Table 3).
Table 3. Frequency of tRFs under control and drought stress conditions in rice.


Drought tolerance in rice is governed by many genes with huge environmental interaction, with low heritability, and thus are difficult to study [66]. Development of GM plants with enhanced tolerance to drought is an important challenge in rice biotechnology research. For this purpose, further investigation of the mechanisms, pathways and genes involved in response to abiotic stress is essential. Expression of genes belonging to diverse functional and regulatory groups, such as transcription factors, protein kinases, and phosphatases, are influenced by abiotic stress conditions [67]. In addition to environmental conditions, gene expression levels correlate with multiple aspects of gene sequence and structure including SCU [18]. Here, we report the existence of strong correlation between SCU and gene expression levels under drought stress in rice which has not previously been reported.

Correlation between GC content and higher expression of rice genes under drought

Our analysis of codon usage patterns of studied genes in different expression categories under drought stress shows that most of the genes in the three categories have more GC content as expected from rice genes. In this study, GC percent analysis was able to separate 60 selected genes in rice into two major classes: 1) the high expression genes that were also high in GC and 2) the other genes with lower GC content. Higher GC content in rice genome has been reported with no particular attention paid for presence or absence of differences among genes with different levels of expression under drought [68]. Wang and Hickey [69] showed that the means of GC content of 14005 rice-coding sequences was 57.8%. Also they separated the 14,005 rice genes into two classes: High (67.4%) and Low GC genes (50.1%). The means of the high GC rice genes and low GC rice genes’ GC3 were 80.4% and 52.7%, respectively in their study. Also, Yi et al. [33] showed higher expression is associated with higher GC3 in three closely-related species of the genus Misgurnus. Our results further confirm the previous reports in the field. However, the novelty of our findings may be the existence of a correlation between genes with different levels of expression under drought condition and their GC content. We showed that genes with high expression under drought conditions had overall 70.9% GC content. Furthermore, we showed that on average 94% of the high expressing genes contained a G or C at third position of their codons. Therefore, we have demonstrated that high expression genes under drought are more dependent on high GC percentages, especially in the third nucleotide position.

Correlation between SCU and higher expression of rice genes under drought

As we expected, the range of the CAI value was higher in the high expression genes compared with other categories. In addition, the ENC value for the high expression genes was less than 36, which is an indication of the presence of a relatively strong CUB. This value for the second and third categories of genes was above 38 and 46, respectively. Sharp et al. [60] reported that, highly expressed genes generally show a tendency of preferentially using a limited number of codons. Also, Liu et al. [70] showed a positive correlation between ENC value and gene expression level in the rice plant. In their report, genes with ENC ≤30 and ENC ≥55 were correlated with high and low expression genes, respectively.

Alanine and serine in the high expression genes under drought stress preferred the GCG and TCG codons, respectively, whereas according to the common rice codon usage table, the preferred codons for alanine and serine were GCC and TCC, respectively. Our results about those amino acids encoded by six codons (leucine, arginine and serine) showed that among the C- or G-ending codons, using G or C in the first and second positions is also more preferred. For example, for the case of leucine, the use of the TTG codon, although ending with a G, is not preferred and the CTC and CTG codons are more preferred. Therefore, using TTG is not recommended for codon optimization for enhanced expression of the concerned gene under drought. Intermediate and low expression genes under drought stress end with A or T codons in some cases. This may further indicate the effect of more efficient codons on higher expression of genes under drought stress. For example, proline and aspartic acid preference the GAT and CCA codons in the intermediate expression genes under drought stress, whereas the preferred rice codon for these two amino acids is GAC and CCG that end with a G or C. Therefore, the use of A and T end codons for these amino acids has reduced the expression of these genes under drought stress. A number of studies of eukaryotes and prokaryotes have showed that groups or families of genes involved in stress responses systematically overuse and underuse specific “non-optimal” synonymous codons. Chionh et al. [71] showed the 48 genes in the DosR regulon in Mycobacterium bovis BCG, which are essential for survival under hypoxic stress, are enriched in G- and C-ending codons. Others showed that hypoxic stress increased the translation of proteins from genes enriched with non-optimal codons ACG (Thr), CTA (Leu), GCG (Ala) and GGA (Gly), whereas their synonymous partners ACC (Thr), CTT (Leu), GCT (Ala) and GGT (Gly) were all overrepresented in downregulated proteins in hypoxia [72]. High-expression genes under drought in rice use G/C-end codons more frequently. Therefore, we created a different codon usage table for these genes compared to the standard rice codon usage table (S4 Table).

Correlation of CUB with gene expression levels for different species including: Caenorhabditis, Drosophila, Arabidopsis [20], Tribolium castaneum [18], arthropods [32,72], Paeonia lactiflora [73] and Populus tremula [74] has been shown by several authors. Selection favors specific codons promoting the efficient and accurate translation of genes that are expressed at high levels.

ARSCU as a single value for prediction of rice genes expression under drought

Previous reports have used the RSCU number for different cases, but this number is calculated for the codons of each amino acid in the gene. In this report, we introduced the ARSCU GC/AT as a single value/index for each gene with the potential to predict gene expression under drought stress. In addition, we developed a simple program in an Excel file for calculating the ARSCU index (ARSCU calculator). It is expected that using this program, we will be able to predict high expression genes in high throughput data with relatively high confidence. This program was developed based on calculations made on a set of genes expressed under drought stress conditions in the rice genome. We validated 1) the discriminating power of ARSCU and 2) its specificity for higher expressing genes under drought. This index was not able to predict highly expressed non-drought related genes. These findings may reflect the different codon usage pattern of the genes under drought stress compared with other highly expressed genes in rice. In other words, the observed phenomenon (strong correlation between ARSCU and high expressing genes) is restricted to higher expressing genes under drought stress and not every high expressing genes in rice.

It remains unclear if ARSCU will also be useful for prediction of gene expression in other plant species under drought or even other stressed conditions. We are currently attempting validation of this program for prediction of gene expression under drought for other cereals. It may require a different threshold definition than the one reported by this index (ARSCU) in this article. There is room for improvement of this program to be suitable for use with the codon usage features of other plant species. The prediction made by this program potentially indicates the extent to which such a codon composition can be expressed under drought. It is, however, clear that in addition to ORF, regulatory elements in particular the promoter and transcription factors affect gene expression and that these variables were not considered in this program. It is therefore expected that even in rice, genes with high ARSCUs may exhibit low expression under drought. The high ARSCUs in such genes only indicate their codon potential for high expression under stressed conditions. In the case of such genes, alteration of regulatory elements may materialize their expression potential.

Possible role of tRNA post transcriptional modification and tRFs justifying the correlation between CUB and high expressing genes under drought

We observed different codon usage pattern in high expression genes under drought stress in our research. This observation termed Modification Tunable Transcripts (MoTTs) was also reported as a model combining genome-wide CUB analytics and gene expression studies. This model implies that modification of tRNA drives the translational regulation of critical response proteins whose transcripts display a distinct codon bias [75].

Average transcripts not exhibiting CUB, do not need specific tRNA modifications, as they are efficiently translated under all conditions. In contrast, the translation of MoTTs in cells lacking tRNA specific modifications is severely reduced [37].

The number of genes that encode tRNA modification enzymes are far more abundant than the tRNA coding genes, which further highlights the importance of tRNA modification [76,77]. It has been long postulated that modified nucleosides on tRNA molecules may function as “biosensor” for environmental and physiological changes [78,79] as a fast module to regulate gene expression at translational level. In agreement with this hypothesis, the abundance of tRNA modified nucleosides do change in response to various stresses [34]. Stress-induced degradation of tRNA has been reported by several investigators [80,81].

We speculate that those transcripts that are more efficiently expressed due to the presence of CUB during the course of stress, can be more efficiently used by translational machinery to express proteins due to the presence of specific modifications. However, this require more direct evidence to draw a final conclusion.

Methylation sensitive degradation of tRNA as another potential underlying mechanism for these observations has also the merit to look at. Wang et al. [82] investigated the mechanism of changes in the composition and abundance of tRNA-modified nucleosides in response to drought, salt and cold stress, as well as in different tissues during the whole growth season in two model plants–O. sativa and Arabidopsis thaliana. They identified 22 and 20 candidate genes for methyl-transferases in rice and Arabidopsis, respectively. Based on bioinformatics analysis, nucleoside abundance assessments and gene expression profiling, they found four methylated nucleosides (Am, Cm, m1A and m7G) that are critical for stress response in rice and Arabidopsis.

However, the relationship of these modifications with CUB is not yet known. On the other hand, several articles have discussed tRNA modification of the animal species and its association with codon usage. Chan et al [74] showed a model for translational control mechanism for survival under oxidative stress in yeast. Exposure to H2O2 leads to an elevation in the level of m5C at the wobble position of the leucine tRNA for translating the codon UUG on mRNA. This modification in tRNA enhanced the translation of the UUG-enriched RPL22A mRNA relative to its paralog RPL22B and leads to changes in ribosome composition in their study. This reprogramming of tRNA and ribosomes ultimately causes selective translation of proteins from genes enriched with the codon TTG. More comprehensive studies are needed to better correlate the expression of genes with specific codon preference under drought stress in rice with tRNA modifications.

The fact that in most cases, no tRF has been reported for the preferred codon under drought stress, may support our hypothesis that during the course of evolution plants with preferred codons for which the tRNA does not break down into tRF gained selective advantage under drought condition. This may be the result of preservation of already existing such preferred codon or even occurrence of the mutation creating the preferred codon. This may be considered for further studies with the objective of identifying the underlying biological mechanism of CUB observed for high expression genes under drought. Based on these observations we propose an illustrated model based on codon characteristics, tRNA modification and tRFs under drought stress in rice (Fig 5).

Fig 5. Proposed model justifying codon optimization in genetic engineering programs based of the observed CUB, codon characteristics, tRNA modification and tRFs under drought stress in rice.

a) The codon usage difference under normal and drought condition for amino acid Valine. b) A segment of mRNA is shown under normal condition, before (left) and after codon optimization (right) symbolizing a gene. Under normal conditions, both genes (before and after codon optimization) are expressed. However, the codon-optimized gene is more expressed [83]. c) Native genes are minimally expressed or no expressed under stress (left) compared to the optimized genes (right). Under stress conditions (C), tRNA modifying enzymes mark a series of tRNAs that carry preferred codons leading to increased translation of codon optimized gene (right). In the case of amino acid Valine, the tRNA associated with GTC codon is broken under stress in the form of tRFs. It makes limited access to the codon for translation.

In summary, the results of this paper suggests that codon optimization may play more significant role for improved gene expression under drought. Therefore, it is suggested to be considered in genetic engineering programs. It is expected that drought tolerant rice will be available from our ongoing projects on engineering rice for enhanced drought tolerance using several different strategies. The genes used in our projects have been optimized based on the findings reported in the current study.

Supporting information

S2 Table. Validation of ARSCU-calculator using 60 rice drought stress related genes.


S3 Table. The ARSCU value of 60 high expression genes in different stages of rice growth.


S4 Table. Rice codon usage table compare with rice high expression genes under drought stress.



  1. 1. Acquaah G. Principles of plant genetics and breeding. John Wiley & Sons; 2009.
  2. 2. Mollasadeghi V, Valizadeh M, Shahryari R, Imani AA. Evaluation of end drought tolerance of 12 wheat genotypes by stress indices. World Appl Sci J. IDOSI Publications; 2011;13: 545–551.
  3. 3. Nachimuthu V V, Sabariappan R, Muthurajan R, Kumar A. Breeding rice varieties for abiotic stress tolerance: Challenges and opportunities. Abiotic stress management for resilient agriculture. Springer; 2017. pp. 339–361.
  4. 4. Yamaguchi T, Blumwald E. Developing salt-tolerant crop plants: challenges and opportunities. Trends Plant Sci. Elsevier; 2005;10: 615–620. pmid:16280254
  5. 5. Ahmad P, Ashraf M, Younis M, Hu X, Kumar A, Akram NA, et al. Role of transgenic plants in agriculture and biopharming. Biotechnol Adv. Elsevier; 2012;30: 524–540.
  6. 6. Bennett J, Cohen MB, Katiyar SK, Ghareyazie B, Khush GS. Enhancing insect resistance in rice through biotechnology. Adv insect Control role transgenic plants. Taylor and Francis, London; 1997; 75–93.
  7. 7. Ghareyazie B, Alinia F, Menguito CA, Rubia LG, de Palma JM, Liwanag EA, et al. Enhanced resistance to two stem borers in an aromatic rice containing a synthetic cryIA (b) gene. Mol Breed. Springer; 1997;3: 401–414.
  8. 8. Ye R, Huang H, Yang Z, Chen T, Liu L, Li X, et al. Development of insect‐resistant transgenic rice with Cry1C*‐free endosperm. Pest Manag Sci Former Pestic Sci. Wiley Online Library; 2009;65: 1015–1020.
  9. 9. Goto S, Sasakura‐Shimoda F, Suetsugu M, Selvaraj MG, Hayashi N, Yamazaki M, et al. Development of disease‐resistant rice by optimized expression of WRKY45. Plant Biotechnol J. Wiley Online Library; 2015;13: 753–765. pmid:25487714
  10. 10. Fartyal D, Agarwal A, James D, Borphukan B, Ram B, Sheri V, et al. Developing dual herbicide tolerant transgenic rice plants for sustainable weed management. Sci Rep. Nature Publishing Group; 2018;8: 1–12.
  11. 11. Eggers B, Mackenzie R. The Cartagena protocol on biosafety. J Int Econ Law. Oxford University Press; 2000;3: 525–543.
  12. 12. James C. Global Status of Commercialized Biotech/GM Crops in 2017: Biotech crop adoption surges as economic benefits accumulate in 22 years. ISAAA Br No 53. 2017;ISAAA.
  13. 13. Wani SH, Sah SK. Biotechnology and abiotic stress tolerance in rice. J Rice Res. 2014;2: e105.
  14. 14. De La Torre AR, Lin Y-C, Van de Peer Y, Ingvarsson PK. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families. Genome Biol Evol. Oxford University Press; 2015;7: 1002–1015. pmid:25747252
  15. 15. Bhatnagar-Mathur P, Vadez V, Sharma KK. Transgenic approaches for abiotic stress tolerance in plants: retrospect and prospects. Plant Cell Rep. Springer; 2008;27: 411–424. pmid:18026957
  16. 16. Ahanger MA, Akram NA, Ashraf M, Alyemeni MN, Wijaya L, Ahmad P. Plant responses to environmental stresses—from gene to biotechnology. AoB Plants. Oxford University Press; 2017;9.
  17. 17. Gout J-F, Kahn D, Duret L, Consortium PP-G. The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLoS Genet. Public Library of Science; 2010;6.
  18. 18. Williford A, Demuth JP. Gene expression levels are correlated with synonymous codon usage, amino acid composition, and gene architecture in the red flour beetle, Tribolium castaneum. Mol Biol Evol. Oxford University Press; 2012;29: 3755–3766. pmid:22826459
  19. 19. Hershberg R, Petrov DA. Selection on codon bias. Annu Rev Genet. Annual Reviews; 2008;42: 287–299. pmid:18983258
  20. 20. Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci. National Acad Sciences; 1999;96: 4482–4487. pmid:10200288
  21. 21. Coghlan A, Wolfe KH. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. Wiley Online Library; 2000;16: 1131–1145. pmid:10953085
  22. 22. Ghaemmaghami S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, et al. Global analysis of protein expression in yeast. Nature. Nature Publishing Group; 2003;425: 737–741. pmid:14562106
  23. 23. Urrutia AO, Hurst LD. The signature of selection mediated by expression on human genes. Genome Res. Cold Spring Harbor Lab; 2003;13: 2260–2264. pmid:12975314
  24. 24. Comeron JM. Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence. Genetics. Genetics Soc America; 2004;167: 1293–1304. pmid:15280243
  25. 25. Cutter AD, Wasmuth JD, Blaxter ML. The evolution of biased codon and amino acid usage in nematode genomes. Mol Biol Evol. Oxford University Press; 2006;23: 2303–2315. pmid:16936139
  26. 26. Drummond DA, Wilke CO. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. Elsevier; 2008;134: 341–352. pmid:18662548
  27. 27. Ingvarsson PK. Molecular evolution of synonymous codon usage in Populus. BMC Evol Biol. Springer; 2008;8: 307. pmid:18983655
  28. 28. Qin F, Shinozaki K, Yamaguchi-Shinozaki K. Achievements and challenges in understanding plant abiotic stress responses and tolerance. Plant Cell Physiol. Oxford University Press; 2011;52: 1569–1582. pmid:21828105
  29. 29. Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985;2: 13–34. pmid:3916708
  30. 30. Chiapello H, Lisacek F, Caboche M, Hénaut A. Codon usage and gene function are related in sequences of Arabidopsis thaliana. Gene. Elsevier; 1998;209: GC1–GC38.
  31. 31. Goodarzi H, Torabi N, Najafabadi HS, Archetti M. Amino acid and codon usage profiles: adaptive changes in the frequency of amino acids and codons. Gene. Elsevier; 2008;407: 30–41. pmid:17977670
  32. 32. Whittle CA, Extavour CG. Codon and amino acid usage are shaped by selection across divergent model organisms of the Pancrustacea. G3 Genes, Genomes, Genet. G3: Genes, Genomes, Genetics; 2015;5: 2307–2321.
  33. 33. Yi S, Li Y, Wang W. Selection shapes the patterns of codon usage in three closely related species of genus Misgurnus. Genomics. Elsevier; 2018;110: 134–142. pmid:28911975
  34. 34. Chan CTY, Pang YLJ, Deng W, Babu IR, Dyavaiah M, Begley TJ, et al. Reprogramming of tRNA modifications controls the oxidative stress response by codon-biased translation of proteins. Nat Commun. Nature Publishing Group; 2012;3: 1–9.
  35. 35. Begley U, Dyavaiah M, Patil A, Rooney JP, DiRenzo D, Young CM, et al. Trm9-catalyzed tRNA modifications link translation to the DNA damage response. Mol Cell. Elsevier; 2007;28: 860–870. pmid:18082610
  36. 36. Hatfield DL, Gladyshev VN. How selenium has altered our understanding of the genetic code. Mol Cell Biol. Am Soc Microbiol; 2002;22: 3565–3576. pmid:11997494
  37. 37. Dedon PC, Begley TJ. A system of RNA modifications and biased codon use controls cellular stress response at the level of translation. Chem Res Toxicol. ACS Publications; 2014;27: 330–337. pmid:24422464
  38. 38. Thompson DM, Parker R. Stressing out over tRNA cleavage. Cell. Elsevier; 2009;138: 215–219. pmid:19632169
  39. 39. Kumar P, Kuscu C, Dutta A. Biogenesis and function of transfer RNA-related fragments (tRFs). Trends Biochem Sci. Elsevier; 2016;41: 679–689. pmid:27263052
  40. 40. Schaefer M, Pollex T, Hanna K, Tuorto F, Meusburger M, Helm M, et al. RNA methylation by Dnmt2 protects transfer RNAs against stress-induced cleavage. Genes Dev. Cold Spring Harbor Lab; 2010;24: 1590–1595. pmid:20679393
  41. 41. Blanco S, Dietmann S, Flores J V, Hussain S, Kutter C, Humphreys P, et al. Aberrant methylation of t RNA s links cellular stress to neuro‐developmental disorders. EMBO J. 2014;33: 2020–2039. pmid:25063673
  42. 42. Park EJ, Kim T-H. Fine-tuning of gene expression by tRNA-derived fragments during abiotic stress signal transduction. Int J Mol Sci. Multidisciplinary Digital Publishing Institute; 2018;19: 518.
  43. 43. Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J. Codon usage and codon pair patterns in non-grass monocot genomes. Ann Bot. Oxford University Press US; 2017;120: 893–909. pmid:29155926
  44. 44. de Abreu Neto JB, Frei M. Microarray meta-analysis focused on the response of genes involved in redox homeostasis to diverse abiotic stresses in rice. Front Plant Sci. Frontiers; 2016;6: 1260. pmid:26793229
  45. 45. Frei M, Wang Y, Ismail AM, Wissuwa M. Biochemical factors conferring shoot tolerance to oxidative stress in rice grown in low zinc soil. Funct Plant Biol. CSIRO; 2010;37: 74–84.
  46. 46. Shaik R, Ramakrishna W. Machine learning approaches distinguish multiple stress conditions using stress-responsive genes and identify candidate genes for broad resistance in rice. Plant Physiol. Am Soc Plant Biol; 2014;164: 481–495.
  47. 47. Wheeler DL, Chappey C, Lash AE, Leipe DD, Madden TL, Schuler GD, et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. Oxford University Press; 2000;28: 10–14. pmid:10592169
  48. 48. Duan J, Cai W. OsLEA3-2, an abiotic stress induced gene of rice plays a key role in salt and drought tolerance. PLoS One. Public Library of Science; 2012;7.
  49. 49. Xiao B, Huang Y, Tang N, Xiong L. Over-expression of a LEA gene in rice improves drought resistance under the field conditions. Theor Appl Genet. Springer; 2007;115: 35–46. pmid:17426956
  50. 50. Hu H, You J, Fang Y, Zhu X, Qi Z, Xiong L. Characterization of transcription factor gene SNAC2 conferring cold and salt tolerance in rice. Plant Mol Biol. Springer; 2008;67: 169–181. pmid:18273684
  51. 51. Redillas MCFR , Jeong JS, Kim YS, Jung H, Bang SW, Choi YD, et al. The overexpression of OsNAC9 alters the root architecture of rice plants enhancing drought resistance and grain yield under field conditions. Plant Biotechnol J. Wiley Online Library; 2012;10: 792–805. pmid:22551450
  52. 52. Huang J, Sun S, Xu D, Lan H, Sun H, Wang Z, et al. A TFIIIA-type zinc finger protein confers multiple abiotic stress tolerances in transgenic rice (Oryza sativa L.). Plant Mol Biol. Springer; 2012;80: 337–350. pmid:22930448
  53. 53. Jan A, Maruyama K, Todaka D, Kidokoro S, Abo M, Yoshimura E, et al. OsTZF1, a CCCH-tandem zinc finger protein, confers delayed senescence and stress tolerance in rice by regulating stress-related genes. Plant Physiol. Am Soc Plant Biol; 2013;161: 1202–1216.
  54. 54. Huang L, Wang Y, Wang W, Zhao X, Qin Q, Sun F, et al. Characterization of transcription factor gene OsDRAP1 conferring drought tolerance in rice. Front Plant Sci. Frontiers; 2018;9: 94. pmid:29449862
  55. 55. Huang X-Y, Chao D-Y, Gao J-P, Zhu M-Z, Shi M, Lin H-X. A previously unknown zinc finger protein, DST, regulates drought and salt tolerance in rice via stomatal aperture control. Genes Dev. Cold Spring Harbor Lab; 2009;23: 1805–1817. pmid:19651988
  56. 56. Gao F, Xiong A, Peng R, Jin X, Xu J, Zhu B, et al. OsNAC52, a rice NAC transcription factor, potentially responds to ABA and confers drought tolerance in transgenic plants. Plant Cell, Tissue Organ Cult. Springer; 2010;100: 255–262.
  57. 57. Xu D-Q, Huang J, Guo S-Q, Yang X, Bao Y-M, Tang H-J, et al. Overexpression of a TFIIIA‐type zinc finger protein gene ZFP252 enhances drought and salt tolerance in rice (Oryza sativa L.). FEBS Lett. Wiley Online Library; 2008;582: 1037–1043. pmid:18325341
  58. 58. Puigbò P, Bravo IG, Garcia-Vallve S. CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct. BioMed Central; 2008;3: 38.
  59. 59. Hastings KEM, Emerson CP. Codon usage in muscle genes and liver genes. J Mol Evol. Springer; 1983;19: 214–218. pmid:6887263
  60. 60. Sharp PM, Li W-H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. Oxford University Press; 1987;15: 1281–1295. pmid:3547335
  61. 61. Vetrivel U, Arunkumar V, Dorairaj S. ACUA: a software tool for automated codon usage analysis. Bioinformation. Citeseer; 2007;2: 62.
  62. 62. Nakamura Y, Gojobori T, Ikemura T. Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res. Oxford University Press; 2000;28: 292. pmid:10592250
  63. 63. Gupta N, Singh A, Zahra S, Kumar S. PtRFdb: a database for plant transfer RNA-derived fragments. Database. Narnia; 2018;2018.
  64. 64. Jung K-H, Kim S-R, Giong H-K, Nguyen MX, Koh H-J, An G. Genome-wide identification and functional analysis of genes expressed ubiquitously in rice. Mol Plant. Elsevier; 2015;8: 276–289. pmid:25624149
  65. 65. Xu H, Gao Y, Wang J. Transcriptomic analysis of rice (Oryza sativa) developing embryos using the RNA-Seq technique. PLoS One. Public Library of Science; 2012;7.
  66. 66. Verulkar SB, Verma SK. Screening protocols in breeding for drought tolerance in rice. Agric Res. Springer; 2014;3: 32–40.
  67. 67. Bashir K, Matsui A, Rasheed S, Seki M. Recent advances in the characterization of plant transcriptomes in response to drought, salinity, heat, and cold stress. F1000Research. Faculty of 1000 Ltd; 2019;8.
  68. 68. Guo X, Bao J, Fan L. Evidence of selectively driven codon usage in rice: implications for GC content evolution of Gramineae genes. FEBS Lett. Elsevier; 2007;581: 1015–1021. pmid:17306258
  69. 69. Wang H-C, Hickey DA. Rapid divergence of codon usage patterns within the rice genome. BMC Evol Biol. BioMed Central; 2007;7: S6.
  70. 70. Liu Q, Dou S, Ji Z, Xue Q. Synonymous codon usage and gene function are strongly related in Oryza sativa. Biosystems. Elsevier; 2005;80: 123–131. pmid:15823411
  71. 71. Chionh YH, McBee M, Babu IR, Hia F, Lin W, Zhao W, et al. tRNA-mediated codon-biased translation in mycobacterial hypoxic persistence. Nat Commun. Nature Publishing Group; 2016;7: 1–12.
  72. 72. Whittle CA, Extavour CG. Expression-linked patterns of codon usage, amino acid frequency, and protein length in the basally branching arthropod Parasteatoda tepidariorum. Genome Biol Evol. Oxford University Press; 2016;8: 2722–2736. pmid:27017527
  73. 73. Wu Y, Zhao D, Tao J. Analysis of codon usage patterns in herbaceous peony (Paeonia lactiflora Pall.) based on transcriptome data. Genes (Basel). Multidisciplinary Digital Publishing Institute; 2015;6: 1125–1139. pmid:26506393
  74. 74. Ingvarsson PK. Gene expression and protein length influence codon usage and rates of sequence evolution in Populus tremula. Mol Biol Evol. Oxford University Press; 2007;24: 836–844. pmid:17204548
  75. 75. Endres L, Dedon PC, Begley TJ. Codon-biased translation can be regulated by wobble-base tRNA modification systems during cellular stress responses. RNA Biol. Taylor & Francis; 2015;12: 603–614. pmid:25892531
  76. 76. Björk GR, Durand JMB, Hagervall TG, Leipuvien R, Lundgren HK, Nilsson K, et al. Transfer RNA modification: influence on translational frameshifting and metabolism. FEBS Lett. Wiley Online Library; 1999;452: 47–51. pmid:10376676
  77. 77. El Yacoubi B, Bailly M, de Crécy-Lagard V. Biosynthesis and function of posttranscriptional modifications of transfer RNAs. Annu Rev Genet. Annual Reviews; 2012;46: 69–95. pmid:22905870
  78. 78. Björk GR, Ericson JU, Gustafsson CED, Hagervall TG, Jönsson YH, Wikström PM. Transfer RNA modification. Annu Rev Biochem. Annual Reviews 4139 El Camino Way, PO Box 10139, Palo Alto, CA 94303–0139, USA; 1987;56: 263–285. pmid:3304135
  79. 79. Gustilo EM, Vendeix FAP, Agris PF. tRNA’s modifications bring order to gene expression. Curr Opin Microbiol. Elsevier; 2008;11: 134–140. pmid:18378185
  80. 80. Motorin Y, Helm M. tRNA stabilization by modified nucleotides. Biochemistry. ACS Publications; 2010;49: 4934–4944. pmid:20459084
  81. 81. Alexandrov A, Chernyakov I, Gu W, Hiley SL, Hughes TR, Grayhack EJ, et al. Rapid tRNA decay can result from lack of nonessential modifications. Mol Cell. Elsevier; 2006;21: 87–96.
  82. 82. Wang Y, Pang C, Li X, Hu Z, Lv Z, Zheng B, et al. Identification of tRNA nucleoside modification genes critical for stress response and development in rice and Arabidopsis. BMC Plant Biol. BioMed Central; 2017;17: 261. pmid:29268705
  83. 83. Hanson G, Coller J. Codon optimality, bias and usage in translation and mRNA decay. Nat Rev Mol cell Biol. Nature Publishing Group; 2018;19: 20–30. pmid:29018283