Conceived and designed the experiments: CF LB JPR. Performed the experiments: CF LB. Analyzed the data: CF IH JPR LB. Wrote the paper: CF IH JPR LB.
The authors have declared that no competing interests exist.
The efficiency of translation termination depends on the nature of the stop codon and the surrounding nucleotides. Some molecules, such as aminoglycoside antibiotics (gentamicin), decrease termination efficiency and are currently being evaluated for diseases caused by premature termination codons. However, the readthrough response to treatment is highly variable and little is known about the rules governing readthrough level and response to aminoglycosides. In this study, we carried out indepth statistical analysis on a very large set of nonsense mutations to decipher the elements of nucleotide context responsible for modulating readthrough levels and gentamicin response. We quantified readthrough for 66 sequences containing a stop codon, in the presence and absence of gentamicin, in cultured mammalian cells. We demonstrated that the efficiency of readthrough after treatment is determined by the complex interplay between the stop codon and a larger sequence context. There was a strong positive correlation between basal and induced readthrough levels, and a weak negative correlation between basal readthrough level and gentamicin response (i.e. the factor of increase from basal to induced readthrough levels). The identity of the stop codon did not affect the response to gentamicin treatment. In agreement with a previous report, we confirm that the presence of a cytosine in +4 position promotes higher basal and gentamicininduced readthrough than other nucleotides. We highlight for the first time that the presence of a uracil residue immediately upstream from the stop codon is a major determinant of the response to gentamicin. Moreover, this effect was mediated by the nucleotide itself, rather than by the aminoacid or tRNA corresponding to the −1 codon. Finally, we point out that a uracil at this position associated with a cytosine at +4 results in an optimal gentamicininduced readthrough, which is the therapeutically relevant variable.
Nonsense mutations are singlenucleotide variations within the coding sequence of a gene that result in a premature termination codon. The presence of such mutations leads to the synthesis of a truncated protein unable to fulfill its normal function. Over the last ten years, treatment strategies have emerged based on the use of molecules, such as aminoglycoside antibiotics (gentamicin) that facilitate the readthrough of premature termination codons, thus restoring the synthesis of a fulllength protein. Such strategies have been tested for various genetic diseases, including Duchenne muscular dystrophy and cystic fibrosis. The readthrough level depends on the nature of the stop codon and the surrounding nucleotide context, but little was known of the rules governing readthrough level and response to aminoglycosides. In this study, we use a large set of nonsense mutations for an indepth statistical analysis designed to decipher the element of the nucleotide context responsible for modulating readthrough levels. We analyse the impact of the six nucleotides upstream and downstream from the stop codon. We demonstrate that the presence of a uracil residue immediately upstream the stop codon is associated with a stronger response to gentamicin treatment than the presence of any of the other three nucleotides.
Translation is terminated by a stop codon entering the A site of the ribosome, inducing the release of the polypeptide chain from the peptidyltRNA
Readthrough can be stimulated by aminoglycoside antibiotics, such as gentamicin, making it possible to generate a fulllength protein from genes carrying a nonsense mutation
We used a set of 66 sequences, each containing a stop codon  mostly nonsense mutations implicated in various human diseases  inserted into the same reporter vector for an extensive statistical analysis of the determinants of readthrough levels and gentamicin response. We found a strong correlation between basal readthrough level and antibioticinduced readthrough level and a very weak negative correlation between basal readthrough level and gentamicin response. The nature of the stop codon did not affect the sensitivity of the nonsense mutation to gentamicin treatment. A comprehensive analysis of the surrounding nucleotides identified positions playing an important role in determining readthrough levels and gentamicin response. In particular, we demonstrated that the nucleotide immediately upstream from the stop codon was a major determinant of gentamicin response and that this effect was mediated by the nucleotide itself, rather than by the nature of the last amino acid or the tRNA present in the ribosomal Psite.
Based on these findings, we have developed the first rules for predicting the sensitivity of nonsense mutations to aminoglycoside treatments based on the surrounding nucleotide sequence.
We analyzed readthrough levels for 66 stop codons, including one natural termination codon and 65 nonsense mutations implicated in various diseases (
Readthrough efficiencies of 66 sequences (65 nonsense mutations and 1 natural stop codon) were measured in NIH3T3 cells with and without gentamicin (800 µg/ml) treatment for 24 H. Each value shown is the mean of at least three independent experiments. For each sequence, the factor of increase (I) is the ratio of the gentamicininduced readthrough level (G) to the basal readthrough level (B). Sequences are ranked in descending order of gentamicininduced readthrough level.
For each sequence, the stop codon and the surrounding nucleotide context, shown in
Readthrough rates ranged from 0.01% (DMD 2726, beta 43, APC 1131) to 0.52% (CF 122) for basal readthrough (B), and from 0.04% (p53 327) to 2.79% (p53 213) in the presence of 800 µg/ml gentamicin (G) (
We studied the distribution and characteristics of the variables B, G and I, by descriptive statistical analysis (
We also carried out a graphical analysis to visualize the distribution of each variable (
For basal readthrough (variable B), 83.3% of nonsense mutations displayed readthrough levels of no more than 0.10%. Most of the nucleotide contexts surrounding nonsense mutations promoted efficient termination. Readthrough rates above 0.10% accounted for 16.7% of all values and were defined as high basal readthrough levels (see
For gentamicininduced readthrough (variable G), 80.3% of nonsense mutations displayed readthrough rates no higher than 0.5%. Therefore, even in the presence of gentamicin, very few nucleotide contexts promoted “high” levels of readthrough. Readthrough rates above 0.50% accounted for 19.7% of all values and were defined as “high” levels of induced readthrough.
For each parameter, values are classified into intervals of equal size. Xvalues indicate the limits for intervals. Yvalues indicate the number of values in each interval.
For increase factor (variable I), we observed a kurtosis slightly higher (1.63) than expected for a Gaussian distribution. Its asymmetry coefficient (1.17) was similar to that for a Gaussian distribution ( = 1). The values were homogeneously distributed and 78% of the values had ranks between 4 and 8. Values greater than 8 accounted for 19.7% of all values and were defined as a “high” factor of increase.
We then investigated whether these three distributions could be converted to Gaussian distributions using the BoxCox transformation (λ = −0.217) (see
Previous observations have suggested that there is no correlation between the basal readthrough level for a nonsense mutation and its sensitivity to gentamicin treatment
Gentamicininduced readthrough level is plotted against basal readthrough level (A); the factor of increase is plotted against basal readthrough level (B) and against gentamicininduced readthrough level (C). BravaisPearson tests were performed to analyze the correlations between the three variables after BoxCox transformation. Statistically significant results are indicated in
These results provide the first description of the relationship between basal readthrough, gentamicininduced readthrough and gentamicin response. They indicate that nonsense mutations with a “high” basal readthrough level give “high” levels of gentamicininduced readthrough. However, some nonsense mutations with a “low” basal readthrough level presented a “high” gentamicininduced readthrough level, because they had high factors of increase. Thus, nonsense mutations were found to behave in different ways and could be classified into three distinct groups (
Responsetype 1: High basal readthrough levels and high gentamicininduced readthrough levels (for example, p53 213, CF 122 or DMD931). These nonsense mutations did not have a particularly high factor of increase.
Responsetype 2: A low or medium basal readthrough level associated with high factor of increase, resulting in essentially high levels of gentamicininduced readthrough (for example, APC 1114 or p53 192).
Responsetype 3: A low or medium basal readthrough level and a weak or moderate gentamicin response. Most of the nonsense mutations studied was of this type.
The first two groups include mutations for which gentamicin treatment can promote high levels of readthrough. For these mutations, we would expect to observe clinical benefit for the treatment, with gentamicin, of diseases linked to the presence of a nonsense mutation. Indeed, similar levels of induced readthrough have already been shown to improve the clinical status of cystic fibrosis patients with CFTR mutations treated with gentamicin
We hypothesized that the differences in the behavior of these nonsense mutations should depend on the nature of the stop codon and the nucleotide context. We therefore assessed the contribution of each of these factors.
The statistical approach used is described in the
Description of the statistical approach used for the study of the impact of identity of the stop codon or nucleotide context on readthrough levels and response to gentamicin.
Graphical representation of medians (before BoxCox transformation) for the parameters B, G and I are shown for the three different stop codons: UAA, UAG and UGA. The results of the statistical analysis after BoxCox transformation are shown in
After BoxCox transformation which allowed us to obtain a normal distribution for these 3 variables, an ANOVA test revealed that the three groups differed significantly in terms of their basal and induced readthrough levels. A LSD test yielded the following hierarchy: UGA>UAG>UAA (the sign>represents a statistical difference) for both basal and induced readthrough levels (
B  G  I  

C>U A>U  /  U>G 

/  A, C>G  / 

/  /  / 

/  / 


/  /  C>A 

/ 





/ 

C>A, G, U 

C>U 

U>A  / 


/  /  / 

/  /  / 

/  G>C  / 

/  / 

p≤0.005.
Conversely, ANOVA test revealed that the factor of increase did not differ significantly between the three groups (
We then investigated the effects of nucleotide context, by the same statistical approach used for investigation of the effects of stop codon identity (see
Graphical representations of medians for the parameters B, G and I (before BoxCox transformation) for the four different nucleotides (A, C, G or U) at each position in the sequence surrounding the codon stop (−6 to +9). Statistical analysis was performed after BoxCox transformation and results are summarized in
After BoxCox transformation we were able to use parametric statistical tests (ANOVA and LSD) to define a hierarchy for some positions (
We first compared the effect of each nucleotide at a given position to the three others at the same position (
These findings contrast with results previously obtained in yeast, which pointed out the role of the two adenine residues immediately upstream from the stop codon in the absence of treatment
Two major determinants were identified in this study (
The +4 position, at which a cytosine residue is correlated with higher basal (p = 0.06) and gentamicininduced readthrough (p = 0.001) than the tree other nucleotides. A C residue in this position has been shown to promote high levels of readthrough in both yeast and mammalian cells, but only in a small number of nucleotide contexts
The −1 position, at which a uracil residue is associated with higher levels of gentamicininduced readthrough level (p = 0.02) and a stronger gentamicin response (p<0.005) than for other nucleotides. This is the first time that the presence of a nucleotide at a specific position has been linked to a better response to gentamicin treatment.
However among mutation presenting a U in −1 position, 30% also present a C in +4 (against 8% for mutation with an A, 12% for mutation with a G and 29% for mutation with a C in −1). To assess that the effect of U in −1 is not biased by the presence of a C in +4, we performed the same statistical analysis restricting the pool of mutations to those without a C in +4. In this subset, the mean value of gentamicininduced readthrough and increase factor is even better when there is a U in −1 position compared to the 3 other nucleotides (
We checked that the determinants identified here were retrospectively consistent with published readthrough analyses. Keeling and Bedwell
We also compared the effect of each nucleotide at a given position to the four nucleotides at all positions on B, G and I. This procedure reveal a clear effect for the increase factor p = 0.0002 (
We analyzed the effect of the nucleotide in position −1 independently of the influence of other nucleotides, by quantifying the readthrough levels of six nonsense mutations in which we changed only the nucleotide in position −1, keeping the rest of the sequence constant:
DMD 673 (UAG, responsetype 1) and DMD 319 (UGA, responsetype 1) have a U residue in position −1, which we replaced with each of the other three nucleotides (A, C, G) separately.
CF 122 (UAA, responsetype 1), DMD 931 (UAG, responsetype 1), beta 17 (UAG, responsetype 3) and p53 146 (UGA, responsetype 3) have a G, A, C or G residue, respectively, in position −1. We replaced each of these residues by a U residue.
Readthrough levels were quantified in NIH3T3 cells in the presence or absence of gentamicin (
For DMD 319 and DMD 673, a U residue immediately upstream from the stop codon (−1) was replaced by each of the other nucleotides, separately. For DMD 931, p53 146, beta 17 and CF 122, the original nucleotide in the −1 position was replaced by a U residue. Readthrough efficiencies were measured as described in
These results provide evidence that the nucleotide immediately upstream from the stop codon is a major determinant of gentamicin response, with an uracil residue in this position associated with stronger responses to gentamicin treatment. For therapeutic purpose, readthrough levels in presence of gentamicin are the relevant variable. However, the capacity of a nonsense mutation to increase its readthrough level after antibiotic treatment could be a crucial point in the future as new readthrough inducers will be available. Indeed several groups are currently developing news molecules derived from aminoglycosides and acting in a similar way but with a greater efficiency
We then examined how the nucleotide upstream from the stop codon exerted its effect on readthrough levels. In prokaryotes, the chemical properties of the ultimate amino acid in the nascent polypeptide chain have been reported to modulate translational readthrough
The nucleotide in the −1 position is the third base of the codon immediately upstream from the stop codon (codon −1). During translation termination, the stop codon is located in the ribosomal Asite and codon −1 is in the Psite. We therefore investigated whether having a hydrophilic or hydrophobic amino acid at the P site was correlated with higher levels of readthrough or stronger gentamicin responses. A twotailed ttest comparing the two groups (hydrophobic or hydrophilic amino acid) for all the nonsense mutations studied showed that there is no relationship between the nature of the final amino acid and a high factor of increase (t = −0.91; p = 0.36) or high readthrough rates (t = 1.71; p = 0.09 for B and t = 1.28; p = 0.2 for G). Moreover, the amino acids encoded by codons ending in U do not belong to a particular chemical class.
These results strongly suggest that the nature of the amino acid at the ribosomal Psite is not a major determinant of readthrough levels.
We then investigated whether the effect of the nucleotide in the −1 position on the factor of increase was due to the nature of the tRNA at the P site. Nucleotides 1, 2 and 3 of the mRNA codon are recognized by nucleotides 36, 35 and 34, respectively of the tRNA anticodon (
We investigated whether the effect of the −1 nucleotide on the factor of increase was correlated with the identity of the tRNA, using four nonsense mutations: beta 17, DMD 319, (
The APC 1114 and APC 1131 nonsense mutations originally had a U residue in the −1 position, which we replaced with a C residue. For each sequence, the two different codons at the Psite are recognized by the same tRNA (
Nonsense mutations  Beta 17  DMD 319  APC 1114  APC 1131  
Stop codon  UAG  UGA  UGA  UAA  
Nucleotide in −1  C  U  C  U  C  U  C  U 
Last Aminoacid  GLY  GLY  PRO  PRO  ASN  ASN  CYS  CYS 
Codon (5′→3′) in −1  GGC  GG 
CC 
CCU  AA 
AAU  UGC  UG 
Anticodon (3′→5′)  CCG  CC 
GG 
GGA  UU 
UUG  ACG  AC 
Thus, the nucleotide in the −1 position is itself a major determinant of the gentamicin response and of the gentamicininduced readthrough level.
We used the largest set of nonsense mutations ever analyzed for the first statistical analysis of the influence of nucleotide context on PTC readthrough and response to aminoglycoside treatment. We confirm the findings of previous studies concerning the importance of the nucleotide in the +4 position, at which the presence of a cytosine (C) residue is correlated with high basal and gentamicininduced readthrough levels. We also show for the first time that the presence of a U residue in −1 is a key determinant of gentamicininduced readthrough which is the relevant parameter for clinical applications. Moreover, we can notice that a U in −1 is also correlated with a higher increase factor between basal and gentamicininduced readthrough. This finding may have important implications in fundamental aspects of structural interactions between readthrough inducers and the translational apparatus. We show that impact of the base in the −1 position is mediated neither by the last amino acid nor by the tRNA present at the ribosomal P site.
These data are consistent with previous reports excluding a role for the last residue of the polypeptide chain or the last incorporated tRNA in readthrough efficiency in eukaryotes
Finally, the consensus sequence U STOP C was systematically associated with induced readthrough levels greater than 0.5%. The combination of these two nucleotides before and after the stop codon may therefore provide an initial indicator of readthrough levels compatible with therapeutic benefit.
NIH3T3 cells (embryonic mouse fibroblasts kindly provided by Marc Sitbon) were cultured in DMEM plus GlutaMAX (Invitrogen). The medium was supplemented with 10% foetal calf serum (FCS, Invitrogen) and 100 U/ml penicillin/streptomycin. Cells were kept in a humidified atmosphere containing 5.5% CO_{2} at 37°C.
Complementary oligonucleotides corresponding to nonsense mutations embedded in their natural context (sequences in
Excel was used for statistical analysis: the Analysis Toolpack for descriptive statistics; the XL stat for Bartlett correlation tests (BravaisPearson) and ttests, Analyseit module for ANOVA and LSD tests.
Descriptive statistics (
Several parameters (Kurtosis coefficient, asymmetry coefficient etc.) and graphical analysis were used to determine whether the distribution of each variable followed a Gaussian distribution. A Gaussian distribution is characterized by a Kurtosis coefficient of 0 and an asymmetry coefficient of 1.
In order to perform a complete statistical analysis we chose to use parametric tests instead of lowpower nonparametric analysis. To this aim, we performed a BoxCox transformation for variables B, G and I with Ψ_{λ}(x) = (x^{λ}−1)/λ using the same lambda: −0.217. After this transformation, a ShapiroWilk test allowed us to conclude that B, G and I follow a normal distribution.
Correlations between variables were analyzed with the parametric BravaisPearson test. The null hypothesis (H_{0}) was “there is no correlation between the two variables studied (R = 0). The alternative hypothesis (H_{1}) was “there is a correlation between the two variables studied (R≠0)”. A perfect positive correlation gives an R value of +1, whereas a perfect negative correlation gives an R value of −1. The significance level was set at 0.05.
In order to analyze the effect of the nature of the stop codon or the nucleotide context on readthrough levels and gentamicin response the 66 stop codons were divided into three groups for stop codon studies (UAA, UAG and UGA) and four groups for nucleotide context studies (U, C, A, G). For each group, we then used a Bartlett test to analyze heterogeneity of variance of each variable. If heterogeneity was not significant we performed one of the most commonly used multiple comparison procedure, the Fisher's Least Significant Difference (LSD) test. The LSD test is a twostep test. First an ANOVA (Analysis Of Variance) test is performed. The null hypothesis for ANOVA is that the mean (average value of the dependent variable) is the same for all groups. The alternative hypothesis is that the mean is not the same for all groups. When the null hypothesis is rejected, it means that at least 2 groups are different from each other. In a second step we determine which groups are different from which performing all pairwise ttests. This last procedure allows to establish a hierarchy between stop or nucleotide at each position. In
Twotailed Student's ttests (excel) were used to study the influence of tRNA or the amino acid in the ribosomal Psite. For this test, we used a significance level α of 0.05.
Plots representing normal distribution of B, G and I after BoxCox transformation using a common lambda = −0.217.
(TIF)
Graphic representations of correlation between the variables B, G and I before BoxCox transformation. Gentamicininduced readthrough level is plotted against basal readthrough level (A); the factor of increase is plotted against basal readthrough level (B) and against gentamicininduced readthrough level (C). For each graph, the trend curve is shown.
(TIF)
Codonanticodon recognition rules. A. The nucleotides in positions 1, 2 and 3 of the codon are recognized by the nucleotides in positions 36, 35 and 34, respectively, of the tRNA anticodon. B. In eukaryotes, the nucleotide in position 34 of the anticodon can recognize two different nucleotides in position 3 of the codon. A, U or a C residue in position 3 of the codon may be recognized by an A or a G residue in position 35 of the tRNA. Thus, some codons are recognized by wobble pairing with the anticodon.
(TIF)
List of the 66 sequences containing a stop codon, with basal readthrough (B), gentamicin induced readthrough (G), increase factor between basal and induced readthrough (I) and the classified group in responsetype. These sequences were inserted into the dual reporter vector in order to determine readthrough level. Nonsense mutations are named by the gene or the disease related to and by their position (aminoacid). p53 mutations are involved in cancers; DMD and CMD mutations are involved in muscular dystrophies; CF mutations are involved in cystic fibrosis and beta mutations are involved in betathalassemia disease (see Materials et Methods for references). Nonsense mutations are classified according to their gentamicin induced readthrough level.
(PDF)
Descriptive statistics for the 3 parameters B, G and I before BoxCox transformation.
(PDF)
Shapiro Test after BoxCox transformation (λ = −0.217).
(PDF)
BravaisPearson statistical analysis of correlation between basal readthrough, induced readthrough and gentamicin response (Increase Factor) after BoxCox transformation.
(PDF)
Statistical analysis of the effect of the type of stop codon on B, G and I after BoxCox transformation using a common lambda: −0.217 (this transformation leads to negative value for B and G).
(PDF)
Statistical analysis of the effect of each nucleotide at each position on B, G and I after BoxCox transformation using a common lambda: −0.217 (this transformation leads to negative value for B and G).
(DOC)
Statistical analysis of the effect of nucleotide in −1 position on B, G and I (after BoxCox transformation) restricting the pool of mutations to those without a C in +4.
(DOC)
Statistical analysis of the effect of all nucleotides on B, G and I after BoxCox transformation using a common lambda: −0.217. The significant differences between each nucleotide against all nucleotides at all positions are listed. In this list, inc is for Increase Factor A, G, C or U for the nucleotide and the − or + followed by a number for position.
(DOC)
Statistical data (Standard deviation and standard error of the mean) of studied nonsense mutations (
(DOC)
We would like to thank Olivier Namy for his support and stimulating discussions and all the members of the laboratory, including Henri Grosjean for helpful suggestions. We thank Laëtitia Joubert and Jules Deforges for technical assistance and Cécile Fairhead for reading this manuscript. We would also like to thank Jacques Rochette (INSERM U 925UPJV, Amiens) for providing betaglobin mutant sequences.