Skip to main content
Advertisement

< Back to Article

Whole cell biophysical modeling of codon-tRNA competition reveals novel insights related to translation dynamics

Fig 5

A multivariable regressor for predicting protein abundance (PA) and optical density (OD) of GFP variants.

(A) Pearson correlation between model prediction and the measured PA values of the GFP variants, as function of the number of feature selected for the train and the test sets. This graph demonstrates the approach taken for feature selection: For 100 times, a train (~67%) and test (~33%) sets were randomly selected. Each time, the next best feature was selected to be the one the increases R2 the most in the test set (for more details, see sub-section 'Regressor features selection' in the Methods section). This result suggests that a model with more than ~10 features will show poor predictivity due to over-fitting. (B) After choosing the best features and sorting them, this figure shows the correlation between the predicted PA (blue) and OD (red) values and the measured ones, for increasing number of features. A features' set based on ESDR (continues line) was compared to a simpler metric of codon-count (dashed lines). In both cases (PA and OD) the ESDR-based model performed better and reached impressive correlation with empirical data, demonstrating the importance of our model (Data for this model was taken from Kudla et. al [34]).

Fig 5

doi: https://doi.org/10.1371/journal.pcbi.1008038.g005