Skip to main content
Advertisement

< Back to Article

Predicting Cellular Growth from Gene Expression Signatures

Figure 5

Assessment of accuracy and outlier detection during growth rate inference.

(A) We performed an out-of-sample cross-validation of our model by randomly sub-sampling 24 of the 36 training expression arrays 1,000 times. We refit our linear model in each random sample, calculated bootstrapped null distributions for all gene parameters, and found sets of the most significant growth-specific genes. These were then used to infer growth rates for the 12 held-out conditions, providing an estimate of the accuracy of the model's growth rate predictions. (B) When predicting the growth rate of a new collection of expression data, our model excludes any calibration gene with an expression level outside the inner fence (1.5 times the inter-quartile range below or above the first or third quartiles). This improves predicted growth rate accuracy while also calling out genes potentially responding to specific non-growth stimuli under some biological condition. For example, in the [6] mild heat shock time course, two of the six outliers are known heat shock genes (HSP26 and HSP78). The other four (YLR327C, MOH1, YBL048W, and TMA10) are uncharacterized genes, suggesting potential roles in the response to heat shock.

Figure 5

doi: https://doi.org/10.1371/journal.pcbi.1000257.g005