From noise to models to numbers: Evaluating negative binomial models and parameter estimations in single-cell RNA-seq
Fig 6
Heterogeneity in scRNA-seq capture efficiency across cells systematically alters the effective model-selection landscape, shifting the regions in which telegraph, NB and Poisson distributions are favoured.
(a) Schematic illustrating the binomial capture model for scRNA-seq with a probability of mRNA capture, , that varies between cells according to some distribution. (b) We consider three different distributions all with mean
but with varying coefficient of variation (CV): (i) Dirac(
) with CV = 0; (ii) Beta(
) with CV = 0.11; (iii) Beta(
) with CV = 0.21. (c) Phase diagram showing the regions of parameter space where the telegraph, NB and Poisson distributions are selected as the optimal ones by the aeBIC, given that the ground-truth mRNA distribution is that of the telegraph model with effective transcription rate
where
is sampled from the 3 distributions mentioned above. Here
is the sample size,
is the sum of gene-state switching rates normalised by the degradation rate of mRNA, and
is the fraction time spent in the active state. The fraction of the total parameter space occupied by the region where the NB distribution is optimally selected is shown on the plots. Note that the transcription rate is fixed to
which implies that the maximum mean number of transcripts in the phase plots is
.