Metabolic Variability in Micro-Populations

Biological cells in a population are variable in practically every property. Much is known about how variability of single cells is reflected in the statistical properties of infinitely large populations; however, many biologically relevant situations entail finite times and intermediate-sized populations. The statistical properties of an ensemble of finite populations then come into focus, raising questions concerning inter-population variability and dependence on initial conditions. Recent technologies of microfluidic and microdroplet-based population growth realize these situations and make them immediately relevant for experiments and biotechnological application. We here study the statistical properties, arising from metabolic variability of single cells, in an ensemble of micro-populations grown to saturation in a finite environment such as a micro-droplet. We develop a discrete stochastic model for this growth process, describing the possible histories as a random walk in a phenotypic space with an absorbing boundary. Using a mapping to Polya’s Urn, a classic problem of probability theory, we find that distributions approach a limiting inoculum-dependent form after a large number of divisions. Thus, population size and structure are random variables whose mean, variance and in general their distribution can reflect initial conditions after many generations of growth. Implications of our results to experiments and to biotechnology are discussed.

The moments of the distribution Beta-Binomial distribution are Leading the the main text result on the ratio between standard-deviation and mean.
Generalizing to a non-symmetric model, the growth rates can be considered as weights that factor the probability of the next division. In the language of Polya urns this can be represented by modifying the number of balls that corresponds to a cell. Denoting the number of draws of each kind by (W N , B N ), the initial state is taken as (w 0 , b 0 ) = (µ 1 n 1 (0), µ 2 n 2 (0)). Following a draw of a ball, we return it with µ 1 or µ 2 new balls according to its color. Effectively, now we draw balls from an urn that has a distribution of white and black balls that is biased according to the asymmetry of growth rates, therefore the probability of picking a white ball will be proportional to the probability of type 1 cell performing the next division. Consequently, n 1 -the number of cells with growth rate µ 1 , will have the same distribution as w0+W N µ1 . In this case the limit distribution is unbounded: The limiting distributions are more easily written in terms of the number of divisions n i = n i − n i (0) instead of the absolute number of cells. Since the two differ by a constant, the only difference will be in the mean of the distribution. Summarizing both cases the number of divisions of type 1 has a known limit distribution X N = N −ρ n 1 −→ X as appearing in the main text. Formally: we denote the cumulative distribution of n 1 by F 1,N , where the subscript 1 represents the ratio between the yields of the two types, one in the symmetric case under consideration. Then this function is related to the cumulative distribution of X N and through it approximated by the function of the random variable X: With the limit taken formally as

Final size distributions with yield variability
Here we calculate in the distribution of n 1 , number of divisions of type 1, on the stopping line n 2 +r n 1 = c(= S 0 Y 2 ), relating its cumulative distribution function F r, c with the the one for the symmetric case we calculated earlier F 1,N . Since the trajectories are monotone, all trajectories that stop on the stopping line with less then n 1 divisions of type 1, also stop on the symmetric line that passes through this point, with less then n 1 division of this type (see illustration in the Appendix of the main text). Therefore, the cumulative distribution for this point can be taken from the corresponding cumulative distribution of the symmetric problem, using N = n 1 + n 2 = n 1 + c − r n 1 = c + n 1 (1 − r) for the total number of divisions : Where the last approximation is valid when c max(n 1 (0), n 2 (0)), corresponding to a small initial condition relative to the growth potential of the medium (see definition of ε in the deterministic model). In this limit we can find the connection between the probability density functions: With the change of variables: Thus relating the moments: When n 1 (x) is the inverse function to (5).
If now we return to the absolute number of cells, instead of number of divisions, all the central moments, including the variance, are the same and the mean shifts by the initial number of cells. Overall, the mean and variance of the total population are: Final population size distribution for the two-state model with equal growth rates In the case of equal growth rates (ρ = 1) X follows the Beta distribution, which has a compact support and a PDF: And the scaling variable in (5) is x = n1 c+ n1 (1−r) . Inverting the scaling relation is straightforward in this case n 1 = cx 1−x(1−r) , and so the moments of n 1 can be calculated using the moments of X. Starting with the average of n 1 : We can now use the known moments of the Beta distribution: x k X = B(n 1 (0) + k, n 2 (0)) B(n 1 (0), n 2 (0)) And so obtain the expansion: Which is the expansion of the ordinary hypergeometric function: n 1 r,c = c n 1 (0) n 1 (0) + n 2 (0) 2 F 1 (1, n 1 (0) + 1, n 1 (0) + n 2 (0) + 1;1 − r) (11) When r is close enough to 1, we can truncate the expansion: n 1 r,c ≈ c n 1 (0) n 1 (0) + n 2 (0) 1 + n 1 (0) + 1 n 1 (0) + n 2 (0) + 1 (1 − r) Where in the last line we used n 1 (0) = qN 0 .
Using N = c + n 1 (1 − r), this result gives immediately the dependence of the mean number of total divisions on the inocculum size and composition: Now turning to the variance of the final population, we write a similar expansion for the second moment: Which can also be truncated: Therefore the variance of the number of cells is Leading to the variance of N : For small enough r: Meaning the standard-deviation over the mean decreases as η ∝ 1/ (N 0 + 1).
This is in agreement with the earlier calculation using the discrete case Beta-

Binomial distribution
Final population size distribution for the two-state model with variable special values of growth rates The model with variable growth rate can be solved for the special case of ρ = 0.5. As argued previously, we expect that the qualitative nature of the solution will not depend strongly on the exact values of the growth rates; therefore we solve this case as an illustratio. The scaling relation (5) is: Which can be inverted as Using (18) we can again calculate the mean and variance using the first few moments of X [3]: The first moment of the number of divisions of type 1 is then found Where we have defined g(x) = 1 + (1−r) 2 x 2

4c
. Using the fact that c is very large enables to write it as a combination of the moments of X: Leading to the result for the total number N : 0) − 0.5)Γ(n 2 (0) + 1) Γ(n 1 (0))Γ(n 2 (0)) + (1 − r) 2 2 Γ(n 1 (0) − 1)Γ(n 2 (0) + 2) Γ(n 1 (0))Γ(n 2 (0)) This approximation appears as a series in the extent of yield variability (1 − r). A direct expansion in ε = N0 Y2S0 , as we had in the deterministic case, cannot be formulated as a controlled approximation here since the effect of the initial size enters through the moments of X. Next we turn to the second moment: Again approximating for large values of initial substrate: