Maximum entropy and population heterogeneity in continuous cell cultures

Continuous cultures of mammalian cells are complex systems displaying hallmark phenomena of nonlinear dynamics, such as multi-stability, hysteresis, as well as sharp transitions between different metabolic states. In this context mathematical models may suggest control strategies to steer the system towards desired states. Although even clonal populations are known to exhibit cell-to-cell variability, most of the currently studied models assume that the population is homogeneous. To overcome this limitation, we use the maximum entropy principle to model the phenotypic distribution of cells in a chemostat as a function of the dilution rate. We consider the coupling between cell metabolism and extracellular variables describing the state of the bioreactor and take into account the impact of toxic byproduct accumulation on cell viability. We present a formal solution for the stationary state of the chemostat and show how to apply it in two examples. First, a simplified model of cell metabolism where the exact solution is tractable, and then a genome-scale metabolic network of the Chinese hamster ovary (CHO) cell line. Along the way we discuss several consequences of heterogeneity, such as: qualitative changes in the dynamical landscape of the system, increasing concentrations of byproducts that vanish in the homogeneous case, and larger population sizes.


The expectation propagation approximation
At β = 0, the probability distribution of reaction fluxes in the population of cells is the uniform measure in the polytope of feasible states P. We can represent this distribution by a truncated multivariate Gaussian [1], where ψ n (v n ) = 1 if lb n ≤ v n ≤ ub n and ψ n (v n ) = 0 otherwise, and for a large enough value of the parameter η. The truncation makes the marginals of this distribution hard to compute, so we instead use the approximate multivariate Gaussian, where φ n (v n ) ∝ e − 1 2 (vn−an)/dn are univariate Gaussians intended to approximate the truncation factors ψ n (v n ), and and D is a diagonal matrix with entries D nn = 1/d n . The 2N parameters a n , d n are obtained from the 2N moment-matching conditions [2], Upon convergence, the EP site distributions φ n (v n ) are an approximation of the real site distributions ψ n (v n ). To incorporate the exponential selection prior, we define: wherew =v + Σβ andw (n) =v (n) + Σ (n) β. The vector β contains the selection coefficients of each reaction flux in the network. In the main text, this vector has a non-zero component in the biomass synthesis reaction, while all other components are zero. Then we estimate the marginal distribution of flux v n under the exponential selection prior as a truncated univariate Gaussian:

Derivation of Equation 13 from Michaelis-Menten kinetics
An alternative uptake bound used in the literature is: where V i is the maximum uptake rate of metabolite i and K i is the Michaelis-Menten constant. Equation (13) in the main text can be derived as an approximation to equation (12) [3]. If a substrate is available in excess (s i K i ), this bound simplifies to u i ≤ V i . In rich media at low cell densities this is the relevant regime. At higher cell-densities, substrates reach low levels (s i K i ), and the bound is well approximated by u i ≤ s i V i /K i . Employing the steady state metabolite concentration from Eq. (12) in the main text, s i = c i − u i ξ, where u i is the average uptake rate of metabolite i in the population of cells. Making the mean-field approximation For high cell densities ξV i /K i 1, and the inequality simplifies to u i ≤ c i /ξ. Combining the bounds obtained in both regimes leads to Equation (13) in the main text. Note that, compared to Equation (12), this approximated bound contains one parameter (since it does not depend on K i ), and it simplifies mathematical derivations.

Stochastic model
To gain some intuition for the metabolic distribution described in the previous section, we conceived a simple stochastic model of a population of cells evolving in metabolic space. As in the main text, a given value of ξ determines a polytope P ξ of feasible metabolic states. We start the simulation with N cells placed randomly inside P ξ . Let v ∈ P ξ be the metabolic state of cell , for = 1, . . . , N . At each step, we select a random cell for reproduction. Cell has a probability proportional to µ of being selected for reproduction, where µ ( v ) is its growth rate. The selected cell replicates to generate a new cell. With a probability , the newborn has a random metabolic state; thus is a probability of "mutation" in this model. Otherwise, with a probability 1 − the newborn has the same growth rate as the mother cell, in which case its metabolic state is a random choice among all metabolic states consistent with this growth rate. Moreover, whenever a cell reproduces, we also select a random cell from the population and eliminate it. This way the total number of cells, N , is kept constant.
Assuming that N is sufficiently large, this simulation will converge after many steps to a stationary population where average statistics f ( v ) /N of arbitrary functions f ( v) of the flux vector will be approximately constant. In particular, we obtain u i , the uptake rates, and from these, the metabolite concentrations We simulated populations of 10000 cells, for different values of ξ and . In each case we recorded the steady state distribution of cells in the polytope P ξ . In particular the simulation resulted in a steady average flux of ATP production, v atp . We then compare the distribution of cells in the polytope from the simulation with the MaxEnt distribution obtained at the same value of ξ, where β is fixed by demanding that the average flux of ATP is the same as in the simulation. Due to the low dimensionality of this simple metabolic model, the resulting equation: can be solved numerically for β. The comparison between the population histograms in steady state and the coresponding MaxEnt distribution are shown in Fig. 1, for some representative values of the parameters ξ, . Here the plot represents the marginal flux distribution P ξ (v atp ) in the population. The corresponding root β of Eq. (14) is also shown.  Finally, the metabolite concentrations are found from s i = c i − ξ u i , where c i is the feed concentration of metabolite i and u i the average uptake in the population. We compute u i from the steady state distribution, and from the MaxEnt formalism. Note that v atp ME = v atp Sim. by definition, where ME denotes the MaxEnt estimate and Sim. the simulation result. On the other hand, v g ME and v g Sim. can be different in principle. Figure 2 shows the comparison between the inferred values (using the MaxEnt estimate, ME) and the simulation results.

Sensitivity to variations in crowding coefficients
To assess the sensitivity of our qualitative results in the CHO-K1 genome-scale model to variations in the crowding coefficients, we repeated simulations after randomly perturbing these coefficients by as much as 25%. Comparing the following figures to Figures 6 and 7 in the text reveals no significant changes.

Reduced CHO-K1 model
The reduced CHO-K1 model is provided in supplementary materials in .jld2 format (https://github.com/JuliaIO/JLD2.jl). The file contains three variables that represent the reduced model: nz S is the sotoichiometric matrix, nz mets the table of metabolites and nz rxns the table of reactions.