Metabolic enzyme cost explains variable trade-offs between microbial growth rate and yield

Microbes may maximize the number of daughter cells per time or per amount of nutrients consumed. These two strategies correspond, respectively, to the use of enzyme-efficient or substrate-efficient metabolic pathways. In reality, fast growth is often associated with wasteful, yield-inefficient metabolism, and a general thermodynamic trade-off between growth rate and biomass yield has been proposed to explain this. We studied growth rate/yield trade-offs by using a novel modeling framework, Enzyme-Flux Cost Minimization (EFCM) and by assuming that the growth rate depends directly on the enzyme investment per rate of biomass production. In a comprehensive mathematical model of core metabolism in E. coli, we screened all elementary flux modes leading to cell synthesis, characterized them by the growth rates and yields they provide, and studied the shape of the resulting rate/yield Pareto front. By varying the model parameters, we found that the rate/yield trade-off is not universal, but depends on metabolic kinetics and environmental conditions. A prominent trade-off emerges under oxygen-limited growth, where yield-inefficient pathways support a 2-to-3 times higher growth rate than yield-efficient pathways. EFCM can be widely used to predict optimal metabolic states and growth rates under varying nutrient levels, perturbations of enzyme parameters, and single or multiple gene knockouts.


Converting enzyme investments into cell growth rates
Being able to compute enzyme-specific biomass production rates, we next translate these rates into cell growth rates. The growth rate of a cell is given by µ = v BM /c BM , where c BM is the biomass concentration (i.e. the amount of biomass per cell volume) and v BM is the rate of biomass production (i.e. the amount of biomass produced per cell volume and per unit time). The cell's growth rate can be approximated based on the enzyme cost of biomass production, where higher enzyme-specific biomass production rates entail higher growth rates. Therefore, an assessment of enzyme-specific biomass production rates (in optimization, or in the rate/yield scatter plots) is equivalent to an assessment of growth rates. Here we derive two conversion formulae: a linear formula, which assumes a constant amount of metabolic enzyme within the biomass, and a nonlinear one, which takes into account the growth-rate-dependent investment in ribosomes. As shown in Figure S1, predicted growth rates yield a overall picture similar to the direct assessment of enzyme-specific biomass production.

Linear growth rate formula based on a fixed proteomic enzyme fraction
To estimate µ = v BM /c BM from the enzyme-specific biomass production r BM = v BM /E met , we need to know the ratio E met /c BM , i.e. the fraction of biomass formed by metabolic enzymes. Empirically, metabolic enzymes (or more precisely: the metabolic enzymes considered in our model) occupy about one eighth of the biomass (in mass units). The mass fraction of protein within biomass, f prot = P tot /c BM ≈ 0.5 (BioNumber 101955 [4]), is relatively constant across cell types. The mass fraction of metabolic enzyme within the proteome, in E. coli, varies around f ccm = E met /P tot ≈ 25% (from proteomics data [5]). If these numbers were constant, the growth rate would be proportional to the enzyme-specific biomass production r BM , with a prefactor of f ccm ·f prot ≈ 0.125:

Nonlinear growth rate formula based on growth-dependent enzyme fraction
In reality, the amount of metabolic enzyme within the proteome is subject to change. As shown by experiments and as explained by resource allocation models [6], the fraction of metabolic enzymes decreases with the growth rate, at least if the growth rate changes are caused by varying metabolic efficiency (e.g. due to different carbon sources). We can account for this changing fraction by using a modified formula. In experiments where cell growth is controlled by nutrient quality or by dilution in chemostats, we can assume a linearly decreasing fraction [6] with positive coefficients a and b. These coefficients can be estimated from proteomics data [5]: the protein fraction devoted to core carbon metabolism decreases from ≈ 25% during slow growth (µ = 0.11/h) to ≈ 18% during faster growth (µ = 0.48/h), leading to estimates a ≈ 27% and b ≈ 20% h. Inserting (3) into (4.6) and solving for µ, we obtain the formula By inserting the numerical values, we obtain µ = 0.27 · 0.5 · r BM 1 + 0.2[h] · 0.5 r BM = 0.135 r BM 1 + 0. 10[h] · r BM .
In our approximation formulae (2) and (5), µ increases with r BM . This is why we claim that maximizing the growth rate µ is equivalent to maximizing the ratio r BM = v BM /E met , or minimizing E met at given v BM .
As shown in Figure S1, the nonlinearity introduced by the second formulae has no effects on the qualitative rate/yield trade-offs. The derivation of Eqs (4) and (4.6) is illustrated in Figure S1(c). Finally, we can rewrite formulae (2)- (5) in terms of doubling time. For that, we define the metabolic enzyme doubling time as and the cell doubling time (in hours) will thus be Nonlinear growth rate formula derived from an enzyme-ribosome trade-off In the previous section, we derived the nonlinear efficiency/growth relationship from an empirical observation: the fraction of metabolic enzymes within the proteome decreases linearly with the growth rate. Here we show an alternative derivation that directly refers to the resource allocation models by Scott al. [6]. To derive the nonlinear relationship (4) between enzyme-specific biomass production and cell growth rate, we assumed that the fraction of metabolic enzymes within the proteome decreases with the growth rate, and  Figure S1: Biomass production rates and growth rates. (a) Cost/growth conversion formulae, derived by comparing the biomass production rate allowed by an EFM (blue line) to the empirically observed metabolic enzyme investment Emet/Ptot at a given growth rate µ (red line). More precisely, blue lines show the scaled biomass rate vBM/cBM as a function of scaled enzyme investment Emet/Ptot. Top left: We assume that the observed enzyme investment is constant (vertical red line). Since growth rate µ and scaled biomass production rate vBM/cBM are identical (y-axis), we can match available enzyme budget (red line) and enzyme investment (blue line) for each EFM by taking the intersection point.
Top right: For each EFM, we plot the achievable growth rate (y-axis coordinate of intersection point in left plot) against the biomass production rate per enzyme investment (slope of blue EFM line), scaled by the constant factor Ptot/cBM (assuming that Ptot is a fixed fraction of cBM). The dots for all EFMs fall on a straight line, our linear conversion function Eq. (4.6). Bottom: Alternatively, we assume that the observed enzyme investment decreases linearly with the growth rate (diagonal red line, schematic drawing). Applying the same procedure, we obtain the nonlinear conversion function Eq. (4). (b) Alternative derivation of the conversion functions, assuming an optimal partitioning of protein resources into ribosomes and metabolic enzymes. Top: We assume that metabolic enzymes occupy a fixed fraction of the proteome. Each EFM defines a biomass rate per enzyme investment (slope of blue triangle, called ax in Eq. (8); compare slopes in (a)). An EFM with a higher slope (drawing on the right) can support a proportionally larger growth rate, giving rise to our linear conversion function Eq. (4.6). Bottom: Now we assume a variable partitioning between ribosomes and metabolic enzymes, and that the growth rate is limited both by ribosomal and metabolic activity. The ribosomal efficiency (slope of brown traingle) is called aY in Eq. (8). Again, an EFM with a higher slope (right) supports a higher growth rate, but the effect is now less pronounced because of the reallocation of protein resources (arrow). This scenario leads, again, to the same nonlinear conversion function Eq. (4) as in (a). (c) Predicted enzyme-specific biomass production versus biomass yield computed for all EFMs. (d) Predicted cell growth rate vs. biomass yield. The nonlinear scaling (i.e. using equation (5)) of the growth rate has only minor effects on the Pareto plot. The horizontal grey lines are guides for the eye. On the left, the lines are evenly spaced, every 2 [gr dw h −1 / gr enz]. On the right, the same lines are converted to growth rates (in [h −1 ]) and do not appear evenly spaced since the conversion is nonlinear (see Equation 4).
we described this dependence by a simple linear function (with offset) which we extracted from proteome data. This argument agrees with the resource allocation model by Scott al. [6], in which the proteomic fraction of metabolic enzymes is given by a constant baseline amount plus a variable amount proportional to the growth rate. To clarify this, we derive our formula again using the terminology of resource allocation models (see Figure S1(d)). We assume that the proteome can be split into three mass fractions: a constant fraction (about half of the proteome), a variable fraction x (consisting of metabolic enzymes), and a variable fraction y (consisting of ribosomal proteins). We further assume that each of the fractions x and y is proportional to the cell growth rate µ with different coefficients, i.e. µ = a x x = a y y.
Here a x is given by (or proportional to) the enzyme-specific biomass production rate (which we can compute from our model for each EFM) and that a y is constant (because we do not consider the effects of translational inhibitors). With these assumptions, the growth rate is given by where c vp is the (constant) sum of variable protein fractions. We now set a x = v bm /c bm , where v bm is the biomass production rate (in carbon molar biomass / time) per metabolic enzyme (in carbon molar) and c bm is the biomass concentration in cells (in carbon molar). The value of a y can be obtained from proteomics data by a linear regression between growth rate and ribosome fraction. Eventually, we obtain This is a hyperbolic function just like the one we derived before. By adjusting this formula to proteomics data, we would obtain the same parameters as above.

EFCM provides advantages over common kinetic or constraint-based modeling methods
By combining full information about kinetics and enzyme costs with a flux optimization approach, EFCM closes the gap between kinetic and stoichiometry-based models. It provides a clear theoretical link between kinetic models and existing network-based approaches which have incorporated some of the kinetic information. In contrast to these existing methods, EFCM allows for systematic studies of parameter sensitivities and uncertainties. Due to the screening of EFMs, its numerical effort is much higher. However, after one single calculation run, gene knockouts can be easily studied without any additional numerical effort. The main advantages over existing kinetic or constraint-based methods are as follows.
• Advantage over a direct optimization of enzyme levels in kinetic models Our optimization procedure is equivalent to a direct optimization of enzyme levels, which would often be numerically impossible: imagine that we treat the enzyme levels E l in a kinetic model as free variables to be optimized for minimizing the ratio v BM (E)/E met (E). This would be computationally costly: the objective function v BM (E)/E met (E) may be difficult to evaluate (because this entails solving a kinetic model for its steady state), but is also likely to be non-convex and non-concave, with potentially many local minima. In EFCM, in contrast, all calculation steps are computationally tractable for medium-sized models. Another advantage of EFCM is that fluxes and metabolite constraints (e.g. stationarity, kinetics, thermodynamics, and physiological bounds) can be imposed easily. Moreover, it is instructive to consider the set of EFMs and to compute the growth rates even for the non-optimal ones. The rate/yield scatter plots show directly which EFMs become growth-optimal, under what conditions, and how the optimum flux mode can switch following parameter Figure S2: EFCM can be used to simulate a wide range of enzyme perturbations and their effects on growth and metabolic strategies (a) From an enzymatic rate law (in this schematic example, a simplified hypothetical Michaelis-Menten rate law with oxygen as a substrate), we obtain a formula for the enzyme cost as a function of flux, k cat value, enzyme burden h, and oxygen level [O 2 ]. Changing the latter parameters will affect the enzyme cost (and therefore the growth rate) of each EFM differently. For example, EFMs that do not use the perturbed reaction will not be affected and their growth rates remain unchanged (in this case, these are the anaerobic EFMs). Panels (b-f) show, schematically, the changes in the rate/yield diagram resulting from different parameter perturbations. In each panel, the EFM with the highest growth rate is marked by a black frame. The blue and red polygons highlight the Pareto front before and after the change, respectively. If we lower the oxygen level (b), the growth-maximizing EFM shows a lower growth rate, but remains optimal. Only at much lower oxygen levels (c), another (oxygen-independent) EFM shows the highest growth rate. A similar change can be obtained by decreasing the k cat value (d) or increasing the enzyme cost weight (e). (f) When an enzyme is knocked out, all EFMs that use this reaction become infeasible and disappear from the plot. The same effect could be reached by an extreme decrease in oxygen or k cat value, or by an extreme increase in the enzyme cost weight.  Figure S3: Two estimates of the total enzyme cost. The total enzyme cost, as computed by EFCM, can be approximated by assuming that all enzyme molecules work at their full capacity (i.e. full substrate saturation, no reverse fluxes, no product saturation). This simplification leads to the "ideal", capacity-based enzyme cost q cat = i (|v i | · w i )/k cat,i [7]. Since q cat yields a lower bound on the enzyme demand as calculated by EFCM ("actual cost", on the y-axis), all points in the plot must lie above the y = x line (black). Actually, we find that all points lie even above the y = 1.4x line (red), so the actual cost for our model in standard conditions (also for non-elementary flows) is at least 1.4 times higher than the ideal cost. Interestingly, the actual cost is also bounded from above by the y = 4.7x line (blue). The two minimal-cost EFMs (i.e. the EFM with the lowest ideal cost and the EFM with the lowest actual cost) are indicated with arrows. Since the ideal cost can be calculated easily as a linear function of the flux, it can be used as an approximation of the actual cost in certain applications, e.g. to quickly discard EFMs that would lead to very low growth rates. changes, including variable external conditions, changing enzyme parameters or cost weights, and enzyme knockouts (see Figure S2).
• Advantage over constraint-based methods with linear flux cost functions EFCM predicts metabolic fluxes and enzyme profiles ab initio, i.e. without requiring any conditions-specific measurements such as proteome or gene transcription data. Nevertheless, our calculations of growth rates rely on a large number of kinetic constants, and uncertainties in these parameters will introduce uncertainties into all our predictions. However, methods with fewer unknown parameters all have their own drawbacks. Stoichiometrybased methods (e.g. FBA without any additional flux constraints) do not require such parameters, but would not be able to address rate/yield trade-offs because their model assumptions force growth rates and yields to be proportional (see Figure S14). Our method can be compared to variants of Flux Balance Analysis that employ flux cost functions to mimic enzyme cost. For example, the total enzyme demand E tot = l E l of a pathway or network can be approximated by E tot ≈ E lb = l v l k cat,l , which puts a lower bound 1 on the true value of E tot . In EFCM, we could use this linear function E lb (v) instead of the true enzyme cost derived from ECM 2 . However, the approximation would not only yield unrealistically low 1 A similar idea underlies FBA with molecular crowding. In this method, the enzyme demand is not used as an objective to be minimized, but as a variable to be constrained during optimization. Fluxes are constrained by a bound v l ≤ v max l = E l k cat,l on every reaction flux. An upper limit on the sum of enzyme levels Etot,max ≥ Etot = l E l represents the limited space available for enzymes. 2 The resulting method would be equivalent to a flux balance analysis with fixed biomass production rate, no other flux constraints, predictions of enzyme costs, but would do so to different extents across EFMs (see Figure S3). This would distort the prediction of growth-optimizing metabolic strategies.

A kinetic description of enzyme capacity utilization
Recent FBA methods, such as FBA with flux minimization [8], molecular crowding [9], or constrained allocations [10] address rate/yield trade-offs by bounding or minimizing the presumable enzyme demand.
Such methods can predict low-yield flux modes, but they ignore the complex kinetic dependencies between reactions due to shared metabolites. As suggested by our EFCM results, enzymes may work below their maximal rates not because they are deliberately left unused, but because of the fact that enzymes require high substrate and low product levels inorder to be thermodynamically and kinetically efficient. This causes contradicting requirements in different reactions. Even in the best possible compromise, many enzymes will not be efficiently used. The assumption that some (or all) enzymes always operate at their maximal capacity would lead to an overestimation of growth rates. The difference between actual and predicted enzyme levels reflects, among other things, the fact that in each given moment, some enzyme molecules will not be processing a substrate molecule. In [11], these enzymes were called "unused enzyme fraction". As pointed out in the main text, we think that this term is misleading.
Nevertheless, to simplify EFCM, one could assume that enzymes work at a constant capacity utilization: in this case, fluxes and enzyme levels would not be related by rate laws, but by simple proportionality factors. For example, if we assumingd that all enzymes work at their maximal speed (as given by their k cat values), the optimization of metabolite concentrations with respect to the total enyzme cost would become obsolete: using the enzyme molecular masses and k cat values, we could directly translate any flux mode into a total required amount of enzyme by a simple linear formula, which does not depend on metabolite concentrations. This simplified version is used by satFBA [12], with the addition that the cost weights of the transport reactions can be varied to reflect differential saturation of the transporter enzymes, which allows for the investigation of changing external conditions (similar to our Figures S14(a-b) and S15).
In a previous article [7], we showed that the assumption of fully efficient enzymes yields inferior predictions for enzyme concentrations, and as expected, the growth rate prediction is harmed as well. The growth rate would be overestimated by a factor of about 2.4 (see Figure S3) and, more severely, the growth differences between EFMs would be distorted. This overestimation is purely an artifact and has no biological interpretation. Given the overestimation of the growth rate, it seems quite surprising that these methods can predict metabolic states quite effectively (e.g. [13]). In the case of Resource Balance Analysis (RBA) [14], the overestimation of growth rates is avoided by using experimentally measured apparent k cat values, which are lower than the actual k cat values and capture the fact that enzymes work below their full efficiency. In RBA simulations, where growth rate is a simulation parameter, different apparent k cat values are chosen for different growth rates, reflecting the fact that enzyme efficiencies depend on metabolite levels [7], which vary between growth rates.

Extending EFCM to larger models and multiple enzyme cost functions
When extending EFCM to larger networks, one needs to address two main challenges: the numerical effort and data availability. Since the ECM problem for each EFM is convex, it may remain solvable even for large networks (as noted by e.g. [15]), but the EFMs of large networks would greatly increase in number. A subsampling of EFMs [16] could be problematic because, depending on model conditions, the high-growth EFMs may easily be missed (see section 3.1). A promising avenue is to subdivide large networks by defining and a minimization of a weighted sum of fluxes, namely E lb (v).
some key metabolites to be external [17] and controlling their concentrations. The resulting subnetworks can be analyzed independently [18], and their EFMs can be combined to yield favorable flux distributions.
In any case, the second problem remains: when extending the network to more peripheral pathways, these pathways will be poorly covered with kinetic data, and the uncertainties about model parameters will increase. Such uncertainties may have a strong effect on the predictions, suggesting that simpler models that require fewer parameters would be preferable. Indeed, focusing only on core metabolism and lumping together some enzymatic complexes into single steps (such as the phosphotransferase system and the electron transport chain) was a deliberate decision that we took to reduce this parameter space. Possibly, our model could be further simplified without greatly affecting the results.
As another extension of EFCM, one could account for multiple cost functions. In our calculations, enzyme profiles were scored by total enzyme mass as a single cost. In a previous publication, EFMs had been scored by multiple cost terms describing different investments in enzyme production [19]. For each EFM, the required enzymes were scored by five different cost values: the investments in atomic carbon, nitrogen, and sulfur, the total protein chain length, and the total length of the DNA coding for these enzymes, and Pareto optimality was used to assess trade-offs between these costs. However, this previous work assessed only the qualitative proteome (i.e. the set of enzymes required for a given EFM), and not the quantitative enzyme levels. Building on [19], EFCM could be easily extended to incorporate multiple enzyme cost functions to be studied by Pareto optimality.

Model description
This section provides details about our E. coli model: its network structure, the choice of kinetic equations and enzyme parameters, the usage of physical units and the practical calculation of growth rates, the choice of external conditions, and some statistics about elementary flux modes.

Network structure
The network structure of our E. coli model (see Figure S4 and Table S2) is based on the model by Carlson (2004) [20]. For a kinetic model, we decided to split some linear chains of reactions, since each reaction in the chain might have different kinetics. Therefore, we changed several details of the model: we split the lumped reactions R7r, R10, R54r and R55r into separate reaction steps, we added the Entner-Doudoroff pathway (reactions R60 and R61r) and merged some reactions (the new reactions R27 and R27b) (see Table   S3). We kept the ethanol (R90) and CO 2 (R97r) export in the model, but did not consider it in the kinetic calculation of enzyme cost because this is a passive, non-catalyzed process. Finally, we set the internal concentrations of CO 2 and ethanol to 1 mM, which we assumed to be sufficient to allow for an export flux through diffusion. The stoichiometric coefficients for the biomass reaction (R70) were taken from the original model (Table II Figure S4: The reaction network of core carbon metabolism in E. coli. and pyruvate water dikinase (pps)) wastes ATP and can be expected to be suppressed by strict enzyme regulation [22]. Such EFMs were consistently predicted to cause low growth rates and had no effect on the outcomes of our study. Some statistics about the biomass-producing EFMs can be found in Figure S7.

Kinetic equations
We used the same type of rate equations for all reactions [3]. Reversible reactions are marked by an "r" after the reaction number (see Table S2). Reversible (v r ) and irreversible (v i ) reactions are modeled as follows: default k cat max-gr pareto max-yield ana-lac aero-ace exp Figure S5: Growth rates of the focal EFMs, depending on the k cat value of the biomass reaction (R70). In all cases, growth rates increase with the maximal velocity of the biomass reaction. Since kinetics and enyzme demands of the biomass reaction are hard to determine, we chose a high k cat value (dotted line) as a standard parameter. With this value, the enzyme cost of the biomass reaction is negligible and the maximal growth rates are determined by metabolism (and different between EFMs) rather than by biomass production itself. The negligible enzyme cost agrees with our definition of "metabolic enzymes within the proteome" which excludes all proteins involved in macromolecule production.
v r = e r · k cat,r with enzyme concentration e i , substrate concentrations s j , product concentrations p k , and (absolute value of the) stoichiometric coefficient n. The K M values are the Michaelis-Menten constants, the k cat values the turnover numbers, and the K eq values the equilibrium constants. In our model, macromolecule production is quantified by a single biomass production rate v BM . The biomass reaction is a lumped reaction of all processes involved in biomass production, and its rate law represents the action of many (not explicitly modeled) cellular processes operating in steady state [23]: v biom = e biom · k biom cat, Equation (12) corresponds to the equation given in [23] with a single unit in the template (n = 1) and e biom as the template concentration. The s j are the substrate levels of the biomass reaction and K M,sj ,biom their Michaelis-Menten constants. Since this reaction is a lumped reaction that summarizes a wide range of biosynthetic reactions, its catalytic constant does not have a direct biochemical meaning. We thus opted not to give too much cost weight to the biomass production and chose a k cat value that does not affect the rate very much (see Figure S5 and compare with Figure S20). With the catalytic constant chosen, the kinetics of the biomass reaction does not limit growth, i.e. growth control is only exerted by the other, metabolic reactions.

Choice of consistent model parameters by parameter balancing
Parameter balancing [1] was used to translate measured kinetic constants from the literature into a complete and consistent set of model parameters (k cat , K eq , and K M values). Unknown k cat values, for example, were substituted by values around 100 s −1 , adjusted to satisfy Haldane relationships, i.e. the thermodynamicsbased laws that link them to K M values and equilibrium constants. Plausible parameter ranges were used to define prior distributions (e.g. a mean value and a standard deviation for logarithmic k cat values). The median value of 100 s −1 was chosen because k cat values in core metabolism tend to be higher than in metabolism in general (typical value around 10 s −1 ) [24]. and S7 (K eq values).

Calculation of specific growth rate and yield
To calculate the cell growth rate µ (using formulae from Equations (4.6)- (7)), we first translate the biomass flux to actual mass units (e.g. grams) by summing up the molecular masses of the biomass reactants, multiplied by their stoichiometric coefficient. The ATP/ADP and NADH/NAD + cofactor pairs are ignored in this calculation because they produce a negligible amount of biomass. That means that one mole of biomass weighs about 20.7 kg. Details of the mass calculation of the biomass are given in Table S4. Then, we convert the biomass flux v R70 in the model to the biomass production rate v BM in the growth equations: Our model calculates the abundance of each enzyme (e i , in mM) required for realizing this biomass flux. To obtain the total enzyme mass required, we multiply each e i by the enzyme's molecular mass per active site . Therefore, the total enzyme cost will be E met = i e i w i [mg l −1 ]. Finally, the enzyme doubling time is given by According to Equation (7), the doubling time of the cell reads From Equations (4.6), (4) and (13), the growth rate µ can be calculated from total enzyme cost E met and from biomass flux v R70 with the following formula: The biomass yield is expressed in milligram biomass per millimole of carbon atoms uptake. Therefore, we need to convert biomass production to grams and substrate uptake rates to mole carbons. Since glucose molecules contain six carbon atoms, and a mole of biomass weighs 20666 grams (see Table S4), the biomass yield is given by where v pts is the flux in the PTS glucose uptake system (reaction R1 in the model).

Absolute values for enzyme concentrations
Our model predicts enzyme concentrations abundances in mM, but initially these values are not properly scaled. To make all EFMs comparable, we fix the biomass production rate v R70 at a standard value of 1 [mM s −1 ]. Thus, before comparing the predicted enzyme levels with data, we need to compute our hidden scaling factor, the absolute value of v BM . An average bacterial cell consists of about 30% dry matter [25], which translates to a dry density of ρ ≈ 3 × 10 5 [mg l −1 ]. A typical value for the growth rate would be µ ≈ 1 [h −1 ], which would be enough to for a rough estimate of the biomass rate, given by v BM = ρ · µ. Using equation (13) we now obtain This result means that our standard value for v R70 is ∼250 times too high, and so are the nominal estimates for enzyme concentrations. To obtain realistic estimated values, one must therefore divide each e i by 250.
It is important to note that a wrong scaling of v R70 and e i does not affect the predictions about growth rate and yield. One can see that in formulae (2.4) and (2.4) this scaling factor affects both the numerators and the denominators and therefore cancels out. This is not a coincidence, but a feature of the way we calculate growth rate, namely by dividing the rate of biomass production by the required enzyme amount for that specific rate. As an outcome of this independence, one can also use the predicted growth rate directly in equation (18) instead of the unit value (1 [h −1 ]), even though the value of v R70 and the enzyme concentrations were used to calculate the growth rate in the first place.

Choice of standard external conditions
As a standard condition for our simulated cells, we chose a high external glucose concentration of 100 mM.
For oxygen we chose the concentration from the same paper as we used for the kinetics of the oxygen-using reactions [26], namely 0.21 mM. The concentration of ammonia (NH 3 ), another external compound taken up, was set to 10 times more than the highest K M to ensure saturation (1.0 mM). The levels of excreted metabolites were assumed to be low and were set to 0.01 mM, except for ethanol and CO 2 which were set to 1 mM (which are actually internal metabolites, since we treat the export reactions as non-enzymatic).

Details on the elementary flux modes
Some biological and statistical properties of the EFMs are shown in Figures S6, S7, S8, and S9.  [27] represents EFMs by points on a twodimensional plane. It tries to preserve their original distances (i.e. the Euclidean distances in flux space) while spreading the points evenly over the plane. The biomass yield (b) and specific growth rate (c) are represented by a color scale. As expected, similar EFMs have very similar yields (since the yield is one of the dimensions in the flux space), but growth rates can significantly change between neighboring EFMs. In each of the panels (d-i) we show a single feature in color coding, all on the same EFM map. The t-SNE algorithm produces clusters of EFMs related to major uptake and secretion fluxes. Interestingly, all high-growth EFMs are contained in a single cluster (see (c)), even though t-SNE uses no information about enzyme kinetics or our standard growth conditions.  (c) Similarity between EFMs and the measured flux distribution. The color of each EFM in this rate/yield plot corresponds to the Spearman correlation between the fluxes in that EFM and the experimentally measured fluxes from [28]. (d) Similarly, we plot the Spearman correlation between the estimated enzyme levels for each EFM and the measured enzyme levels from [29]. Note that even the point corresponding to exp does not have a correlation of 1, since, even though the fluxes are taken from experiments, the estimated enzyme abundances are still given by the ECM algorithm. Nevertheless, exp is among the EFMs with the highest correlations.  The ED pathway is consistently used by a group of low yield/high growth rate EFMs, but also in other parts of the spectrum. (d) The pentose phosphate pathway is also used throughout, but more in the higher-growth-rate EFMs. (e) The flux in upper glycolysis is reversible and therefore depicted using a red-blue colormap. It is zero for the seven EFMs with the highest growth rate and otherwise usually positive. Only a few EFMs with medium-low growth rates use the reverse direction. (f) The flux through pyruvate dehydrogenase, a large (and therefore costly) enzyme complex, is relatively low for the fastest EFMs.  (a) Acetate secretion, scaled by glucose uptake. Among the acetate-secreting EFMs, higher acetate secretion tends to imply lower biomass yields.
(b) An oxygen uptake around 0.4 [mol O 2 per mol C glucose] maximizes the yield: a lower O 2 uptake implies higher by-product secretion (to make up for the ATP requirements), while at higher uptake rates, more carbon is oxidized and released as CO 2 . (c) The number of active reactions, a simple measure of enzyme demand, shows little correlation with biomass yield. (d) The same holds for the sum of fluxes, scaled by the glucose uptake rate.

Model results
This section provides additional results on the predicted growth rates for different EFMs, under different external conditions (variation of glucose and oxygen levels), different choice of kinetic constants, a restricted usage of pathways, and enzyme knockouts. It also provides results on the necessary enzyme investments (for different EFMs, and under varying external conditions) and on the optimal metabolic strategies in chemostats at different growth rates (as determined from predicted Monod curve parameters).

Growth rates achieved by elementary flux modes
As shown in Figure S10, the predicted cell growth rates do not only vary widely across EFMs, but also show statistical distribution that assume very different shapes depending on biochemical external conditions. At standard (high oxygen) conditions, growth rates are relatively evenly distributed, while under low-oxygen conditions, a large number of EFMs (the oxygen-dependent ones) show very low growth rates, and only a very small percentage comes close to the maximal growth rate. This has practical consequences for modeling: since the number of EFMs can be large, it might be tempting to sample EFMs instead of enumerating them exhaustively, in the hope to find at least some EFMs with high growth rates. However, this approach would have failed in the low-oxygen case, because almost all EFMs yield very unfavorable growth rates. In this specific example, we might try to sample EFMs in such a way that oxygen-dependent EFMs are automatically discarded. However, in general cases, such heuristics may be hard to find. We conclude that a sampling of EFMs may yield some well-performing EFMs (as here, in high-oxygen conditions), but it may also fail (as here, in low-oxygen conditions).

Effect of individual enzyme parameters on cell growth
If we decrease the catalytic constant of an enzyme, all EFMs using this enzyme will show lower growth rates because the lower catalytic efficiency must be compensated by higher enzyme levels to maintain the flux in the EFM (see proof in SI section 4.5); as a consequence, the biomass production per enzyme investment decreases, and so does the growth rate. Figure S20 shows the effects of a simulated change in the catalytic  constant of tpi, the triose-phosphate isomerase. Compared to standard conditions (light grey dots, same as in main text Figure 2(b)), the Pareto front now contains a few EFMs with similar yields, but very different growth rates. The growth rate of max-yield, which uses tpi, is greatly reduced, while the max-gr EFM is not affected since it has no flux through tpi at all. The sensitivities shown in Figure S20 (b) correspond to the curve slopes in Figure S20 (c) at the standard k cat value. The grey shading marks the range of a two-fold increase or decrease around the standard k cat . One can see that the linear approximation is not suitable for much larger changes in k cat , such as the change shown in panel (a) (i.e. a 1000-fold decrease to "low k cat "). The curves in Figure S20(c) are based on a global enzyme re-optimization after parameter changes. Therefore, if we focus on small parameter changes (which allow us to ignore effects higher than first order), the parameter sensitivities can be easily computed. The calculation of sensitivities for k cat values, std. glucose (100 mM) low glucose (0.1 mM) Figure S12: The effect of a lower glucose level on growth. A drop in glucose level (from 100 mM to 0.1 mM) decreases the growth rates of all EFMs, but to different extents. EFMs with low yields are more strongly affected (ana-lac, for example), since the relative glucose uptake rate, and therefore the enzyme burden of the PTS system, is larger. However, the Pareto front changes only slightly, and still consists of a few EFMs (Pareto optimal EFMs are marked by dark triangles). equilbrium constants, and K M values from previously computed optimal enzyme profile for an EFM under standard conditions is described in SI sections 4.2 and 4.3.

Monod curves and optimal EFMs under high-glucose or low-glucose conditions
In a glucose-limited chemostat, different dilution rates will lead to different cell densities and external glucose concentrations. Can we expect that cells use the same metabolic strategies in a wide range of growth rates, or with some strategies perform better at low growth rates and others perform better at high growth rates? Since our model uses glucose concentration as a parameter, EFCM can be used to answer such questions; whereas methods like classical FBA, in which all fluxes scale linearly with the glucose uptake flux, would not be applicable (see Figure S14). To study trade-offs between growth at low and high glucose levels, we first calculated a Monod curve for each individual EFM, assuming a wide range of external glucose concentrations. The Monod curve is typically characterized by the formula is the maximal growth rate, [S] is the concentration of the limiting nutrient (i.e. glucose) and K S is the substrate saturation constant (or "Monod coefficient"). The Monod coefficient denotes the concentration of glucose at which the half-maximal growth rate is reached (µ = 1 2 µ max ), and its reciprocal value can be seen as the cell's overall affinity for glucose. In a chemostat at high dilution rates, cells must grow fast because they can only survive if their maximal growth rate exceeds the dilution rate. At low dilution rates, in contrast, the higher cell density leads to very low glucose levels; in this case, there is a selection for cells that can grow fast at low glucose levels, typically cells with a low Monod constant.
Having computed the growth rates of all EFMs in a wide range of glucose concentrations, either in aerobic or anaerobic conditions, we determined the curve parameters for each EFM (Monod coefficient and maximal growth rate) by fitting the growth curve with a Hill function. Plotting µ max versus the affinity 1/K S and under aerobic conditions, we do find a trade-off ( Figure S19(a)), and under anaerobic conditions, the tradeoff develops becomes more pronounced ( Figure S19(e)). In fact, the growth at low glucose concentration depends both on the Monod constant and on the maximal growth rate; as shown in Figure S19(i), our    Figure S13: Cell growth rate as a function of external glucose and oxygen levels. For each combination of glucose and oxygen levels, maximal growth is realized by one optimal EFM (marked by colors). (a) 23 of the EFMs turn out to be optimal (i.e. have the highest growth rate) at least in some region of the glucose/oxygen plane. (b) Despite the complicated pattern of "winning" EFMs, the optimal growth rate changes smoothly with the glucose and oxygen levels. It is difficult to see any transitions except for one boundary that curves up around the lower right corner. In a chemostat experiment, an imposed growth rate could be realized by various combinations of external glucose and oxygen levels. Which of these combinations arises in the medium depends on the ratio of glucose and oxygen supplies to the chemostat. By plotting different EFM properties (c-h) of the optimal EFMs in this plane, we can see that only anaerobic EFMs are optimal in the low oxygen/high glucose region (d). Interestingly, acetate fermentation is only favorable in a narrow band around this anaerobic region and becomes unfavorable at higher O 2 levels (e).
simulations predict almost no trade-off as long as oxygen levels are high. In anaerobic conditions ( Figure   S19(m)), the trade-off becomes more pronounced, suggesting that the winning strategies depend on the dilution rate.  Figure S14: Optimal EFMs depending on glucose and oxygen concentrations or uptake rates. (a) The optimal EFMs across the glucose/oxygen plane (same figure as S13(a)). (b) The optimal EFM colors overlaid on a 3D surface plot, where height represents the optimal growth rate reached. (c) By changing the axes in panel (a) to the glucose and oxygen uptake rates, we obtain a very different picture. Due to our model assumptions, only EFMs can maximize the growth rate at a given condition, and since each EFM defines a constant ratio of glucose to oxygen uptake, its points are all positioned along a straight line (scaled by the growth rate). Therefore, this scatter plot is very sparse. (d) the same data as in (c), shown as a 3D plot where the z-axis is growth rate. Again, the points of each EFM are aligned since all rates scale linearly with the growth rate.

Growth of strains deficient in EMP glycolysis, ED glycolysis, or respiration
Our analysis of growth-yield trade-offs extends an earlier study of cost/yield trade-offs in ATP production.
Flamholz et al. had studied cost-yield trade-offs between two variants of glycolysis, the Embden-Meyerhof-Parnas (EMP) and the Entner-Doudoroff (ED) pathway [30]. Their model comprised only glycolysis, and ATP yield and enzyme cost were compared between the two pathway variants at a fixed glucose uptake rate. The EMP pathway provides a two-fold yield, but requires even higher amounts of protein per ATP production flux. This leads to a cost-yield trade-off. The authors concluded that, under a high demand for ATP produced in glycolysis, cells should use the EMP pathway, while cells that have other "cheap" sources of ATP (e.g. photosynthesis) should rather use the ED pathway.
With our model, we can now study the choice between pathways more generally. Compared to the glycolysis model in [30], our model has the following additional features: (i) the model includes the TCA cycle and respiration, so choices between fermentation and respiration and choices between EMP and ED pathways glycolysis, ED glycolysis, and respiration) to variants in which one of the three pathways was blocked by a simulated gene deletion (see Figure 6 in main article).
Blocking either the EMP or ED pathway had the most marked effects at low oxygen levels, where respiration is not used and cells rely entirely on glycolysis for their ATP production. Blocking respiration had almost the same effect as setting the external oxygen concentration to very low values (except for the case of extremly low glucose concentrations). This analysis does not require any additional optimization runs. We just need to analyse the existing simulation results while discarding some EFMs, i.e. those that contain a blocked pathway. Note that our three model variants can also be seen as simple models of bacterial species that lack the genes for EMP glycolysis, ED glycolysis, or respiration.
In [30], it was hypothesized that non-respiring cells, which completely depend on ATP generated in glycolysis, should employ the high-yield EMP pathway despite it higher enzyme cost. Here we find the opposite: at (low oxygen and) high glucose levels, EMP and ED pathways yield approximately the same growth rates, and at low-to-medium glucose levels, the ED pathway performs even better, possibly due to ATP cost/yield ratio, which is still better than the cost/yield ratio of the EMP pathway -a conclusion that is in line with the simulation results, but not with the verbal conclusions, in [30].

Epistatic effects between enzyme knockouts
Once growth rates and yields have been computed for all EFMs, simulating single, double, or multiple gene knockouts is really easy -we just need to exclude all EFMs that are affected by a knockout and redo the statistical analysis (e.g. finding the growth-maximising EFM, or determining the Pareto front). By comparing Figure S19: Growth rates under high-glucose or low-glucose conditions. The upper two rows (panels (a)-(h)) show scatter plots between the maximal growth rate at standard conditions (y-axis) and the inverse Monod coefficient. Likewise, the lower two rows (panels (i)-(p)) show scatter plots between the maximal growth rate at standard conditions (y-axis) and at very low glucose levels (1 µM). Odd rows show the distribution of all EFMs in aerobic conditions ([O 2 ] = 0.21 mM) and even rows show anaerobic conditions (only EFMs that do not consume O 2 are shown). The first column (left) displays all EFMs, and the three other columns show oxygen uptake, biomass yield and acetate secretion rate, respectively. In aerobic conditions there is almost no trade-off between the maximal growth rates at ligh or low glucose concentrations. This means that high-glucose conditions and low-glucose conditions often favor the same EFMs. Figure S22 shows predicted epistasis values for growth rate as the selection objective. We find the same lethal knockouts as the yield (compare Figure S23). However, for growth rate there are fewer cases of positive epistatic interactions. After a knockout in lower glycosis (R7-R8), a second knockout in the pentose phosphate pathway (PPP, R11-R15) gives an extra reduction in growth rate (no positive scaled epistasis), while it does not have a big effect on the yield. There is a positive epistasis in the growth rate for the two reactions in the ED pathway (R60 and R61r), while this pathway is not used in high-yield pathways (and therefore no epistasis there). Some of the epistatic interactions for yield reappear at low oxygen levels, because high-yield pathways tend to provide high growth rates at low oxygen levels (panel (d)). It is interesting that some epistatic interactions change signs between different conditions, such as the positive c std. k cat low k cat max-gr pareto max-yield ana-lac aero-ace exp Figure S20: Effects of a varying enzyme parameter on growth. (a) Change in the rate/yield spectrum. We decreased the catalytic constant of triose-phosphate isomerase (tpi) by a factor of 1000 (from its original value 7800 s −1 to 7.8 s −1 ). (b) Sensitivity of the growth rate to the k cat value of tpi for all different EFMs. Error bars depict the change in growth rate caused by a 50% increase or decrease in the k cat value. To compute the sensitivities, we used a formula that assumes a direct compensation of the affected enzyme (see SI section 4.3). (c) Effect of a variable k cat value of tpi (x-axis) on the growth rates of selected EFMs. The 50% range around the standard k cat value is marked by grey shading. The top two EFMs max-gr and pareto do not use tpi at all and are thus insensitive to changes in its k cat . epistasis between uptake (R1) with the ED pathway (R60 and R61r) (panels (b) and (f)) which turns negative at low oxygen levels (panel (d)).
The epistatic effects on yield are shown in Figure S23. Panel (a) shows the relative yields of double knockouts as a fraction of the maximal yield (i.e. the yield of the wild-type). Single knockouts are shown on the diagonal. Obviously, the uptake reaction (R1) and biomass reaction (R70) are essential. As can be seen, a knockout of reactions R21-R24 and R40 is lethal because there is no way to produce 2-oxogluterate, which is needed for the biomass reaction (although this is not directly obvious for reaction R20 and R40).
Other essential reactions are R93 (to obtain ammonia) and R12r (to obtain ribose-5-P). Single knockouts in oxidative phosphorylation (R80) or lower glycolysis (R7/R8) decrease the yield, because those are used by the high-yield EFMs. Some double knockouts are synthetically lethal-i.e. only the double knockout is lethal but each of the single knockouts is viable. These are mostly combinations of knockouts in lower glycolysis, the ED-pathway and the pentose phosphate pathway. A few double knockouts lower the yield dramatically, mostly combinations of the PPP and the TCA cycle. In panel (b), the scaled yield epistatis is calculated with the formula where Y i is the scaled yield of knockout i and Y 1, [31]). Here we can clearly see the synthetically lethal double knockouts and the double knockouts that have a dramatically lower yield in red. In blue (i.e. epistasis score = 1) we find combinations or knockouts where one is dominant, while the other one doesn't reduce the yield any further. For instance, this is the case for sequential reactions in the same pathway (e.g. R10a and R10b).

Computing a stationary flux distribution in a metabolic network from measured fluxes
Assume that absolute fluxes have been measured using 13 C metabolic flux analysis (MFA), but the list of fluxes does not cover all the 52 reactions in our CCM model. We now describe a method to estimate the missing reaction fluxes (and maybe adjust the measured ones) to obtain a consistent flux distribution, i.e. one that satisfies the constrains of the system, including mass balance. One option is to find the flux mode (v) that minimizes the l 1 distance to the measured exchange fluxes (v ± σ) and fulfills mass balance constraints: wherev is an auxiliary variable that represents the actual flux, constrained to be within the confidence interval of the absolute measured fluxes. This linear programming problem may have non-unique solutions.
To reduce the set of solutions, we add a secondary optimization goal, namely to maximize ATP production (i.e. in our model, to maximize the reaction flux v r82 ).
where 1 is the minimum value achieved in the first optimization.

Global cost sensitivities can be approximated by local cost sensitivities
We now compute the sensitivities between enzyme cost and model parameters such as k cat values. In analogy to the local and global flux sensitivities in Metabolic Control Analysis, called elasticities and flux response coefficients, we distinguish local and global sensitivities for enzyme cost at predefined fluxes. From these sensitivities, we can compute local and global sensitivities between parameter perturbations and the cell growth rate. To define local sensitivities, we perturb a parameter and adapt only the enzyme of the affected reaction (while all metabolite levels and fluxes must remain unchanged). To define the global sensitivities, we perturb the same parameter and adapt all enzymes (and, accordingly, all metabolite levels (where the flux must remain the same and the cost-optimal adaptation is chosen). Below we show that, for small parameter changes, local and global adaptation lead to the same first-order cost changes, i.e. that local and global sensitivities are identical.
To define the optimal enzyme cost E opt met and the optimal logarithmic metabolite profile s 0 for a given flux vector v, we solve the enzyme cost minimization problem with an enzymatic cost function E met (v, s, k) derived from a kinetic model. The fluxes v are fixed and given, the vector of logarithmic metabolite levels s can be varied within the metabolite polytope, and the vector k contains model parameters that affect individual reaction rates.
To show that local and global sensitivities are identical, we start from the unperturbed reference values k 0 and the resulting optimal metabolite vector s 0 ; then we expand the enzymatic cost function quadratically around this point with respect to s. To simplify the notation (and without loss of generality), we consider a one-dimensional problem (with scalar logarithmic concentration s): with expansion coefficients a, b, and c. Since we expand around an optimum point, the coefficient b vanishes.
Now we consider a small parameter change dk, which changes the cost landscape (and thereby the expansion coefficients): We compute the new optimum point s opt * by equating the derivative to zero: where we assume that a ≥ 0 (i.e. we assume that we started from a unique optimum state). To compute the resulting cost, we insert s opt * back into the perturbed cost function: second-order terms in dk This means: In other words: the first-order cost change after a parameter perturbation, and with an optimal adaptation of all enzyme levels, is given by the cost change that would ensue from adapting only the affected enzyme (and keeping all metabolite levels unchanged). Note that our first-order expansion holds only for small (relative) parameter changes.

Enzyme cost sensitivities of kinetic constants
The local cost sensitivity (with enzyme cost weight, i.e. molecular mass w) thus reads As we saw before, these (first-order) local cost sensitivities are identical to the (first-order) global cost sensitivities.
Cost sensitivities for K eq values To compute the cost sensitivity of an independent change in one of the K eq values, we first calculate the elasticity where Q(c) is the mass-action ratio, and η sat (c) represents the saturation efficiency (which does not depend on K eq ). Using the definition η thr = 1 − Q(c)K −1 eq , we can rewrite this result as Note that this formula refers to individual changes in K eq values; it does not account for dependencies between K eq values due to Wegscheider conditions.

Cost sensitivities for K M values (reaction substrates)
The expression for the cost sensitivity of a K M value is given by where j is the index for c for all substrates and k for all products. We will calculate the sensitivity to changes in the Michaelis-Menten constant of a substrate s, which we denote by K s ≡ K M,cs . For this cost function, the K M sensitivity would be If s is the only substrate for this reaction, we are left with Again, this formula refers to individual changes in K M values; it does not account for dependencies due to Haldane relationships.

Cost sensitivities for K M values (reaction products)
We start again with equation (30) and take the derivative with respect to K p ≡ K M,cp -the Michaelis-Menten constant to one of the products. In this case, the sensitivity is given by and in case there is only one substrate and one product, this simplifies to (34)

Growth-rate sensitivities of kinetic constants
Since we use a nonlinear function to convert enzyme cost (E met ) into growth rate (µ), the sensitivities of µ to local changes in a kinetic constant (k) require an additional prefactor, i.e. according to the chain rule: For the second term on the right, one can use the results from the previous section on cost sensitivities (SI section 4.3), replacing k with either k cat , K s , K p , or K eq . The first term on the right does not depend on the type of kinetic constant, and can be expressed as a function of µ. To compute it, we must first express µ as a direct function of E met : Therefore, the derivative would be: Quite often, it is useful to calculate the scaled sensitivity: If a kinetic parameter changes from a reference value k 0 to a value k, we can use the first-order Taylor expansion for µ in log-scale:

Increasing a k cat value cannot increase the enzyme demand
Proposition 1 A local increase in an enzyme's k cat value can never increase the optimal total enzyme cost.
Proof Let E opt met (v, k) ≡ min s E met (v, s, k) be the minimal enzyme cost required (optimized over all metabolite level profiles s) for a given flux vector v and kinetic parameter set k. Letk be another kinetic parameters set which is identical to k except for the k cat of a single enzyme i, specificallyk cat,i > k cat,i . Now, for any metabolite profile s, if we compare E met (v, s,k) to E met (v, s, k) the only change would be the cost associated with that one enzyme, so Therefore, s E met (v, s,k) ≤ E met (v, s, k) holds for all metabolute profiles s, and this inequality will also trivially apply to the minima of both functions.

Approximation formula for Monod curves and higher-dimensional Monod functions
A Monod function -the cellular growth rate as a function of extracellular metabolite concentrations -can have a complicated shape. However, we can obtain a simple formula with interpretable parameters based on the logic of EFCM and on some simplifying assumptions: 1. We assume that the cell is bound to use a single, predefined flux mode (e.g. an EFM), and compute the Monod surface for this flux mode.
2. We assume that the metabolic enzymes occupy a fixed fraction of the biomass.
3. As an approximation, we assume that all internal metabolite levels remain constant despite changes of the external nutrient levels; this implies that (given our flux mode) all internal enzyme levels are constant, too, and that only the transporter levels are changing -the same type of approximation that was made in satFBA [12].
The first assumption will later be dropped, and the second one can be avoided 3 . To derive our formula, we consider a kinetic model and split the total cost of metabolic enzymes into where w t and w i denote cost weights (e.g. protein masses) for transporters and metabolic enzymes, respectively. With the yields y t = v bm vt for the different substances taken up, the external substance concentrations x t , and the transporter rate laws v t = c trans where the expressions in round brackets are constant for the given flux mode. From now on we call these expressions α t and β for ease of notation. The α t depend directly on the shape of the flux mode, while β depends also on kinetics and needs to be computed by ECM. In a full EFCM analysis, β would also depend on the external concentrations x t , but based on Assumption 3, we treat it as constant, and compute it once for a typical choice of external concentrations. Now we can compute the cell growth rate: where γ (the fraction of metabolic enzyme within the biomass) is assumed to be constant 4 . In analogy to Eq. (), we can express it as γ = f ccm f prot . Now we choose rate laws for the transporters. We first assume an irreversible mass-action rate law f t (x t ) = k t x t with a rate constant k t . Inserting this into Eq. (44), we where α t = α t /k t . Alternatively, we may assume an irreversible Michaelis-Menten transporter kinetics f t (x t ) = kt xt Kt+xt with rate constant k t and Michaelis-Menten constant K t . Inserting this into Eq. (44), we that is, the same form as before, but with parameters α = αt Kt kt and β = β + t αt kt . By comparing this to the classical hyperbolic Monod curve In summary, this means: for a single external metabolite, we obtain a classical hyperbolic Monod curve of the form µ = µ max x k+K Monod , with the effective parameters µ max = γ/β and K Monod = α/β, where primes (in α and β) have been omitted for generality. For Monod functions with several external metabolites, we obtain a similar form; a simple hyperbolic Monod curve with an effective external concentration x eff = [ t αt xt ] −1 , given by a weighted harmonic mean of the actual external concentrations x t . For both types of rate laws, the parameters are directly obtained from a single ECM run with the model inquestion, and for a reference state with typical external concentrations.
The function parameters depend on the flux mode assumed, and so far, we assumed that this flux mode was fixed and given (Assumption 1). Now we drop this assumption and assume that the cell can choose its optimal flux mode for each external condition (combination of external concentrations). This is quite easy to model: we know that the optimal flux modes are EFMs. For each EFM, we obtain a Monod function of the above form, and to obtain the actual Monod function (assuming an optimal choice of fluxes), we just take the maximum among all these functions. With an index j for EFMs, the final Monod function reads: where primes (in α and β) have again been omitted. Note that, again, this Monod function holds both for linear and for Michaelis-Menten transporter kinetics (which will lead to different choices of the parameter values).
Let us now revisit our above assumptions. Assumtion 1 has already been dropped. As noted before, assumption 2 can also be dropped, leading to a simple nonlinear scaling of our formula. Regarding transporter kinetics, we considered two possible rate laws, and the extension of our formula to other rate laws is straightforward. Thus, we are only left with Assumption 3, the assumption that internal metabolite and enzyme levels are assumed to be constant and that changes in external concentrations are solely compensated by changing transporter levels. This is the main difference between our simple Monod formula and a full calculation by EFCM. How can this assumption be justified? The parameters in our simplified formula are based on a reference state with typical external concentrations and the ensuing optimal enzyme levels.
If the external substance concentrations are changing, an adaptation of the transporter level only is always possible, but more costly than an optimal global enzyme adjustment. Therefore, starting from our reference state (and computing the Monod parameters in this state), our Monod function will yield a lower bound on the actual Monod function (which is based on a global adjustment). Furtermore, as pointed out in the section on sensitivities, the difference between a local and a distributed, global enzyme adjustent (and thus, the difference between the simplified and the true Monod function) is a second-order effect: in the reference state, the two functions yield the same value (by construction), and close the reference state, the differences will be small. This also means: if we are interested in a specific region of the external concentration space, we can always improve our approximations by choosing a reference state in that region.       Table S9: Cofactor pairs. In the model, some cofactors are not newly synthesized by reactions, but only interconverted in pairs. We have fixed the total concentration for these metabolites. max-gr pareto max-yield Figure S24: Flux modes and EFMs near the Pareto front. To determine the shape of the Pareto front (containing any flux modes, not only EFMs), we first considered EFMs that are either Pareto-optimal EFMs or among the highest-ranking EFMs in terms of yield or growth rate. Then we sampled flux modes that are convex combinations of a few of these EFMs. Grey dots represent the samples, light red dots are EFMs, large red squares are Pareto-optimal EFMs and black squares are those sampled nonelementary flows that turned out to be Paretooptimal. Pareto-optimal EFMs must also be Pareto-optimal among EFMs, i.e. they must not be dominated by any other EFM; however, the converse need not hold in all cases: there can be EFMs that are Pareto-optimal among EFMs, but not among all flux modes. Note that the figure is zoomed to show the entire span of the Pareto front in detail.  S10: Details on selected EFMs representing different growth strategies. Metabolic fluxes are given in carbon moles (or O 2 moles) per carbon mole of glucose taken up. The flux fraction through the pentose phosphate pathway (PPP) is defined as the ratio R1/R10a (see Figure S4 for the reaction numbers in the network).  Figure S26: The flux distribution of pareto (EFM #1218).  Figure S30: The flux distribution of exp.