The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization

doi:10.1371/journal.pcbi.1005167

Fig 1.

Enzyme cost in metabolism.

(a) Measured enzyme levels in E. coli central metabolism (molecule counts displayed as rectangle areas). Colors correspond to the network graphics in Fig 3. To predict such protein levels, and to explain the differences between enzymes, we start from known metabolic fluxes and assume that these fluxes are realized by a cost-optimal distribution of enzyme levels. (b) Enzyme-specific flux depends on a number of physical factors. Under ideal conditions, an enzyme molecule catalyzes its reaction at a maximal rate given by the enzyme’s forward catalytic constant (top left). The rate is reduced by microscopic reverse fluxes (center left) and by incomplete saturation with substrate (causing waiting times between reaction events) or by allosteric inhibition or incomplete activation (bottom left). With lower catalytic rates (center), realizing the same metabolic flux requires larger amounts of enzyme (right).

More »

Expand

Fig 2.

Enzyme demand in a metabolic pathway.

(a) Pathway with reversible Michaelis-Menten kinetics (equilibrium constants, catalytic constants, and K_M values are set to values of 1, [A] and [B] denote the variable concentrations of intermediates A and B in mM). The external metabolite levels [X] and [Y] are fixed. Plots (b)-(d) show the enzyme demand of reactions 1, 2, and 3 at given flux v = 1 according to Eq (2). Grey regions represent infeasible metabolite profiles. At the edges of the feasible region (where A and B are close to chemical equilibrium), the thermodynamic driving force goes to zero. Since small forces must be compensated by high enzyme levels, edges of the feasible region are always dark blue. For example, in reaction 1 (panel (b)), enzyme demand increases with the level of A (x-axis) and goes to infinity as the mass-action ratio [A]/[X] approaches the equilibrium constant (where the driving force vanishes). (e) Total enzyme demand, obtained by summing all enzyme levels. The metabolite polytope—the intersection of feasible regions for all reactions—is a triangle, and enzyme demand is a convex function on this triangle. The point of minimum total enzyme demand defines the optimal metabolite levels and optimal enzyme levels. (f) As the k_cat value of the first reaction is lowered by a factor of 5, states close to the triangle edge of reaction 1 become more expensive and the optimum point is shifted away from the edge. (g) The same model with a physiological upper bound on the concentration [A]. The bound defines a new triangle edge. Since this edge is not caused by thermodynamics, it can contain an optimum point, in which driving forces are far from zero and enzyme costs are kept low.

More »

Expand

Table 1.

Mathematical symbols used.

The fitness unit Darwin (D) is a proxy for the different fitness units used in cell models. Reaction must be orientated in such a way that all fluxes are positive. To define metabolite log-concentrations, we use the standard concentration c_σ = 1 mM. For a more comprehensive list of mathematical symbols used in ECM, see Table C in S1 Text.

More »

Expand

Table 2.

Simplified enzyme cost functions.

By omitting some terms in Eq (5), we obtain a number of cost functions with simple dependencies on enzyme parameters and metabolite levels. Terms marked by ✓ appear explicitly in the rate and cost formulae, while other terms are omitted or set to constant values. The EMC0 function yields the sum of fluxes, EMC1 functions contain enzyme-specific flux burdens based on k_cat and h values (i.e., replacing reaction rates by their maximal velocities). EMC2 depends on metabolite levels only via the driving forces. EMC3 functions are based on simplified rate laws, and EMC4 functions capture all rate laws, possibly including allosteric regulation. The rate law denominators D^S, D^SP, D^1S, and D^1SP, and the EMC functions themselves are described in Table A in S1 Text.

More »

Expand

Fig 3.

Predicted enzyme levels in E. coli central metabolism.

(a) Network model with pathways marked by colors. Flux magnitudes are represented by the arrows’ thickness. (b) The ratio flux/ (EMC1) as a predictor for enzyme levels. Points on the dashed line would represent precise predictions. (c) Enzyme levels predicted by the reversibility-based EMC2(S) function. Vertical bars indicate tolerance ranges obtained from a relaxed optimality condition (allowing for a one percent increase in total enzyme cost). (d) Enzyme levels predicted with EMC3 function representing fast substrate or product binding. (e) Enzyme levels predicted with EMC4 function based on the common modular rate law [21]. In all sub-figures (b-e), RMSE is the root mean squared error (in log₁₀-scale) of our predictions compared to the measured enzyme levels, and r stands for the Pearson correlation coefficient. Predictions are based on fluxes from [52], and K_M values from BRENDA [40], and compared to protein data from [53]. For metabolite predictions, see Figure E in S1 Text.

More »

Expand

Fig 4.

Prediction uncertainties and evidence for cost optimality.

(a) Uncertainty of predicted enzyme levels due to uncertain model parameters. A hundred sets of kinetic model parameters were generated by Monte Carlo sampling. Due to the multivariate distribution used for sampling, each parameter set satisfies the Haldane relationships. At the same time, fluxes were sampled according to their experimental error bars (typically around 15% of the measured flux), and the fixed metabolite concentrations were randomly varied in a ± 5% range. The resulting predicted enzyme levels, computed using the EMC4cm score, are shown by small gray dots. Solid blue circles show medians, and error bars show 25% and 75% quantiles; empty red circles show the original ECM4cm prediction, i.e. without sampling. (b) The enzyme levels in E. coli appear to be cost-optimized. We compared the ECM solution (with ECM4cm score) to enzyme profiles obtained from metabolite profiles randomly sampled in the metabolite polytope. The ECM solution (red) or metabolite profiles sampled in a close neighborhood (pink) yield significantly better enzyme predictions (quantified by RMSE, compare Fig 3) than metabolite profiles sampled in the entire polytope (light blue). The total enzyme cost (on x-axis) represents the sum of weighted enzyme concentrations (in mM); the weight of an enzyme is given by its amino chain length, divided by the median chain length of all enzymes considered.

More »

Expand

Fig 5.

Enzyme demand in central metabolism.

(a) Measured fluxes for all reactions (black dots on top) lead to an enzyme demand (bottom). The enzyme demand, predicted by using the reversibility-based EMC2s cost function, can be split into factors representing enzyme capacity and thermodynamics (see Methods). Bars show predicted enzyme levels in mM for individual enzymes on logarithmic scale. Yellow dots denote measured enzyme levels (in μM). Note that the bars do not represent additive costs, but multiplicative cost terms on logarithmic scale; therefore, the relevant feature of the blue bars is not their absolute lengths, but their differences between enzymes. (b) The kinetics-based EMC4cm cost function includes saturation terms and yields more accurate predictions. Starting from the capacity cost (in blue), the reversibility (purple) and saturation (red) terms increase the enzyme demands and decrease the variability between enzymes (on log-scale). Note that flux data (circles) and protein data (yellow dots) are identical in both plots.

More »

Expand

Fig 6.

Rate law and enzyme demand of reversible Meichalis-Menten reactions.

For a reaction S ⇌ P with reversible Michaelis-Menten kinetics, a driving force θ = −Δ_rG′/RT, and a prefactor for non-competitive allosteric inhibition, the rate law can be written as with inhibitor concentration x. In the example, with non-competitive allosteric inhibition, the kinetic factor η^kin could even be split into a product η^sat ⋅ η^reg. The first two terms in our example, , represent the maximal velocity (the rate at full substrate-saturation, no backward flux, full allosteric activation), while the following factors decrease this velocity for different reasons: the factor η^rev describes a decrease due to backward fluxes (see Figure A in S1 Text) and the factor η^kin describes a further decrease due to incomplete substrate saturation and allosteric regulation (see Fig 1b). The inverse of all these terms appear in the equation for enzyme demand, q, which is given by the enzyme level multiplied by the burden of that enzyme, h_E.

More »

Expand

Fig 7.

The conversion between fluxes and enzyme levels, in both directions.

(a) Starting from the logarithmic enzyme level (dashed line on top), we add the terms , log η^rev, and log η^kin, and obtain better and better approximation of the rate. In the example shown, has a numerical value smaller than 1. The more precise approximations (with more terms) yield smaller rates. The EMC4 arrows refer to other possible rate laws with additional terms in the denominator. (b) Enzyme demand is shaped by the same factors (see Eq (5)). Starting from a desired flux (bottom line), the predicted demand increases as more terms are considered.

More »

Expand

Fig 8.

Data integration in the ECM-based modeling workflow.

After collecting all available kinetic and thermodynamic data and mapping them onto the network model, we use parameter balancing to obtain a consistent, complete set of kinetic constants. For a fully parameterized kinetic model, the metabolite and enzyme levels must be determined. We compute them by enzyme cost minimization with predefined metabolic fluxes (obtained from experiments or computationally). Finally, the predicted values are validated with measured metabolite and protein concentrations.

More »

Expand