Spontaneous Reaction Silencing in Metabolic Optimization

Metabolic reactions of single-cell organisms are routinely observed to become dispensable or even incapable of carrying activity under certain circumstances. Yet, the mechanisms as well as the range of conditions and phenotypes associated with this behavior remain very poorly understood. Here we predict computationally and analytically that any organism evolving to maximize growth rate, ATP production, or any other linear function of metabolic fluxes tends to significantly reduce the number of active metabolic reactions compared to typical nonoptimal states. The reduced number appears to be constant across the microbial species studied and just slightly larger than the minimum number required for the organism to grow at all. We show that this massive spontaneous reaction silencing is triggered by the irreversibility of a large fraction of the metabolic reactions and propagates through the network as a cascade of inactivity. Our results help explain existing experimental data on intracellular flux measurements and the usage of latent pathways, shedding new light on microbial evolution, robustness, and versatility for the execution of specific biochemical tasks. In particular, the identification of optimal reaction activity provides rigorous ground for an intriguing knockout-based method recently proposed for the synthetic recovery of metabolic function.


Number of active reactions in typical steady states
The mass balance constraints Sv = 0 define the linear subspace Nul S = {v ∈ R N | Sv = 0} (the null space of S), which contains the feasible solution space M. However, the set M can possibly be smaller than Nul S because of the additional constraints arising from the environmental conditions (the availability of substrates in the medium, reaction irreversibility, and cell maintenance requirements). Therefore, M may have smaller dimension than Nul S. If we denote the dimension of M by d, there exists a unique d-dimensional linear submanifold of R N that contains M, which we denote by L M . We can then use the Lebesgue measure naturally defined on L M to make probabilistic statements, since we can define the probability of a subset A ⊆ M as the Lebesgue measure of A normalized by the Lebesgue measure of M. In particular, we say that v i = 0 for almost all v ∈ M if the set {v ∈ M | v i = 0} has Lebesgue measure zero on L M . An interpretation of this is that v i = 0 with probability one for an organism in a random state under given environmental conditions. Using this notion, we prove the following theorem on the reaction fluxes.
also has Lebesgue measure zero. Therefore, we have v i = 0 for almost all v ∈ M.
Theorem 1 implies that we can group the reactions and exchange fluxes into two categories: 1. Always inactive: v i = 0 for all v ∈ M, and 2. Almost always active: v i = 0 for almost all v ∈ M.
Consequently, the number n + (v) of active reactions satisfies where n m 0 is the number of inactive reactions due to the mass balance constraints (characterized by Theorem 2) and n e 0 is the number of additional reactions in the category 1 above, which are due to the environmental conditions. Combining this result with the finding that optimal states have fewer active reactions (see the main text), it follows that a typical state v ∈ M is non-optimal.

Inactive reactions due to mass balance constraints
Let us define the stoichiometric coefficient vector of reaction i to be the ith column of the stoichiometric matrix S. We similarly define the stoichiometric coefficient vector of an exchange flux. If the stochiometric vector of reaction i can be written as a linear combination of the stoichiometric vector of reactions/exchange fluxes i 1 , i 2 , . . . , i k , we say that i is a linear combination of i 1 , i 2 , . . . , i k . We use this linear relationship to completely characterize the set of all reactions that are always inactive due to the mass balance constraints, regardless of any additionally imposed constraints, such as the availability of substrates in the medium, reaction irreversibility, cell maintenance requirements, and optimum growth condition. To prove the forward direction in this statement, suppose that v i = 0 in a state v satisfying Sv = 0. By writing out the components of the equation Sv = 0 and rearranging, we get Since v i = 0, we can divide this equation by v i to see that s i is a linear combination of s k , k = i To prove the backward direction, suppose that s i = k =i c k s k . If we choose v so that v k = c k for k = i and v i = −1, then for each j, we have

Number of active reactions in optimal states
The linear programming problem for finding the flux distribution maximizing a linear objective function can be written in the matrix form: where A and b are defined as follows. If the ith constraint is v j ≤ β j , the ith row of A consists of all zeros except for the jth entry that is 1, and b i = β j . If the ith constraint is α j ≤ v j , the ith row of A consists of all zeros except for the jth entry that is −1, and b i = −α j . A constraint of the type α j ≤ v j ≤ β j is broken into two separate constraints and represented in A and b as above. The inequality between vectors is interpreted as inequalities between the corresponding components, so if the rows of A are denoted by a T 1 , a T 2 , . . . , a T K (where a T i denotes the transpose of a i ), Av ≤ b represents the set of K constraints a T i v ≤ b i , i = 1, . . . , K. By defining the feasible solution space the problem can be compactly expressed as maximizing c T v in M.
The duality principle (Best & Ritter, 1985) expresses that any linear programming problem (primal problem) is associated with a complementary linear programming problem (dual problem), and the solutions of the two problems are intimately related. The dual problem associated with problem (3) is where {u 1 , u 2 } is the dual variable. A consequence of the Strong Duality Theorem (Best & Ritter, 1985) is that the primal and dual solutions are related via a well-known optimality condition: v is optimal for problem (3) if and only if there exists {u 1 , u 2 } such that Note that each component of u 1 can be positive or zero, and we can use this information to find a set of reactions that are forced to be inactive under optimization, as follows. For any given In particular, if an irreversible reaction (v i ≥ 0) is associated with a positive dual variable (u 1i > 0), then the irreversibility constraint is binding, and the reaction is inactive (v i = 0) at v 0 . In fact, we can say much more: we prove the following theorem stating that such a reaction is actually required to be inactive for all possible optimal solutions for a given objective function c T v.
Theorem 3. Suppose {u 1 , u 2 } is a dual solution corresponding to an optimal solution of problem (3). Then, the set M opt of all optimal solutions of (3) can be written as and hence every reaction associated with a positive dual component is binding for all optimal solutions in M opt .

Sketch of proof.
Let v 0 be the optimal solution associated with {u 1 , u 2 } and let Q denote the right hand side of (9). Any v ∈ Q is an optimal solution of (3), since straightforward verification shows that it satisfies (6-8) with the same dual solution {u 1 , u 2 }. Thus, we have Q ⊆ M opt .
Conversely, suppose that v is an optimal solution of (3). Then, v can be shown to belong to H, which we define to be the hyperplane that is orthogonal to c and contains v 0 , i.e., This, together with the fact that v satisfies Sv = 0 and Av ≤ b, from (6), can be used to show that v ∈ Q. Therefore, any optimal solution must belong to Q. Putting both directions together, Thus, once we solve Eq. (3) numerically and obtain a single pair of primal and dual solutions (v 0 and {u 1 , u 2 }), we can use the characterization of M opt given in Eq. (9) to identify all reactions that are required to be inactive (or active) for any optimal solutions. To do this we solve the following auxiliary linear optimization problems for each i = 1, . . . , N: If the maximum and minimum of v i are both zero, then the corresponding reaction is required to be inactive for all v ∈ M opt . If the minimum is positive or maximum is negative, then the reaction is required to be active. Otherwise, the reaction may be active or inactive, depending on the choice of an optimal solution. Thus, we obtain the numbers n opt + and n opt 0 of reactions that are required to be active and inactive, respectively, for all v ∈ M opt . The number of active reactions for any v ∈ M opt is then bounded as The distribution of n + (v) within the bounds is singular: the upper bound in Eq. (12) is attained for almost all v ∈ M opt . To see this, we apply Theorem 1 with M replaced by M opt .
This is justified since we can obtain M opt from M by simply imposing additional equality constraints. Therefore, if we set aside the n opt 0 reactions that are required to be inactive (including n m 0 and n e 0 reactions that are inactive for all v ∈ M), all the other reactions are active for almost all v ∈ M opt . Consequently, We can also use Theorem 3 to further classify those inactive reactions caused by the optimization as due to two specific mechanisms: 1. Irreversibility. The irreversibility constraint (v i ≥ 0) on a reaction can be binding (v i = 0), which directly forces the reaction to be inactive for all optimal solutions. Such inactive reactions are identified by checking the positivity of dual components (u 1i ).
2. Cascading. All other reactions that are required to be inactive for all v ∈ M opt are due to a cascade of inactivity triggered by the first mechanism, which propagates over the metabolic network via the stoichiometric and mass balance constraints.
In general, a given solution of problem (3)

Typical linear objective functions
Since the feasible solution space M is convex, its "corner" can be mathematically formulated as an extreme point, defined as a point v ∈ M that cannot be written as v = ax + by with a + b = 1, 0 < a < 1 and x, y ∈ M such that x = y. Intuition from the two-dimensional case (Fig. S1) suggests that for a typical choice of the objective vector c such that Eq.
(3) has a solution, the solution is unique and located at an extreme point of M. We prove here that 0000000000000 0000000000000 0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000  0000000000000   1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111  1111111111111 Level curves of an c objective function M Figure S1: Optimum is typically achieved at a single extreme point. The only exception is when the objective vector c is in the direction perpendicular to an edge, in which case all points on the edge are optimal.
this is indeed true in general, as long as the objective function is bounded on M, and hence an optimal solution exists. and y must be suboptimal, and hence we have c T x < c T v and c T y < c T v. Then,