^{1}

^{1}

^{1}

^{2}

^{1}

Conceived and designed the experiments: ADM EM. Performed the experiments: DDM MF. Analyzed the data: DDM MF. Contributed reagents/materials/analysis tools: DDM MF ADM EM. Wrote the paper: DDM MF ADM EM.

The authors have declared that no competing interests exist.

The integration of various types of genomic data into predictive models of biological networks is one of the main challenges currently faced by computational biology. Constraint-based models in particular play a key role in the attempt to obtain a quantitative understanding of cellular metabolism at genome scale. In essence, their goal is to frame the metabolic capabilities of an organism based on minimal assumptions that describe the steady states of the underlying reaction network via suitable stoichiometric constraints, specifically

The operation of biological systems is constrained under all circumstances by the laws of physics. Thermodynamics, in particular, dictates preferential directions in which biochemical reactions should flow at stationarity. When applied to cellular reaction systems (like metabolic networks), it favors the emergence of some (thermodynamically feasible) ways to organize the flow of matter while prohibiting others. The development of detailed predictive models for the biochemical activity of a cell relies on the possibility to integrate the laws of thermodynamics in genome-scale reconstructions of cellular metabolic networks. In this work we have devised an efficient relaxation algorithm to implement thermodynamic constraints in genome-scale models. Besides allowing to check for thermodynamic feasibility of reaction flow configurations, it is also capable of providing information on other relevant physico-chemical quantities. We have applied it to two cellular metabolic networks of different complexity, namely that of human red blood cells and that of the bacterium

Constraint-based models of cellular metabolism are important tools to analyze and predict the chemical activity and response to perturbations of cells without relying on kinetic details that are often unavailable. In such frameworks, the metabolic capabilities of a cell are inferred from the overall configuration space compatible with minimal physico-chemical constraints describing the non-equilibrium steady state of the underlying reaction network. First, feasible reaction flux vectors need to satisfy mass-balance conditions. Then, according to the second law of thermodynamics, in an open chemical network at steady state and constant temperature and pressure the direction of each reaction should ensure a decrease in Gibbs energy. Thermodynamic consistency of flux configurations satisfying mass-balance alone is in general not guaranteed due to the presence of infeasible cycles

Much work has been concerned with implementing thermodynamic constraints in genome-scale models of metabolism. The removal of thermodynamic inconsistencies was proved to be useful in estimating concentrations and affinities besides fluxes in Flux-Balance-Analysis

The scalability of algorithms to solve mixed integer-linear (or non-linear) programming problems may become an issue when the underlying network size is large or when one is interested in sampling the solution space (for both free energies and fluxes) rather than focusing on a potentially small set of configurations (e.g. optima). Luckily, however, solutions to computationally hard problems can often be generated efficiently with the help of heuristic algorithms based on simple local rules. The use of message-passing algorithms to characterize the high-dimensional volume of the solution space of FBA models

Our goal in this paper is to obtain information about the landscape of Gibbs free energies compatible with a given vector of reaction directions by following a route that allows to use all stoichiometric information via heuristics inspired by perceptron learning. In a nutshell, the method we propose consists in exploiting the network's structure to iteratively build up correlations between the chemical potentials of the reacting species starting from a seed of empirical biochemical knowledge, until a thermodynamically consistent profile is achieved. The resulting algorithm is completely scalable and can be employed for different purposes, like checking the feasibility of flux configurations, identifying and removing infeasible cycles, estimating reaction affinities, and obtaining bounds for (log−)concentrations and free energies of formation.

In the following, we describe the method in detail, providing a mathematical proof of convergence as well as theoretical arguments highlighting the main idea behind the procedure. As applications, we focus on two metabolic networks of rather different complexity. First, we shall obtain a detailed reconstruction of the Gibbs energy landscape underlying metabolic activity in the human red blood cell (hRBC) starting from the flux maps obtained in

The cellular systems analyzed in this study are (i) the model of the hRBC metabolism developed in

According to the second law of thermodynamics, in an open system at constant temperature

Generate a chemical potential vector

Compute

If

If

Given a stoichiometric matrix

As is generally true in MinOver schemes, the reinforcement term in (4) drives the gradual adjustment of chemical potentials by ensuring that, at every iteration, the least satisfied constraint (labeled

In essence, the final outcome of multiple (random) initializations of the above algorithm is a set of

If there is no prior information on the direction of some reactions (e.g. because they are putatively reversible), the corresponding constraints (3) are formally absent, as if

Finally, some observations are in order about the solution space of (3), which in general has the form of an unbounded cone passing trough the origin. If one is interested in uniform sampling the space of

A number of interesting theoretical and computational questions arise at this stage, regarding e.g. the minimal amount of prior information on chemical potentials needed to bound the solution space of (3), or computationally efficient and scalable methods to obtain uniform sampling (besides Monte Carlo, which may be infeasible at high dimensions as suggested by the “curse of dimensionality”, see e.g.

The algorithm just discussed generates chemical potential vectors given a thermodynamically feasible vector of reaction directions. A generic assignment of reaction directions, however, could be such that the system (3) has no solutions apart from the trivial one. In accordance with the Farkas-Minkowski lemma

While running the algorithm to compute chemical potentials for a large number of iteration steps

Search, within such subset of reactions, for a loop of length

If a loop is found, change the direction of one of its reversible reactions chosen with uniform probability.

In the

A computer code implementing the algorithms to compute chemical potentials and identify and remove infeasible loops is downloadable

As a first application, we have employed the MinOver scheme outlined above to analyze the thermodynamic landscape of the hRBC metabolic network. As a starting point, we have considered the flux configurations obtained in

according to

according to

As a first step, we tested the thermodynamic feasibility of these direction assignments, solving (3) by starting from the vector

Results for the estimated concentrations and Gibbs energy changes (computed from (2) using the final chemical potential vectors) are showcased in

The input information used to initialize the algorithm (with error bars) is denoted by black markers (see text for details). Values obtained starting from direction assignments corresponding to a sample of

The input information used to initialize the algorithm (with error bars) is denoted by black markers; the values obtained starting from direction assignments corresponding to MBE and VNC solutions are shown respectively as red and green markers. Note that the input information is consistent with reactions operating in the reverse direction for GAPDH, PGK, PGM, LDH, G6PDH, TA and PNPase. The algorithm is able to correct these inconsistencies starting from both MBE- and VNC-compatible direction assignments.

We can now quantify the extent to which the solutions we generate are “close” to the prior. In

The results obtained starting from direction assignments corresponding to MBE solutions are shown here for three different methods: (a) MinOver with

The application that we have just discussed shows that the algorithm we present can provide information on the Gibbs energy landscape, even correcting inconsistent input knowledge. We shall now employ the procedure outlined in the

Since we are not focusing on the reconstruction of the Gibbs energy landscape but simply on the existence of solutions of (3), a detailed biochemical prior is not needed. Therefore, for the present purposes we have taken

Convergence times shown are for the identification and elimination of the thermodynamically infeasible loops and for the verification of thermodynamic feasibility of randomly generated flux configurations from the

We have furthermore studied the set of loops that were thus identified and corrected. Quite remarkably this analysis revealed that thermodynamical infeasibility is related to the presence of a small set of cycles, 23 in total. These are reported in

Rectangles (resp. ellipses) denote metabolites (resp. reactions). The cycles depicted here are n. 8 (A, top left), 18 (B, top right) and 22 (C, bottom) from

Cycle ID | Length | Formula |

1 | 3 | SERt4pp |

2 | 3 | NAt3pp |

3 | 3 | NAt3pp |

4 | 3 | NAt3pp |

5 | 3 | PROt4pp |

6 | 3 | HPYRRx |

7 | 3 | CRNDt2rpp(R) |

8 | 3 | VPAMT |

9 | 3 | ABUTt2pp |

10 | 3 | NAt3pp |

11 | 3 | ADK3(R) |

12 | 3 | ACCOAL |

13 | 3 | PPAKr(R) |

14 | 4 | ACt4pp |

15 | 4 | CA2t3pp |

16 | 4 | SERt4pp |

17 | 4 | GLUt4pp |

18 | 4 | CA2t3pp |

19 | 4 | THRt4pp |

20 | 5 | ADK1(R) |

21 | 5 | R15BPK |

22 | 6 | R1PK |

23 | 6 | ADK3(R) |

We remark that the corrected flux configurations thus obtained, like the starting ones (which were drawn from a uniform product measure over reactions), are not guaranteed to be consistent with any steady state assumption. On the other hand, see Supporting

Ideally, constraint-based models of metabolic activity allow to appraise the energetic potential of cells based on minimal constraints related to local mass-balance and thermodynamic feasibility rules, possibly complemented with optimization principles that can encode for functional constraints. As a result, the flow of matter in non-equilibrium steady states could be characterized in terms of the Gibbs energy change of reactions, which specifies the directionality of interconversions, and of the average number of turnovers per time per volume, i.e. the flux, without the need of detailed information on enzyme kinetics or transport mechanisms. Thermodynamic constraints, strongly linked to overall intracellular conditions like ionic strength and pH

Many important steps have been taken recently to tackle it. At one level, thermodynamic feasibility can be translated into topologic constraints (‘absence of loops’) for the flux configuration emerging from mass-balance constraints

The usefulness of the algorithm in the analysis of genome-scale networks has been tested in two different cases. For the metabolic network of the human red blood cell, our approach has proved capable of reconstructing the Gibbs energy landscape correcting inconsistent prior information. In turn, this has lead to predictions for intracellular metabolite levels. It is important to stress that the bounds on concentrations we have obtained (which vary rather heterogeneously across compounds) only reflect stoichiometric information. For the metabolic network of

The main advantage of our method consists in our view in its efficient implementation. On the critical side, we point to two aspects that deserve further study. In first place, our tool requires flux configurations as inputs, i.e. it is still unable to produce thermodynamically feasible configurations of fluxes and chemical potentials starting from no previous reversibility hypothesis. However it may provide the basis of a more general procedure for the analysis of genome-scale metabolic networks that couples flux- and thermodynamic profiling, a challenging open problem in computational biology. Secondly, our method relies on prior biochemical information and it would be desirable for it to be effective even if much or most biochemical priors are unknown. As we pointed out, some information has to be injected into the problem for the sake of definiteness. The interesting question is therefore what is the minimum necessary prior needed to reconstruct the Gibbs energy landscape and how are predictions affected by restricted priors. Such problems are mathematical in nature and could bear a particularly high significance for modeling purposes.

Stoichiometric matrix of the human Red Blood Cell metabolic network employed in this study.

(TXT)

Thermodynamic potentials for the human Red Blood Cell metabolic network employed in this study.

(XLS)

Detailed information about infeasible cycles (reactions and metabolites) identified in the

(XLS)

A toy example of the computation of chemical potentials by MinOver with loop correction.

(PDF)

The IIT Platform “Computation” is gratefully acknowledged.