^{1}

^{2}

^{3}

^{¶}

^{4}

^{3}

^{1}

^{2}

^{*}

Conceived and designed the experiments: FA RB. Performed the experiments: FA. Analyzed the data: FA EJK BMB MPB RB. Contributed reagents/materials/analysis tools: FA RB. Wrote the paper: FA EJK BMB MPB RB.

¶ Membership of The SilicoTryp Consortium is provided in the Acknowledgments.

The authors have declared that no competing interests exist.

Kinetic models of metabolism require detailed knowledge of kinetic parameters. However, due to measurement errors or lack of data this knowledge is often uncertain. The model of glycolysis in the parasitic protozoan

An increasing number of mathematical models are being built and analysed in order to obtain a better understanding of specific biological systems. These quantitative models contain parameters that need to be measured or estimated. Because of experimental errors or lack of data, our knowledge about these parameters is uncertain. Our work explores the effect of including these uncertainties in model analysis. Therefore, we studied a particularly well curated model of the energy metabolism of the parasite

Kinetic models of metabolism require quantitative knowledge of detailed kinetic parameters (e.g. maximum reaction rates, enzyme affinities for substrates and regulators). However, our knowledge about these parameters is often uncertain. When the parameters are measured, various sources of error can affect the results: experimental noise at the technical and biological levels, systematic bias introduced by parameters being measured

Here we present an analysis of the effect of parameter uncertainties on a particularly well defined example of a quantitative metabolic model: the model of glycolysis in bloodstream form

Abbreviations: Metabolites: Glc-6-P = Glucose 6-phosphate, Fru-6-P = Fructose 6-phosphate, Fru-1,6-BP = Fructose 1,6-bisphosphate, DHAP = dihydroxyacetone phosphate, GA-3-P = glyceraldehyde 3-phosphate, Gly-3-P = glycerol 3-phosphate, 1,3-BPGA = 1,3-bisphosphoglycerate, 3-PGA = 3-phosphoglycerate, 2-PGA = 2-phosphoglycerate, PEP = phosphoenolpyruvate. Reactions: 1 = transport of glucose across the cytosolic membrane, 2 = transport of glucose across the glycosomal membrane, 3 = hexokinase, 4 = phosphoglucose isomerase, 5 = phosphofructokinase, 6 = aldolase, 7 = triosephosphate isomerase, 8 = glyceraldehyde 3-phosphate dehydrogenase, 9 = phosphoglycerate kinase, 10 = transport of 3-PGA across the glycosomal membrane, 11 = phosphoglycerate mutase, 12 = enolase, 13 = pyruvate kinase, 14 = transport of pyruvate across the cytosolic membrane, 15 = glycerol 3-phosphate dehydrogenase, 16 = glycerol kinase, 17 = DHAP-Gly-3-P antiporter, 18 = glycerol-3-phosphate oxidation, 19 = ATP utilisation, 20 = adenylate kinase.

Explicitly considering the uncertainties of parameters in the analysis of the model allowed us to gain interesting new insights into its behaviour. Most importantly, our analysis allowed us to quantify the degree of confidence concerning diverse properties of the system, including the hierarchy of control which is relevant for prioritizing potential drug targets. The resulting quantitative profile of model uncertainties, including the identification of major fragilities and areas in need of further examination, provides a solid basis for future model extensions. These will in turn introduce new uncertainties and should be dealt with using the same general framework established here.

In order to specify the uncertainty associated with each parameter, we gathered all available information relating to the sources of the values used in the model. Information included data on how kinetics were measured, the number of replicates and the standard error of mean values when available, additional calculations used to estimate the parameter from the observed values, and any “corrections” for additional factors such as temperature or pH. For this purpose, we created the “SilicoTryp” wiki, a MediaWiki-based (

Each reaction of the model has its own page. On this page, the rate equation is specified and a table includes all parameters with their detailed source and calculations when necessary.

From the information collected, probability distributions could be inferred for each parameter as described in

To model the effect of uncertainty, we sampled values for each parameter according to its probability distribution, generating a ensemble of alternative models. Together these alternative models accurately represent our degree of uncertainty about the correct parameters, assuming that our knowledge of each parameter value is independent of the other parameters (see

The first property of the models that we analyzed is whether or not a steady-state is reached in a reasonable time. Our simulation uses the steady-state of the model with the fixed set of parameters to set the initial concentrations of the metabolites. From this initial state, each model is simulated until steady-state is reached. Considering the generous threshold we set for these simulations, steady-state should be reached rapidly. Yet, only 33% of the 10,000 models reached steady-state within 50 simulated minutes or less, and only 36% within 300 simulated minutes. As shown in

The contour lines indicate when steady-state was reached (in minutes of simulated time). If steady-state was not reached before, simulations were stopped at 300 minutes (see

In bloodstream form

This property is well-conserved in all our models using the full range of plausible parameter values (see

The glycolytic flux is defined as the sum of the fluxes producing glycerol and pyruvate. The black lines represents the percentage of the glycolytic flux in the pyruvate branch (top) and the glycerol branch (bottom) in the fixed parameter model. The red line is the distribution of the percentage of the glycolytic flux in the collection of models generated from the parameter probability distributions. The division of the flux between the pyruvate branch and the glycerol branch is well conserved. The effect of the uncertainties of the parameters is almost non-existent in anaerobic conditions (simulated by setting the glycerol 3-phosphate oxidase

Indeed, anaerobically, the flux distribution is entirely determined by the topology and stoichiometry of the model: the 6-carbon product derived from glucose (fructose 1,6-bisphosphate) is split into two 3-carbon products by aldolase. Anaerobically, the NADH formed in the pyruvate branch can only be reoxidized to

The fixed-parameter version of the model (i.e. the model with the set of parameter defined to be as close as possible to the model described in

Using our collection of models, we are able to see the effect of parameter uncertainties on the steady-state concentration estimates. Considering only the models that reach steady-state within 300 simulated minutes, several cases can be distinguished (see

For many metabolites, steady-state concentrations are well-conserved in all plausible models and their distribution is approximately log-normal: glucose 6-phosphate, fructose 6-phosphate, glycosomal glyceraldehyde 3-phosphate, cytosolic and glycosomal dihydroxyacetone phosphate and glycerol 3-phosphate,

For several metabolites, steady-state concentrations do not follow approximate log-normal distributions, although their steady-state concentrations are distributed within a range of values consistent with physiological metabolite concentrations. These include glycosomal and cytosolic ATP, ADP, AMP, glycosomal and cytosolic glucose, fructose 1,6-bisphosphate and glycosomal 1,3-bisphosphoglycerate. For example, the concentration of glycosomal ATP and AMP is predicted to be between 0 and 6 mM (see

For two metabolites, 3-phosphoglycerate and pyruvate, the steady-state concentration distribution has a long, heavy tail, indicating that some combinations of plausible parameter values can lead to extreme predicted concentrations (several hundreds to thousands of

The cytosolic 2-phosphoglycerate and glycosomal ATP steady-state concentrations are consistent with physiological metabolite concentration, whereas 3-phosphoglycerate and pyruvate sometimes reach hundreds of millimoles per liter. The value for the fixed parameter model is indicated by a vertical black line.

The accumulation of 3-phosphoglycerate (3-PGA) and/or pyruvate to unreasonable concentrations causes some models to reach steady-state at extremely high concentrations or to fail reaching steady-state within 300 minutes. This occurs when the maximum reaction rates (

(A) Percentage of models that reach steady-state within 300 minutes as a function of phosphoglycerate mutase

Vanderheyden et al. have measured the pyruvate efflux

Adding alanine aminotransferase into the model would require adding several other reactions as well: the production and recycling of 2-oxoglutarate and glutamate need to be incorporated, as well as the export of alanine

Control coefficients are one of the most important high-level properties of kinetic models of metabolism: they allow the quantification of how much influence each reaction has on the flux of the pathway. In the glycolytic model of

Using our collection of models, we calculated the control coefficients for every reaction and every model (see

(A) Percentage of models which have either the glucose transporter (GlcTc), phosphoglycerate mutase (PGAM) or glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as the reaction with the highest control coefficient either over the glucose consumption flux (red) or the oxygen consumption flux (blue). (B) Percentage of models vs. the number of reactions that have a control coefficient higher than 0.001. The color inside the bars represents the proportion that has either the glucose transporter, PGAM or GAPDH as the reaction with the highest control coefficient over the glucose consumption flux within these subgroups.

In 1999, Bakker et al.

In the fixed-parameter model, shared control was only seen at glucose concentrations higher than 5 mM

Dynamic models of metabolism are powerful tools to infer interesting and often unexpected properties of cellular physiology. However, the data used to build models from diverse sources can lack accuracy and precision. Here we demonstrate how model output can vary when the uncertainties associated with incomplete and variable datasets are explicitly considered in studying a model. We took as an example the well characterised model of the compartmentalised glycolysis in the parasitic protozoan

The first property that we studied is the ability of the model to reach steady-state rapidly. Surprisingly, a significant proportion (60%) of the models we generated by sampling the parameters did not allow the model to reach steady-state within 300 minutes, due to the accumulation of either 3-phosphoglycerate or pyruvate in the cytosol. This phenomenon could be attributed to two individual parameters, the maximal reaction rates of phosphoglycerate mutase and pyruvate transport which, when operating below their mean value (but still very close to it), caused the accumulation of two metabolites (3-phosphoglycerate and pyruvate respectively). For the pyruvate transporter, the analysis suggested a mechanism that could avoid this problem: alanine aminotransferase has been shown, unexpectedly, to be essential in bloodstream form

We then analysed the distribution of the steady-state fluxes between the pyruvate and glycerol producing branches of glycolysis both in aerobic and anaerobic conditions. In totally anaerobic conditions, the distribution was very well conserved. Indeed, this property is entirely constrained by the topology of the model and thus this result was expected. Our analysis shows that the distribution of the fluxes is more variable in aerobic conditions, consistent with previously unexplained variation in experimental observations (although changes in oxygen tension within different cells in measured populations would create the same effect).

Further analysis of the steady-state concentrations allowed us to distinguish the metabolites that are only moderately affected by the parameter uncertainties and follow an approximate log-normal distribution, such as

Finally, we analysed the control coefficients of each enzyme using our collection of models. These properties are especially important in the case of glycolysis in

The data derived from the work performed here point to several further studies, including analysis of the role of alanine amino transferase in the regulation of pyruvate concentration and more exact quantification of pyruvate transport and phosphoglycerate mutase kinetics. The detailed description of parameter uncertainty will now form the basis for a comprehensive Bayesian analysis and extension of the model using alternative topologies

The model used in this paper is the last updated version

To allow a straight-forward sampling of parameters, the rate equations were rewritten to contain the equilibrium constant instead of the ratio of

The list of sources used to compute the values of the equilibrium constants is available in supplementary

In order to sample the model parameters, we needed to define a probability distribution for each parameter. These distributions can be defined empirically using arbitrary shapes, but for the sake of convenience it is usually appropriate to use standard shapes (e.g. normal or log-normal distributions) and then to estimate the parameters of these distributions (usually the mean and standard deviation).

These parameters represent concentrations, therefore they cannot be negative and our uncertainty about their values is best represented by a log-normal distribution.

For each

The parameter has been measured experimentally: a mean (

The parameter has been measured experimentally, but only a mean value is reported.

The parameter has not been measured, and no estimate of its value is available. When no other information is available, the parameter is calculated from the list of

The parameter has not been measured, but some indication of its mean is available, e.g. a value measured for a phylogenetically closely related species (

The

The equilibrium constants,

As the equilibrium constant does not depend on the organism (assuming constant temperature, pH and ionic strength), the mean and standard-deviation of the distribution can be calculated from the various values reported in the literature (see supplementary text S3). When only one published value could be found, the standard deviation was calculated using the mean relative standard deviation of the other equilibrium constants in the model as described above.

The maximum rate of the reactions (

For each

If the

The

The glucose transporter

The model includes several transport reactions. Among them, only the transport rates across the cytosolic membrane have been measured. The transport rates across the glycosomal membrane have not been characterised and are currently modelled using mass action kinetics (i.e., as non-saturable, non-enzymatic reactions) to maintain maximal compatibility with the published model

No information is available about the uncertainty of these parameters. As these parameters are strictly positive, they are sampled using a log-normal distribution as are

Bakker et al.

ATP utilization is modelled using mass action kinetics with a single rate constant. As this reaction represents all of the cytosolic reactions that consume ATP and are not explicitly included in the model, the rate constant of this reaction is unknown. As for glycosomal transport reactions, this parameter was sampled according to a log-normal distribution. The mean used is the value fitted by Bakker et al.

The glucose transport across the cytosolic membrane is assumed to be symmetric

All parameters were sampled using the MT19937 random number generator of Makoto Matsumoto and Takuji Nishimura

The steady states were calculated using the SOSlib library

The control coefficients were computed using the methodology described by Bakker et al.

(XML)

(TIFF)

(PDF)

(PDF)

(PDF)

The authors would like to thank Rainer Machné for his help with SOSLib.