A Unique Transformation from Ordinary Differential Equations to Reaction Networks

Many models in Systems Biology are described as a system of Ordinary Differential Equations, which allows for transient, steady-state or bifurcation analysis when kinetic information is available. Complementary structure-related qualitative analysis techniques have become increasingly popular in recent years, like qualitative model checking or pathway analysis (elementary modes, invariants, flux balance analysis, graph-based analyses, chemical organization theory, etc.). They do not rely on kinetic information but require a well-defined structure as stochastic analysis techniques equally do. In this article, we look into the structure inference problem for a model described by a system of Ordinary Differential Equations and provide conditions for the uniqueness of its solution. We describe a method to extract a structured reaction network model, represented as a bipartite multigraph, for example, a continuous Petri net (CPN), from a system of Ordinary Differential Equations (ODEs). A CPN uniquely defines an ODE, and each ODE can be transformed into a CPN. However, it is not obvious under which conditions the transformation of an ODE into a CPN is unique, that is, when a given ODE defines exactly one CPN. We provide biochemically relevant sufficient conditions under which the derived structure is unique and counterexamples showing the necessity of each condition. Our method is implemented and available; we illustrate it on some signal transduction models from the BioModels database. A prototype implementation of the method is made available to modellers at http://contraintes.inria.fr/~soliman/ode2pn.html, and the data mentioned in the “Results” section at http://contraintes.inria.fr/~soliman/ode2pn_data/. Our results yield a new recommendation for the import/export feature of tools supporting the SBML exchange format.


Introduction
Many models in Systems Biology are described as a system of Ordinary Differential Equations (ODEs), which allows for transient and steady-state analysis (for instance using MATLABH), or bifurcation analysis with tools like XPPAUT [1], but only when kinetic information is available.
Complementary structure-related qualitative analysis techniques have become increasingly popular in recent years, such as qualitative model checking or pathway analysis. Qualitative analysis techniques do not rely on kinetic information, but require a precisely structured model with well-identified products, reactants and catalysts (and their stoichiometry, if any) for each reaction.
The fact that the Systems Biology Markup Language (SBML) [2] has become a standard for sharing and publishing of models has helped in making modelers clarify the structure of their models. Unfortunately, SBML does not enforce that the structure and underlying ODEs are coherent. Even if the system is specified by a list of reactions, as supported, e.g., by COPASI [3], modelers tend to specify their reaction kinetics differently when aiming at ODEs analysis. The troublemakers are reactions with complex kinetics. COPASI provides a list of predefined functions; some of them actually stand for whole building blocks. Thus, the structural interpretation of models specified in formalisms such as SBML may vary according to the source of the original model. Particularly, if the models were originally meant to be ODEoriented, a later discrete interpretation as a qualitative or stochastic model by a naive automatic translation may produce wrong results; see Figure 1 for an introductory example demonstrating the problem.
In [4], it is elaborated that structural information hidden in kinetic laws may affect the results obtained from structural analysis, such as elementary mode analysis [5], extreme pathway analysis [6], flux balance analysis [7], chemical organization theory [8], deficiency analysis or chemical reaction network theory (CRNT) [9,10]. This perfectly coincides with our own experience, and applies equally for place and transition invariant analysis to validate a model, see e.g. [11][12][13], or to derive automatically an hierarchically structured network representation [14].
Structural analysis may directly support ODEs-oriented dynamic analyses; e.g. [15] applies network decomposition for a modular parameter estimation approach, [16] introduces a structural persistency criterion, and transition invariants are used in [17] to identify fragile nodes and the core network responsible for the steady state behaviour, and in [18] to determine steady state solutions.
Likewise, the correct structure is mandatory when a reaction network is meant to be put into a stochastic setting, as it has been introduced in the Petri net context in the seminal paper [19], and exercised by applying various stochastic analysis techniques (standard Markovian transient and steady state analysis, analytical and simulative model checking) to a running case study in, e.g., [12,20].
In [4], the authors present an algorithm that uncovers hidden structural information for some models already given in SBML. On the contrary, in our article we discuss conditions for unique structure inference directly from a given system of ODEs. We derive from those conditions an algorithm, that has been implemented and made public. We illustrate the necessity of our conditions and the result of the inference on some simple examples. This allows for a correct and automatic translation from ODE models to structured models suitable for qualitative or stochastic analysis, which we demonstrate on the very examples of the BioModels database [21] that were incorrectly transcribed in SBML as shown by [4].
We model a reaction network by a continuous Petri net (CPN), see [22]. We define P, the set of places, with n~jPj, and T , the set of transitions, with m~jT j. F { and F z are n|m incidence matrices describing the weights of the transitions' input and output arcs, respectively. The matrix entries are denoted by f z ij and f { ij , respectively.
Each transition t[T has a rate function v t specifying the generally state-dependent continuous flow over its input and output arcs. v t can be an arbitrary function, but its variables are restricted to the pre-places of t to enforce a close relation between structure and dynamic behavior. A CPN uniquely defines a system of ODEs over the variables corresponding to the places p i [P: We are interested in mapping a system of ODEs onto a CPN, such that the reverse operation according to (1) gives an equivalent system (up to simple algebraic operations obviously ensuring behavioral equivalence, such as a|v{b|v~(a{b)|v). Thus, we will assume that the variables of the system of ODEs are: x i ,1ƒiƒn, i.e., each variable is mapped in a unique way to a place p i of the net, which is required by the reverse mapping.
Such mappings have already been used in the Systems Biology community, e.g. in the need for a stochastic view of models originally described by ODEs. For instance in STODE [23], which was supposed to be included in COPASI, in BlenX [24], and the Beta Workbench [25]. However, no precise algorithm is described, and program sources of implementations are not available. Most importantly, these computational platforms do not care about our main concern -the uniqueness of the revealed structure.
Please note that any ODEs can be represented by a CPN simply by considering the full expression of each dx=dt, i.e. the right-hand side of the equation, as the v x of a single transition t x with all variables used in v x (i.e., the domain of v x ) as pre-places, and exactly the same post-places (with the same arc weights), except for x itself, which should have as weight on t x ?x one more than the weight on x?t x ; compare Figure 2. This naive translation always works and produces a net having an equal number of places and transitions, with structural information typically hidden in the generally complex kinetics v x . However, it is not obvious under which conditions there is exactly one CPN corresponding to a system of ODEs (even if we assume minimal arc weights), and especially whether certain biologically reasonable conditions on the CPN enforce its uniqueness. In the following we discuss ODEs conditions ensuring that there exists only one CPN; but it will almost never be the one we get by the naive translation.

Methods
We will first present a restricted form of our results and then discuss its generalization to other types of kinetics. We will give examples where even quite simple kinetics leads to ambiguity, i.e., several nets can generate the same system of ODEs.

Mass Action Law
In order to obtain uniqueness of the net, we will first restrict ourselves to the case where our first condition holds. Condition 1. The CPN has pure mass action law kinetics, i.e.
where the parameters k j belong to a finite alphabet K of symbols.
Mass action is the basis of more elaborate rates used in biological models, like Michaelis-Menten or Hill kinetics, and the use of symbolic parameters is quite standard in ODEs models since it allows the modeler to ''play'' with a system of ODEs in a simple and coherent way. Mass action kinetics are also necessary for some stochastic simulation methods or analysis techniques like CRNT [9].
It is obvious that for arbitrary kinetics there is little hope to find a unique CPN. Moreover the following examples show that even quite simple kinetics can lead to ambiguity, i.e., several net structures can give the same system of ODEs (see Example 1), and that there is a need for symbolic parameters to ensure uniqueness (see Example 2). Example 1. Consider the following ODEs: If one allows general kinetic expressions, even restricted such that they have the same variables as they have pre-places, one could obtain the two nets given in Figure 3.
Note that the second net does not respect Condition 1, since the kinetics should have been k : A 2 .
Example 2. Consider the following ODEs: Symbolic parameters are required to avoid that (3) leads to the two nets given in Figure 4.
We obtain the following system of ODEs by combining Condition 1 with equation (1): If a system of ODEs can be put in such a form, thanks to basic algebraic transformations, we will try to extract from it a CPN. Otherwise, it does obviously not correspond to any model fulfilling Condition 1.
We thus restrict our study to ODE systems of the form: where J is a set of indices and for all j[J it holds s j [Z,l j [K, and all r ih [N; in other words, ODE systems where the right side is a polynomial over x i , with coefficients being integer linear combinations of parameters in K.
A reaction which has exactly the same multisets of pre-and post-places, i.e., reactants and products, will only lead to null members in any ODE. Thus, we also assume: Condition 2. The CPN does not contain any void reaction, i.e., Finally, we introduce a third purely syntactic condition to ensure uniqueness of the CPN.
Condition 3. In the CPN, the same parameter is never used for two different reactions with the same reactants, i.e., We illustrate Condition 3 by Example 3. Example 3. We consider again system (2). Complying with Condition 1, but allowing a single parameter to be used twice for the same reactants, i.e., violating Condition 3, one could obtain the net given in Figure 5.
Indeed, for the given system (2) and with the three introduced conditions, there are necessarily two places (A and B), one single transition (it has kinetics k : A), a single pre-place (A with weight 1), and a single post-place (B with weight 1); see the first CPN of Example 1 in Figure 3.
Before turning to our main result, we introduce two lemmata. Lemma 1. Under our three conditions, all kinetics v j appear at least once in the ODEs.
Proof. Let us suppose that v j0 does not appear in the system.   Thus there are necessarily some terms compensating for v j0 in some equations. These ODEs are precisely all the dx i =dt such that f z ij0 {f { ij0 6 ¼ 0. However, since parameters are symbolic, only monomials with the same value of k j and the same degree for all x h can compensate each other. But under Condition 3 there are no other j that share these features with j 0 . Lemma 2. Conversely, for each term s : l : P x r h h of the ODEs, there exists a transition with parameter l, and pre-places x h with the corresponding arc weights r h .
Proof. The existence is obtained directly from the mapping of CPNs to ODEs according to (1). Since parameters and variables are symbolic objects, no term of that form can be created otherwise.
There is only a single such transition in any net agreeing with Condition 3. Thus, if there are several terms with the same l and r h : s 1 : l : P x rh h , . . . ,s q : l : P x rh h , they correspond to the same transition and can be merged into one single term s : l : P x r h h with s~P q 1 s i . We can now proceed to our main result. Theorem 1. For any system S of ODEs defining dx i =dt,1ƒiƒn according to Conditions 1-3, there exists at most one CPN, such that the system S 0 obtained from it according to (1) is equivalent to S, up to basic arithmetic.
Proof. We have seen that the x i uniquely define P. From Lemma 1 and 2 we obtain the uniqueness of the definition of T and F { . Now, the post-places and corresponding weights are defined unambiguously by looking at dx i =dt and imposing the constraint s~f z ij {f { ij , i.e., f z ij~s zf { ij with f { ij already determined to be equal to some r h in the previous step. If the obtained f z ij is strictly negative, there is no CPN that would produce such system under the assumed conditions. The theorem states that there is at most one CPN. Indeed lots of ODEs are not amenable to (4) and thus do not comply with our first condition. However even for some systems that do comply with it there exists no model fulfilling our three conditions, as illustrated by Example 4.
Example 4. An ODE system that can be put in the form of equation (4), but does not correspond to any CPN fulfilling our three conditions is dx=dt~{2kx.
In this case, from the ODEs one would obtain a single place for x, a single transition with parameter k, an input weight of 1, but no possible output weight: f z~s zf {~{ 2z1~{1.

Beyond Mass Action Law
About 10% of the models of the BioModels database fulfill our three conditions. However it is quite common to use classical enzymatic kinetics like Michaelis-Menten or Hill type kinetics.
Actually, one can weaken Condition 1 in order to cope with Michaelian kinetics of the form: v j~V j : x j K j zx j in addition to the mass action law case.
Instead of polynomials, the right members of the ODEs will then be rational fractions. But thanks to the partial fraction decomposition theorem (see for instance [26]) they can be decomposed in a unique way into a sum of a polynomial and of rational fractions, with irreducible polynomials as denominator and a numerator of strictly smaller degree.
In our case, the simple rational fractions will have degree one denominator (K j zx j ) and degree zero numerator, otherwise there is no CPN corresponding to these ODEs without violating our new condition. These fractions can be easily and unequivocally transformed into the above form, the remaining polynomial will be handled as in the previous section.

Results
We built a prototype implementation of the method outlined above -the tool ode2pn, which converts XPPAUT files into SBML (Level 2, Version 1) or APNN (one of the standard Petri net formats [27]), respectively, by applying directly the constructive proof of Theorem 1. We built upon an already existing tool, Nicotine [28], for the output of the structured model and added to it an XPPAUT parser that uses Lemma 2 to introduce a new reaction for each corresponding term in the ODEs and Theorem 1 to complete the stoichiometry matrix.
The tool rejects the conversion when no structured model fulfilling our conditions can be obtained. It is available at http:// contraintes.inria.fr/,soliman/ode2pn.html.
Note that the partial fraction decomposition necessary for the Michaelian kinetics always exists, but is ''practical'' only with prior knowledge of the poles of the denominator's polynomials. These are the K j in the Michaelian case. Actually, our implementation supposes that the corresponding rational fractions are already in decomposed form.
In [4], five models from the BioModels database were identified as having been transcribed in SBML with some structural information missing: models 44, 93, 94, 143 and 151 (we adopt the convention to reduce the official model names to at most three digits). Model 44 involves Hill Kinetics and model 143 even more complex kinetic laws; so our approach cannot guarantee the uniqueness of the structure for these two cases. In the following we discuss our results for the remaining three models.
Contrary to [4], where SBML files are evaluated directly, we take the auto-generated XPP files (i.e. ODEs, generated from those SBML models), which we downloaded from the BioModels database in September 2009, and hand-curated in order to obtain exactly the ODEs as given in the original articles.
Models 93 and 94 are two models of the JAK/STAT pathway by [29]. In the original article they are described by a drawing (see Fig. 6) and a mixture of what the authors call ''chemical reactions'' and of ODEs (mostly for mRNAs). They are used as ODEs for simulation and were hand-transcribed to SBML for inclusion in BioModels database, but missing the ''reversibility'' of some reactions. We input the 34 differential equations (in each case) to our tool, with sometimes more than ten different terms in a single equation, and obtained the unique structure complying with our conditions (with the Michaelian extension) and correctly including reverse reactions when needed.
Model 151 is a model of the regulation of that same JAK/ STAT pathway by IL-6 in hepatocytes [30]. It includes 68 differential equations (see Fig. 7 for an extract) and once again leads to a unique structure (with mass action and Michaelian kinetics). The XPPAUT.ode file (BIOMD151.ode) and the resulting structured SBML file (BIOMD151_new.xml) can be found at http://contraintes.inria.fr/,soliman/ode2pn_data/ together with the biomodels version (BIOMD0000000151.xml), which actually contains more errors than found by [4], mostly concerning parameter names that are quite error-prone when hand-translated from ODEs to SBML. Note that the XPPAUT file which we provide corrects two typos from the original article, namely k r39 instead of k r30 in dx 8 =dt and x 15 instead of x 14 in dx 16 =dt. These typos still allow extraction of a unique structured model, but with obvious differences compared to that described in the article.
The converted models can be further processed by any tool complying with SBML or APNN, e.g. using Snoopy [31], which supports both formats and allows for graphical visualization of the translation results.

Discussion
We have discussed conditions for a unique structure inference out of a given system of ODEs. For reaction networks fulfilling the given three conditions, ODEs and a structured formalization by, e.g., a CPN, are equivalent representations, which can be transformed into each other without loss of information. Note that these networks are restricted to mass action or Michaelian kinetics, which are the most widely used kinetics for biochemical systems, and prohibit empty reactions which would not have any biochemical meaning. These conditions forbid models, which were mathematically correct, but contradict reasonable biochemical expectations.
We have shown that otherwise the structure is not uniquely defined by a system of ODEs. We have given examples where violating our conditions leads to several nets having possibly different discrete, and thus stochastic behavior, but generating the same system of ODEs. These counterexamples demonstrate the Figure 6. Figure 1 of [29] representing a schematic view of the JAK/STAT pathway. The incorrect structure of the corresponding SBML models (93 and 94) of the BioModels database can be automatically fixed by going back to the differential equations and extracting the unique structure fulfilling our three conditions. It then correctly includes the reversibility of reactions (1), (2), (3), (6), etc. highlighted in red, and absent from the BioModels database version. doi:10.1371/journal.pone.0014284.g006 Figure 7. Beginning of the Appendix II of [30] describing the full ODE model of that article. The 68 ODEs actually allow the extraction of a unique model fulfilling the three established conditions. It not only correctly reflects the structure described in the article, but also avoids the typos introduced in the hand-written model 151 of the BioModels database; hand-typing an SBML model for that many ODEs with numerous parameters is definitely error-prone. doi:10.1371/journal.pone.0014284.g007 necessity of each individual condition. We have given a constructive proof for the translation algorithm, which has been directly implemented, providing XPP to SBML conversion.
Our conditions are quite restrictive (only Mass-Action and Michaelian kinetics), but do cover a large part of mathematical biology models. This should allow, in the future, more and more modelers to benefit from structural analysis techniques for their systems, even if done as an afterthought. It also leads to more precise links between the different formalisms and launches a bridge betweens different communities of the Systems Biology field. In those cases where both the ODEs and a reaction diagram are given, our method allows the check if they are consistent.
Ideally, models are specified with our conditions in mind, be it as a list of reactions (as, e.g., in COPASI) or some graphical notation (e.g., continuous Petri nets). In both cases, kinetic functions should obey the three established conditions. Userfriendly tools might check these conditions while doing export to SBML files to prevent misleading results by later use. Sophisticated ODE tools will have no problems in applying adequate algebraic transformations to optimize the simulation algorithms' run-time behavior. Any import of SBML files should check these conditions if aiming at structure-related qualitative or stochastic analysis techniques.
We intend to continue in trying to find uniqueness conditions for more general kinetics, and to devise heuristics for structure inference when uniqueness cannot be obtained (unwinding algebraic conservation laws coming from rapid equilibria, for instance). We also plan to make our algorithm more widely usable, for instance through a CellDesigner [32] XPP-import plugin.