Elementary Flux Modes Analysis of Functional Domain Networks Allows a Better Metabolic Pathway Interpretation

Metabolic network analysis is an important step for the functional understanding of biological systems. In these networks, enzymes are made of one or more functional domains often involved in different catalytic activities. Elementary flux mode (EFM) analysis is a method of choice for the topological studies of these enzymatic networks. In this article, we propose to use an EFM approach on networks that encompass available knowledge on structure-function. We introduce a new method that allows to represent the metabolic networks as functional domain networks and provides an application of the algorithm for computing elementary flux modes to analyse them. Any EFM that can be represented using the classical representation can be represented using our functional domain network representation but the fine-grained feature of functional domain networks allows to highlight new connections in EFMs. This methodology is applied to the tricarboxylic acid cycle (TCA cycle) of Bacillus subtilis, and compared to the classical analyses. This new method of analysis of the functional domain network reveals that a specific inhibition on the second domain of the lipoamide dehydrogenase (pdhD) component of pyruvate dehydrogenase complex leads to the loss of all fluxes. Such conclusion was not predictable in the classical approach.


Introduction
Metabolic pathway analysis is important for assessing network properties in (reconstructed) biochemical reaction networks [1,2]. Numerous biotechnological applications of this kind of analysis exist, mainly in the metabolic engineering field. This includes the extension of existing pathways to achieve the synthesis of novel products, to redirect metabolite fluxes towards a desired product, and to accelerate or bypass steps that exert high flux control [3].
The notion of elementary flux mode (EFM) is a key concept derived from the analysis of metabolic networks from a pathwayoriented perspective. An EFM is defined as the smallest subnetwork that enables the metabolic system to operate at steady state with all irreversible reactions proceeding in the appropriate direction [4,5]. In metabolic pathway analyses, the metabolic networks are described in terms of biochemical reactions, enzymes and metabolites. However, it was previously shown that enzymes (proteins) functions are a composition of elementary actions supported by submolecular functional units: the protein domains [6,7]. In nature, proteins are mainly made of more than one domain. In previous work, Zhang [8] revealed that proteins forming the network of the bacterium Thermotoga maritima are dominated by a small number of structural domains performing diverse but mostly related functions. This new type of description provides insight into the evolution of metabolic networks [9]. Consequently, the identification of functional protein domains supported by specific folds in networks can therefore provide insights into their function and maybe change the paradigm of network analysis.
An essential aspect of our work is to represent the metabolic networks using a new paradigm by looking at the enzymes functional units based on the protein domains instead of enzymes represented as a whole entity. Protein domains are part of the protein showing specific sub-functions as functional units (and often sub 3D structure). Most of the proteins are made of different domains. For a given metabolism, such a modification in enzymatic activity representation leads to a metabolic network made of functional units (domain with a new set of connectivities) but representing the same metabolism. We based our approach on this newly represented functional networks and we calculated the elementary flux modes to check whether the EFMs of the classical representation can be represented using the functional domain network and to identify new ones. We illustrated our approach with the tricarboxylic acid cycle (TCA cycle) of the bacteria Bacillus subtilis. We observed that enzymes-domains EFM analysis, in the case of the TCA cycle, has revealed the key role of the domain involved in NAD/FAD binding. This kind of observation has several implications in the domain of systems and synthetic biology and could help biologists to test, not only the role of the enzymes, but the domains in biological systems and to create new pathways and biological functions.

Domain Function Assignment
To build a metabolic network of functional units represented by enzyme domains, we have to identify for each enzyme, its structural domains and the elementary actions they provide. To identify the structural domains we perform a systematic molecular modelling of all enzymes of the network, thanks to a pipeline dedicated to protein structure modelling using Python and Perl routines, and use the homology modelling software Modeller version 9v4 [10,11].
Briefly said, this routine recursively takes the list of target sequences as input file (profile.py) and does a multiple (multalign.pl) or single alignment (salign.py), then generates the model (model.py). The best model of each enzyme is selected based on the lowest modeller objective function score (MOF). After this step, the structural domains of the proteins are assigned to each of the protein models using fastSCOP [12,13]. Besides, fastSCOP, PFAM [14,15], Swiss-Prot [16] and literature review were used for the functional domain assignment.
We illustrate our approach on the Bacillus subtilis pyruvate dehydrogenase multienzyme complex (PDH). This complex, encoded by the pdhABCD operon, is composed of three different enzymes: enzyme 1 or pyruvate decarboxylase, composed of alpha and beta subunits, coded by pdhA and pdhB respectively; enzyme 2 or dihydrolipoyl transacetylase coded by pdhC, and enzyme 3 or lipoamide dehydrogenase coded by pdhD. After the molecular modelling and the structural and functional domain assignment, Figure 1. Functional domains network of the pyruvate dehydrogenase complex. The network was designed using CellDesigner [32] with SBGN [33] (see Table 5 for the abbreviations). The rectangular nodes represent the enzymes' domains and the oval nodes represent the metabolites. The external nodes are surrounded by a blue outline. The symbol À0 represents the catalytic activities on the reactions. doi:10.1371/journal.pone.0076143.g001  Figure 1).

Functional Domain Network Design and Analysis
Another objective of this work was to represent the metabolic networks in a different way, by looking at the enzymes' functionalities supported by the enzyme domains instead of the enzyme as a whole entity. As soon as the activities of each domain are obtained, the functional domain networks can be built. A functional domain network is now defined as a set of biological activities which can be involved in enzymatic reactions or molecular interactions. It is represented by an oriented hypergraph where the nodes represent the molecular objects (metabolites, enzyme domains) and the arrows represent the reactions or the molecular interactions. The symbol À0 represents the catalytic activities on the reactions. The domains having a binding activity are represented by nodes which connect the arrows (i.e the reactions) and the domains having an enzymatic activity are represented by nodes which are linked to the reaction with the symbol À0.
Our representation of metabolic networks as functional domain networks is just a step forward to implement details on structure/ function of molecules involved in a given metabolic networks. The change of representation method does not modify the role of the network to metabolize biochemical compounds (small molecule). By implementing structural (3D domain organization) and functional details, our representation has to deal with elementary actions either binding-related (interaction) or chemical transformation-related (catalytic). We decided to analyze such a new network representation with EFMs since it has already been proved that EFMs can deal with both interaction and chemical transformations [18].
Based on this formalism, we have built the new pyruvate dehydrogenase functional domain network (see Figure 1). The functional domain network representation, compared to the classical one, allows an enriched description of the functional paths of the network. To evaluate this detailed representation, we calculated the elementary flux modes with metatool [19,20]. Elementary flux mode analysis is a well-known method to exhibit all the possible pathways within a given network. For the functional domain network representing the PDH complex, we obtained one elementary flux mode which contains all the reactions. The metatool input and the sbml files can be found in File S1. Compared to the classical representation, as expected, this single elementary flux mode has the same overall reaction: Pyruvate+NAD + +CoARAcetyl-CoA+CO 2 +NADH. This short example shows that the functional domain network representation is functionally consistent with the classical representation. Interestingly, it contains much more functional details which can be later used for further in depth interpretations in complex systems.

Traditional description of the TCA cycle
The citric or tricarboxylic acid cycle (TCA cycle) is the metabolic pathway that oxidizes acetyl-CoA to carbon dioxide [21] and plays a central role in the metabolism of Bacillus species and most other cells (see Figure 2(a)). Recently, we showed the importance of the nature of the carbon source in enzymes expression and metabolic quantification of TCA cycle components [22]. Although this metabolic network is well-known, our new approach reveals a reorganisation of fluxes in the network. This reorganisation of fluxes is caused by the finer description of molecular funtional units which in turn induces a rearrangement of the connectivities in the hypergraph. This seems to be frequent in networks. To anticipate such a hidden complexity, one needs a more detailed fluxes description of networks, as the one offered by the functional domain network description. We computed elementary flux modes using metatool [19,20] on the classical description of TCA network. The metatool input and the sbml files can be found in File S2. There are six elementary flux modes (see   Table 2), only two out of them, EFM 2 and EFM 5, produce ATP. They are represented in the Figure 2(b) and Figure 2(c). We can note that if the PDH is inhibited then the EFM 2 is not functional. It is however still possible to produce ATP through the EFM 1. The functional domains representation presented in the previous section, shows that in fact, the PDH is the result of several enzymatic activities involving reaction intermediates. The fluxes extracted from the classical TCA network description do not explicitly take into account these intermediate reactions where enzymatic domains and intermediate metabolites are involved.

TCA cycle network using domains as enzymatic units
To enrich the metabolic network description of the TCA cycle with functional enzyme domains, we applied the same methodology as the one used to describe pyruvate dehydrogenase complex in the previous section. To perform domain identification and functional assignment, we used all the sequences of all TCA enzymes from subtilist database [23] and the methodology presented above for the 3D modelling and domain assignments. Our functional domains description allows to emphasize that the PDH complex and the a-ketoglutarate dehydrogenase (ODH) complex have similar mechanisms and similar domains structures. Moreover, Hoch and Coukoulis [24] shown that a-ketoglutarate dehydrogenase contains the same subunit (pdhD) as the PDH complex of Bacillus subtilis. Based on the activities of each domain, see Table 3, we reconstructed the TCA cycle (see Figure 3) and calculated the EFMs. The metatool input and the sbml files can be found in File S3.
In this system, there are again six elementary flux modes performing the same overall reactions but they are described in a finer way compared to the classical TCA network representation ( Table 4). Some of them additionally contain the domain SucD1 because it is now considered as an external node. Enzymatic activities supported by domains can be affected by their local environment independently from the rest of the molecule (inhibition, activation, kinetics, etc.). Figure 4 shows the two out of the six domain-based elementary flux modes (EFM 2 and EFM 5) which produce ATP. They involve the PDH complex and the ODH complex. The reactions which are linked to PdhD2 (R5, R6, R7) are involved in five EFMs including the two producing ATP. If, for instance, the R5 activity is inhibited by a compound from the local environment, this could lead to suppression of the five EFMs. Then, none of the EFMs of the system could produce ATP anymore. It is thus worth noticing that using the functional domain network changes the topology of the hypergraph in adding connectivities. Hence, it links an input branch (the PDH complex) with the cycle by the lipoamide dehydrogenase (pdhD). Thus, influencing a domain of the lipoamide dehydrogenase suddendly affects the EFMs of the network even though they were not affected in the classical representation.
On the other hand, if an intermediate metabolite described in our new representation is consumed by an enzymatic activity in the local environment, then our functional domain representation allows to show it. This was not predictable in the classical system.  Because our functional domain network description is more detailed, it allows to address and analyse in a finer way the metabolic reactions in a network and predict the crucial role of the specific NAD/FAD domain (c.4.1) of lipoamide dehydrogenase (pdhD) in ATP production.

Discussion
The understanding and engineering of metabolic networks requires powerful theoretical methods such as pathway analysis, in which the topology of metabolic networks is considered. Previous works already showed the interest in looking at the detailed submolecular organization in metabolic networks [8]. However such a detailed molecular function description has never been used for global metabolic network analyses. Moreover, even if some efforts have been done to identify and annotate protein domains [25][26][27][28] and to predict domain-domain interactions [29], they did not used EFMs description to analyze functional domain networks. In this work we have provided a powerful method for network analysis, using not only the classical description of EFMs, but also a detailed analysis of enzyme domains. This methodology combines structural domain and functional analysis of enzymes and produces a new functional domains network representation which allows a finer description of successive reactions. This detailed representation is suitable for EFM analysis. We showed that the functional domain network EFM analysis provides a detailed description of the same overall reaction. When this approach was applied on the TCA cycle of Bacillus subtilis we showed that the analysis of EFMs of domain networks can lead to additional conclusions compared to the analysis of EFMs from classical metabolic networks. In particular, it is now possible to take into account interactions with the local environment of newly described intermediate reactions and metabolites. In our analysis of the TCA cycle, we observed that the enzymes analysed are made of 1 to 5 domains that perform different roles in the system. This detailed knowledge about enzyme domains functions was used in the EFM analysis. Our results showed that we gain detailed informations about the key role of a given domain in a functional domain network. It is the case of the second domain of enzyme PdhD, also found in two different enzymes, and responsible for NAD/FAD binding. Lipoamide dehydrogenase (pdhD) or Lpd, the third enzyme in Mycobacterium tuberculosis's pyruvate dehydrogenase complex (PDH), helps Mycobacterium tuberculosis to resist host reactive nitrogen intermediates. Recent works on this specific enzyme complex show that Lpd is a potential target for anti-infectives against Mycobacterium tuberculosis. The authors clearly identify the need of a finer functional network description, to drive their drug-design strategy as complement to gene knockout studies [30].
One limitation of this kind of analysis is the availability of detailed informations about enzyme activities and the annotation of their domain-supported functions. This requires strong interaction between structural and functional enzymatic activities. Our method, combining the representation of metabolic networks at the functional molecular domain level and the EFM analysis, allows more relevant functional studies. In the context of synthetic biology where biological networks are engineered, our method can provide a robust way to perform relevant functional analysis. This should strongly impact our capacity to better anticipate the potential behaviour of a newly designed synthetic network [31]. When used on natural biological networks, our method can also help to discover hidden non-obvious pathways.

Supporting Information
File S1 Metatool input and the sbml files of the PDH domain model.