^{1}

^{2}

^{1}

^{*}

Conceived and designed the experiments: GVS SH KL. Performed the experiments: GVS SH. Analyzed the data: GVS SH KL. Contributed reagents/materials/analysis tools: GVS SH KL. Wrote the paper: GVS SH KL. Wrote the program code used in the analysis: GVS SH.

The authors have declared that no competing interests exist.

Modularity analysis offers a route to better understand the organization of cellular biochemical networks as well as to derive practically useful, simplified models of these complex systems. While there is general agreement regarding the qualitative properties of a biochemical module, there is no clear consensus on the quantitative criteria that may be used to systematically derive these modules. In this work, we investigate cyclical interactions as the defining characteristic of a biochemical module. We utilize a round trip distance metric, termed Shortest Retroactive Distance (ShReD), to characterize the retroactive connectivity between any two reactions in a biochemical network and to group together network components that mutually influence each other. We evaluate the metric on two types of networks that feature feedback interactions: (i) epidermal growth factor receptor (EGFR) signaling and (ii) liver metabolism supporting drug transformation. For both networks, the ShReD partitions found hierarchically arranged modules that confirm biological intuition. In addition, the partitions also revealed modules that are less intuitive. In particular, ShReD-based partition of the metabolic network identified a ‘redox’ module that couples reactions of glucose, pyruvate, lipid and drug metabolism through shared production and consumption of NADPH. Our results suggest that retroactive interactions arising from feedback loops and metabolic cycles significantly contribute to the modularity of biochemical networks. For metabolic networks, cofactors play an important role as allosteric effectors that mediate the retroactive interactions.

Mathematical models are powerful tools to understand and predict the behavior of complex systems. However, the complexity presents many challenges in developing such models. In the case of a biological cell, a fully detailed and comprehensive model of a major function such as signaling and metabolism remains out of reach, due to the very large number of interdependent biochemical reactions that are required to carry out the function. In this regard, one practical approach is to develop simplified models that nevertheless preserve the essential features of the cell as a complex system by better understanding the chemical organization of the cell, or the layout of the biochemical network. In this work, we describe a computational method to systematically identify closely interacting groups of biochemical reactions by recognizing the modular hierarchy inherent in biochemical networks. We focus on cyclical interactions based on the rationale that reactions that mutually influence each other belong in the same group. We demonstrate our method on a signaling and metabolic network and show that the results confirm biological intuition as well as provide new insights into the coordination of biochemical pathways. Prospectively, our modularization method could be used to systematically derive simplified and practically useful models of complex biological networks.

Hierarchical modularity has emerged as an organizational principle of biochemical networks, where larger less cohesive clusters of network components (e.g. metabolic enzymes or signaling molecules) comprise functionally distinct sub-clusters

In recent years, observations on modularity have prompted metabolic engineers and synthetic biologists to consider whole pathways, rather than individual genes, as modular building units for cellular design

While there is general agreement that a biochemical module should represent a group of connected network components, and that the arrangement of modules in the network is hierarchical, there is less consensus on the criteria that should be used to systematically extract biologically meaningful modules

In this paper, we extend the concept of retroactivity to account for cyclical interactions spanning distant parts of a biochemical network as exemplified by feedback loops of signaling and metabolic pathways. In earlier work

Each reaction in the network was ^{2+} signaling; GPCR: G-protein coupled receptor mediated signaling; ERBB: erythroblastic leukemia viral oncogene homolog receptor signaling).

Each reaction in the network was

To examine the effect of cyclical, i.e. retroactive, interactions on modularity, we compared the partitions of the EGFR signaling network obtained using Newman's connectivity (^{2+} signaling (CAS) or small guanosine triphosphatase (SGTP) were identified. Quantitatively, both partitions reach a hierarchical depth of 6 and become more homogeneous closer to the terminal nodes of the partition tree. From the root to terminal nodes, the canonical group compositions of the modules (represented by the pie colors) trend toward a single, dominant group (

(A) Partitions obtained using Newman's connectivity metric. The GPCR dominated module (ID: 22202) has 36 reactions and 28 cycles. (B) Partitions obtained using the ShReD metric. The GPCR dominated module (ID: 22203) has 39 reactions and 167 cycles. The terminal node (ID: 22219) has 99 reactions, but only 10 cycles.

(A) Homogeneity of modules as a function of partition height (see text in

There are also notable differences between the two partitions. While both partitions extract modules predominantly consisting of G-Protein coupled Receptor (GPCR) activation reactions, the ShReD partition identifies greater hierarchy stemming from those modules. In the Newman partition, there are several terminal leaf nodes that predominantly comprise Mitogen Activated Protein Kinase (MAPK) reactions. Analogous terminal nodes are not present in the ShReD partition. The ShReD partition yields a large terminal node consisting of 99 reactions (Supplementary

We next compared the Newman (

(A) Partitions obtained using Newman's connectivity metric. (B) Partitions obtained using the ShReD. Details of the reactions in the boxed modules are shown in

Homogeneity index is shown as a function of partition height. The height of the root node in the Newman partition tree is 2, whereas the height of the ShReD tree is 7. Error bars represent one standard deviation.

The impact of metabolic regulation on ShReD-based modularity was investigated by comparing the partitions for the metabolic network model with (

(A) Metabolic network with cofactors, but no regulatory edges. The boxed module (ID: 15982) contains pyruvate kinase. (B) Metabolic network with regulatory edges, but no cofactors. Note the absence of a redox module coupling detoxification reactions with lipid synthesis.

(A) Number of finite ShReDs in a module as a function of partition depth. (B) Average ShReD of a module as a function of partition depth. Error bars represent one standard deviation.

We next assessed the impact of cofactors such as ATP, NADH, and NADPH on ShReD-based modularity by comparing the partition generated for the complete metabolic model (

Detailed composition of modules boxed in _{2}. (B) Coupled reactions metabolizing NADPH.

(A) Number of finite ShReDs in a module as a function of partition depth. (B) Average ShReD of a module as a function of partition depth. Error bars represent one standard deviation.

For completeness sake, we compared the partitions based on ShReD with partitions based on local, or nearest neighbor, retroactivity. To obtain local retroactivity partitions, the size of cycles was restricted to two edges, effectively eliminating all retroactive paths involving non-neighboring vertices. Algorithmically, _{ij}_{ij}

In this paper, we introduce the use of ShReD as a round trip distance metric, which can be combined with a partition algorithm (adapted from Newman's earlier work on community detection) to systematically identify biochemical reaction modules that feature cyclical interactions. The notion of grouping together network components based on “retroactivity” was first proposed by Saez-Rodriguez and coworkers, who hypothesized that a strictly downstream component should have little impact on the activity of an upstream component unless there is a feedback or retroactive relationship

To evaluate the performance of ShReD as a module-detection metric, we performed two sets of comparisons. One set of comparisons involved the community detection algorithm presented by Newman, which also formed the basis for our partitioning algorithm. Newman's original algorithm partitioned based on connectivity, and favored the placement of a pair of network elements (vertices in the graph representation) into the same module if the number of connections between the two elements exceeded the expected (e.g. average) number of connections assuming an equivalent network with edges placed at random. The second set of comparisons involved the special case of local feedback loops or cycles arising from reversible reactions. The results of these comparisons were used to investigate how multi-step signaling loops or metabolic cycles, as opposed to conventional connectivity or reaction reversibility, contribute to the modular organization of biochemical networks.

Applied to a model network of EGFR signaling, the ShReD-based partitions generated modules with a greater number of cyclical interactions across all depths compared to Newman's connectivity-based partitions (

In the case of the liver metabolic network, which has a substantially greater number of cycles (arising from allosteric feedback loops) compared to the EGFR signaling network, the difference between ShReD and Newman partitions is more dramatic. ShRed partitions again lead to greater hierarchy, reaching a depth of 7, whereas Newman's partition only reaches a depth of 3. The greater hierarchy achieved using the ShReD metric is significant, because the partition algorithm is essentially identical to Newman's algorithm, i.e. the only difference is the metric used to calculate the modularity score

The retroactive interactions captured by ShReD include not only reaction reversibility (as in previous work

As many of the allosteric regulators were energy currency metabolites, we also examined the partitions for a partial metabolic model that lacks these cofactors. The resulting network contains fewer ShReDs, presumably reflecting an overall decrease in the total number of paths. Compared to the complete model, the corresponding ShReDs (connecting the same reaction vertices) of the partial model are ∼30% _{2} oxidation with different reactions in and around the TCA cycle (_{2} oxidation is placed in a module (ID: 15985) containing succinate dehydrogenase, which reduces FAD^{+} to FADH_{2}. The coupling between TCA cycle reactions and oxidative phosphorylation is intuitive. However, the TCA cycle reactions are also highly connected to reactions in glutamate metabolism and β-oxidation, associations that may be subjectively less intuitive. In this light, ShReD partitions reflect an emphasis on cyclical interactions mediated by the cofactors. A third example of an intuitive, yet non-canonical grouping involves the drug transformation reactions. In the present study, the metabolic model included reactions that are induced by troglitazone, a hydrophobic anti-diabetic compound withdrawn from the market due to severe hepaotoxicity. Module 15995 illustrates the cyclical interactions coordinating reactions of several different canonical pathways, including glutathione, lipid, glucose, and pyruvate metabolism (

To examine whether the influence of the cofactors reflected the relatively small size of the model network (comprising ca. 150 reactions), we also applied the ShReD-based modularity analysis to a larger model of the human liver (comprising ca. 2500 reactions)

In conclusion, this paper presents a novel methodology for modularity analysis that enables hierarchical partitions of biochemical networks by preserving feedback loops and other cyclical interactions. To the best of our knowledge, the present study is the first to build a module detection method that focuses on cycles or feedback loops as the key structural feature. The present study is also the first to account for cofactors in modularity analysis, further emphasizing the role of pathway regulation in network modularity. Previously, studies on modularity have generally ignored cofactors, citing methodological challenges arising from having to place these highly connected hub metabolites into particular modules

A common way to model a biochemical network using a graph is to represent the components as vertices and their interactions as edges. In this study, the focus is on understanding the hierarchical and modular relationship among reactions, treating metabolites as shared resources among modules. We therefore use a directed graph with vertices representing reactions and edges indicating a directional interaction between the connected reactions. Edges are drawn between two reactions (

(A) A reaction-centric representation of two different cases (B and C) where one reaction is upstream of another. (B) Reaction R_{1} produces a metabolite M_{2} that is consumed by reaction R_{2}. (C) Reaction R_{1} produces a metabolite M_{2} that is an allosteric effector of the enzyme catalyzing reaction R_{2}.

We utilize round trip distance as a metric, which we call _{ij}_{1,3}_{1} to R_{3} and there is one edge from R_{3} to R_{1}. There is another cycle connecting the two reaction vertices, which also involves R_{4}, R_{5} and R_{6}. This cycle, however, is not the ShReD, as its length of 6 exceeds the ShReD value of 3. For a given network (or sub-network) a ShReD value is computed for every pair of vertices in the network (or sub-network). To compute the ShReD values, we first calculated the shortest distances between all pairs of vertices using the Floyd-Warshall algorithm

(A) The example network comprises 8 reactions and 1 allosteric inhibition. (B) Graph representation of the reaction-to-reaction interactions in the example network.

Partitions were obtained by adapting Newman's community detection algorithm

In Newman's algorithm, the modularity score was computed as the difference between the actual and expected number of _{ij}_{i}_{j}_{ij}_{ij}_{1} and R_{2} are both 4.8. The expected ShReD between R_{1} and R_{2}, _{12}_{12}_{i}_{i}_{ij}_{i}_{j}_{i}_{j}_{ij}_{i}_{j}_{1}, R_{2}, R_{3}, R_{7} and R_{8} to one module, and R_{4}, R_{5} and R_{6} to the other module. The reactions in the first module are not fully connected, which gives rise to two disconnected components, one comprising R_{1}, R_{2} and R_{3} and the other comprising R_{7} and R_{8}. In this example, a single binary partition generated three separate modules, each consisting of a single cycle.

In Newman's original community detection algorithm, partitioning of a subnetwork continues if the modularity score _{1}, R_{2}, R_{3}, R_{7} and R_{8} is not fully connected, and two subnetworks are found, one comprising R_{1}, R_{2}, and R_{3} and the other comprising R_{7} and R_{8}. Neither subnetwork can be further partitioned, as every element in the leading eigenvector of the corresponding modularity matrix has the same sign. Similarly, the module comprising R_{4}, R_{5} and R_{6} cannot be further partitioned, as every element in the leading eigenvector of the corresponding modularity matrix has the same sign, and the algorithm terminates.

The partitioning results are reported in the form of a hierarchical tree annotated with several properties. Each module is represented as a pie chart, where the size of each slice is proportional to the fraction of reactions that belong to the corresponding, pre-assigned canonical (textbook) grouping. The

The

As case studies, we examined two types of biochemical networks that feature directed interactions and feedback loops.

The signaling network was reconstructed based on a published model of epidermal growth factor receptor (EGFR) signaling

A stoichiometric network model of human hepatocyte metabolism was reconstructed from the KEGG reaction database and further augmented by the addition of xenobiotic transformation reactions, as well as regulatory interactions mediated by allosteric effectors. The model comprised 159 reactions, 146 metabolites, and 61 regulatory interactions. The xenobiotic transformation reactions were added to describe the metabolism of the anti-diabetic compound troglitazone (TGZ), including steps needed to supply conjugation substrates such as glutathione (GSH). The regulatory interactions in the model reflect known allosteric effects of metabolites on reaction activity as described in a standard biochemistry textbook

To investigate the effect of scale, a more detailed graph model of human liver metabolism was constructed from a previously published model (HepatoNet1)

Partition report for the hepatocyte metabolic model. The report includes the reaction definitions, regulatory interactions, and stoichiometric matrix for the model.

(XLSX)

Partition report for the Hepatonet1 model. The report includes the reaction definitions and stoichiometric matrix for the model.

(XLSX)

Network of terminal modules from the partitioning of the EGFR signaling network using Newman's connectivity (A) and ShReD (B). The interactions between modules represent interactions between reactions in the respective modules. The size of a module is proportional to the number of reactions in the module. As the networks correspond to the terminal nodes of the respective partitioning trees, hierarchical information can be inferred from the presence of multiple modules assigned to the same canonical signaling pathway. For example, panel B shows multiple GPCR transactivation modules (dark blue) of varying sizes. In the same panel, MAPK cascade (light blue) is present as a component of a larger composite module with multiple canonical signaling pathways.

(TIF)

Network of terminal modules from the partitioning of the hepatocyte metabolic network based on both Newman's connectivity metric (A) and ShReD (B). The interactions between modules represent interactions between reactions in the respective modules. The size of a module is proportional to the number of reactions in the module.

(TIF)

ShReD based partitioning of Hepatonet1 model. Boxes highlight modules centered on NADH (module ID: 253956) and NADPH (module ID: 254789) consumption and production. The two modules share a number of reactions with identical main (carbon) reactants but different cofactors. For example, malate oxidation in the mitochondria (r0057) is in the NADH module, whereas malate oxidation in the cytosol (r0058) is in the NADPH module.

(TIF)

The authors thank Douglas Weaver for his lively discussions towards the development of ShReD as a metric, Sean Sullivan for his assistance in writing code for the hierarchical tree visuals, Michael Yi for his assistance in generating the HepatoNet1 partitions, and Richard Mondello and Marshall Moutenot for their help in implementing Newman's community detection algorithm.