^{1}

^{*}

^{1}

^{1}

^{3}

^{¶}

^{2}

^{¶}

These authors also contributed equally to this work.

Conceived and designed the experiments: GEEW KJ TCE. Performed the experiments: GEEW MK. Analyzed the data: GEEW MK KJ TCE. Wrote the paper: GEEW MK KJ TCE.

The authors have declared that no competing interests exist.

Previously, we introduced causal mapping (CMAP) as an easy to use systems biology tool for studying the behavior of biological processes that occur at the cellular and molecular level. CMAP is a coarse-grained graphical modeling approach in which the system of interest is modeled as an interaction map between functional elements of the system, in a manner similar to portrayals of signaling pathways commonly used by molecular cell biologists. CMAP describes details of the interactions while maintaining the simplicity of other qualitative methods (e.g., Boolean networks).

In this paper, we use the CMAP methodology as a tool for generating hypotheses about the mechanisms that regulate molecular and cellular systems. Furthermore, our approach allows competing hypotheses to be ranked according to a fitness index and suggests experimental tests to distinguish competing high fitness hypotheses. To motivate the CMAP as a hypotheses generating tool and demonstrate the methodology, we first apply this protocol to a simple test-case of a three-element signaling module. Our methods are next applied to the more complex phenomenon of cortical oscillations observed in spreading cells. This analysis produces two high fitness hypotheses for the mechanism that underlies this dynamic behavior and suggests experiments to distinguish the hypotheses. The method can be widely applied to other cellular systems to generate and compare alternative hypotheses based on experimentally observed data and using computer simulations.

Recently much effort has been focused on gaining a systems-level understanding of processes that occur on the cellular and molecular level. Because the external and internal environments of cells are constantly changing, any design principle employed at this level must be robust to perturbations. In terms of computational models, this implies that some degree of uncertainty in key parameter values must be tolerated without significantly affecting system performance. This situation leads quite naturally to an increased role of coarse-grained descriptions of cellular systems such as Boolean networks

Previously we proposed a graphical systems biology approach, causal mapping (CMAP), to describe complex cellular and molecular systems

In this paper, we introduce the use of the CMAP as a hypothesis generating tool. Other similar approaches have been developed using different network techniques including Boolean networks

To illustrate the algorithm, we considered a simple CMAP consisting of three concepts (C_{1}–C_{3}) with the goal of determining which network architectures are capable of adaptation. That is, at least one of the concepts must return to near its basal level in the presence of a persistent stimulus. Such behavior is common in genetic networks and signaling pathways _{1}) of 1 concept 3 (C_{3}) was required to respond transiently and reach a maximum value higher than 0.2, and eventually return to a value below half of the maximum. Note that the fitness indices below correspond to calculations that satisfy these particular criteria; the optimum scheme in terms of highest fitness will, in general, depend on the selection of criteria.

In _{2} influenced C_{1} or/and C_{3} influenced C_{1} or C_{2} were not considered (^{3} = 125. The only feed-forward configuration satisfying the criteria is shown in

Depictions of the configurations, with the concepts in boxes and their influences represented by green arrows (positive) and bar-headed red connectors (negative), are shown on the left of each panel. All configurations have the requirement that concept 1 transiently activates concept 3. _{1}; blue, C_{2}; and dark green, C_{3}. A: only feed-forward reactions are allowed; B: a feedback from C_{2} to C_{1} is allowed; C: a case where no interactions between C_{1} and C_{2} are allowed; D: there are no limitations on the connections. (See Illustration of the

When a negative feedback loop from C_{2} to C_{1} (_{1}. If there are no interactions between C_{1} and C_{2}, the configuration depicted in _{1} diminishes with time to a steady-state level. If there are no restrictions on any of the connections, a configuration (_{1} (inhibiting) and C_{3} (activating), C_{2} evolves non-monotonically in time. Since none of the other hypotheses produced similar results, observation of this behavior would provide strong support for such a mechanism. This behavior is a general property of the model but, for some parameter sets, it is not so pronounced and its detection would require adequate signal to noise ratio in the experiment. In summary, in this section we showed how the hypotheses generation algorithm can be applied to a simple three-concept system to determine pathway architectures that respond transiently to a sustained external signal.

Pletjushkina et al.

In a previous study

The scheme is derived from the original CMAP paper _{i}_{ii}

We constructed eight different candidate configurations of the system, all of which include eight concepts (see

Configuration # | Hypothesis description | Total number of combinations, _{total} |
Number of valid combinations, _{i} |
Fitness index, F | Set size, K | ||||

Fixed | Random | Fixed | Random | Fixed | Random | ||||

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |

1 | Con to SAC | 19683 | 4000000 | 0 | 6 | 0 | 1.50E-06 | 3 | |

‘Cytosol’ to SAC | 1953125 | 4000000 | 292 | 97 | 1.50E-04 | 2.43E-05 | 5 | ||

Con to MLC-pho | 40353607 | 4000000 | 4548 | 222 | 1.13E-04 | 5.55E-05 | 7 | ||

ROCK to MLC-pho | |||||||||

2 | Con to SAC | 19683 | 4000000 | 0 | 0 | 0 | 0 | 3 | |

‘Cytosol’ to SAC | 1953125 | 4000000 | 0 | 0 | 0 | 0 | 5 | ||

Con to MLC-pho | 40353607 | 4000000 | 0 | 0 | 0 | 0 | 7 | ||

ROCK to MLC-pho | |||||||||

3 | Con to SAC | 19683 | 4000000 | 0 | 0 | 0 | 0 | 3 | |

‘Cytosol’ to SAC | 1953125 | 4000000 | 0 | 0 | 0 | 0 | 5 | ||

Con to MLC-pho | 40353607 | 4000000 | 0 | 0 | 0 | 0 | 7 | ||

ROCK to MLC-pho | |||||||||

4 | Con to SAC | 19683 | 4000000 | 279 | 7916 | 0.0142 | 1.98E-03 | 3 | |

‘Cytosol’ to SAC | 1953125 | 4000000 | 3429 | 4250 | 1.76E-03 | 1.06E-03 | 5 | ||

Con to MLC-pho | 40353607 | 4000000 | 91854 | 2945 | 2.28E-03 | 7.36E-04 | 7 | ||

ROCK to MLC-pho | |||||||||

5 | Con to SAC | 2187 | 4000000 | 743 | 185617 | 0.3397 | 0.0464 | 3 | |

‘Cytosol’ to SAC | 78125 | 4000000 | 8127 | 127856 | 0.104 | 0.032 | 5 | ||

Con to MLC-pho | 823543 | 4000000 | 71221 | 43949 | 0.0865 | 0.011 | 7 | ||

ROCK to MLC-pho | |||||||||

6 | Con to SAC | 2187 | 4000000 | 0 | 0 | 0 | 0 | 3 | |

‘Cytosol’ to SAC | 78125 | 4000000 | 0 | 0 | 0 | 0 | 5 | ||

Con to MLC-pho | 823543 | 4000000 | 0 | 0 | 0 | 0 | 7 | ||

ROCK to MLC-pho | |||||||||

7 | Con to SAC | 6561 | 4000000 | 0 | 0 | 0 | 0 | 3 | |

‘Cytosol’ to SAC | 390625 | 4000000 | 0 | 0 | 0 | 0 | 5 | ||

Con to MLC-pho | 5764801 | 4000000 | 0 | 0 | 0 | 0 | 7 | ||

ROCK to MLC-pho | |||||||||

8 | Con to SAC | 2187 | 4000000 | 0 | 0 | 0 | 0 | 3 | |

‘Cytosol’ to SAC | 78125 | 4000000 | 0 | 0 | 0 | 0 | 5 | ||

Con to MLC-pho | 823543 | 4000000 | 0 | 0 | 0 | 0 | 7 | ||

ROCK to MLC-pho |

The weight for the following influences were fixed: MLCK-p-MLC, MLC-pho - p-MLC, calcium uptake and Ca-pump work (calcium - calcium), calcium release from Ca_{i}^{2+}-CaM (Ca-CaM – Ca _{i}^{2+}), Ca-CaM – MLCK, Ca-CaM dissociation (Ca-CaM - Ca-CaM), MLCK-Ca-CaM dissociation (MLCK - Ca-CaM).

The configurations were tested against three criteria formulated from experimental observations

The configurations were tested in a way similar to that described in the section Illustration of the

In the previous section we discussed the results of simulations with fixed values for several weights. based on the results of our previous work ^{11} possible combinations of parameter values. The value of each weight was chosen from a uniform random distribution. The results from this approach are presented in

We constructed histograms of the distribution of weights associated with each causal influence that give successful outcomes for hypotheses 4 and 5 based on the Monte-Carlo method for sampling parameter space. The results for four selected causal influences are shown in

Distribution of weights for hypotheses 4 and 5 for four causal influences. Top row: hypothesis 4; bottom row: hypothesis 5. All histograms are normalized by the total number of occurrences.

We tested different values of the set size (K) to verify that our conclusions were independent of this parameter. _{i}^{2+}; calcium→calmodulin; and p-MLC→contractility (see

Distribution of weights for hypothesis 5 for four causal influences as a function of set size. K = 3 (upper row), 5 (middle row), 7 (bottom row). All histograms are normalized by the total number of occurrences.

A key problem in modeling is to find parameter sets that describe system behaviour. We used the CMAP approach to find parameter sub-space whose values describe qualitative features of experimental data sets. This led to computation of the fitness index for various configurations (see

To further test competing hypotheses, we asked how weight sets included in valid hypotheses would respond to defined perturbations that correspond to feasible manipulations of the experimental system under investigation. To quantify the effects of such perturbations, we adopted the following procedure:

Identify an influence which can be experimentally manipulated, e.g. titration of an inhibitor;

For each set of weights that produced oscillations, shift the weight of interest by a chosen amount to simulate the experimental manipulation;

Simulate the CMAP with the new set of parameters and determine the period and amplitude of the resulting oscillations and the fraction of non-oscillators;

Compute the number of sets of weights in the ensemble that lead to an increase, decrease, or a loss of oscillations.

This procedure is demonstrated in

The bar graphs represent the portion of weight sets that produced an increase in oscillation period (green) or a decrease in oscillation period (red) or cessation of oscillations (black) when the value of calcium self-inhibition weight (_{CaCa}

The causal mapping technique has the potential to be an effective tool for studying complex biological systems. On the one hand, CMAP is a semi-quantitative method similar to Boolean networks and its extensions

In this paper, we have added hypotheses generation to the CMAP toolbox. This methodology enables investigators to rank hypotheses according to a fitness index. Hypotheses with high fitness indices represent operating mechanisms that are robust to variations in parameter values, and, therefore, in theory represent good design principles for operating in the fluctuating environments found at the cellular and molecular levels. Thus, one interpretation of a high fitness index is that these systems represent architectures most likely to survive natural selection.

We applied the hypothesis generation tool to a simple test case of a three-element signaling module and to the more complex phenomenon of cortical oscillations

Coarse-grained approaches such as this will have some limitations. Of course, as the complexity of biological networks increases, the number of possible configurations increases in an exponential fashion. However, this is limited in a practical sense by the prior knowledge we have about this system derived from laboratory experiments and the biological literature. It could also be argued that the weight interval we employ [−1, 1] is unduly restrictive in limiting the range of variation of weights that we employ. In this regard, it should be noted this is already an improvement in terms of modeling dynamics when compared to the frequently employed Boolean networks which are binary in nature. Moreover, this range of weights employed already produces a rich repertoire of parameter combinations that qualitatively reproduce the observed behavior.

As biologists continue to move toward studying cellular and molecular systems as a whole, there will be an increased need for mathematical approaches to interpret and codify experimental results. We believe the CMAP provides the appropriate level of description within an intuitive framework to make sense of these complex biological systems.

The equations describing how the concepts _{j}(t)_{ij}_{j}(t−1)_{j}(t−1)_{j}(t−1)_{j}

Each causal influence is assigned a weight (_{ij}_{ij}

In _{i}_{i}_{ij}

In the previous work _{j}_{max}_{i}

In the original version of the CMAP

Using this extension, the causal function for phosphorylated myosin light chain (MLC-pho→pMLC influence, see

Note the differences between MLCK→pMLC and MLC-pho→pMLC influences: in the first case pMLC is a

A given network configuration is defined by the number of concepts (nodes) and influences (edges) between the concepts and the nature of the influences (positive or negative). For a given network configuration, the weights of the influences and the initial values for the concepts can vary, but the weights cannot change sign. Network configurations can differ from each other by the number of concepts, the connectivity or the nature of the influences (positive or negative). For example,

Because we assume a discrete range of values for the weights, for any given CMAP configuration, there is a finite number of parameter sets that need to be investigated. This number, _{total}_{i}_{i}_{total}^{14} = 6,103,515,625), we randomly picked 4*10^{6} weight combinations, see column 4 in

A CMAP

Simulations were performed using Fortran95 to calculate the fitness indices and MATLAB (The Mathworks, Inc) for the rest of simulations.

For three-node simulation, we examined all possible CMAP configurations with a full set of weight values. The configuration had a total of six possible influences between concepts for which the weights could be positive, negative or zero. For K = 5, each positive or negative weight could have five different absolute values [0.1, 03, 0.5, 0.7, 0.9]. Each simulation was performed for 5000 “time” steps. The initial parameters for concept values were: C_{1} = 0.5; C_{2} = C_{3} = 0. We considered that a particular set of weights met a criterion when during the simulation the value of C_{3} transiently increased higher than 0.2 with a subsequent decrease to lower than half of the maximum value which was reached during the increasing stage.

During the simulation, when some of the influences were fixed (see

The algorithm for evaluating the fitness index and ranking hypotheses consists of the following steps:

Define the phenotype as a set of experimental observations that the CMAP configuration should reproduce. These observations form the criteria against which the CMAP configurations are tested.

Build candidate CMAP configurations. Start with the elements that are known to be involved in the processes under study. Next, use all available knowledge to place connections (influences) between these elements.

Specify the weights that will be varied.

For each set of parameter weights, run a simulation with K = 5. Count the number of parameter combinations, _{i}

The value of _{i}

Hypotheses with the highest fitness indices are selected for further studies.

Control:

Start over for set sizes K = 3 and 7.

Perform Monte-Carlo simulations (see section ‘Monte-Carlo method for sampling parameter space’ in text) in which all parameter values are allowed to change.

The controls are needed to make sure that the results are not set size dependent or reflect a special choice of values for the fixed parameters.

The authors are grateful to Nancy Costigliola for useful discussions and help in the manuscript preparation.