Computational discovery of dynamic cell line specific Boolean networks from multiplex time-course data

doi:10.1371/journal.pcbi.1006538

Fig 1.

Caspo-ts workflow.

Caspo-ts receives as input data a prior knowledge network (PKN) and a discretized phosphoproteomic dataset. In this example the phosphoproteomic data consists of two perturbations involving akt (inhibitor) and hgf (stimulus): 1) akt = 0, hgf = 1 and 2) akt = 1, hgf = 0. A black colored perturbation means the inhibitor or stimulus was perturbed (1) while white represents the opposite (0). Readouts are specified in blue and describe the time series under given perturbations. Using this input data, caspo-ts, performs two steps: ASP solving and model checking. In the ASP solving step: (i) a set of BNs, compatible with the PKN, is generated, (ii) afterwards an over-approximation constraint is imposed upon each candidate BN to filter out invalid BNs, that do not result in an over-approximation of the reachability between the Boolean states given by the phosphoproteomic dataset, and finally (iii) BNs are optimized using an objective function minimizing the distance to the experimental measures. The ASP step also introduces repairs in some data points of the time series that added penalties to the objective function. These corrected traces will be given to the model checker. In the model checking step, the exact reachability of all the (binarized and corrected) time series traces in the family of BNs is verified.

More »

Expand

Fig 2.

Breast cancer signaling pathway.

This figure shows the reconstructed signaling network from a combination of databases. An arrow shows the positive regulatory relationship between two proteins, while a T shaped arrow indicates inhibition. Green nodes are stimuli, blue nodes are readouts, white nodes are unmeasured or unobserved, and blue nodes with a red border represent inhibitors and readouts at the same time. Please note that in the node labels, we have added the phosphorylation sites to the protein names in order to connect them to the experimental measurements.

More »

Expand

Fig 3.

Boolean Network of breast cancer cell lines.

The aggregated graph for all cell lines. Blue, red, green and orange edges are used for each cell line BT20, BT549, MCF7 and UACC812, respectively. The nodes are connected by logic gates (AND or OR) to their direct predecessors. Edges are used to show influences (→ for positive and ⊣ for negative). An AND gate is depicted by a small black circle where the incoming edges correspond to the inputs of the gate. An OR gate is depicted by multiple incoming edges to the node. A different color scheme is used to represent different types of nodes. The green color is for stimuli, the red for inhibitors, the blue for readouts, and the white for unobserved nodes. Black edges denote common hyper-edges across cell lines; the thickness of the black hyper-edge denotes the number of cell lines sharing this hyper-edge.

More »

Expand

ROC curve across all cell lines.

The x-axis shows the false positive rate and the y-axis denotes the true positive rate. These rates are calculated using Eqs (3) and (4). The average AUROC score is 0.77.

More »

Expand