Fig 1.
Caspo-ts receives as input data a prior knowledge network (PKN) and a discretized phosphoproteomic dataset. In this example the phosphoproteomic data consists of two perturbations involving akt (inhibitor) and hgf (stimulus): 1) akt = 0, hgf = 1 and 2) akt = 1, hgf = 0. A black colored perturbation means the inhibitor or stimulus was perturbed (1) while white represents the opposite (0). Readouts are specified in blue and describe the time series under given perturbations. Using this input data, caspo-ts, performs two steps: ASP solving and model checking. In the ASP solving step: (i) a set of BNs, compatible with the PKN, is generated, (ii) afterwards an over-approximation constraint is imposed upon each candidate BN to filter out invalid BNs, that do not result in an over-approximation of the reachability between the Boolean states given by the phosphoproteomic dataset, and finally (iii) BNs are optimized using an objective function minimizing the distance to the experimental measures. The ASP step also introduces repairs in some data points of the time series that added penalties to the objective function. These corrected traces will be given to the model checker. In the model checking step, the exact reachability of all the (binarized and corrected) time series traces in the family of BNs is verified.
Fig 2.
Breast cancer signaling pathway.
This figure shows the reconstructed signaling network from a combination of databases. An arrow shows the positive regulatory relationship between two proteins, while a T shaped arrow indicates inhibition. Green nodes are stimuli, blue nodes are readouts, white nodes are unmeasured or unobserved, and blue nodes with a red border represent inhibitors and readouts at the same time. Please note that in the node labels, we have added the phosphorylation sites to the protein names in order to connect them to the experimental measurements.
Fig 3.
Boolean Network of breast cancer cell lines.
The aggregated graph for all cell lines. Blue, red, green and orange edges are used for each cell line BT20, BT549, MCF7 and UACC812, respectively. The nodes are connected by logic gates (AND or OR) to their direct predecessors. Edges are used to show influences (→ for positive and ⊣ for negative). An AND gate is depicted by a small black circle where the incoming edges correspond to the inputs of the gate. An OR gate is depicted by multiple incoming edges to the node. A different color scheme is used to represent different types of nodes. The green color is for stimuli, the red for inhibitors, the blue for readouts, and the white for unobserved nodes. Black edges denote common hyper-edges across cell lines; the thickness of the black hyper-edge denotes the number of cell lines sharing this hyper-edge.
Table 1.
Similarity scores among breast cancer cell lines.
Fig 4.
Heterogeneous Boolean functions.
The Boolean functions are represented on the y-axis and the frequency of each Boolean function is shown on the x-axis. A Boolean function, or hyper-edge, is of the form node ← expr, where node is the receiver of the Boolean clause expr in the BN. In the Boolean clause, the not operator is represented by a “!” symbol and the AND operator by a “+” symbol. The disjunction of clauses is represented by multiple reactions upon the same receiver node.
Fig 5.
Common Boolean functions across all four cell lines.
The Boolean functions are represented on the x-axis and the frequency of each Boolean function is shown on the y-axis.
Table 2.
This table summarizes the RMSE results for each cell line. We have calculated the discrete RMSE (error related to the discretization of the data) and the model RMSE (caspo-ts error). The Delta column shows the difference between model and discrete RMSE.
Fig 6.
Performance assessment with learning, testing and random datasets.
The x-axis shows the cell line and the y-axis shows the RMSE ratio (see Eq (2)) of the inferred BNs from the HPN-DREAM data for each cell line with respect to the three datasets. The three datasets are encoded by different color codes. The RMSE ratio with respect to the HPN-DREAM learning and testing datasets is shown in blue and green colors, respectively. The random dataset RMSE ratio distribution is shown as red boxplots.
Fig 7.
ROC curve across all cell lines.
The x-axis shows the false positive rate and the y-axis denotes the true positive rate. These rates are calculated using Eqs (3) and (4). The average AUROC score is 0.77.