Performance assessment of sample-specific network control methods for bulk and single-cell biological data analysis

doi:10.1371/journal.pcbi.1008962

Table 1.

The summary of recommended methods in different biological application scenarios.

* denotes that the method is recommended in the SSC analysis.

More »

Expand

Fig 1.

Overview of the Sample-Specific Control problem (SSC).

(A) The flowchart of SSC analysis. The process of SCC analysis consists of two steps. The first step is to construct the sample-specific state transition network from the sample datasets. For constructing the sample-specific state transition network, several sample-specific network construction techniques have been proposed, including the Single Pearson Correlation Coefficient (SPCC), Linear Interpolation to Obtain Network Estimates for Single Samples (LIONESS), Single-Sample Network (SSN), and Cell-Specific Network construction (CSN) methods. Among them, the SSN method has the requirement for reference samples for constructing the single-sample differential co-expression network. Note that to filter the noise of sample-specific network reconstructions, the directed protein interaction information networks can be used for keeping the edge direction in the sample-specific state transition network. The second step is to design the network control principles; several structural network control methods have been proposed for finding a minimum set of driver nodes to control the whole network state dependent on adequate knowledge of the network structure, including the directed-network-based methods (MMS and DFVS) and the undirected-network-based methods (MDS and NCUA). (B) Representative biological meaning of “network driver nodes” in the structural network control principles. Assuming that biological samples can be represented as the sample-specific interaction network, the sample-specific network driver nodes can provide an efficient resource of personalized driver genes and cell-specific markers that can be useful for understanding the tumor or cell heterogeneity.

More »

Expand

Fig 2.

The evaluation of structural network control methods by simulation of biological networks.

The goal is to evaluate the control efficiency by synthetically achieving control of the networked nonlinear Lorentz oscillator system towards desired attractors on two biological directed protein interaction networks from different resources (i.e. Network-1 and Network-2). (A-B) Desired attractor (8,484, 8.484, 27); (C-D) Desired attractor (-8,484, -8.484, 27); (E-F) Desired attractor (0, 0, 0). The efficiency of a given network is measured by the average value for different tolerance errors from 0.1 to 1.

More »

Expand

Fig 3.

Evaluation of SSC workflows for driver gene identification on nine TCGA bulk cancer gene expression datasets.

By using dissimilar combinations of sample-specific network and reference network, different sample-specific state transition networks can be obtained, e.g., (A) CSN_Net1, (B) CSN_Net2, (C) SSN_Net1, (D) SSN_Net2, (E) SPCC_Net1, (F) SPCC_Net2, (G) LIONESS_Net1, and (H) LIONESS_Net2. Then the performance of four network structural control methods based on these sample-specific state transition networks were evaluated, representing the performances of different SSC analysis workflows.

More »

Expand

Fig 4.

Evaluation of SSC workflows for drug target identification on LUSC and LUAD cancer datasets from TCGA.

To evaluate the usage efficiency of sample-specific network drivers for personalized drug target discovery, the number of targeted sample-specific network drivers matching with drug combinations are calculated and anti-cancer drug target combinations are ranked for each patient. The drug combinations annotated in the Clinical Anti-cancer Combinational drugs (CAC) were applied to assess the AUC of the top-ranked/predicted anti-cancer drug combinations from different SSC workflows using (A) LUSC and (B) LUAD cancer datasets.

More »

Expand

Fig 5.

Evaluation of SSC workflows by key genes identification on Chu-time single-cell gene expression dataset.

(A) Introduction of the Chu-time dataset. This dataset comes from a study of developmental biology and contains 758 cells within six time points (0 h, 12 h, 24 h, 36 h, 72 h, and 96 h) along the differentiation process from human embryonic stem cells to definitive endoderm cells. (B)–(F) The usage of four sample-specific network construction methods in identifying the cell clusters, i.e. cell types/states. (G) Detection of a group of genes with significant expression changes along the multiple time points as determined by SCPattern method.

More »

Expand

Fig 6.

Evaluation of SSC workflows by the identification of important genes involved in cell fate decisions.

The corresponding SSC analysis dependent on different sample-specific state transition networks, including (A) CSN_Net1, (B) SSN_Net1, (C) SPCC_Net1, (D) LIONESS_Net1, (E) CSN_Net2, (F) SSN_Net2, (G) SPCC_net2, and (H) LIONESS_Net2.

More »

Expand

Fig 7.

Identification of “dark genes”by different SSC workflows on Chu-time single cell dataset.

The different state transition networks include (A) CSN_Net1, (B) CSN_Net2, (C) SSN_Net1, (D) SSN_Net2, (E) LIONESS_Net1, (F) LIONESS_Net2, (G) SPCC-Net1, and (H) SPCC_Net2.

More »

Expand

Fig 8.

Identification of “dark-differential expression genes” by different SSC workflows on multiple TCGA cancer expression datasets.

The different state transition networks includes (A) CSN_Net1, (B) CSN_Net2, (C) SSN_Net1, (D) SSN_Net2, (E) SPCC_Net1, (F) SPCC_Net2, (G) LIONESS_Net1, and (H) LIONESS_Net2.

More »

Expand

Fig 9.

Evaluation of identification consensus among structural control methods.

The comparisons are carried on any two control methods using (A) nine TCGA cancer datasets, and (B) temporal single cell datasets.

More »

Expand

Fig 10.

Evaluation of robustness of network structural control methods.

The MMS, MDS, DFVS, and NCUA were compared, when two reference networks on CSN and SSN and SPCC and LIONESS used. The robustness of network structural control methods is shown in (A) for nine TCGA cancer datasets and in (B) for temporal single cell datasets.

More »

Expand

Fig 11.

The P-value results of these network structure factors compared with random selection method.

(A) Degree centrality; (B) Closeness centrality; (C) Betweeness centrality.

More »

Expand

Fig 12.

The F-score of the driver nodes’ enrichments in the list of cancer-associated genes in the human signaling network.

More »

Expand