Skip to main content
Advertisement

< Back to Article

Fig 1.

Flow chart of MEGENA.

A) Fast planar filtered network construction. Significant interactions are first identified and then embedded on topological surface via a parallelized screening procedure described in the text. On the right, a toy example is illustrated to show construction of PFN from a thresholded network by FDR (top left), and gradual construction of PFN with number of included links and screened pairs shown on the top of each. B) Multi-scale clustering: Beginning from connected components of the initial PFN as the parent clusters, clustering is performed for each parent cluster and compactness of the sub-clusters are evaluated. These steps are described in the dotted box. The clustering is performed iteratively until there remains no further parent clusters meaningful to split. C) Downstream analyses: Multiscale Hub Analysis (MHA) is performed to detect significant hubs of individual clusters and across α, characterizing different scales of organizations in PFN. Then, clusters are ranked by associations to clinical traits including enrichment of differentially expressed gene (DEG) signatures, and correlations to survival end-point etc.

More »

Fig 1 Expand

Fig 2.

Comparison of acceptance rates of correlation pairs into PFN links.

A,B) Results from PFN construction from TCGA lung squamous cell carcinoma (LUSC) data including 20523 genes. 57562 links out of maximal possible link number of 61563 are embedded. The left panel (A) shows the acceptance rates without PCP (denoted as “serial”, and colored as blue), and after performing PCP (denoted as “PCP”, and colored as red), as a function of number of links already embedded on the PFN, normalized by the maximum possible number of embedded links. The right panel (B) shows the ratio of acceptance rates after PCP to the acceptance rates without PCP is plotted as a function number of links already embedded on the PFN, normalized by the maximum possible number of embedded links. C,D) Results from TCGA thyroid carcinoma (THCA) data including 16639 genes. 44802 out of maximal possible link number of 49911 are embedded. The right and left panel show the same plots as described in the case of LUSC.

More »

Fig 2 Expand

Fig 3.

Validation of PFNs in comparison to various network inference methods.

A. Comparisons of AUC of ROC for weighted shortest path distances of inferred networks from simulated data from various golden standard networks (labeled on the top), in comparison to ARACNE and RF. Different combinations with Pearson’s correlation coefficient (Pearson), mutual information (MI) and Euclidean distance (Euclid) were tested. B-C. Comparison of BRCA TF knock down signatures on BRCA PFN (red) and FDRN (green) neighborhoods of the target TFs, inferred from MI. The strips on the top of each plot shows expression fold changes (1.3 and 1.5 respectively) to derive these signatures. B shows FDR corrected FET p-values against the number of significantly enriched signatures. C shows enrichment fold change cut-off against the number of significantly enriched signatures. D-E. Comparisons of BRCA TF knock down signatures on inferred networks from PCC. D and E correspond to FDR corrected FET p-values and enrichment fold changes, similarly to B and C.

More »

Fig 3 Expand

Table 1.

Table of best average AUC-ROC across various FDR thresholds.

Each column represents the combination of network inference method and similarity/dissimilarity measure tested, and each row represents gold standard networks from which time series were generated. The best performing methods are highlighted by bold font.

More »

Table 1 Expand

Fig 4.

The global BRCA PFN.

Different node colors represent different clusters identified at a scale of α = 1.3. Node size and label size are proportional to node degree.

More »

Fig 4 Expand

Fig 5.

Degree distributions of the BRCA PFN (A) and the LUAD PFN (B).

The x-axis is the logarithm of degree k and the y-axis is the logarithm of inverse cumulative degree distribution, P(k’ > k). Red straight line is fitted distribution for P(k^'>k)~k^(γ+1), where γ is the estimated exponent of the underlying degree distribution. Respective γ value is displayed at the top.

More »

Fig 5 Expand

Fig 6.

Comparison of MEGENA (as a combination of the multiscale clustering analysis and PFN) and various combinations of the established clustering techniques (eigenvector, infomap, walktrap, WGCNA) and the networks (PFN, FDRN, WGCN) using the TCGA BRCA gene expression data.

Two different similarity measures (MI and PCC) were used to perform analyses to compare robustness with respect to difference in measures to evaluate interactions. A) The number of significantly enriched functional/pathway signatures (Bonferroni corrected FET p-values) from MSigDB at various p-value thresholds against. B) Number of significantly enriched functional/pathway signatures from MSigDB at the various odds ratio thresholds. C) Number of clusters predictive of patient survival (based on FDR corrected Cox p-values) at various significance levels. D) Number of clusters predictive of patient survival (based on FDR corrected Cox p-values) and associated to at least one significantly under-represented signatures with Bonferroni corrected FET p-value < 0.05.

More »

Fig 6 Expand

Fig 7.

Identification of the adipocytokine-enriched cluster, comp1_56, which was specifically identified by MEGENA.

A) The Global BRCA PFN. The nodes in red represent the genes that is predictive of overall survival of LumB patients (Cox p-value <0.05). The blue circle indicates the location of the cluster comp1_56. B) A magnified view of the cluster comp1_56. The nodes with labels are the hubs of the cluster.

More »

Fig 7 Expand

Fig 8.

Kaplan-Meier plots of subgroups separated by median expressions of two hub genes AQP7 (A) and CIDEC (B), showing significant logrank p-values.

Blue curves showing lower risks correspond to lower expressions, and red curves showing higher risks correspond to higher expressions.

More »

Fig 8 Expand

Fig 9.

Hierarchical organization of functions and signaling pathways corresponding to the multiscale clusters identified by MEGENA.

A) Comparison of number of significantly enriched functions and pathway signatures across clusters identified at different scale groups. The scale groups identified from MHA are colored according to the legend, and “all” denotes collection of clusters across the scale groups. B) Multiscale organization of clusters in PFN. Each node is a cluster identified by multiscale clustering in PFN, where the node size is proportional to the cluster size, node color coincides with the cluster group color scheme in A, and node labels indicate most enriched function/signaling pathway for individual clusters. A directed link a→b indicates b is a sub-cluster of a.

More »

Fig 9 Expand

Fig 10.

Comparison of expression fold changes (FC) of the hub genes and non-hub genes between different cancer stages in BRCA, against lists of genes identified by mutiscale hub analysis, where fc denotes expression fold change.

The numeric labels on x-axis represent the ranges of α values defining the resolution levels of the hubs, “multiscale” represents intersection of hub genes across different scales, and “non.hub” represents the rest of genes.

More »

Fig 10 Expand

Fig 11.

Kaplan-Meier plots of the subgroups defined by median expression of ROPN1 in A) all the patients, B) the ER+ patients and C) the PR+ patients.

Blue and red curves correspond to the lower and higher expression levels of ROPN1, respectively.

More »

Fig 11 Expand

Fig 12.

Fast PFN construction.

A parallelized screening procedure is developed to extract a subset of gene pairs which are highly likely to be embedded. A) FPFNC begins with a rank-ordered list of association pairs. B) Then a subset of Nc pairs undergo parallelized quality control by their embeddability on a single platform of Go to identify the pairs which are more likely embedded in the subsequent network construction steps. C) These screened set of Nc pairs are then tested on the growing embedded network subsequently. D) A final updated network G, which will be used as Go on the next cycle. The whole processes are repeated until the defined criterion for termination is met.

More »

Fig 12 Expand

Fig 13.

Flow chart of the clustering analysis procedure for each value of compactness resolution parameter, α.

The upper panel illustrates the k-split procedure within each cluster to detect optimal sub-clusters. The lower panel describes the compactness evaluation procedure (CEP) after k-split. CEP compares the parent cluster prior to k-split with the sub-clusters after k-split by means of the compactness measure, νl, and updates the partition accordingly. On the left, each step is illustrated by a graphical toy example. From the top, the pictures correspond to: the initial network subject to clustering, correct classification of boundary nodes by BDP (Before: before BDP, After: correction after BDP), identification of the optimal k via modularity Qk, final clusters, and comparison between initial network and sub-clusters via compactness. These steps are iterated for all clusters from the newly updated partition until no further update can be made.

More »

Fig 13 Expand

Fig 14.

Identification of hubs at various scale (defined by α) groups in the breast cancer PFN.

A) Plots of various internal validity indices used for selecting the optimal number of clusters to group α values. B) Barplot showing summarized scores from normalized ranks by internal validity indices from A). C) A heatmap of the pairwise Euclidean distances between any two vectors of the within-cluster connectivity (determined by Cw(V,A)) of all the nodes at the corresponding scales. The color bar on the top of heatmap represents the distinct scale clusters identified by MHA.

More »

Fig 14 Expand