Graph-theoretical model of global human interactome reveals enhanced long-range communicability in cancer networks

Evgeny Gladilin

doi:10.1371/journal.pone.0170953

Abstract

Malignant transformation is known to involve substantial rearrangement of the molecular genetic landscape of the cell. A common approach to analysis of these alterations is a reductionist one and consists of finding a compact set of differentially expressed genes or associated signaling pathways. However, due to intrinsic tumor heterogeneity and tissue specificity, biomarkers defined by a small number of genes/pathways exhibit substantial variability. As an alternative to compact differential signatures, global features of genetic cell machinery are conceivable. Global network descriptors suggested in previous works are, however, known to potentially be biased by overrepresentation of interactions between frequently studied genes-proteins. Here, we construct a cellular network of 74538 directional and differential gene expression weighted protein-protein and gene regulatory interactions, and perform graph-theoretical analysis of global human interactome using a novel, degree-independent feature—the normalized total communicability (NTC). We apply this framework to assess differences in total information flow between different cancer (BRCA/COAD/GBM) and non-cancer interactomes. Our experimental results reveal that different cancer interactomes are characterized by significant enhancement of long-range NTC, which arises from circulation of information flow within robustly organized gene subnetworks. Although enhancement of NTC emerges in different cancer types from different genomic profiles, we identified a subset of 90 common genes that are related to elevated NTC in all studied tumors. Our ontological analysis shows that these genes are associated with enhanced cell division, DNA replication, stress response, and other cellular functions and processes typically upregulated in cancer. We conclude that enhancement of long-range NTC manifested in the correlated activity of genes whose tight coordination is required for survival and proliferation of all tumor cells, and, thus, can be seen as a graph-theoretical equivalent to some hallmarks of cancer. The computational framework for differential network analysis presented herein is of potential interest for a wide range of network perturbation problems given by single or multiple gene-protein activation-inhibition.

Citation: Gladilin E (2017) Graph-theoretical model of global human interactome reveals enhanced long-range communicability in cancer networks. PLoS ONE 12(1): e0170953. https://doi.org/10.1371/journal.pone.0170953

Editor: Peter Csermely, Semmelweis University, HUNGARY

Received: February 24, 2016; Accepted: January 13, 2017; Published: January 31, 2017

Copyright: © 2017 Evgeny Gladilin. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: The author received no specific funding for this work.

Competing interests: The author have declared that no competing interests exist.

Introduction

Clinically relevant, macroscopically detectable tumors are known to exhibit phenotypic and molecular genetic heterogeneity [1]. Despite considerable genetic diversity, different tumor cells manage to maintain common functional capabilities that manifest in hallmarks of cancer [2]. The underlying mechanisms of cancer hallmark maintenance in different tumors with different genomic profiles are not yet well understood. As a consequence of cancer heterogeneity and plasticity, differential signatures defined by a relatively small number of genes-proteins exhibit substantial variability, which complicates the identification of cancer-specific alterations in microarrays and other omics data.

An alternative approach to quantitative characterization of malignant transformations consists in the assessment of the global architecture of cellular networks. Recent advances in network science provide a powerful theoretical framework for the description of global properties of physical, social and biological networks [3–5]. For construction of binary and weighted biological networks, gene co-expression maps [6–8], pairwise physical interactions and non-physical associations between proteins, DNA, RNA, metabolites and gene regulatory events have been applied [9–23]. Diverse parameters of local and global network organization have been used for quantitative description and differentiation of normal, diseased and random interactomes including graph-theoretical measures such as node degree, centrality, modularity, clustering, [24–27], network statistics [28], information content [29] and hyperbolicity [30]. Global information-theoretical features, such as network entropy, have been shown to significantly differ between cancer and non-cancer interactomes [31, 32].

Cancer networks have repeatedly been reported to be significantly larger, interlinked more densely and more tautly organized in comparison to non-cancer and, in particular, random networks [25, 33–37]. These findings were, however, challenged by reasonable criticism that refers to potential biases of existing network descriptors due to overrepresentation of disease-related genes. Consequently, these genes exhibit a higher number of interactions, higher degrees and other artificially exceptional features in contrast to poorly studied targets [38, 39]. To overcome shortcomings of degree-based descriptors, we present a novel degree-normalized communicability measure that is applied to study information flow in global cancer and non-cancer networks whose basic topology is defined by directional and gene expression weighted protein-protein and gene regulatory interactions.

The manuscript is organized as follows. First, methods for construction of gene expression weighted network topology are described. The experimental results of comparative analysis of cancer and non-cancer interactomes are presented and discussed. The complete set of raw and processed data used in this work can be found in supplementary information.

Methods

Microarray data preprocessing

TCGA level-3 microarray data from tumor and normal tissue samples of breast invasive carcinoma (BRCA), colon adenocarcinoma (COAD) and glioblastoma (GBM) patients are used. Lists of all TCGA samples used in this study are in S1 Table.

Statistical significance of differential gene expression between tumor and normal tissue samples is evaluated using the t-test with the p-value threshold p < 0.01. For significantly up/downregulated genes, the log2-fold average differential gene expression (ADGE) Δ_i is computed. Unidentified and non-significantly altered genes are assumed to have unchanged level of expression (Δ_i = 0). Next, all N genes are sorted according to a hybrid score based on a product of t- and ADGE-values: λ_i = sign(Δ_i)(t_iΔ_i). To avoid dependency of subsequent calculations on statistical outliers, absolute values of gene scores are subsequently substituted by a uniform pattern of average gene expression ranging in λ_i ∈ [−6.5, 6.5]. This transformation has the effect that genes with the same rank in a λ_i-sorted list become equal weights in different cancer and non-cancer samples: (1) Sorted lists of rank-normalized gene weights for all tumor/norm, norm/tumor (i.e., reversely weighted tumor/norm lists) as well as randomized data are in S2 Table.

Network topology compilation

Network topology is compiled on the basis of directed pairwise protein-protein and gene regulatory interactions by integration of open-source data provided with STRING (string-db.org), MSigDB (software.broadinstitute.org/gsea/msigdb) and PATHWAYCOMMONS (www.pathwaycommons.org). The complete list of 74538 directed pairwise interactions is in S3 Table.

Network communicability: plausibility considerations

First, we want to define a plausible measure for quantification of total information flow (communicability) along a single linear pathway. This should be done in such a way that the absence or malfunction of one single pathway link (i.e., network edge) results in interruption or significant impairment of the entire pathway communicability, see Fig 1(a). This intuitively comprehensible constraint is considered by a measure that is defined as a product of all edge weights E_i, i.e., (2) where E_i ≥ 0 are positive numbers whose values indicate working (conducting) or non-working (non-conducting) state of the i-th pathway link. In the case of unweighted networks, conducting and non-conducting states of pathway links are described by binary weights of network edges E_i = 1 and E_i = 0. In weighted networks, weights of network edges are positive floating-point numbers E_i > 0.

Download:

Fig 1. Principle concept of network communicability assessment.

a. Information flow along a single linear pathway joining two nodes (P1, P4) is defined as a product of weights E_i of all pathway links (here: E₁ E₂ E₃). Disruption of a single pathway link (E₂ → 0) results in interruption of the entire pathway communicability. b. Total information flow through multiple pathways is defined as a sum of communicabilities of all linear pathways joining a pair of nodes (here: three pathways with communicabilities E₁ E₂ E₃, E₄ E₅ E₆ E₇, E₄ E₈ E₉ E₇). Disruption of a single pathway link (E₂ → 0) does not interrupt the total P1→P4 communicability.

https://doi.org/10.1371/journal.pone.0170953.g001

If two network nodes are connected by multiple pathways, total information flow should not critically depend on the state of a single pathway link or even one single pathway, see Fig 1(b). Consequently, total communicability between each two network nodes can be defined by a sum of all single pathway communicabilities: (3)

Another plausible requirement on the network communicability measure is that intensity of information flow should decline with increasing distance from the source. Consequently, Eq (3) can be extended to (4) where ω(N_j) > 1 denotes a pathway length dependent weighting factor.

Normalized total communicability

In graph theory, the total number of walks of the length n joining nodes of an arbitrary complex network is calculated as the n-power Aⁿ of the graph-representing, sparse adjacency matrix A, see Fig 2. In fact, one can show that Eq (3) is formally identical to the Aⁿ compilation rule from the entries of A. In turn, the weighted version of our plausibly derived communicability measure (Eq (4)) naturally emerges within the concept of the adjacency matrix exponential e^A. Following [40, 41], the total communicability C_ij(n) between a pair of network nodes (i, j) joined by all possible walks of the maximum length n is calculated as the exponential of the adjacency matrix A_ij computed up to the n-power term of the series expansion: (5) where I is the identity matrix. In simple terms, C_ij(n) represents a n!-weighted sum of all walks (pathways) of the lengths 1, 2, 3‥n joining a pair of network nodes with indices (i, j). In this study, the matrix exponential is calculated for n ≤ 7 using sparse matrix multiplication algorithms as available with the CSPARSE package [42]. Thereby, computational costs for iterative compilation of a C_ij(n = 7) matrix with 7018 diagonal elements on a Intel Core i5-4590 powered PC amount roughly one hour.

Download:

Fig 2. Example of network communicability calculus using adjacency matrices.

Given initial adjacency matrices of directional weighted (W) and unweighted (A) pairwise interactions between network nodes, all weighted and unweighted walks of the length n > 1 are calculated from the n-power matrices Wⁿ and Aⁿ, respectively. Note that the shortest possible loop in a directed graph has the length n = 3. Consequently, diagonal entries of W³ and A³ contain non-zero values, while the n = 1, 2 power matrices are hollow.

https://doi.org/10.1371/journal.pone.0170953.g002

Unweighted adjacency matrices indicate existence A_ij = 1 of connections between each two nodes (i, j), but they do not consider intensity of their interactions. In order to account for biologically relevant differences in strength of network interconnections, weighted adjacency matrices are required. Here, we make strength of network edges dependent on differential gene expression. The basic idea consists of constructing a matrix of differential gene expression weighted interactions which entries are positive numbers W_ij ≥ 0 that have the following interpretation (6) Since interactions between neighbor nodes are defined on network edges, we want to define a mapping function which maps differential gene expression of each two neighbor nodes on their interlinking edge. Furthermore, this mapping should consider that communicability of a single network link is critically dependent on intensities of interlinked nodes. The above requirements are met by the following mapping function which is further termed as the minimum metric (shortly, min-metric): (7) In addition, we introduce an alternative average metric (shortly, avg-metric) (8) that allows to account for a global trend of gene regulation in the entire pathway. In analogy to Eq (5), the matrix of differential gene expression weighted communicability G_ij is defined as (9) Finally, in order to avoid artificial overweighting of well-studied genes with high number of of interacting neighbors, we introduce the normalized total communicability (NTC) matrix D_ij(n) where non-zero entries are defined by the ratio of differential gene expression weighted to unweighted communicability: (10) Consequently, entries of D_ij(n) matrices indicate relative changes in total information flow between each the two network nodes (i, j) joined by pathways of the maximum length n independently on the total number of these pathways.

Results

Starting from 74538 directed interactions between 7018 network nodes, NTC matrices of multistep pathways are computed iteratively as described above (Eqs (5)–(10)). Complete lists of weighted and unweighted pairwise interactions (i.e., 1st order adjacency matrices) for tumor/norm, norm/tumor and ‘random expression’ samples are in S3 Table. With increasing pathway lengths, communicability matrices become densely populated. As shown in Fig 3, the occupancy of communicability matrices (i.e., the ratio of non-zero matrix entries to the dimension of the fully occupied matrix 7018²) displays a particularly rapid increase from 0.15% to 53% at n = 4 and saturates around 70%. This means that the majority of network nodes are interconnected via n ≥ 4 distant pathways.

Download:

Fig 3. Total number of non-zero entries and percentage of occupancy of communicability matrices as a function of the maximum pathway length n = 1 − 7.

https://doi.org/10.1371/journal.pone.0170953.g003

Differences between cancer and non-cancer interactomes are first studied using the more restrictive min-metric (Eq (7)). For this purpose, seven NTC matrices D_ij(n = 1 − 7) are computed for BRCA/COAD/GBM cancer interactomes. To analyze dependency of NTC on gene scoring (i.e., gene expression), complementary ‘non-cancer’ NTC matrices are assembled by resorting the gene lists in reverse or random orders, i.e., (11) where 1 ≤ α ≠ β ≤ N are two unequal random indices of differentially expressed genes in sorted BRCA/COAD/GBM gene lists.

To assess global differences between cancer and non-cancer interactomes, the average of all D_ij(n) entries (12) as well as the fraction of enhanced communicability r(n) as a function of the maximum pathway length (n = 1 − 7) are computed: (13) where (k, l) and (i, j) denote indices of elevated (i.e., D_kl(n) > 1) and all NTC entries, respectively. Fig 4 shows plots of and r(n) for cancer and non-cancer networks. Remarkably, the average NTC in the range of n ≥ 4 distant pathways exhibits a persistent increase only in cancer interactomes. In contrast, average NTC of all non-cancer networks declines with an increasing pathway length. Difference between cancer and non-cancer NTC is also visible in the fraction of elevated NTC as a function of the maximum pathway length r(n), which shows more rapid growth in cancer than in the reference non-cancer networks. Similar patterns of elevated long-range NTC in cancer interactomes are also observed when using the avg-metric (Eq (8)). To compare NTC of cancer and non-cancer networks simultaneously in both metrics, diagonal (θ) and cumulative off-diagonal (ξ) communicabilities in min- and avg-metric are computed as follows (14) Fig 5 shows diagonal and off-diagonal n ≤ 7 gene communicabilities of BRCA/COAD/GBM vs randomly weighted interactomes as two-dimensional distributions. Significance of the differences between cancer and non-cancer communicabilities in (θ^min, θ^avg) and (ξ^min, ξ^avg) representations is confirmed by the two-dimensional Kolmogorov-Smirnov test with significance level p < 0.001.

Download:

Fig 4. Statistics of total communicability matrices computed using the min-metric.

Left column: average NTC as a function of the maximum pathway length (n = 1 − 7). Right column: fraction of enhanced network communicability (Eq (13)) as a function of the maximum pathway length (n = 1 − 7). In contrast to normal and randomly weighted networks, BRCA/COAD/GBM cancer networks exhibit elevated long-range NTC.

https://doi.org/10.1371/journal.pone.0170953.g004

Download:

Fig 5. Diagonal and off-diagonal communicability of tumor (cyan) vs random (bordeaux) interactomes in min- and avg-metric.

Each point represents log-sum of total diagonal (left column) re. off-diagonal (right column) communicability of the i-th network node to itself and remote neighbors via all walks of the length (n ≤ 7), respectively. Orange labels indicate a subset of 90 common genes that are associated with elevated communicability in BRCA/COAD/GBM interactomes. Green labels in the COAD plot indicate a fraction of 76 COAD specific EMT-related genes with particularly high communicability in avg-metric.

https://doi.org/10.1371/journal.pone.0170953.g005

To determine whether elevated communicability of different cancer interactomes arises from enhancement of common genes, the impact of simulated gene inhibition on the above statistical features of NTC matrices (Eqs (12)–(13)) is simulated. For this purpose, a cut set of common BRCA/COAD/GBM genes with high NTC in both min- and avg-metric is computed using an iterative procedure, as shown in Fig 6. Starting with the initial set of 530 common BRCA/COAD/GBM genes, the smallest subset of 90 genes is identified whose simulated inhibition is sufficient to decrease the difference between NTC of cancer and non-cancer interactomes, see Fig 5 (orange labels). Subsequent visualization and ontological analysis using STRING reveal association of these tightly interlinked genes with enhanced cell division, DNA replication, cellular stress response and other cancer related functional categories, see Fig 7 and S4 Table. In addition to common genes, there are cancer-type specific genes with high NTC that appear to group in separate clusters in min-avg diagrams. 76 COAD specific genes, indicated in Fig 5 with green labels, build a prominent cluster with particularly high NTC values in avg-metric. These 76 genes are enriched in Hedgehog, Hippo and Wnt pathways which are known to promote Epitelial-to-Mesenchymal Transition (EMT) and metastatic cell transformation [43], see Fig 8 and S5 Table.

Download:

Fig 6. Identification of the cut set of genes associated with elevated NTC in BRCA/COAD/GBM cancer samples.

First, an initial cut set of 530 common BRCA/COAD/GBM genes with elevated NTC is estimated. By iterative reduction of the initial gene set, 90 genes are identified whose simulated inhibition is sufficient to level down the difference between NTC of cancer and non-cancer interactomes.

https://doi.org/10.1371/journal.pone.0170953.g006

Download:

Fig 7. Subnetwork of 90 common BRCA/COAD/GBM genes with high NTC, cf. Fig 5 (orange labels).

https://doi.org/10.1371/journal.pone.0170953.g007

Download:

Fig 8. Subnetwork of 76 COAD specific genes with high NTC in the avg-metric, cf. Fig 5 (green labels).

https://doi.org/10.1371/journal.pone.0170953.g008

Since our above simulations indicate a high level of robustness of cancer networks with respect to inhibition of a relatively small number of genes, we are interested in assessing the inhibitory effects of other prominent gene signatures. For this purpose, we examine a list of 33 genes with a high differential entropy in bladder cancer highlighted in [31]. Remarkably, 12 of 33 bladder cancer genes from [31] are also present in our list of 90 common BRCA/COAD/GBM genes. According to the hypergeometric test HGT(7018, 90, 33, 12) = 2.6e-15, this overlap is statistically significant, see S6 Table. However, simulated inhibition of these 33 genes turned out to not be sufficient for the suppression of elevated NTC. Fig 9 shows the results of simulated inhibition 33 bladder cancer genes from [31] and 90 common BRCA/COAD/GBM genes identified in this work.

Download:

Fig 9. Effects of simulated inhibition of 90 common BRCA/COAD/GBM genes with high NTC vs 33 bladder cancer genes with high differential entropy from [31], see full gene lists in S4 Table.

Inhibition of 33 high-score targets from [31] moderately decreases global communicability features of cancer networks. However, it is obviously not sufficient to suppress their elevation with increasing pathway length. In contrast, simulated inhibition of our 90 targets results in decreasing the difference between NTC features of cancer and non-cancer interactomes.

https://doi.org/10.1371/journal.pone.0170953.g009

Discussion

Network-based approaches to mining omics data are increasingly popular. However, consistent modeling of biological networks remains a challenging task and requires the consideration of numerous factors whose impact on simulation results is still controversially debated in the literature. These factors include the role of a particular network topology, directionality and size as well as the choice of appropriate gene proximity metrics and numerical scores. In this work, we focused on the construction and evaluation of novel descriptors for measurement of network information flow. We let other issues remain widely unaddressed, assuming that simulation results obtained with different network topologies should be, in general, convergent.

To account for the potential bias of degree-based network features, we introduced a degree-independent measure of information flow—the normalized total communicability (NTC). NTC relies on a well-known concept of network topology characterization by means of the n-power adjacency matrices, whose entries indicate the total number of unweighted walks of the maximum length n between each two network nodes. In our approach, adjacency matrices of unweighted network topology are used for normalization of differential gene expression weighted walks. Consequently, NTC does not explicitly depend on node degrees, but rather serves as an integrative measure of up- or downregulation of all pathways of the maximum length n joining each two network nodes.

Similar to other works, we use public databases on pairwise protein-protein and gene regulatory interactions to compile the basic network topology. However, here we rely on a subset of directed interactions that naturally restrict the emergence of loops to n ≥ 3 network steps. Our simulation results show that elevated long-range communicability of different cancer networks is largely caused by circulation of information flow within compact subnetworks of tightly interlinked genes. In networks with non-directed interactions, this feature might be missing.

Our simulations indicate a high level of robustness of cancer networks with respect to inhibition of a low number of genes-proteins. Simulated inhibition of a few dozen genes, including a hit list of 33 bladder cancer genes from [31], was not sufficient to suppress elevation of long-range NTC. Despite the fact that elevated NTC arises in different cancer interactomes from heterogeneous gene expression profiles, we identified a subset of 90 common BRCA/COAD/GBM genes whose simulated inhibition is capable of reducing differences between NTC of cancer and non-cancer interactomes. These genes turn out to be associated with cancer-related ontological categories, including enhanced cell division, DNA replication, elevated energy demand and cellular stress response. We conclude that enhanced NTC reflects correlated activity of genes whose coordination is required for maintenance of sustained proliferation and replication of all tumor cells. In other words, elevated long-range NTC represents a graph-theoretical hallmark of cancer networks. Under the assumption of gradual elevation of NTC in course of cancer development, an abnormal increase of NTC can serve as an early marker of malignant cell transformation. Further investigations of normally proliferating and cancer cells at different stages of disease development are required to prove this assumption and to define reliable diagnostic measures.

While focusing on construction of a feasible graph-theoretical formalism for gene expression weighted network modeling, this work does not go into the discussion of biological mechanisms of cancer network rewiring and regulation. Different biological processes on single gene, chromosome and whole genome level including gene mutations, changes in gene copy number, chromotripsis are known to accompany malignant cell transformation [44]. Consideration of this layer of information will be an important subject for future research.

Finally, our graph-theoretical framework is of potential interest for a broad spectrum of network perturbation problems such as single or multiple gene-protein activation, inhibition or malfunction due to the impact of mutations or interactions with pharmaceutic drugs.

Supporting Information

S1 Table. Lists of TCGA BRCA, COAD, GBM tumor/norm samples used in this study.

https://doi.org/10.1371/journal.pone.0170953.s001

(XLSX)

S2 Table. Sorted lists of average differential gene expression (ADGE) values of TCGA BRCA/COAD/GBM tumor/norm (TN), norm/tumor (NT) (i.e., reversed TN) and ‘random expression’ data.

https://doi.org/10.1371/journal.pone.0170953.s002

(XLSX)

S3 Table. Lists of 74538 unweighted and gene expression weighted pairwise interactions (i.e., 1st order adjacency matrices) computed on the basis of TCGA BRCA/COAD/GBM TN, NT and ‘random expression’ data.

https://doi.org/10.1371/journal.pone.0170953.s003

(XLSX)

S4 Table. 90 common BRCA/COAD/GBM genes associated with elevated NTC and their GO enrichment terms.

https://doi.org/10.1371/journal.pone.0170953.s004

(XLSX)

S5 Table. 76 COAD-specific genes associated with elevated NTC in the avg-metric and their GO enrichment terms.

https://doi.org/10.1371/journal.pone.0170953.s005

(XLSX)

S6 Table. Overlap between 90 common BRCA/COAD/GBM genes with high NTC (see S4 Table) and 33 genes with the high differential entropy in bladder cancer from [31](Suppl.Table.2).

https://doi.org/10.1371/journal.pone.0170953.s006

(XLSX)

Acknowledgments

The author is grateful to Amanda Chase for critically reading the manuscript.

Author Contributions

Conceptualization: E.G.
Data curation: E.G.
Formal analysis: E.G.
Investigation: E.G.
Methodology: E.G.
Project administration: E.G.
Software: E.G.
Validation: E.G.
Visualization: E.G.
Writing – original draft: E.G.
Writing – review & editing: E.G.

References

1. Marusyk A, Polyak K. Tumor heterogeneity: Causes and consequences. Biochim Bioph Acta—Reviews on Cancer. 2010;1805(1):105–117. pmid:19931353
- View Article
- PubMed/NCBI
- Google Scholar
2. Hanahan D, Weinberg R. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–674. pmid:21376230
- View Article
- PubMed/NCBI
- Google Scholar
3. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. The large-scale organization of metabolic networks. Nature. 2000;407(6804):651–654. pmid:11034217
- View Article
- PubMed/NCBI
- Google Scholar
4. Barabási AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nature reviews Genetics. 2004;5(2):101–113. pmid:14735121
- View Article
- PubMed/NCBI
- Google Scholar
5. Christensen C, Albert R. Using graph concepts to understand the organization of complex systems. Int J Bifurcation and Chaos. 2006;17:2201.
- View Article
- Google Scholar
6. Hanisch D, Zien A, Zimmer R, Lengauer T. Co-clustering of biological networks and gene expression data. Bioinformatics. 2002;1(18):S145–54. pmid:12169542
- View Article
- PubMed/NCBI
- Google Scholar
7. Zhang B, Horvath S. A General Framework for Weighted Gene Co-Expression Network Analysis. Stat Appl Genet Mol Biol. 2005;4(1):17. pmid:16646834
- View Article
- PubMed/NCBI
- Google Scholar
8. Horvath S, Dong J. Geometric interpretation of gene coexpression network analysis. PLoS Computational Biology. 2008;4(8):24–26. pmid:18704157
- View Article
- PubMed/NCBI
- Google Scholar
9. Bader GG, Hogue CC. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003;4:2. pmid:12525261
- View Article
- PubMed/NCBI
- Google Scholar
10. Tornow S, Mewes HW. Functional modules by relating protein interaction networks and gene expression. Nucleic Acids Research. 2003;31(21):6283–6289. pmid:14576317
- View Article
- PubMed/NCBI
- Google Scholar
11. Tanay A, Sharan R, Kupiec M, Shamir R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(9):2981–2986. pmid:14973197
- View Article
- PubMed/NCBI
- Google Scholar
12. Chua HN, Sung WK, Wong L. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics (Oxford, England). 2006;22(13):1623–30. pmid:16632496
- View Article
- PubMed/NCBI
- Google Scholar
13. Ulitsky I, Shamir R. Identification of functional modules using network topology and high-throughput data. BMC Systems Biology. 2007;1:8. pmid:17408515
- View Article
- PubMed/NCBI
- Google Scholar
14. Ideker T, Sharan R. Protein networks in disease. Genome Res. 2008;18:644–652. pmid:18381899
- View Article
- PubMed/NCBI
- Google Scholar
15. Liu G, Wong L, Chua HN. Complex discovery from weighted PPI networks. Bioinformatics. 2009;25(15):1891–1897. pmid:19435747
- View Article
- PubMed/NCBI
- Google Scholar
16. Li X, Wu M, Kwoh CK, Ng SK. Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Genomics. 2010;11 Suppl 1:S3. pmid:20158874
- View Article
- PubMed/NCBI
- Google Scholar
17. Liu H, Su J, Li J, Liu H, Lv J, Li B, et al. Prioritizing cancer-related genes with aberrant methylation based on a weighted protein-protein interaction network. BMC Systems Biology. 2011;5(1):158. pmid:21985575
- View Article
- PubMed/NCBI
- Google Scholar
18. Feng J, Jiang R, Jiang T. A max-flow-based approach to the identification of protein complexes using protein interaction and microarray data. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2011;8(3):621–634. pmid:20733237
- View Article
- PubMed/NCBI
- Google Scholar
19. Milenković T, Memišević V, Bonato A, Pržulj N. Dominating biological networks. PLoS ONE. 2011;6(8):28–32. pmid:21887225
- View Article
- PubMed/NCBI
- Google Scholar
20. Cho DY, Kim YA, Przytycka TM. Chapter 5: Network Biology Approach to Complex Diseases. PLoS Computational Biology. 2012;8(12):e1002820. pmid:23300411
- View Article
- PubMed/NCBI
- Google Scholar
21. Guney E, Oliva B. Exploiting Protein-Protein Interaction Networks for Genome-Wide Disease-Gene Prioritization. PLoS ONE. 2012;7(9):e43557. pmid:23028459
- View Article
- PubMed/NCBI
- Google Scholar
22. Yu D, Kim M, Xiao G, Hwang TH. Review of biological network data and its applications. Genomics & informatics. 2013;11(4):200–210.
- View Article
- Google Scholar
23. Pandey G, Arora S, Manocha S, Whalen S. Enhancing the functional content of eukaryotic protein interaction networks. PLoS ONE. 2014;9(10):e109130. pmid:25275489
- View Article
- PubMed/NCBI
- Google Scholar
24. Tuck DP, Kluger HM, Kluger Y. Characterizing disease states from topological properties of transcriptional regulatory networks. BMC Bioinformatics. 2006;7:236. pmid:16670008
- View Article
- PubMed/NCBI
- Google Scholar
25. Platzer A, Perco P, Lukas A, Mayer B. Characterization of protein-interaction networks in tumors. BMC Bioinformatics. 2007;8:224. pmid:17597514
- View Article
- PubMed/NCBI
- Google Scholar
26. Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, et al. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature biotechnology. 2009;27(2):199–204. pmid:19182785
- View Article
- PubMed/NCBI
- Google Scholar
27. Komurov K, Ram PT. Patterns of human gene expression variance show strong associations with signaling network hierarchy. BMC Systems Biology. 2010;4(1):154. pmid:21073694
- View Article
- PubMed/NCBI
- Google Scholar
28. Weiss JN, Karma A, MacLellan WR, Deng M, Rau CD, Rees CM, et al. “Good enough solutions” and the genetics of complex diseases. Circulation research. 2012;111(4):493–504. pmid:22859671
- View Article
- PubMed/NCBI
- Google Scholar
29. Schramm G, Kannabiran N, König R. Regulation patterns in signaling networks of cancer. BMC Systems Biology. 2010;4(1):162. pmid:21110851
- View Article
- PubMed/NCBI
- Google Scholar
30. Albert R, DasGupta B, Mobasheri N. Topological implications of negative curvature for biological and social networks. Phys Rev E Stat Nonline Soft Matter Phys. 2014;89:032811. pmid:24730903
- View Article
- PubMed/NCBI
- Google Scholar
31. West J, Bianconi G, Severini S, Teschendorff AE. Differential network entropy reveals cancer system hallmarks. Scientific reports. 2012;2:802. pmid:23150773
- View Article
- PubMed/NCBI
- Google Scholar
32. Banerji CRS, Miranda-Saavedra D, Severini S, Widschwendter M, Enver T, Zhou JX, et al. Cellular network entropy as the energy potential in Waddington’s differentiation landscape. Scientific Reports. 2013;3:25–27. pmid:24154593
- View Article
- PubMed/NCBI
- Google Scholar
33. Jonsson PF, Bates PA. Global topological features of cancer proteins in the human interactome. Bioinformatics (Oxford, England). 2006;22(18):2291–2297. pmid:16844706
- View Article
- PubMed/NCBI
- Google Scholar
34. Milenkovic T, Memisevic V, Ganesan AK, Przulj N. Systems-Level Cancer Gene Identification from Protein Interaction Network Topology Applied to Melanogenesis-Related Functional Genomics Data. Journal of The Royal Society Interface. 2010;7(44):423–437. pmid:19625303
- View Article
- PubMed/NCBI
- Google Scholar
35. Islam MF, Hoque MM, Banik RS, Roy S, Sumi SS, Hassan FMN, et al. Comparative analysis of differential network modularity in tissue specific normal and cancer protein interaction networks. Journal of clinical bioinformatics. 2013;3(1):19. pmid:24093757
- View Article
- PubMed/NCBI
- Google Scholar
36. Rai A, Menon AV, Jalan S. Randomness and preserved patterns in cancer network. Scientific reports. 2014;4:6368. pmid:25220184
- View Article
- PubMed/NCBI
- Google Scholar
37. Guney E, Oliva B. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes. PLoS ONE. 2014;9(4):e94686. pmid:24733074
- View Article
- PubMed/NCBI
- Google Scholar
38. Hakes L, Pinney JW, Robertson DL, Lovell SC. Protein-protein interaction networks and biology–what’s the connection? Nature biotechnology. 2008;26(1):69–72. pmid:18183023
- View Article
- PubMed/NCBI
- Google Scholar
39. Schaefer MH, Serrano L, Andrade-Navarro Ma. Correcting for the study bias associated with protein-protein interaction measurements reveals differences between protein degree distributions from different cancer types. Frontiers in Genetics. 2015;6(August):260. pmid:26300911
- View Article
- PubMed/NCBI
- Google Scholar
40. Estrada E, Rodríguez-Velázquez JA. Subgraph centrality in complex networks. Physical Review E. 2005;71(5):056103.
- View Article
- Google Scholar
41. Benzi M, Klymko C. Total communicability as a centrality measure. Journal of Complex Networks. 2013;1(2):124–149.
- View Article
- Google Scholar
42. Davis TA. Direct Methods for Sparse Linear Systems (Fundamentals of Algorithms 2). Philadelphia, PA, USA: SIAM; 2006.
43. Lamouille S, Xu J, Derynck R. Molecular mechanisms of epithelial-mesenchymal transition. Nat Rev Mol Cell Biol. 2014;15(3):178–196. pmid:24556840
- View Article
- PubMed/NCBI
- Google Scholar
44. Heng H, Liu G, Stevens J, Bremer S, Ye K, Abdallah B, et al. Decoding the genome beyond sequencing: the new phase of genomic research. Genomics. 2011;98(4):242–252. pmid:21640814
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Marusyk A, Polyak K. Tumor heterogeneity: Causes and consequences. Biochim Bioph Acta—Reviews on Cancer. 2010;1805(1):105–117. pmid:19931353
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Hanahan D, Weinberg R. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–674. pmid:21376230
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. The large-scale organization of metabolic networks. Nature. 2000;407(6804):651–654. pmid:11034217
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Barabási AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nature reviews Genetics. 2004;5(2):101–113. pmid:14735121
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Christensen C, Albert R. Using graph concepts to understand the organization of complex systems. Int J Bifurcation and Chaos. 2006;17:2201.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref6] 6. Hanisch D, Zien A, Zimmer R, Lengauer T. Co-clustering of biological networks and gene expression data. Bioinformatics. 2002;1(18):S145–54. pmid:12169542
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref7] 7. Zhang B, Horvath S. A General Framework for Weighted Gene Co-Expression Network Analysis. Stat Appl Genet Mol Biol. 2005;4(1):17. pmid:16646834
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref8] 8. Horvath S, Dong J. Geometric interpretation of gene coexpression network analysis. PLoS Computational Biology. 2008;4(8):24–26. pmid:18704157
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Bader GG, Hogue CC. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003;4:2. pmid:12525261
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref10] 10. Tornow S, Mewes HW. Functional modules by relating protein interaction networks and gene expression. Nucleic Acids Research. 2003;31(21):6283–6289. pmid:14576317
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref11] 11. Tanay A, Sharan R, Kupiec M, Shamir R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(9):2981–2986. pmid:14973197
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref12] 12. Chua HN, Sung WK, Wong L. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics (Oxford, England). 2006;22(13):1623–30. pmid:16632496
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref13] 13. Ulitsky I, Shamir R. Identification of functional modules using network topology and high-throughput data. BMC Systems Biology. 2007;1:8. pmid:17408515
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref14] 14. Ideker T, Sharan R. Protein networks in disease. Genome Res. 2008;18:644–652. pmid:18381899
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref15] 15. Liu G, Wong L, Chua HN. Complex discovery from weighted PPI networks. Bioinformatics. 2009;25(15):1891–1897. pmid:19435747
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref16] 16. Li X, Wu M, Kwoh CK, Ng SK. Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Genomics. 2010;11 Suppl 1:S3. pmid:20158874
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref17] 17. Liu H, Su J, Li J, Liu H, Lv J, Li B, et al. Prioritizing cancer-related genes with aberrant methylation based on a weighted protein-protein interaction network. BMC Systems Biology. 2011;5(1):158. pmid:21985575
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref18] 18. Feng J, Jiang R, Jiang T. A max-flow-based approach to the identification of protein complexes using protein interaction and microarray data. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2011;8(3):621–634. pmid:20733237
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. Milenković T, Memišević V, Bonato A, Pržulj N. Dominating biological networks. PLoS ONE. 2011;6(8):28–32. pmid:21887225
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Cho DY, Kim YA, Przytycka TM. Chapter 5: Network Biology Approach to Complex Diseases. PLoS Computational Biology. 2012;8(12):e1002820. pmid:23300411
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Guney E, Oliva B. Exploiting Protein-Protein Interaction Networks for Genome-Wide Disease-Gene Prioritization. PLoS ONE. 2012;7(9):e43557. pmid:23028459
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Yu D, Kim M, Xiao G, Hwang TH. Review of biological network data and its applications. Genomics & informatics. 2013;11(4):200–210.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref23] 23. Pandey G, Arora S, Manocha S, Whalen S. Enhancing the functional content of eukaryotic protein interaction networks. PLoS ONE. 2014;9(10):e109130. pmid:25275489
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref24] 24. Tuck DP, Kluger HM, Kluger Y. Characterizing disease states from topological properties of transcriptional regulatory networks. BMC Bioinformatics. 2006;7:236. pmid:16670008
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref25] 25. Platzer A, Perco P, Lukas A, Mayer B. Characterization of protein-interaction networks in tumors. BMC Bioinformatics. 2007;8:224. pmid:17597514
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref26] 26. Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, et al. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature biotechnology. 2009;27(2):199–204. pmid:19182785
View Article
PubMed/NCBI
Google Scholar

[100] View Article

[101] PubMed/NCBI

[102] Google Scholar

[ref27] 27. Komurov K, Ram PT. Patterns of human gene expression variance show strong associations with signaling network hierarchy. BMC Systems Biology. 2010;4(1):154. pmid:21073694
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref28] 28. Weiss JN, Karma A, MacLellan WR, Deng M, Rau CD, Rees CM, et al. “Good enough solutions” and the genetics of complex diseases. Circulation research. 2012;111(4):493–504. pmid:22859671
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref29] 29. Schramm G, Kannabiran N, König R. Regulation patterns in signaling networks of cancer. BMC Systems Biology. 2010;4(1):162. pmid:21110851
View Article
PubMed/NCBI
Google Scholar

[112] View Article

[113] PubMed/NCBI

[114] Google Scholar

[ref30] 30. Albert R, DasGupta B, Mobasheri N. Topological implications of negative curvature for biological and social networks. Phys Rev E Stat Nonline Soft Matter Phys. 2014;89:032811. pmid:24730903
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref31] 31. West J, Bianconi G, Severini S, Teschendorff AE. Differential network entropy reveals cancer system hallmarks. Scientific reports. 2012;2:802. pmid:23150773
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

[ref32] 32. Banerji CRS, Miranda-Saavedra D, Severini S, Widschwendter M, Enver T, Zhou JX, et al. Cellular network entropy as the energy potential in Waddington’s differentiation landscape. Scientific Reports. 2013;3:25–27. pmid:24154593
View Article
PubMed/NCBI
Google Scholar

[124] View Article

[125] PubMed/NCBI

[126] Google Scholar

[ref33] 33. Jonsson PF, Bates PA. Global topological features of cancer proteins in the human interactome. Bioinformatics (Oxford, England). 2006;22(18):2291–2297. pmid:16844706
View Article
PubMed/NCBI
Google Scholar

[128] View Article

[129] PubMed/NCBI

[130] Google Scholar

[ref34] 34. Milenkovic T, Memisevic V, Ganesan AK, Przulj N. Systems-Level Cancer Gene Identification from Protein Interaction Network Topology Applied to Melanogenesis-Related Functional Genomics Data. Journal of The Royal Society Interface. 2010;7(44):423–437. pmid:19625303
View Article
PubMed/NCBI
Google Scholar

[132] View Article

[133] PubMed/NCBI

[134] Google Scholar

[ref35] 35. Islam MF, Hoque MM, Banik RS, Roy S, Sumi SS, Hassan FMN, et al. Comparative analysis of differential network modularity in tissue specific normal and cancer protein interaction networks. Journal of clinical bioinformatics. 2013;3(1):19. pmid:24093757
View Article
PubMed/NCBI
Google Scholar

[136] View Article

[137] PubMed/NCBI

[138] Google Scholar

[ref36] 36. Rai A, Menon AV, Jalan S. Randomness and preserved patterns in cancer network. Scientific reports. 2014;4:6368. pmid:25220184
View Article
PubMed/NCBI
Google Scholar

[140] View Article

[141] PubMed/NCBI

[142] Google Scholar

[ref37] 37. Guney E, Oliva B. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes. PLoS ONE. 2014;9(4):e94686. pmid:24733074
View Article
PubMed/NCBI
Google Scholar

[144] View Article

[145] PubMed/NCBI

[146] Google Scholar

[ref38] 38. Hakes L, Pinney JW, Robertson DL, Lovell SC. Protein-protein interaction networks and biology–what’s the connection? Nature biotechnology. 2008;26(1):69–72. pmid:18183023
View Article
PubMed/NCBI
Google Scholar

[148] View Article

[149] PubMed/NCBI

[150] Google Scholar

[ref39] 39. Schaefer MH, Serrano L, Andrade-Navarro Ma. Correcting for the study bias associated with protein-protein interaction measurements reveals differences between protein degree distributions from different cancer types. Frontiers in Genetics. 2015;6(August):260. pmid:26300911
View Article
PubMed/NCBI
Google Scholar

[152] View Article

[153] PubMed/NCBI

[154] Google Scholar

[ref40] 40. Estrada E, Rodríguez-Velázquez JA. Subgraph centrality in complex networks. Physical Review E. 2005;71(5):056103.
View Article
Google Scholar

[156] View Article

[157] Google Scholar

[ref41] 41. Benzi M, Klymko C. Total communicability as a centrality measure. Journal of Complex Networks. 2013;1(2):124–149.
View Article
Google Scholar

[159] View Article

[160] Google Scholar

[ref42] 42. Davis TA. Direct Methods for Sparse Linear Systems (Fundamentals of Algorithms 2). Philadelphia, PA, USA: SIAM; 2006.

[ref43] 43. Lamouille S, Xu J, Derynck R. Molecular mechanisms of epithelial-mesenchymal transition. Nat Rev Mol Cell Biol. 2014;15(3):178–196. pmid:24556840
View Article
PubMed/NCBI
Google Scholar

[163] View Article

[164] PubMed/NCBI

[165] Google Scholar

[ref44] 44. Heng H, Liu G, Stevens J, Bremer S, Ye K, Abdallah B, et al. Decoding the genome beyond sequencing: the new phase of genomic research. Genomics. 2011;98(4):242–252. pmid:21640814
View Article
PubMed/NCBI
Google Scholar

[167] View Article

[168] PubMed/NCBI

[169] Google Scholar

Figures

Abstract

Introduction

Methods

Microarray data preprocessing

Network topology compilation

Network communicability: plausibility considerations

Normalized total communicability

Results

Discussion

Supporting Information

S1 Table. Lists of TCGA BRCA, COAD, GBM tumor/norm samples used in this study.

S2 Table. Sorted lists of average differential gene expression (ADGE) values of TCGA BRCA/COAD/GBM tumor/norm (TN), norm/tumor (NT) (i.e., reversed TN) and ‘random expression’ data.

S3 Table. Lists of 74538 unweighted and gene expression weighted pairwise interactions (i.e., 1st order adjacency matrices) computed on the basis of TCGA BRCA/COAD/GBM TN, NT and ‘random expression’ data.

S4 Table. 90 common BRCA/COAD/GBM genes associated with elevated NTC and their GO enrichment terms.

S5 Table. 76 COAD-specific genes associated with elevated NTC in the avg-metric and their GO enrichment terms.

S6 Table. Overlap between 90 common BRCA/COAD/GBM genes with high NTC (see S4 Table) and 33 genes with the high differential entropy in bladder cancer from [31](Suppl.Table.2).

Acknowledgments

Author Contributions

References