Reassessing the modularity of gene co-expression networks using the Stochastic Block Model

doi:10.1371/journal.pcbi.1012300

Fig 1.

Schematic representation of three network architectures.

Each panel shows the adjacency matrix (top) and the corresponding network diagram (bottom). A. Modular architecture: The network is composed of five distinct modules, each containing ten interconnected traits. Modules are connected by a few inter-module links. B. Core-periphery architecture: The network consists of a single densely connected core module with ten traits and a peripheral group with 40 traits. The peripheral group is connected to the core module, but has few internal connections. C. Disassortative architecture: The network comprises five groups, each with ten traits. Traits within each group are not interconnected but are instead connected to traits in other groups, forming a pattern of between-group connections.

More »

Expand

Fig 2.

Schematic representation of the clustering in the SBM.

Genes are clustered into level-1 blocks, level-1 blocks are clustered into level-2 blocks, and so on. A. Circular representation of the clustering we use in the following figures. Block names are constructed by following the hierarchy, starting at level 1. So in this example, the level-1 block 8 can also be referred to as 8-4-1. B. A tree-like representation that highlights the hierarchy in the nested SBM. Each level-2 block is composed of all the genes in its child level-1 blocks, each level-3 block is composed of all the genes in its child level-2 blocks, and so on.

More »

Expand

Fig 3.

Simulations comparing the Stochastic Block Model (SBM) and Weighted Gene Co-expression Network Analysis (WGCNA) in detecting assortative and non-assortative network structure.

A. Known modular correlation matrix with 50 traits grouped into 5 assortative modules, further organized into 2 higher-level groupings and a fifth module equally correlated with the others. B. FDR-trimmed weighted network generated from observations sampled from the correlation matrix in (A). C. SBM fit on the network in (B), correctly identifying the five modules and their higher-level organization. Edges are colored according to the block assignment of the vertices. D. WGCNA fit on the network in (B), detecting the only 4 assortative modules but not the higher-level groupings. E. Modified correlation matrix with a non-assortative module (first 5 traits, highlighted in red) introduced. F. FDR-trimmed weighted network generated from observations sampled from the correlation matrix in (E). G. SBM fit on the network in (F), correctly identifying the non-assortative module (in blue with red arc) and the assortative modules. H. WGCNA fit on the network in (F), failing to recognize the non-assortative module and grouping its traits (circled in red) with an assortative module (teal module). These simulations show the ability of SBM to capture both assortative and non-assortative network structures, as well as hierarchical organization, compared to WGCNA, which is primarily designed for detecting assortative communities.

More »

Expand

Fig 4.

Matrix and graph representations of the SBM clustering.

A and B: SBM Level-1 blocks are colored by the number of edges within and between blocks. Gray squares represent pairs of unconnected blocks. The upper levels of the nested hierarchy are shown by the red lines. C and D: A full representation of the fitted block model. Genes are shown at the perimeter, colored by their level 2 blocks. The internal graph shows the hierarchical structure of the fitted SBM. Numbers in blue circles correspond to the level-2 block. Arrows between level-1 blocks and genes are omitted, unlike Fig 1. A subsample of 30.000 edges is shown connecting the genes, and edges are colored according to their transformed weights, with more positive weights plotted on top and more yellow. External labels refer to a non-exhaustive subset of level-2 blocks with clear biological functions inferred from interpreting GO enrichment. Level-2 block 8 in the body, with the blue circle highlighted in red, is the only level-2 block with no GO enrichment.

More »

Expand

Fig 5.

Comparison of the clustering in WGCNA and levels 3 and 4 of the SBM hierarchy for the gene expressions in the body (left) and the head (right).

Each point corresponds to a gene. The x-axis corresponds to the Level-3 SBM blocks, and the y-axis the WGCNA modules. Colors correspond to the (coarser) level 4 of the SBM.

More »

Expand

Fig 6.

Assortativity measured in the SBM level-1 blocks and Newman Modularity (average assortativity) at each level of the SBM hierarchy (inset).

GO enriched blocks are shown in yellow and appear throughout the distribution of assortativity. Modularity is much higher in the head, and it peaks at level 3, dropping in upper levels. Body has a much higher number of non-assortative blocks and lower modularity at all levels. Modularity peaks at level 4 in the body and drops strongly at level 5. Interestingly, the 4 most assortative blocks in the body do not show significant GO enrichment.

More »

Expand

Table 1.

Fraction of blocks at each level of the SBM hierarchy that show significant GO enrichment at the 5% FDR level with a minimum of 4 genes in the enriched set.

More »

Expand

Fig 7.

Comparison of assortativity values between level-1 blocks enriched for cytoplasmic translation and all other blocks. Blocks enriched for cytoplasmic translation tend to be less assortative.

More »

Expand