Figure 1.
Minimizing the map equation over all network partitions gives an optimal clustering of the network with respect to the dynamics on the network.
Optimal two-level clustering is shown in A and hierarchical clustering is shown in B. The description length, which is 4.75 bits for an unpartitioned network, is the sum of the average length of codewords from the index codebook(s) and the module codebooks weighted by the rate of use of each codebook. For this undirected unweighted network with total degree 78, all rates can be calculated by counting links and normalizing: The codewords of the index codebook in A are used at relative rates at a total rate
and, for example, the codewords of the first module codebook are used at relative rates
at a total rate
with contribution from the exit probability
. The codewords of the smaller index codebooks in B are used at relative rates
and
at total rates
and
. The fine-level modules of this hierarchical clustering coincide with the modules of the two-level clustering.
Figure 2.
Multilevel organization in three real-world networks.
The bottom row illustrates structures that a two-level clustering can capture. The width of the horizontal lines represents the size of the modules and the number to the left of the braces gives the number of submodules within each module. For visual simplicity, we exclude submodules with less than 1 per mil of all flow. See Fig. 3 for a hierarchical map of science based on the journal citation network.
Figure 3.
A hierarchical map of science.
We partitioned 7,940 journals connected by 9.2 million citations [20] into four major disciplines, which we identified as life sciences, physical sciences, ecology and earth sciences, and social sciences. In physical sciences, we followed a second-level split into the areas of mathematics and of physics and chemistry. The size of the modules represents the fraction of time that a random surfer spends following citations in that field, and the arrows indicate flow volume between the fields. For visual simplicity, we exclude fields and arrows with low flow.
Table 1.
The hierarchical organization of real-world networks.
Figure 4.
The range of mixing parameters that give a well-defined three-level hierarchical structure for the benchmark networks in the paper.
The networks have nodes, coarse-level module sizes between
and
nodes, and fine-level module sizes between
and
nodes. The connected points illustrate the sets of mixing parameters we present in the paper.
Figure 5.
Figures A–D show how well the algorithm reveals the three-level organization of the hierarchical benchmark networks with 10,000 nodes and 100,000 links. The nodes share a fraction of their links with nodes in other coarse-level modules and a fraction
of their links with nodes in other fine-level modules. Every data point represents the average value of 100 measures.