Advertisement

< Back to Article

Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference

Figure 3

Modulated modularity clustering.

Panels (A)–(F) illustrate the method of modulated modularity clustering by example. (A) Example data consisting of nine observations on twelve variables drawn from a standard multivariate Normal distribution with variance-covariance matrix given by the correlation matrix from Figure 2. The variables have been permuted so that the block structure of the data will be obscured. (B) Heat map showing the pairwise correlations between variables (numbered 1 through 12 after permutation). The color scheme was introduced in Figure 2. (C) Surface plot of the maximum modularity attainable for a fixed number of clusters as varies. The 4,213,597 possible partitions of the twelve variables are grouped by number of parts (and hence clusters), and the maximum modularity found among these is shown on the plot for each . The surface appears convex and attains its maximum at (on a grid of step size 0.001) for a clustering of size 4. For larger examples, it is not possible to enumerate all partitions, and an approximate method is used to marginally maximize in . (D) The optimal defines the graph whose community structure is to be evaluated. The graph has affinity matrix with entries ; for illustration, these values have been shifted and linearly scaled so that the previously introduced heat map applies. To identify the partition of maximum modularity, we use a greedy forward search in which the initial graph is recursively bisected into subgraphs until the overall modularity can no longer increase. In the figure, each level (LEVEL 1, LEVEL 2) indicates a round of bisection, and each subgraph is represented by its corresponding section of the affinity matrix. There is no third level; subsequent to the second round, the overall modularity cannot be increased through further bisection. (E) The resulting clustering is used to reorder the affinity matrix by permutation of its rows and columns. Entries with colors other than dark blue have now been aggregated along the main diagonal. (F) Applying the same permutation to the correlation matrix reveals the four correlated clusters of variables ({1,9},{4,7,8,11},{2,6,12},{3,5,10}) hidden within the data.

Figure 3

doi: https://doi.org/10.1371/journal.pgen.1000479.g003