Clustering-independent analysis of genomic data using spectral simplicial theory
Fig 5
Feature selection on the MNIST dataset using the combinatorial Laplacian score.
Each sample consists of a grey-scale image of a hand-written digit from 0 to 9 and each pixel represents a feature. The degree of localization of each feature in the simplicial complex is assessed using the 0- and 1-dimensional combinatorial Laplacian scores. (a) Number of rejected null hypothesis at a FDR of 0.05 as a function of the radius ε of the balls in the Vietoris-Rips complex. In this example, the statistical power of and
is maximized for ε~0.7. The significance of the scores was determined through a permutation test were the pixels were randomized 5,000 times. (b) Vietoris-Rips complex colored by the intensity of a pixel that is significant under
(q-value < 0.005) but not under
(q-value = 0.5). The intensity of the pixel is high in a densely connected, topologically trivial region of the complex containing images of the digit `4`. Images associated to several nodes are shown for reference, with the pixel highlighted in red. (c) Vietoris-Rips complex colored by the intensity of a pixel that is significant under both
(q-value < 0.005) and
(q-value < 0.005). The intensity of the pixel is high in a densely connected region that surrounds a large non-contractible cycle of the simplicial complex. The cycle is generated by images that belong to the sequence of digits `7`, `3`, `1`, `9`. Images associated to several nodes along the cycle are shown for reference, with the pixel highlighted in red.