Noise robustness of persistent homology on greyscale images, across filtrations and signatures

doi:10.1371/journal.pone.0257215

Fig 1.

Persistent homology pipeline.

PH can be calculated for different types of spaces S, which can represent a single data observation (typical for classification tasks) or a complete dataset. In this figure, we calculate the PH information for an image. The input for PH is a filtration, a nested family of spaces that approximate the structure of S at different scales r₁ < r₂ < … r_t. For example, to approximate the structure of an image at scale r, we can look only at pixels within distance r from the top left pixel (top panel). Alternatively, we can look at an image as a point cloud, and approximate its structure at resolution r by constructing an edge between two points whenever they are within distance r (bottom panel). For a homological dimension k (in the figure, k = 1), PH registers the birth and death time r of every k-dimensional cycle (connected component, loop, void, etc.) within the filtration, and is commonly summarized with a scatter plot of birth and death coordinates, referred to as persistence diagram (PD). It is often interesting or even necessary to transform the PD into a different persistence signature, such as a persistence landscape (PL, top panel) or a persistence image (PI, bottom panel).

More »

Expand

Fig 2.

Filtration on a cubical complex.

The first image represents the values [0, 100] of the filtration function ϕ. The next nine figures show the cubical complexes K₁₀ ⊆ K₂₀ ⊆ K₃₀ ⊆ ⋯ ⊆ K₉₀, where K_r corresponds to the union of all cubes, i.e., pixels (u, v) with the filtration value ϕ(u, v) ≤ r. There is only one 1-dimensional cycle, i.e., hole (one-pixel hole in the third row and third column), which is first seen in K₄₀, and then disappears or closes in K₇₀.

More »

Expand

Fig 3.

Filtrations.

The first plot shows an example MNIST image Z, with greyscale pixel values in [0, 250]. The next four plots respectively show the heat map for the binary, greyscale, density and radial filtration function where K(Z) is the cubical complex corresponding to the given example image. The final two plots visualize the heat map of where ϕ is the discretized version of the Rips and DTM filtration functions and and X(Z, z₀) is the point cloud corresponding to the image Z.

More »

Expand

Fig 4.

Persistent homology across filtrations.

Persistent homology is a multi-set of persistence intervals (b_i, d_i), where b_i and d_i are respectively the time when a cycle i (a connected component, loop, void, etc.) is born, and when it dies in a filtration. The table lists 1-dimensional PH calculated for a few example MNIST images (or an image with an outlying pixel), across selected filtrations. The notation (b, d)* implies that multiple cycles appear and disappear at the same time (thus, PH is a multi-set, where each element has its multiplicity). The notation (b, d)** implies that there are multiple intervals with a similar birth and death value. The cardinality of the set of persistence intervals determines the number of cycles. However, the definition of the filtration implies the interpretation of birth and death times, so that PH with different filtrations captures different topological (and geometric) information, what further influences its noise robustness and discriminating power. For example, an additional point at an outlying distance from a point cloud can have an important influence on PH with the Rips filtration (e.g., an additional black pixel within a hole will change the persistence of that hole, see persistence intervals in red), but this is less true for the DTM filtration, as the outlier will have a large distance from the nearest point cloud neighbours and will thus appear only very late in the filtration. A reverse example is a pixel with an outlying greyscale value (e.g., white pixel in a dark region) which has an important influence on PH with the binary, greyscale and radial filtration (in blue), but much less for the density, Rips and DTM filtration. If geometric information is captured, PH becomes sensitive under some affine transformations. Furthermore, 1-dimensional PH with binary, greyscale and density filtration cannot differentiate between digits 0, 6 and 9 (as they all have one hole of similar brightness), but radial filtration allows to discriminate between digits 6 and 9 (as the holes have a different position), and the Rips and DTM filtration enable to distinguish between 0 and 6 (as the holes are of different size).

More »

Expand

Fig 5.

Noise robustness of 0-dimensional persistent homology on an example image.

Illustration of the effect of various image transformations when the image is represented with its filtration function values (1st row of each filtration), or 0-dimensional persistence diagram (2nd row), persistence landscape (3rd row), or persistence image (4th row).

More »

Expand

Fig 6.

Noise robustness of 1-dimensional persistent homology on an example image.

Illustration of the effect of various image transformations when the image is represented with its filtration function values (1st row of each filtration), or 1-dimensional persistence diagram (2nd row), persistence landscape (3rd row), or persistence image (4th row).

More »

Expand

Table 1.

Persistent homology across signatures.

More »

Expand

Table 2.

Image noise.

More »

Expand

Table 3.

Noise robustness of persistent homology on 1000 MNIST greyscale images.

More »

Expand

Fig 7.

Noise robustness and discriminative power of persistent homology on 1000 MNIST greyscale images.

The figure shows the drop in SVM classification accuracy when the test dataset is noisy, compared to the non-noisy test set, averaged across 1000 images in the MNIST dataset. Each image is represented either with its filtration function values (1st row of each filtration), or with its 0- or 1-dimensional persistence diagram (2nd and 5th row), persistent landscape (3rd and 6th row) or persistent image (4th and 7th row). The size of the node reflects the absolute accuracy on the non-noisy test data. The colour of the node reflects the accuracy drop, indicated in the colour bar. In particular, the presence of red nodes for PH information (2nd to 7th row) implies that PH is not robust under any type of noise, for any filtration and persistence signature.

More »

Expand