Fig 1.
Topological data analysis pipeline.
Starting from point cloud data—e.g., cell centroid locations in tissue—we construct filtered simplicial complexes to approximate the underlying geometric structure. Persistent homology is then computed to extract topological features such as connected components, loops, and voids across multiple scales. These features are vectorised, and we explore weightings that emphasise structure at different scales. The resulting representations are used for clustering. We also introduce the persistence weighted death simplices (PWDS) visualisation to enhance interpretability of the topological features.
Fig 2.
Lupus murine spleen cell centroid data.
A: Spatial distribution of cell centroids for the 25 cell types in lupus murine spleen CODEX data set. The data set consists of three healthy samples (BALBc-1, BALBc-2, BALBc-3), three samples in an early stage of the disease (MRL-4, MRL-5, MRL-6), two samples in an intermediate stage of the disease (MRL-7, MRL-8) and one sample in a late stage of the disease (MRL-9). B: Cell counts of the two types of pulp for each sample (entire slide) in tens of thousands of cells. C: Point clouds corresponding to the main parts of the spleen, the red pulp and the white pulp, for each sample, classified by disease stage. The white pulp forms compartments surrounded by the red pulp.
Fig 3.
Examples of persistent homology.
A: Example of the alpha filtration for a point cloud forming a loop. Corresponding persistence diagram with one persistent feature in degree 1 corresponding to the loop. B: Example of persistent homology for a point cloud consisting of four loops of various sizes and densities. For each loop, we indicate the corresponding feature in the persistence diagram. Less dense loops have larger birth values, bigger loops have larger death values. C: Steps of the computation of a persistence image with weight on persistence w1(b,p) = p.
Fig 4.
Persistence weighted death simplices (PWDS) visualisation.
A: Colouring and weighting of the persistence diagram. The colouring in the diagram distinguishes features based on their birth values, with red representing features formed by proximal points, blue for those formed by distal points, and a continuous gradation of colour for features in between. The intensity of the colour reflects the feature’s persistence (death minus birth), with darker shades indicating more prominent features. B: Visualisation of persistence weighted death simplices (PWDS) for five loops with increasing noise with thresholds and
. Each triangle indicates the approximate location of a loop detected by persistent homology with alpha complexes, coloured as described above. The PWDS visualisation of the first loop shows only red triangles, indicating features formed by proximal points, with a prominent red triangle for the large loop and smaller red triangles for voids. As noise is added inside the loop, blue triangles appear for features formed by distal points, and the large loop becomes less prominent. With increasing noise, the red triangles shrink and lighten, and eventually, only light-red triangles remain, indicating the loss of structure at the larger scale.
Fig 5.
PWDS of red and white pulp cell populations in the murine spleen tissue.
Visualisation of PWDS for the red and white pulp in a healthy sample (BALBc-1) and a diseased sample (MRL-4). Triangles correspond to loops detected by PH with alpha complexes. Red triangles correspond to loops formed by proximal cells, blue ones to loops formed by distal cells, and the intensity of colour is set according to the prominence of the feature. The parameters of the colour gradation depend on the cell type as in the example in the Methods section: and
. PH detects the coarse structure of the spleen, i.e. the noisy holes left in the red pulp by the white pulp as prominent loops (intense-red triangles). The presence of more numerous small light-blue triangles in the red pulp PWDS of diseased samples indicates a higher level of infiltration of the red pulp cells inside the holes left by the white pulp.
Fig 6.
Betti curves of red pulp, white pulp, and B cells populations in the murine spleen tissue.
A: Red pulp cell locations in the lupus murine spleen CODEX data. B: Red pulp normalised Betti-1 curves, which count the proportion of loops that are alive at a certain parameter value, relative to the total number of loops detected by PH (alpha). The peaks of the curves are due to a large proportion of cells within dense regions where many small-scale loops form. The location of the peak measures the typical scale of these dense environments. Differences in the location of the peak indicate an increase of density within the dense regions of the red pulp in disease. C: Cell counts of the red pulp cells for each sample. D: PCA plot of the red pulp normalised Betti-1 curves of the alpha filtration, where we observe two separated groups corresponding to healthy and diseased samples. E, F, G, H: same plots for white pulp cells, where the opposite phenomena is detected by the Betti-1 curves: a decrease in density in the dense regions of the white pulp in disease. I, J, K, L: same plots for B cells, where apart from a displacement in the peaks due to a decrease in density within the B follicles, we further identify a decrease in the height of the peak of Betti-1 curves as the disease progresses, signalling a reduction in the size of the B follicles.
Fig 7.
PWDS and Betti curves for all cell types together in the human lung tissue.
A: Visualisation of PWDS for all cell types together with a choice of six representative samples out of 32. The visualisation corresponds to loops in PH alpha, with threshold parameters and
. We observe the extensive cellular infiltrate in the lung tissue infected with COVID-19, indicated by small light-red triangles. In healthy samples PH detects the cavities in the lungs formed by thin walls of cells, indicated by intense red-triangles. For some COVID-19 samples, holes within the cellular infiltrate are also detected. Since there is not much noise inside the holes in this data set, there is a small number of blue triangles. Most of them correspond to either holes of concave shapes, or to holes whose walls are not completely contained in the sample, so they are formed by distal cells. B: All cells normalised Betti curves of degree 1 of the alpha filtration. We observe that diseased samples have a very similar behaviour to what we observed in the spleen data set, a peak in the small-scale due to the presence of large dense regions. The Betti-1 curves for the healthy samples do not have a pronounced peak at small scales, since the thin walls do not contain a large number of small-scale loops, but they have heavier tails due to the large proportion of large-scale holes.
Fig 8.
Clustering based on the topology of Endothelial cells in the human lung tissue.
A: Clustering based on the topological descriptor that attains the best Rand score when compared to the biological classification: percentiles of death values of degree 1 PH (alpha) of Endothelial cells. We observe that the Alveolitis samples are approximately separated from the other two stages, and the third cluster contains sparser samples. B: PCA of the topological descriptor from Endothelial cells that attains the best Rand score, with the clusters separated by black lines. C: For each sample, we plot the density of Endothelial cells. We indicate by color yellow, orange or red the stage of the disease, and by a label of color blue, green or purple the label in the clustering we obtain. The clustering corresponds approximately with thresholding by density.