Geometric and topological characterization of the cytoarchitecture of islets of Langerhans

The islets of Langerhans are critical endocrine micro-organs that secrete hormones regulating energy metabolism in animals. Insulin and glucagon, secreted by beta and alpha cells, respectively, are responsible for metabolic switching between fat and glucose utilization. Dysfunction in their secretion and/or counter-regulatory influence leads to diabetes. Debate in the field centers on the cytoarchitecture of islets, as the signaling that governs hormonal secretion depends on structural and functional factors, including electrical connectivity, innervation, vascularization, and physical proximity. Much effort has therefore been devoted to elucidating which architectural features are significant for function and how derangements in these features are correlated or causative for dysfunction, especially using quantitative network science or graph theory characterizations. Here, we ask if there are non-local features in islet cytoarchitecture, going beyond standard network statistics, that are relevant to islet function. An example is ring structures, or cycles, of α and δ cells surrounding β cell clusters or the opposite, β cells surrounding α and δ cells. These could appear in two-dimensional islet section images if a sphere consisting of one cell type surrounds a cluster of another cell type. To address these issues, we developed two independent computational approaches, geometric and topological, for such characterizations. For the latter, we introduce an application of topological data analysis to determine locations of topological features that are biologically significant. We show that both approaches, applied to a large collection of islet sections, are in complete agreement in the context both of developmental and diabetes-related changes in islet characteristics. The topological approach can be applied to three-dimensional imaging data for islets as well.


Introduction
Described by Paul Langerhans as part of his medical dissertation in 1869, the islets of Langerhans [1] are endocrine micro-organs embedded in the acinar tissue of the exocrine pancreas.While they comprise only about 1-4% of the total mass of the pancreas [2], they produce and secrete hormones that are crucial for regulating blood glucose levels as well as levels of amino acids, free fatty acids, keto acids, glycerol, and other energy-rich nutrients.There are several cell types within islets, including beta cells, alpha cells, delta cells, and others, each of which produces a specific hormone with complex counter-regulatory actions.These cells communicate through a complex network of paracrine [3] and autocrine signaling pathways involving hormones, neuropeptides, growth factors, and through electrical coupling via gap junctions [4,5].
Beta cells are responsible for producing insulin, which promotes the uptake and storage of glucose in muscle and adipose tissue and inhibits hepatic glucose production.Insulin also affects the storage of glucose in the liver in the form of glycogen, as well as promotes storage of lipids in fat tissue (inhibition of the hormone-sensitive lipase, promotion of lipoprotein lipase), and amino acids in muscle tissue.Alpha cells produce glucagon, which increases blood glucose levels by promoting the breakdown of glycogen stored in the liver and muscles, and by stimulating gluconeogenesis, the production of glucose from non-carbohydrate sources such as amino acids and fatty acids.Delta cells produce somatostatin, which helps regulate the secretion of both insulin and glucagon.Electrical connectivity of delta cells is just being appreciated in recent research and they play a central role in the regulation of both insulin and glucagon [6].Other cell types in the islets, such as epsilon cells, produce hormones that are involved in appetite regulation and the overall metabolic response to food intake.These cells are not randomly arranged in the islets of Langerhans, but instead form a complex three-dimensional architecture.It is believed that islets have a core of beta cells (β), surrounded by mantles of alpha cells and delta cells (αδ) that are close to the periphery, with the outer layer consisting of pancreatic polypeptide (PP) cells.For example, [7] observed that small human islets (40-60μm in diameter) in their study had β-cells in a core position, α-cells in a mantle position, and vessels at the periphery.In bigger islets they observed α-cells in a mantle position as well as along vessels that penetrated and branched inside the islets.There is some degree of variation in the precise arrangement of the cells between islets, which themselves come in a variety of sizes, and between species [2,8,9].The precise arrangement of cells may play a role in the signaling between cells and the regulation of hormone secretion [7,[10][11][12][13].How this arrangement of different cell types in islets [14] relates to innervation [15] of, and blood flow [16,17] through islets is still an area of active study [18,19].Motivated by the debate about the functional significance of core-mantle segregation of islet cells, our aim in this work is to quantitate ringstructures or cycles of αδ-cells that surround β-cells and cycles of β-cells that surround αδ-cells in 2D and 3D data sets of islet cell composition.
The development of islet cell populations and their numbers during the transition from birth to adulthood is also a complex process that is not fully characterized in humans.However, studies have shown that there are dynamic changes in the numbers of different cell types during this period [20,21].For example, the number of beta cells in the islets increases significantly during the first few months after birth, then gradually increases until the age of about 5-6 years, after which it remains relatively stable until early adulthood.The numbers of other islet cell types, such as alpha and delta cells, also change during this period, although the patterns and magnitudes of these changes may differ from those of beta cells.There can be significant individual variability in the numbers of islet cells during development, and environmental factors such as diet and lifestyle also influence these processes.The relationship between islets and acinar and ductal cells may also be different at different developmental stages and in different species (e.g., islets are more intralobular in humans and more interlobular in mice).
In type 2 diabetes, it has been possible to quantitatively assess the decrease in the number and function of different cell types in the islets of Langerhans [18,22,23].Beta cell mass and insulin secretion decrease, and delta cell number and somatostatin secretion also decrease.Early during the development of diabetes there may be an adaptive increase in mass or function (or both) and in some people (non-progressors), this may be more functional or persistent than in others (progressors) and is crucial for our understanding why some people do not develop T2D despite insulin resistance and why some respond to dietary interventions while others do not [24][25][26][27][28][29].Glucagon secretion is inhibited by insulin but is also regulated by a complex interplay of several other factors, including glucose levels, amino acids, and neural inputs.Counterintuitively, the regulation of glucagon secretion is impaired, leading to an increase in glucagon secretion.The mechanisms underlying this are not fully understood [30][31][32], but it may be due to a combination of factors besides impaired insulin secretion, including decreased sensitivity of alpha cells to glucose and altered gut hormone signaling.In addition, inflammation and oxidative stress [33], which are known to be elevated in diabetes, can also affect alpha cell function and contribute to increased glucagon secretion.This progressive loss of cellular function ultimately leads to impaired glucose homeostasis and hyperglycemia in type 2 diabetes.
Quantitative study of the arrangement of islet cells in health, disease and development has been largely defined by the availability of data, both imaging and electro-physiological.The majority of imaging data consists of immuno-fluorescence in two-dimensional (2D) sections of islets which are then processed with image analysis software to determine nuclear locations and cell types.Patch-clamp electro-physiological studies measure the electrical activity of individual cells within isolated islets, and are used to investigate the functional connectivity of the islets by recording the activity of cells in response to different stimuli [34][35][36][37][38]. Calcium imaging using fluorescent dyes monitors changes in intracellular calcium levels in response to different stimuli and allows the measurement of the coordinated activity of cells within the islet [39].
Given this data, the complex architecture of the islets of Langerhans has been described quantitatively using a variety of methods.In the context of islets of Langerhans, structural connectivity refers to the physical connections between different cell types within the islet, such as gap junctions [35,40] that facilitate direct communication and signaling between the cells.Functional connectivity [36] refers to the coordination and synchronization of different cell types within the islet.This is essential for proper regulation of glucose metabolism and insulin secretion.Spatial statistics, such as spatial auto-correlation functions which describe the degree of similarity between nearby cells, describe the distribution and arrangement of cells within islets quantitatively [19].Network science [18,21,41,42], going beyond spatial statistics by computing centrality measures, clustering coefficients, and the modularity of the graph defined by the cell-to-cell adjacency matrix obtained from images, has also been used to find quantitative characteristics of islet morphology as evidenced in 2D sections.New imaging techniques have been used to create detailed three-dimensional (3D) images of islets, which have been used to quantify relationships between intraislet capillary density and islet size [43].
More recently, there has been significant activity combining network science with mathematical modeling to understand how alterations in structural and/or functional connectivity affect islet response to glucose-stimulated insulin secretion.The role of hub nodes in networks [44][45][46][47][48][49] has been central to this activity with some conflicting reports.In this context, we set out to investigate if there are quantitative topological characteristics distinct from the network statistics that have been investigated for islet networks that may be relevant for dynamical islet function.These nonlocal topological characteristics may, for example, implicitly incorporate information about innervation and vascularization.They are also specific to the dimensionality of the islet data, unlike functional network characteristics which can in principle be defined without using any information about the spatial milieu of islet cells.
In this work, first, we develop a geometric quantitative characterization of the occurrence of ring structures or cycles comprising of β cells (β-cycles) or αδ cells (αδ-cycles) that surround cells of the other type(s) in 2D islet sections.We noted that this geometric approach was unlikely to generalize robustly to 3D islet images that are becoming available.Therefore, to check the results obtained via this intricate geometric quantitation, we developed a topological data analysis approach that could be checked to agree with the geometric 2D results, and at the same time generalize to 3D data.Specifically, persistent homology (PH) is a branch of topological data analysis that computes topological invariants from networks by varying a minimal edge-length threshold in a network.This allows PH to uncover non-local robust network features.A 2D or 3D data set of locations of cells in an islet is a network of cells spatially embedded in a Euclidean space.In this case, features computed by PH can be geometrically realized as holes in the spatial embedding.Hence, αδ-cycles of interest can be computed as representative boundaries of holes in the embedded network of αδ-cells that contain β-cells inside.However, representative boundaries around these features are not unique by definition [50] so the computed boundary may be geometrically imprecise.In [51] we developed technical tools for finding tight representatives for topological features that improve geometrical precision of estimation of their location.Here, we introduce a way to quantitate topological features that are biologically significant, specifically, those features in the spatial embedding of the αδ-network that contain β-cells inside and vice versa.With these tools in hand, we have investigated a large number of islet sections in this paper in an attempt to approach the cytoarchitecture of the islets of Langerhans in a computationally rigorous setting.We applied our computational approach to both developmental changes in islet cytoarchitecture and to compare diabetic and control islets.

Results
Two data sets of locations of beta (β), alpha (α), and delta (δ) cells in 2D slices of human pancreatic islets were analyzed.The first comprises islets in different developmental stages of gestation (stage 0), 1-35 weeks (stage 1), 12-24 months (stage 2), and 28 months and later (stage 3).Changes in pancreatic islet cyto-architecture during development have been studied previously using this data set [21].The second comprises islets from diabetic and non-diabetic human subjects.We call the former developmental data set and the latter the T2D data set.Subject-wise demographics for the T2D data sets are shown in S4 Table.These details are not available for the developmental data set.α and δ cells together will be denoted by αδ-cells.Ring structures computed using the geometrical method will be called geometric cycles and those computed using topological data analysis will be called PH-cycles.Fig 1 shows examples of cycles around non-singular (NS) components (component of a network with more than one node or cell in this case that were computed using both methods.

Cell composition of islets changes significantly during development
There were 6088, 4942, 3130, and 7203 islets with at least five β-cells and five αδ-cells in stages 0 to 3, respectively.Islets were characterized by their total number of cells (transformed to log scale) and β-cell fraction.The resulting 2D distributions of islet characteristics for each stage were compared between stages pairwise using the Kolmogorov-Smirnov test (KS-test).We found that the distributions of islet characteristics are significantly different between every pair of stages (p-values < 0.05, see S1 Table ).For a more informative comparison, we used the Kullback-Leibler divergence (KL-divergence) to quantitatively assess the relative difference between kernel density estimates (KDEs) of the 2D distributions of islet characteristics.Fig 2A shows that the KL-div between stages 0 and 1 and between 2 and 3 are smaller than all other pairwise comparisons.This indicates that islet cell composition changes more significantly from stage 1 to stage 2. Peaks of the KDEs indicate that a higher proportion of islets have higher β-cell fraction in the later developmental stages of 2 and 3. S10A Fig shows exemplary 2D sections (characteristics similar to the peak of the KDEs) for the developmental stages.We next plot the KDE of the 2D distribution of islets characterized by the number of α and δ cells in them.Only islets with at least 5 α and δ-cells were considered.Fig 2B shows that in the early stages (0 and 1) majority of islets have the same number of α and δ-cells.However, in the later stages of development clusters of islets appear that have more α than δ-cells.Moreover, almost all of the control and diabetic islets (with at least 5 cells of each kind) have more α than δ-cells (see S9 Fig) .We note that control and diabetic islets are from older human subjects (see S4 Table ).The distribution of ages of all subjects in the T2D data set has a minimum age of 15 years and themedian age of 64 years.

There is a correlation between changes in cycle formation around cores and topology of the islets across developmental stages
Computing geometric cycles found 649, 463, 168, and 453 islets with at least one NS β-component inside an αδ-cycle in stages 0 to 3, respectively.In contrast, there were 64, 88, 238, and 823 islets with at least one NS αδ-component inside a β-cycle in stages 0 to 3, respectively.Fig 3A plots the percentages with respect to the total number of islets.Albeit these percentages are small, the trends in percentages of islets with αδ-cycles around β-cores and β-cores around αδcycles are clearly different, with significant changes in both happening after stage 1.Further, a similar pattern is observed in maximum dimension-1 (H 1 ) topological persistence of all islets.S1 Appendix gives an illustration of dimension-1 persistence applied to a point-cloud and an intuitive interpretation of results.S8 Fig shows distributions of the maximum dimension-1 persistence for β-graphs (left panel) and αδ-graphs (right panel) in all 2D sections across developmental stages.Except for stages 0 and 1 all other pairwise comparisons (Mann-Whitney U test) show that the distributions are significantly different.Fig 3B shows the median and 95%tile of distributions of maximum H 1 of αδ-cells and of β-cells in all islets at different developmental stages.Both the median and 95%-tile of β-cells increase after stage 1, suggesting that there are holes with larger robustness that can wrap around NS αδ-components.Similarly, both the median and 95%-tile of the maximum H 1 persistence of αδ-cells decrease from stage 1 to 2 in accordance with a decrease in the percentage of islets with at least one NS β-component inside an αδ-cycle.S1 Fig shows that distributions of maximum persistence are significantly different after stage 1.PLOS COMPUTATIONAL BIOLOGY of 0.10 between stages 2 and 3.For such islets, the largest KL-divergence is between stages 0 and 3, which might be attributable to the high KL-divergence between KDEs of all islets in these stages (largest KL-divergence in Fig 2A is between stages 0 and 3).This is supported by the observation that the changes in peaks of KDE are similar in both cases, a larger proportion have higher β-fraction in stage 3 as compared to stage 0. KS-test estimates significant p-values (less than 0.05) for all pairwise comparisons between developmental stages in both cases of islets, with a NS β-component inside an αδ-cycle and a NS αδ-component inside a β-component (S2 Table ).S10B and S10C Fig show exemplary 2D sections (characteristics similar to the peak of the KDEs) for the developmental stages for islets with at least one β-component inside a mantle and islets with at least one αδ-component inside a mantle, respectively.

Cycles are closer to the islet's periphery than its center
Minimal distances of each geometric cycle from its islet's periphery and center were computed.Fig 6A shows that the minimal distance of the computed cycles from the islet's periphery is less than their minimal distance from the islet's center in the T2D data set.Similar was observed for the developmental data set (see S5 Fig) .Fig 6B shows that αδ-cycles in small islets (estimated area <10000) are close to the periphery in both control and diabetic.However, there exist β-cycles in small islets that are far from the periphery, as shown by large minimal distances from the periphery.We also observe that larger islets have some cycles with a larger minimal distance from the periphery.Moreover, only a few cycles (6% to 19%) contain the islet center inside them.S6 Fig shows distance from periphery vs. islet area for the developmental data set.

All results are consistent between geometric and PH-cycles
For all of the computed geometric cycles, proximal PH-cycles were computed.In at least 89.5% of the islets with at least one geometric cycle around a NS component, a PH-cycle proximal to that geometric cycle was found.Specifically, there were 629, 439, 165, and 440 islets with at least one αδ PH-cycle around a NS β-component in stages 0 to 3, respectively.There  distributions of islet characteristics in both cases are not significantly different for all different categories in both data sets, p-values from KS-test are � 0.05 (at least 0.99).The maximum KL-divergence between KDEs of islets with at least one geometric αδ-cycle around a NS βcomponent and those with at least one αδ PH-cycle around a NS β-compoment across all stages is 0.001.For islets with at least one β-cycle around a NS αδ-component, this number is 0.005.For the T2D data set, these numbers are 0 and 0.004.We note that these KL-divergences are significantly smaller (by an order of magnitude) as compared to the divergences observed in previous results.S3 and S4 Figs show all KDEs for developmental and T2D data sets, respectively.KS-tests and comparison of KL-divergences of KDEs give evidence for agreement between results from geometric and PH-cycles.

PH finds closed polyhedral structures in 3D islets consisting of αδ-cells (βcells) around multiple β-cells (αδ-cells)
We showcase the application of PH to 3D data sets.Structural information of mouse (n = 29) and human (n = 28) islets were obtained from [52].

Methods
Data acquisition.The two data sets for human pancreatic islets in this study comprise of twodimensional coordinates of beta (β), alpha (α), and delta (δ) cells in islets.The data set with islets at different developmental stages is from human pancreatic tissues that were obtained from the University of Chicago Human Tissue Resource Center with an exemption from the Institutional Review Board [21].The different stages are gestation (stage 0), 1-35 weeks (stage 1), 12-24 months (stage 2), and 28 months and later (stage 3).The data set with diabetic and non-diabetic human subjects is from [53].Locations of endocrine cells were obtained as described in the original studies, which we briefly summarize here.Two-dimensional sections of tissue samples were stained for insulin, glucagon, somatostatin, and DAPI.Each section was imaged, and two-dimensional coordinates for each cell nucleus were estimated based on high concentrations of DAPI.The cell type of each cell was recorded as β, α, or δ based on a high concentration of insulin, glucagon, or somatostatin near its nucleus, respectively.
Defining βand αδ-graphs for islets.Islets with at least five αδ-cells and five β-cells were considered.V I ad and V I b denote the sets of αδ-cells and β-cells, respectively, in islet I. Edges between αδ-cells and between β-cells were defined as follows.First, neighborhood radii t I b and t I ad were computed using the pair distribution function [18].The pair distribution function is computed as ratio of the number of cells at a radial distance of r to the number of cells expected if they are randomly distributed.The thickness of the radial shells was chosen as 0.5.At r-values where peaks occur in the curve of the pair distribution of function, it is expected that there is a larger number of cell pairs within intercell distances than would be expected from a random distribution [21].The peak at the smallest value of r represents a primary correlation between cells.Peaks of diminishing heights occur at higher r-values as a result of secondary correlations between cells.A peak-finding algorithm was implemented to compute t I b as the smallest rvalue at which the pair distribution function for β-cells is minimal between the second and the third peaks.t I ad is computed similarly.The area of an islet was defined as the area of the bounding box around all of its cells.After t I b was computed, edges between β-cells are initially all the β-cells that are at most with t I b distance apart.Edges between αδ-cells were initialized similarly.Second, a shadow algorithm was implemented to account for obstruction in the interaction of two cells due to the presence of a third cell between them [18].All edges that are obstructed by a cell were removed.The final sets of β-edges and αδ-edges are denoted by E I b and E I ad ; respectively.
Computing cycles using geometry.αδ-cycles around β-cells were computed as follows.The αδ-graph of islet I is the discrete graph on its αδ-cells, denoted by G I ad � ðV I ad ; E I ad Þ.G I b is defined similarly.We drop the superscript I for notational convenience.A αδ-cycle is a simple closed curve on G αδ such that it partitions the graph into two disjoint sets, one inside the cycle and one outside.It follows from the Jordan Curve Theorem that the graph has to be planar.G αδ was made planar using dummy vertices-if two edges, {v 1 , v 2 } and {v 3 , v 4 } intersect at p, then they are removed from E I ad ; a dummy vertex u p located at p is added to V αδ , and edges {v i , u p }, 1 � i � 4, are added to E I ad : Geometric αδ-cycles around β-cells were computed using three main steps.First, a list of all αδ-cycles was computed as follows.For each connected component of c k of G αδ , a spanning tree T k is constructed on c k .Let F k be the set of edges that are in E αδ but not in T k .The weight of an edge between two cells is defined as the Euclidean distance between the cells.For each edge {v i , v j } in F k a shortest weighted path P between v i and v j in T k is computed.The set of edges {{v i , v j }} [ {edges in P} forms an αδ-cycle or a cycle of αδ-cells in c k .αδ-cycles in all components of G αδ were computed.Second, αδ-cycles that surround β-cells were determined.For each β-cell, all αδ-cycles that contain it are computed using a windingnumber algorithm.A set of β-cells can be inside multiple cycles.Sets of β-cells were computed that are inside the same set of αδ-cycles.Let B be the collection of such sets of β-cells.Third, for each set S of β-cells in B; a minimal geometric cycle, P S , of αδ-cells is defined and computed as follows.

Determine the closest pair of cells {β
2. Let angðv; uÞ ¼ arctan where v x and v y are x and y-coordinates of vertex v in the islet.If ang(v, u) < 0, then define angðv; uÞ ¼ arctan In other words, it is the positive or counter-clockwise angle that the horizontal line through u has to turn to be parallel to the line through u and v. Let θ * = ang(β i , v * ).
3. For v j 2 V αδ , let N(v) be the neighbors of v j in G αδ .Let anglesðv j Þ ¼ ½angðv m j ; v j Þ� ¼ ½y m � 1�m�jNðv j Þj be the sorted list of angles in increasing order, where v m j 2 Nðv j Þ: 4. The notion of minimality that we define is that the counterclockwise turn at every node along the cycle should be minimal.
5. Hence, v 1 2 P S is computed as Nðv * Þ such that θ k is the smallest angle greater than θ * .Once we have P S = [v * , v 1 ], the next cells in the path are computed as follows.If v m and v m+1 are two adjacent cells in P S , the next cell v m+2 2 P S is computed as v mþ2 ¼ v k mþ1 2 Nðv mþ1 Þ such that θ k is the smallest angle greater than ang(v m , v m+1 ).The next cells are computed till we reach v * .Note that if no such θ k exists at a step, then the minimal counterclockwise turn is for k = 1, hence, v mþ2 ¼ v 1 mþ1 : 6.It is possible that the computed minimal path does not contain S inside it.Hence, we check that every β cell in S is inside the computed path using the winding number algorithm.If not true, then the pair {β i , v * } is marked as an incompatible pair and we begin with step 1 by finding the closest pair but ignoring the ones that are marked as incompatible.
Sizes of components of S are determined using the Networkx Python package [54].β-cycles around αδ-cells are computed similarly.
An advantage of using angles to find geometric loops is that we can identify components that are partially surrounded by cells of the other kind (see S7 Fig).However, analysis of partial loops is not included in this work.
Kernel density estimation and Kullback-Leibler divergence.Islets are characterized by β-cell fraction and the total number of cells in them.The number of cells in islets was transformed by ln(1 + x).KDE was estimated using scipy.stats.gaussian_kdemodule of the Scipy Python package [55], with the default method for bandwidth estimation.Two-dimensional KDE was computed on a grid of resolution 100 over the space of (β-fraction, the number of total cells in islets) coordinate pairs.KL-divergence between two KDEs was computed using scipy.stats.entropymodule with default settings.
Computing islet's periphery and distance of cycle from its islet's periphery and center.Since the periphery of a 2D slice can be non-convex, it was estimated by computing an alpha shape [56] for the set of all cells in the slice as follows.The alpha shape computation depends on the hyperparameter called the shrink factor.Shrink factor set to 0 computes the convex hull of the set of points as the alpha shape.To get a more accurate estimate of the non-convex periphery, we first initialized the shrink factor as the multiplicative inverse of the maximum of g I b and g I ad for islet I.If the computed alpha shape was composed of multiple polygons, the shrink factor was halved and the alpha shape was computed again.The periphery of the islet was defined by the alpha shape that was composed of a single polygon at the largest possible shrink factor in this iteration.The computation was done using the Python package alphashape v1.3.1.
The area of the islets and distances between cycles and the periphery were estimated using the distance method from Python package Shapely v2.0.1.distance computes the distance between two polygons as the distance between the closest pair of points.The islet center was computed as the centroid of the periphery.Containment of the islet center inside a cycle was computed using contains method of Shapely v2.0.1.
Computing dimension-1 persistent homology.Dimension-1 PH of αδ-cells was computed using the standard column algorithm to reduce boundary matrices [57].An introduction to persistent homology with precise mathematical terminology can be found at [58].Here we provide a brief overview of the standard column algorithm to compute dimension-1 persistence pairs using non-technical terminology that might be more accessible to non-experts.
Total ordered sets of vertices, edges, and triangles are defined as follows.Vertices in V αδ are indexed aribitrarily.All possible edges on V αδ are indexed by their length with longer edges having a higher index.All possible triangles on V αδ are indexed by order of the edge with the largest order in their boundary.In both cases, ties are broken arbitrarily.Boundary matrix for edges, D e , is defined as a m by n matrix with, D e [i, j] = 1 iff v i is in the boundary of e j and D e [i, j] = 0, otherwise.Similarly, boundary matrix for triangles, D t , is defined as a n by k matrix with, D t [i, j] = 1 iff e i is in the boundary of t j and D t [i, j] = 0, otherwise.Boundary matrices are reduced using standard column reduction as follows.A column is non-empty if it has at least one non-zero entry.Pivot-index of a non-empty column is defined as the maximum row index with a non-zero element.If columns i and j have the same pivot-index and i < j, then column i is replaced with its modulo 2 sum with column j.This is repeated till no two nonempty columns have the same pivot-index.D e and D t are reduced independently, and the resulting reduced matrices are denoted by R e and R t .The reduction operations are denoted by V e and V t , respectively.If (i, j) is a pivot of R t , then there is a topological feature born at the spatial scale of the length of edge e i and it dies at the spatial scale of the largest length of the edge in the boundary of triangle t j .Persistence of each topological feature is the difference between its death and birth.Dimension-1 PH of β-cells was computed similarly.
Computing an initial set of biologically significant cycles using persistent homology.We provide instructions to compute representative boundaries using non-technical language.See [59] for an explanation of the algorithm using precise terminology.αδ-cycles containing β-cell(s) and β-cycles containing αδ-cells(s) are classified as biologically significant.To compute αδcycles around β-cells in an islet I, sets of vertices, edges, and triangles were defined as follows.Vertices and edges are V I ad and E I ad ; respectively.Triangles are those that have edges in E I ad and do not contain (horizontal ray algorithm) any β-cell.Boundary matrices are defined as described previously and PH is computed for this collection of vertices, edges, and triangles.Since triangles containing β-cells are not in the boundary matrix, topological features in the αδ-graph that contain β-cells will not die.If column i of R e is empty and i is not a pivot-index of any column of R t , then column i of V e is a representative boundary of a topological feature that does not die.From these representative boundaries we ignore the ones that do not contain any β-cells inside them.This results in an initial set of αδ-cycles that contain at least one β-cell inside them.
Greedy and stochastic shortening of PH-cycles before comparison with geometric cycles.Representative boundaries around topological features are not unique by definition and can be geometrically imprecise.To improve their precision before comparison with geometric cycles, the boundaries in the initial set were shortened using greedy and stochastic shortening introduced in previous work [51].Technical details of stochastic shortening of αδ-cycles in an islet are as follows.Locations of αδ-cells were perturbed 50 times in neighborhood disks centered at the cells.Edge-lengths were rounded to the nearest integer.Since edges of the same length can be ordered arbitrarily, at most 50 unique different total ordered sets of edges were constructed for each perturbation.PH-cycles for each permutation of every perturbation were computed as described above.Moreover, this was done for ten different values of maximum neighborhood disk radii of [0.1, 0.2, . .., 1].These ranges of values were chosen because they are much less than (� 10%) the minimum neighborhood radius of 8 that was computed across all islets in both data sets.This resulted in up to 25000 sets of representative boundaries for the islet.For each boundary in a set of representatives, the set of β-cells inside it was computed.For each set of β-cells that is inside some representative boundary, a list of those with the least number of edges was constructed.Finally, we computed if any of the representative boundaries in this list is proximal to a geometric cycle as described next.
Comparing geometric cycles with PH-cycles.We say a geometric αδ-cycle matches an αδ PHcycle if they both contain the same set of β-cells inside them.Suppose L is the set of αδ-cells that are computed to form a geometric cycle and L is an αδ PH-cycle computed for an islet I.The distance between them is defined as dðL; LÞ ¼ maxfmax p i 2L fmin pj 2 L fdðp i ; pj Þgg; max pi 2 L fmin p j 2L fdðp i ; pj Þggg; where d(p, q) is the Euclidean distance between cells p and q in the islet.We say that L and L are proximal if cycle L matches with L and dðL; LÞ � t I ad : Otherwise, we say they are distant.Analogous definitions follow for proximal β geometric and PH-cycles.
Computing cycles in 3D data sets using PH.To compute αδ-cycles around β-cells in an islet I, sets of vertices, edges, and triangles were defined as before.Additionally, tetrahedrons on αδ-cells are defined as those that have edges in E I ad ; all faces as valid triangles, and do not contain any β-cell.Containment was checked using barycentric coordinates.Tetrahedrons are ordered by the length of the longest of the edges of their faces, also called their diameters.Those with the same diameter are given a unique order arbitrarily.This results in a fullordered set of tetrahedrons.The boundary matrix for tetrahedrons is defined and constructed, denoted by D h .It is reduced as before to give the reduced matrix R h and features that do not die are computed using methods analogous to those defined for the 2D case.The threshold to define edges on the graph was chosen as 25.
Tests for statistical significance.1D distributions of maximum persistences were compared using two-sided Mann-Whitney U rank test for two independent samples using scipy.stats.mannwhitneyumodule with default settings.2D distributions of islet characteristics were compared by computing p-value from KS-test using the ndtest Python package from https://github.com/syrte/ndtest.

Discussion
Studies of the structure of islets of Langerhans have shown that the relative number and arrangement of the individual cell types plays a critical role in regulating glucose metabolism [60].The arrangement is highly complex and heterogeneous, and has been investigated using various experimental and quantitative approaches, including network science methods [41,42].Changes in the structural characteristics of islet cell types have been observed during the progression of type 2 diabetes, with decreased beta-cell numbers and disrupted structural and functional connectivity being key features of the disease.Quantification of islet structure has typically been applied to 2D images of islet sections, but such 2D data is unable to capture important aspects of islet physiology such as vasculature [61] and innervation [62], both of which are known to play critical roles in islet function and in functional communication between islets.
Our contribution here is two-fold.We have developed two distinct approaches to go beyond functional network statistics or spatial descriptive or network statistics in the characterization of islet cytoarchitecture.One, a geometric approach, is much easier to apply to, and visualize in, 2D image data, and the other, a topological characterization, is applicable to 2D and 3D data.Of note here is that in contrast to network characterizations, the topological features we uncover are nonlocal by construction, and therefore are capturing a complementary view of islet cytoarchitecture relative to network approaches.While there are other computational approaches to topological characterization, our approach is the only one that explicitly gives the locations of the actual topological features in the image.Such location information is, of course, the sine qua non for studies of the functional impact of any feature, see, for example [63].We confirmed that results from our two distinct computational methods, geometric and topological, completely agree for the 2D sections for the developmental and T2D data sets.
Our results showed that a low percentage of islets contained ring structures with a NS-component of the other kind of cells.However, we observe changes in this percentage across developmental stages and between control and diabetic, that correlate with trends in persistence homology of the islets in both cases.These differences in islet cytoarchitecture may affect paracrine signaling between endocrine cells resulting in functional differences [7,[10][11][12][13].For example, NS-components of β-cells might be indicative of β-cells coupled via gap junction linkages that may play a functional role in coordinated responses to endogenous insulin secretagogues such as glucagon-like peptide-1 (GLP-1), but might not be significant for islet dynamics involved in glucose-stimulated Ca 2+ oscillations [64,65].Further, 3D structural analysis of human islets has shown that α-cells are arranged along interiorly pervading vessels [7].Hence, studying the topology of the 3D islets taking into account the blood vessel information, and comparing it between control and diabetic subjects might be important to study possible relations between morphological and functional changes.
The topological characterization [66] may be important for understanding disease susceptibility of islets of different characteristics.It is obvious that 2D images provide a limited view of the complex 3D architecture of the islet, and can result in under-or overestimation of cell sizes and numbers.In addition, automated segmentation algorithms may not always accurately distinguish individual cells, particularly in cases where cells are tightly packed or have irregular shapes.Finally, variations in staining or imaging conditions can affect the accuracy and reproducibility of quantitative measurements.The topological characterization is robust to many of these experimental uncertainties.
3D imaging techniques, including confocal microscopy, two-photon microscopy, and optical coherence tomography, are being developed to provide a more comprehensive understanding of islet architecture [19,43,61,[67][68][69][70][71][72][73], including progress on visualization of vasculature.Advances in image analysis algorithms allow the segmentation of individual cells from 3D image stacks, allowing for the quantification of various structural and functional parameters.Our topological approach can be applied without any changes to 3D imaging data, while the geometric approach is difficult to generalize to 3D without mathematical assumptions.However, current 3D imaging of islets does have limitations, such as limited penetration depth and imaging speed, which may result in incomplete imaging of large islets.Nevertheless, we illustrated application of PH to find αδ-cycles (closed polyhedral structures) in 3D data sets containing β-cells inside (and β-cycles containing αδ-cells inside) for the limited data sets of human and mice that were publicly available.It can be of interest to analyze the properties of locations of the cycles found in 3D data sets.However, in the limited 3D data sets available, we observed islets with highly convex shapes and some with multiple globules.Mathematically sound definition and stable computation of geometrical properties of these 3D point-clouds, for example equatorial plane and poles, might require a larger number of data sets for testing and validation.It can be of further interest to compare the consistency of results between 3D data sets and their 2D sections for different slicing schemes.These analyses of 3D islet cytoarchitecture can be a future direction of research as data from many 3D islets becomes available.
Mathematical models have been used to simulate the behavior of islets and predict how changes in cell number, size, and arrangement will affect glucose metabolism.These models can also be used to analyze the effects of different interventions, such as drug treatments, on islet function.Applying this type of mathematical modeling [37,47,48,74] to simulated islet cell distributions with similar network characteristics but distinct topological characteristics or vice versa would be an interesting future direction to determine the relevance of nonlocal topological features to islet function.
The contribution of this work to biology is to provide quantitation of structures that have been controversial in terms of existence and/or functional significance.The results are correlative and not causative.How these structures are related to glucose-stimulated insulin secretion (GSIS) profiles is unclear.We hope that by providing multiple mathematical methods for defining and computing such topological structures, the field can focus on the relevance of these structures for function and understand how different features in the GSIS profile are related to specific islet features.

Fig 2 .
Fig 2. Comparing cell composition across developmental stages.(A) KDE plots of the distribution of islets characterized by the total number of cells and β-cell fraction from stage 0 to stage 3.The number between every pair of plots shows the KL-divergence between respective KDEs.The KL-divergence between stages 0 and 1 and between 2 and 3 are at most 0.07 as compared to at least 0.27 for every other pairwise comparison.The mode of the density estimate is marked by a red star in each plot.Peaks are at (0.38, 3.27), (0.49, 2.95), (0.66, 3.27), and (0.75, 3.6) for stages 0 to 3. A higher proportion of islets in the last two stages have a higher β-cell fraction.(B) KDE plots of distribution of islets characterized by number of α and δ-cells.In the later developmental stages (2 and 3) there is a large proportion of islets with more α-cells than δ-cells (bright regions in the KDE under the y = x white dashed line).https://doi.org/10.1371/journal.pcbi.1011617.g002

Fig
Fig 2A showed that the KL-divergence in the distribution of characteristics of all islets between stages 2 and 3 is the smallest (� 0.07) amongst all pairwise comparisons.Further, the percentages of islets with a NS β-component inside an αδ-cycle are similar in stages 2 and 3 (see Fig 3A).However, Fig 4A shows that distributions of characteristics of such islets have a larger KL-divergence of 0.26 between stages 2 and 3. Specifically, a comparison of peaks of these KDEs indicates that a higher proportion of such islets in stage 3 contain more cells.In contrast, Fig 4B shows that islets with a NS αδ-component inside aβ-cycle have smaller KL-divergence There were 2038 and 1179 islets with at least five β-cells and five αδ-cells from control and diabetic subjects, respectively.Fig 5A shows that the KL-divergence between KDEs of islets with at least one NS β-component in an αδ-cycle (middle row in the figure panel) and of those with at least one NS αδ-component in a β-cycle (bottom row) is more than double the KL-divergence between KDEs of all islets (top row), between control and diabetic subjects.We found 175 and 85 islets with at least one geometric αδ-cycle around a NS β-component in non-diabetic and diabetic subjects, respectively.159 and 56 islets were found to have at least one geometric β-cycle around a NS αδ-component.S2 Table shows that distributions of the islets characteristics between control and diabetic are significantly different in both cases, p-values from KS-test are <0.05.S11 Fig. shows exemplary 2D sections for each case.Fig 5B left panel shows that percentages of islets with at least one NS component in a cycle are lower for diabetic subjects.A similar trend is observed for maximum dimension-1 persistence (Fig 5B right panel).This correlation between percentages of islets with NS components in cycles and percentiles of the maximum of dimension-1 persistence was also observed across developmental

Fig 5 .
Fig 5. Comparing features of islets between control and diabetic subjects.(A) KDEs for control (left column) and diabetic (right column) subjects for all islets (top row), islets with at least one NS β-component in an αδ-cycle (middle row), and islets with at least one NS αδ-component in a β-cycle (bottom row).Numbers show the KL-divergence.(B) The percentage of islets that have at least one cycle around a NS component is lower in diabetic subjects as compared to non-diabetic subjects.Percentiles of maximum dimension-1 persistence of islets also are lower for diabetic subjects.https://doi.org/10.1371/journal.pcbi.1011617.g005 were 60, 79, 219, and 753 islets with at least one β PH-cycle around a NS αδ-component.For the T2D data set, we computed 175 and 85 islets with at least one αδ PH-cycle around a NS βcomponent in non-diabetic and diabetic subjects, respectively.155 and 54 islets were computed to have at least one β PH-cycle around a NS αδ-component.S3 Tableshows that

Fig 6 .
Fig 6. Analysis of cycles with respect to islet periphery and center.(A) KDEs for a minimal distance of cycles from islet periphery vs. islet center.The majority of the cycles are below the y = x line (white dashed) showing that their minimal distance from the periphery is less than that from the islet center.(B) Minimal distances of cycles from islet-periphery vs. islet's estimated area.αδ-cycles in small islets touch the periphery and very few cycles contain the islet center inside them.There are cycles in larger islets that are far from the periphery.https://doi.org/10.1371/journal.pcbi.1011617.g006 Fig 7 illustrates results for three of the islets, two from humans and one from mice.

Fig 7 .
Fig 7. Examples of closed polyhedral structures found by PH in 3D islets.(A) β-cycles that contain αδ-cells inside them in a human islet.β-cells are in green and αδ-cells are in red.(B) αδ-cycles that contain β-cells inside them in a human islet.(C) β-cycles thet contain αδ-cells inside them in a mouse islet.https://doi.org/10.1371/journal.pcbi.1011617.g007