^{1}

^{2}

^{2}

^{3}

^{4}

^{1}

The authors have declared that no competing interests exist.

Retinal fundus imaging is a non-invasive method that allows visualizing the structure of the blood vessels in the retina whose features may indicate the presence of diseases such as diabetic retinopathy (DR) and glaucoma. Here we present a novel method to analyze and quantify changes in the retinal blood vessel structure in patients diagnosed with glaucoma or with DR. First, we use an automatic unsupervised segmentation algorithm to extract a tree-like graph from the retina blood vessel structure. The nodes of the graph represent branching (bifurcation) points and endpoints, while the links represent vessel segments that connect the nodes. Then, we quantify structural differences between the graphs extracted from the groups of healthy and non-healthy patients. We also use fractal analysis to characterize the extracted graphs. Applying these techniques to three retina fundus image databases we find significant differences between the healthy and non-healthy groups (p-values lower than 0.005 or 0.001 depending on the method and on the database). The results are sensitive to the segmentation method (manual or automatic) and to the resolution of the images.

Fundus images are nowadays routinely used for the early diagnostic of ocular pathologies such as glaucoma [

An analysis method with potential for diabetic retinopathy diagnosis is based on fractal analysis [

Here we propose a new method that uses concepts inspired in network science [

Precise graph comparison is a hard problem with many applications and different methods have been proposed in the literature (see [

The proposed algorithm was tested on three databases of different size: a small high-resolution fundus (HRF) image database which comprises images of 15 patients with diabetic retinopathy, 15 with glaucoma and 15 without pathology; a large database, Messidor, where we used 230 images of patients with diabetic retinopathy classified in three groups, and 142 images of patients without pathology; and a medium size database from the Instituto de Microcirugía Ocular (IMO) which contains 70 images of glaucoma patients, and 23 images of patients without pathology. By means of nonlinear dimensionality reduction techniques we show that the DR, glaucoma and healthy groups, have statistically significant different features. To support these results, we also calculate the fractal dimension of the images (segmented and skeletonized versions) and find significant differences between the three groups, which are fully consistent with the results of the graph dissimilarity analysis.

In this section we present the algorithms proposed for unsupervisedly retrieve features from images in a database. We also present the three databases we used to test our algorithms.

All the methods make use of the result of an unsupervised segmentation algorithm that was adapted from the one proposed in [

The box counting algorithm is a well-known method for estimating the fractal dimension (FD) of a geometrical object [^{−D}, where D is the fractal dimension which can be cast as an equation:

With the information retrived from the segmented images (raw and skeletonized) we construct undirected graphs where the nodes represent the branching (bifurcation) points and the endpoints, and the links represent vessel segments that connect pairs of nodes. The links have associated weights that represent the cost of transporting matter from one node to the other. If nodes i and j are not connected, _{i,j} = 0, while if there is a segment connecting them, _{i,j} ≠ 0. In order to test different possibilities using the values of the length, _{i,j}, and the width, _{i,j}, of the segment that connects nodes _{i,j} and the width _{i,j} of each link can be computed using the information contained in the skeletonized and raw segmentations. The length accounts for the number of pixels spanned by each link in the skeletonized version while the width can be estimated from the number of pixels (_{i,j}) each link has in the raw segmented mask as _{i,j} = _{i,j} × _{i,j}.

Structural differences between the extracted graphs were characterized by using the measures described in this section, which provide probability distribution functions (PDFs) that can be mutually compared by using nonlinear dimensionality reduction (NLDR) techniques.

The node distance distribution (NDD) measures the heterogeneity of a graph in terms of the connectivity distances, and allows the precise comparison of two graphs, by quantifying the differences between distance-based PDFs extracted from the graphs. It is based on computing, for each node

To apply the NDD concept to the tree-like graphs extracted from the segmented images, we consider only the distribution of distances to the central node that represents the optic nerve (because all the transported blood comes from and returns to this node). Thus, we analyze the Central NDD (C-NDD) PDF that gives the distribution of distances of the nodes to the central one. The distance of one node to the central one is defined as the sum of the weights of the shortest path.

As an example, in _{CNDD} (

Example result of the automatic segmentation algorithm on top of the original image, nodes are shown in light yellow, while links are shown in dark grey. An example of a shortest path from the optical disk to the node highlighted in green is shown, it consists of three links each one having its own weight according to

A variation of the Central NDD is the central mean weight distribution (CMWD), which is the distribution of average weights, i.e., the sum of the weights of the links that connect two nodes, divided by the number of links.

The degree distribution, _{DD} (_{DD} (_{DD} (_{DD} (_{WDD} (_{i} = ∑_{j} _{i,j}).

The analyses described above provide us, for each image, with various probability distributions (one for each combination of _{i,j} are the distance (JS divergence) between the probability distributions extracted from images

We used three different databases to test our algorithms.

The HRF is a public database [

This database also includes a manual segmentation of the vessel network performed by a human expert. For comparison purposes we have also analyzed this set of images, as well as the images resulted from our automated segmentation method described in the supporting information.

This database, kindly provided by the Messidor program partners (see

As our method is sensitive to changes in the images resolution, we worked with the first 400 images in the database that have a resolution of 2240 × 1488 pixels. Out of the 400 images we discarded 28 images in which the algorithm either failed to segment the network or to find the optic nerve and analyzed 372 images: 230 with diabetic retinopathy and 142 without.

We also analyzed 93 images from patients at IMO (Ocular Microsurgery Institute:

We applied the analysis tools described in

The algorithms were implemented in MatLab (segmentation, network retrieval, IsoMap) and Python (fractal analysis, network analysis) and their runtime using personal laptops was between 5 and 35 seconds per image depending on the resolution. These runtimes could be improved by rewriting the algorithms in a compiled language, however, they provide a rough assessment of the complexity of the algorithms.

We have summarized the results in two (Tables

Analysis | MESSIDOR p-Val. | HRF Automated p-Val. | HRF Manual p-Val. | ||
---|---|---|---|---|---|

C-NDD | 1 | -2 | |||

C-NDD | 1 | 2 | 0.29 | 0.57 | |

CMWD | 1 | -2 | 0.82 | 0.68 | |

WDD | 0 | 1 | |||

Nodes | - | - | 0.074 | 0.69 | |

Links | - | - | 0.073 | 0.99 | |

Endpoints | - | - | 0.070 | 0.29 | |

Bifurcation points | - | - | 0.082 | 0.65 | |

FD skeletonized | - | - | 0.23 | 0.88 | 0.68 |

FD raw | - | - | |||

FD best direction | - | - | |||

Best result using FD proposed in [ |
- | - | - | - | |

FD result in [ |
- | - | - | - |

p-values obtained by comparing the features extracted from the Messidor and HRF databases (automated and manual segmentations) of the groups with and without diabetic retinopathy (p-values smaller than 0.05 in

Analysis | IMO p-Val. | HRF Automated p-Val. | HRF Manual p-Val. | ||
---|---|---|---|---|---|

C-NDD | 1 | -2 | 0.27 | ||

C-NDD | 1 | 2 | |||

CMWD | 1 | -2 | |||

WDD | 0 | 1 | 0.99 | 0.087 | |

Nodes | - | - | |||

Links | - | - | |||

Endpoints | - | - | 0.10 | ||

Bifurcation points | - | - | |||

FD skeletonized | - | - | 0.057 | ||

FD raw | - | - | 0.11 | ||

FD best direction | - | - | |||

Best result in [ |
- | - | - | - |

p-values obtained by comparing the features extracted from the IMO and HRF databases (automated and manual segmentations) of the healty and glaucoma groups (p-values smaller than 0.05 in

Using the (a) automated and (b) manual segmentation. In both plots the horizontal axis denotes the fractal dimension of the skeletonized mask while the vertical axis accounts for the fractal dimension of the raw segmented mask. Each point represents the fractal dimensions of one image, while the ellipses represent the square root of the covariance matrix of each group. In (a) we note that the three groups are well separated (p-values 4.5e-05, 0.00041), while in (b) the healthy group is well separated from the non-healthy ones (p-values 8.5e-06, 9.6e-08). In both plots we note that using the two fractal dimensions improves the separation, in comparison to using only one.

In both segmentations a clear distinction between healthy and non-healthy groups is obtained. In addition, with the automated segmentation a clear segregation between the three groups is obtained (left panel), which is not seen in the analysis of the manual segmentation (right panel).

Similar segregation between healthy and non-healthy groups is obtained for the Messidor and for the IMO databases, with p-values of the order of 9e-12 for Messidor (see

The panels display the IsoMap features extracted from the HRF database, using the automated (a,c) and manual (b,d) segmentations, with C-NDD analysis (a, b) and with C-MWD analysis (c,d). The weigths (

Again a clear distinction between the groups is obtained, with p-values (see Tables

The panels display the raw histograms of the C-MWD extracted from the HRF database (with logarithmic scale in the insets), using the automated (a) and manual (b) segmentations (it corresponds to

Similarly to the previous C-NDD analysis, in the manual segmentation the two non-healthy groups are clearly separated from the healthy one, while on the automated segmentation only the glaucoma group is found to be statistically different from the healthy one. The same results (not shown) hold for the Messidor and IMO databases.

Comparing C-NDD and C-MWD, one can see that the prior performs better for diabetic retinopathy while the latter performs better for glaucoma. The p-values are summarized in Tables

IsoMap features obtained from the (a) automated and (b) manual segmentations. The weigths (

Here we observe (as in the previous analysis, compare

We also analyzed other network features, such as the number of links, the number of nodes, the number of endpoints (nodes with only one neighbor) and the bifurcation points (nodes with 3 or more neighbors). The results, also presented in Tables

In

For the diabetic condition, in both Messidor and HRF with automated segmentation the best analysis turned out to be the best direction of the fractal dimension plane (i.e. a linear combination of both proposed fractal dimensions) while for the manual segmentation the best analysis was WDD with

For the glaucoma case, in both IMO and HRF with automated segmentation the best results were obtained using the C-NDD with

The parameters

It should be noted that the analyzed network is a 2-dimensional projection of the real 3-dimensional retina network, this implies that there are some nodes in it which, in reality, correspond to crossovers of veins and arteries. This alters the extracted features in two ways, by generating spurious nodes whose links are fictional, and by generating spurious shortest paths to the optical nerve. The problem of distinguishing arteries from veins in fundus photographies is highly non-trivial [

Our findings are consistent with the results recently reported in [

We have demonstrated that the network-based features extracted from fundus images are useful for detecting topological changes produced in patients with diabetic retinopathy and glaucoma. For both diseases, the proposed network features we have proposed are able to separate the healthy group and the unhealthy groups with extremely high statistical significance. We have also compared our results with those obtained from fractal geometry analysis, and we have shown that using both fractal dimensions (raw segmented, and skeletonized) improves the separation between the groups, in comparison to using only one. The most statistically significant results were obtained using high resolution images (the HRF database), and in particular, when using the manual segmentation provided with the database. We found that analyzing the manual segmentation of the HRF database with the weighted degree distribution (see

When analyzing diabetic patients, the weights that performed the best were the widths (^{2} (^{2}). This can be understood by considering that glaucoma is linked to an increase of the intraocular pressure, which can increase the volume of the vessels.

The measures proposed in this paper demonstrated very good performance in retina fundus images of different resolution, and of patients with different diseases. Therefore, it will be interesting to explore their potential with other vascular-related diseases.

(PDF)

(PDF)

(PDF)

(RAR)

P.A. and C.M. acknowledge support by the BE-OPTICAL project (H2020-675512).

C.M. also acknowledges partial support from Spanish MINECO/FEDER (PGC2018-099443-B-I00) and ICREA ACADEMIA.

CFRM and LGV acknowledge COFAA-IPN, EDI-IPN, CCA-IPN and SNI-CONACyT, México.

I.S.-N. acknowledges partial finantial support from Spanish MINECO (FIS2017-84151-P).

All the authors acknowledge the collaboration from IMO and its technicians for the assistance at getting the images from their internal database and anonymizing them.

The authors also acknowledge the creators of the Messidor and HRF databases for kindly offering them to public use.