Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Data augmentation via warping transforms for modeling natural variability in the corneal endothelium enhances semi-supervised segmentation

  • Sergio Sanchez,

    Roles Conceptualization, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Facultad de Ingeniería, Universidad Tecnologica de Bolivar, Cartagena, Colombia

  • Noelia Vallez,

    Roles Conceptualization, Investigation, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation VISILAB, Universidad de Castilla-La Mancha, E.T.S. Ingeniería Industrial, Avda Camilo Jose Cela, Ciudad Real, Spain

  • Gloria Bueno,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliation VISILAB, Universidad de Castilla-La Mancha, E.T.S. Ingeniería Industrial, Avda Camilo Jose Cela, Ciudad Real, Spain

  • Andres G. Marrugo

    Roles Conceptualization, Data curation, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    agmarrugo@utb.edu.co

    Affiliation Facultad de Ingeniería, Universidad Tecnologica de Bolivar, Cartagena, Colombia

Abstract

Image segmentation of the corneal endothelium with deep convolutional neural networks (CNN) is challenging due to the scarcity of expert-annotated data. This work proposes a data augmentation technique via warping to enhance the performance of semi-supervised training of CNNs for accurate segmentation. We use a unique augmentation process for images and masks involving keypoint extraction, Delaunay triangulation, local affine transformations, and mask refinement. This approach accurately captures the natural variability of the corneal endothelium, enriching the dataset with realistic and diverse images. The proposed method achieved an increase in the mean intersection over union (mIoU) and Dice coefficient (DC) metrics of 17.2% and 4.8% respectively, for the segmentation task in corneal endothelial images on multiple CNN architectures. Our data augmentation strategy successfully models the natural variability in corneal endothelial images, thereby enhancing the performance and generalization capabilities of semi-supervised CNNs in medical image cell segmentation tasks.

Introduction

The segmentation of corneal endothelial cell images is important for assessing corneal health and diagnosing various corneal diseases based on cell morphology [1]. Accurate segmentation is still challenging despite the development of automated methods over the last decade [2]. Recent automated methods based on Deep Learning [3, 4] have superseded traditional methods [5] based on morphological operations, contour detection, and spatial frequency analysis due to their improved performance. However, their success and generalization capabilities largely depend on expensive image annotation by experts due to their supervised nature.

In this context, the segmentation of microscopic images of the corneal endothelium, as illustrated in Fig 1, faces a series of challenges. These include data acquisition issues such as calibration problems, noise, and equipment handling errors [6, 7]. Image processing is further complicated by factors like variations in lighting, shadows, blurring, glare, and unwanted artifacts [8]. At the medical evaluation stage, specialists grapple with issues like misdiagnoses and discrepancies between experts, compounded by subjectivity in image interpretation and lack of accurate annotations [9]. The complexity is increased in images of endothelial cells affected by diseases such as Fuchs dystrophy, presenting additional diagnostic challenges [10].

thumbnail
Fig 1. Illustration of various challenges in corneal endothelium imaging including low visibility, blurring, and images with guttae due to Fuch’s dystrophy, highlighting the complexities in data acquisition and analysis.

https://doi.org/10.1371/journal.pone.0311849.g001

Due to the complex nature of medical imaging and its high cost of labeling, there has been significant growth in research into data augmentation techniques that seek to improve the performance of CNN in the medical image segmentation task [1113]. These techniques aim to enrich the diversity and representativeness of the training data set, which in turn contribute to better generalization and accuracy of the CNN. However, not all data improvement techniques are effective. To address this limitation, more advanced approaches based on semi-supervised learning and data augmentation techniques are being explored and have proven to be a promising solution. These networks have the ability to extract relevant features from a large set of unlabeled images, making it possible to capture fundamental patterns and structures present in medical images. Which they subsequently use to perform fine tuning with a limited number of labeled images to learn the [14] segmentation task.

Hence the importance of using data augmentation strategy that allows applying transformations that manage to simulate the natural variations present in the images, that help to enrich the data sets and improve the robustness of the segmentation models. These transformations must not only be realistic, but also specific to the task and anatomy at hand. For example, in the case of segmentation of the corneal endothelium, it is critical that the applied deformations reflect actual variations in cell morphology and surrounding tissue characteristics.

Therefore, in this article we present a novel and effective methodology that addresses the inherent challenges of low-annotation medical databases, with cutting edge results. We propose a data augmentation technique using semi-supervised learning to improve the segmentation task in specular microscopy images of the corneal endothelium. Our main contribution focuses on the implementation of the warping and watershed strategy to model and generate transformed images and masks similar to the original, contributing to the challenge of sparsely annotated medical databases. In the following sections, the proposed method will be described in detail and a comparison with traditional strategies will be made.

Related work

Image segmentation of the corneal endothelium is a problem that has been the subject of research for several decades. This task faces multiple challenges, as images of these cells often present blur, low contrast, and irregular lighting. Over time, various techniques have been developed with the aim of achieving a more precise delineation and effective separation of endothelial cells. The first strategies were based on manual segmentation, which is prone to human error [15, 16]. Subsequently, techniques based on thresholding methods, filters and image processing were used. Although they improved segmentation, there were still challenges with low-quality images and noise [17, 18]. However, with advances in the field of computer vision and deep learning, there has been a significant change in the way segmentation of the corneal endothelium is approached.

Convolutional Neural Networks have demonstrated an exceptional ability to automatically learn intrinsic features in this type of data. In recent years, different authors such as Okumura et al., and Sierra et al., [3, 19] have investigated deep networks based on supervised learning, achieving plausible results, however, these models require a large volume of images annotated by specialists to improve performance, they are sensitive to domain changes and overfitting. Not to mention that these databases are often sparse, unbalanced, require informed consent from patients, and are complex to acquire and interpret [20].

To overcome the problem of unlabeled medical databases, there are different strategies, such as regularization methods, transfer learning, one-time and zero learning algorithms, semi-supervised learning and data augmentation [21, 22].

Regularization methods are commonly used in deep network configuration, some strategies such as dropout, early stopping and batch normalization control the complexity of the neural network, avoid overfitting and avoid excessive dependence between neurons. However, its effectiveness may be decreased by the size of the data set, network architecture, learning rate, optimization algorithm, training iterations, and hyperparameter settings [23, 24].

Transfer learning takes advantage of the knowledge of frozen deep neural networks, which have been trained with millions of non-medical images. However, this approach may not provide features directly relevant to medical segmentation tasks. Despite this, transfer learning can be advantageous, as previously trained networks have learned low-level visual features such as borders, textures, and basic patterns, useful for further tuning [25, 26]. Nonetheless, this strategy presents challenges, mainly to ensure the relevance and applicability of the pre-trained features in the specific context of medical imaging.

Few-shot learning algorithms, including one-shot and zero-shot learning, are machine learning strategies that offer promising solutions to the scarcity of unlabeled medical databases. These strategies have achieved promising results in the performance of deep neural networks [27, 28]. These techniques, despite their innovative use of minimal data and ancillary information, are often hampered by bias and lack of diversity in the training data.

Semi-supervised learning is a strategy that combines labeled data with unlabeled data during the training of a neural network. Initially it learns unlabeled data features, then the weights are frozen and finally used in a fine-tuning stage to learn a specific task [29, 30]. These architectures are used on generative approaches [31], predictive tasks [32], contrastive and non-contrastive learning [33] and bootstrap approaches [34, 35]. Most of these techniques require heuristics and fine tuning of hyperparameters, so each approach may perform differently depending on the type of problem being addressed. Which is why it is important to obtain more labeled data, either through conventional or synthetic means, to ensure better performance of CNNs.

Data augmentation is a strategy to increase the volume, quality, and diversity of annotated images, though its application varies depending on the task. This technique is commonly used with both traditional methods and deep learning. Traditional approaches typically involve geometric transformations (e.g., rotation, flipping, cropping, shifting, zooming, and random local rotation) [36], photometric adjustments (color space shifting, brightness) [37, 38], and noise injection or filtering [39, 40]. Yet, these methods often fail to fully capture the natural variability of biological structures, limiting their effectiveness in more complex segmentation tasks.

Neural network-based data augmentation, using architectures like GANs and style transfer [41, 42], enables the generation of data that is often indistinguishable from real data, addressing challenges in image processing. These networks have been widely adopted to increase data volume; however, they require substantial computational resources and often produce artifacts or unnatural shapes in the output [4345].

Several data augmentation strategies for segmenting corneal endothelium images from specular microscopy have been explored in the literature. Authors like Sierra et al. [46], Vigueras et al. [47], Kolluru et al. [48], and Shilpashree et al. [49] have used basic geometric transformations such as rotations, translations, cropping, and elastic deformations. However, these traditional techniques provide limited variability and may not adequately capture the complexity of many datasets.

CNNs applied to these images are prone to overfitting, requiring regularization techniques. For example, Viguera et al. [50] and Busra et al. [51] used methods like Batch Normalization, dropout, and Early Stopping to enhance performance. In recent years, researchers like Wu et al. [52], Sánchez et al. [53], and Fabijańska et al. [1] have advanced semi-supervised learning strategies to extract multi-level features from unlabeled corneal endothelial images, using fine-tuning with minimal labeled data for segmentation.

Moreover, Kucharski and Fabijanska [54] developed a technique using GANs to generate training data for corneal endothelial image segmentation, addressing the scarcity of annotated images. They validated the approach using a UNet model, first trained with labeled images and then with synthetic images from three free web-based databases. The UNet successfully detected mask edges and achieved reasonable generalization. The results, evaluated using accuracy metrics, compared the generated masks with ground truth. The researchers concluded that GAN-based strategies can enhance medical databases with limited annotated images, although GANs are difficult to train, control, and stabilize, requiring extensive adjustments [55].

Similarly, Sierra et al. [46] proposed a segmentation method for specular microscopy images of corneal endothelium affected by Fuchs’ dystrophy, framing the task as a regression problem using distance maps rather than traditional pixel-level classification. Their method involves pre-processing to generate distance maps and post-processing to convert them into masks using the watershed technique. Results show faster convergence on clinically relevant parameters, indicating the method’s effectiveness with small datasets. However, challenges remain in pre-processing distance maps for morphometric calculations and correcting illumination in the input images.

Fabijańska et al. [4] used a U-NET architecture to segment corneal endothelial images, achieving an AUROC of 0.92 and a DICE coefficient of 0.86, indicating high precision in boundary delineation. Similarly, Nurzynska (2018) applied CNNs for automatic cell segmentation, reaching 93% accuracy compared to manual annotations and a modified Hausdorff distance of 0.14 pixels, demonstrating strong segmentation performance. Hao et al. [56] developed a deep learning system for estimating morphometric parameters and segmenting corneal endothelial cells from in vivo confocal microscopy images. Tested on 99 subjects, it achieved a Pearson correlation of 0.932 (p < 0.01) with Topcon cell density measurements and an AUC of 0.923, surpassing a U-Net model with an AUC of 0.913.

Shilpashree et al. [49] compared average perimeter length (APL) and endothelial cell density (ECD) between Fuchs’ endothelial dystrophy (FECD) patients and healthy subjects. Using U-Net and Watershed for segmentation, they found decreased ECD and increased APL with higher guttae in FECD, indicating endothelial deterioration. For healthy subjects, the method achieved an F1 score of 82.27%, an average IoU of 77.27%, a precision of 87.9%, and an ROC AUC of 96.70%, demonstrating high segmentation accuracy.

Hence, there is a compelling need to introduce a data augmentation approach that uses deformations to effectively represent the inherent variability within corneal endothelium images, ultimately enhancing the segmentation task in semi-supervised scenarios.

Materials and methods

In this section, we present the proposed methodology to improve semi-supervised segmentation. Our approach focuses on the natural variability present in corneal endothelial images through controlled and realistic deformations, thus enhancing the robustness and generalizability of segmentation models. The dataset was accessed and used for research purposes between March 2023 and November 2023.

Dataset description

A set of 90 in vivo specular microscopy images of endothelial cells in two-channel TIF format, with a resolution of 640x480 pixels, obtained from 66 patients with healthy (42 from the right eye) and dystrophic (48 from the left eye) corneas were used. The data was split by patient, it is important to note that there was no overlap of patient images in the training, testing and validation sets. From these images, 271 patches of size 96x96 were extracted that were labeled by expert personnel, and 1719 patches of the same dimensions without annotations, with balanced distributions for all possible cases. Images of the same patient are not repeated in the training set, nor in the test and validation sets. Images were acquired with a Topcon SP3000P specular microscope equipped with Cell Count software. The study protocol received approval from the ethics committee of the Technological University of Bolívar in Colombia. Due to the retrospective approach of the study, the requirement to obtain informed consent was not applicable. Furthermore, the study was carried out in accordance with the principles established in the Declaration of Helsinki.

Semi-supervised model

We propose a semi-supervised model consisting of two stages. In the first stage, unsupervised learning is used to train various encoders, including ResNet50, ResNet101, DenseNet121, and ResNet101ViT, within a Siamese network architecture based on the Barlow Twins method [57]. This process learns feature representations from a large set of unlabeled images. In the second stage, the encoder weights obtained from the first stage are frozen, and fine-tuning is performed with supervised learning to segment corneal endothelial images using a small number of labeled images.

As shown in Fig 2, the model processes unlabeled images (X) through a Siamese network, where each image undergoes geometric transformations (e.g., rotations, flipping, cropping) to generate new samples (YA and YB). These transformed images are passed through identical encoders to produce feature maps (ZA and ZB), and consistent patterns are identified using the cross-correlation function. The learned weights are then fixed and applied during fine-tuning. A decoder with attention modules and skip connections is added, and a 1×1 convolution at the output generates the final segmentation mask.

thumbnail
Fig 2. Block diagram of the proposed semi-supervised model.

https://doi.org/10.1371/journal.pone.0311849.g002

Architectures.

We use state-of-the-art pretrained backbones—ResNet50 [58], ResNet101 [59], DenseNet121 [60], and ResNet101ViT [61]—within our semi-supervised learning framework. These models, trained on large-scale datasets, have proven effective at learning robust features that enhance performance in our corneal endothelial image segmentation task.

ResNet architectures are used for their deep learning capabilities and residual blocks that prevent performance degradation in deep networks. DenseNet is employed for its dense layer connections, allowing the model to capture detailed and contextual features. Finally, ResNet101ViT, which combines ResNet with Vision Transformer (ViT), captures both local and global features by applying attention mechanisms and processing images in patches [62]. This combination improves feature extraction and segmentation accuracy in medical imaging tasks.

Our approach demonstrates significant improvements in model generalization across different encoders, regardless of the pretrained backbone used.

Conventional data augmentation.

When there are not enough images to adequately train a model, or the classes have a large imbalance between them, the application of geometric transformations is the simplest way to extend a dataset. However, not all transformations are suitable for this purpose [20]. For this type of data augmentation, this strategy applies a distortion to the input images randomly before the training stage, using a Siamese network using some operations like horizontal flip, vertical flip and rotations.

This strategy can increase the variability of the data set and improve the model’s ability to better generalize to new images under different conditions. However, these geometric transformations may not be sufficient to capture the complexity and diversity present in medical data sets.

Data augmentation via warping

The proposed warping method enhances data augmentation for corneal endothelium images by producing new images based on carefully constructed multistage transformation. The method takes as input the Image I and its segmentation mask M. It involves keypoints extraction, Delaunay triangulation, and triangle mapping via affine transformations as outlined in Fig 3.

In Fig 3, you can see that the proposed data augmentation method receives as input images and masks of the corneal endothelium, then these are processed using the warping method, which will be explained later in greater detail.

Keypoint extraction.

Keypoint extraction is a pivotal step in our warping process, applied to the segmentation mask M of the image I. This process is designed to capture crucial features from both the internal structure and the edges of the image, ensuring a comprehensive and realistic transformation.

Initially, keypoints are extracted from the mask M by identifying the centroids of cells. These points represent significant anatomical features within the corneal endothelium and are essential for guiding the warping transformations. Additionally, to mitigate potential edge artifacts, keypoints are also densely sampled along the image edges. This includes points from critical locations such as the corners, center points, and at intervals along the quarters and eighths of the image dimensions [63].

The final set of keypoints K thus encompasses a combination of these two types of extracted points, ensuring both the internal and boundary regions of the image are adequately represented. The set K is defined as (1)

This set of keypoints K is instrumental in the subsequent Delaunay triangulation and affine transformations, facilitating the generation of realistic and varied warped images. xi, yi are the coordinates of the i-th keypoint.

The extraction of closely spaced keypoints near the image edges prevents unrealistic transformations, such as black triangles, as illustrated in Fig 3. This approach ensures deformations are constrained to avoid unrealistic artifacts while still producing augmented training samples.

Delaunay triangulation.

Delaunay triangulation is used to partition the image into triangular regions, a crucial step for effective warping. This technique is selected for its distinctive properties that are especially beneficial in image processing tasks. Delaunay triangulation maximizes the minimum angle of all triangles, preventing the formation of elongated or skinny triangles, which results in more uniform and stable transformations during the warping process [64]. Moreover, it guarantees a unique set of triangles for a given set of points (assuming no four points are cocircular), providing consistency and predictability essential for repeatable image warping.

Given the set of keypoints K, Delaunay triangulation creates a network of triangles, represented mathematically as (2) where △kakbkc denotes a triangle formed by keypoints ka, kb, kc. This structured approach ensures that the transformations applied in the warping process are evenly distributed and align with the image’s natural geometry.

The choice of Delaunay triangulation is particularly advantageous for image warping. Its ability to adapt to the local geometry of the keypoints provides an optimal tessellation that minimizes potential warping artifacts and distortions. This leads to more natural and visually pleasing results, an essential factor in applications like medical imaging, where maintaining the integrity and accuracy of structural representations is paramount.

Warping through local affine transformations.

The warping stage transforms the original image I with coordinates (x, y) into the warped image Iw with coordinates (x′, y′). This transformation is facilitated by local affine transformations applied within triangular regions.

Original keypoints K are modified to create a new set of keypoints K′, facilitating the generation of a new triangulation grid for the warped image (Fig 3). Each keypoint with coordinates (xi, yi) in K is shifted by random amounts to produce the new keypoints K′ given by (3) where δx and δy are random shifts. Let R be a random variable following a uniform distribution in , a random shift, δ, is then computed as follows, (4) where dmin is the minimun distance between any two points in K defined as (5) and (6) The parameter s in Eq 4 inversely controls the magnitude of the transformation: a smaller s results in larger shifts and consequently more pronounced warping [65].

After re-triangulation using the modified set K′, we obtain a new set of triangles that map the original image to the warped image. The continuous vector-valued warping function f is defined for each triangle, projecting the pixels from I to Iw given by (7) where △′ is the domain of the triangles in Iw, and f−1(x′, y′) gives the corresponding coordinates in the original image I.

This process ensures a bijective and smooth transition from I to Iw, resulting in a warped image Iw and warped mask Mw that exhibit realistic variations while maintaining the structural integrity of the original image. The local affine transformations are carefully computed to preserve the topology and avoid artifacts.

Transformations via warping help to enrich the training data set, introducing realistic variations in the images, which allows improving the generalization of CNN networks. Fig 4 shows data augmentation for input images and masks, using the warping method described above.

thumbnail
Fig 4. Transformation via warping with low (s = 3) and high (s = 2) deformation levels.

https://doi.org/10.1371/journal.pone.0311849.g004

From the previous Fig 4, we can see that the proposed model can perform deformations of the original image with different levels, this can be achieved by adjusting the parameter s within the algorithm developed. After obtaining the final deformations, it can be seen that the masks do not present fine details on the edges of the cells and guttae, so it is proposed to use a mask refinement stage.

Mask refinement.

At this stage, the watershed technique is used in image processing to adjust the cell segmentation masks (Fig 5). This technique is based on the idea of simulating the behavior of water flowing over a topography, where the points of water accumulation determine the boundaries between cells. By applying a basin to cell images, intensity gradients and regions of overlap are identified, allowing for more precise and detailed segmentation of individual cells. By using the watershed transform, it is possible to adjust the mask edges more effectively, thereby improving the quality of the training data used for AI models, resulting in higher accuracy and performance in cell segmentation [66]. Below is a diagram of the proposed method.

thumbnail
Fig 5. Block diagram of the proposed strategy.

https://doi.org/10.1371/journal.pone.0311849.g005

After applying the basin technique to the deformed images, the results obtained can be detailed in Figs 6 and 7.

thumbnail
Fig 6. Edge refinement of a deformed mask of healthy endothelial cells using the watershed technique.

https://doi.org/10.1371/journal.pone.0311849.g006

thumbnail
Fig 7. Edge refinement of a deformed mask of endothelial cells with guttae using the watershed technique.

https://doi.org/10.1371/journal.pone.0311849.g007

In the figures above, it can be seen that the basin technique helps to refine the output masks, so that they are as plausible as possible with respect to the original ones. because the masks generated depending on the level of deformation may present incomplete edges, isolated points and very thin or thick cellular regions. Achieving very promising results for medical databases.

Experiments and results

In this research, a semi-supervised learning model was developed, based on the Barlow Twins approach, with the purpose of improving the performance of CNNs. To train and validate the model, specular microscopy images of the corneal endothelium in vivo with various resolutions were used.

Experiment configuration

The model was trained in the unsupervised stage with different encoders, using the warmup cosine optimizer with learning rate initialized at 1e−3 for all experiments, a batch size of 8 and a weight decay 5e−4. For the Fine Tuning stage to learn the segmentation task, the Adam optimizer and the Binary Cross-Entropy Dice Loss function were used. The feature maps learned in the unsupervised stage were concatenated with a 5-filter decoder [16, 32, 64, 128, 256], which had attention modules, skip connections and residual blocks.

In order to demonstrate the applicability of the proposed data augmentation technique, the semi-supervised model was trained in three scenarios. For the first scenario called Strategy 1 (Baseline), the corneal endothelium dataset with conventional augmentation was used in all training stages of the model (horizontal flip, vertical flip, and rotations), employing 1719 patches without labels for unsupervised training and 271 patches with labels (216 for training, 25 for test and 30 for validation) to learn the segmentation task, performing fine tuning under supervised learning. For the second scenario called Strategy 2, it was increased from 1719 to 4220 unlabeled patches, using the proposed data augmentation technique, for the unsupervised stage. For the last scenario called Strategy 3, the number of images of Strategy 2 was used in the unsupervised stage and for the supervised stage it was increased from 271 to 652 labeled patches, 597 for training, 25 for test and 30 for validation, using the proposed data augmentation. The Table 1 shows the distribution of the data according to the strategy used. And Table 2 shows the quantitative results taking into account several evaluation metrics for the segmentation task.

thumbnail
Table 1. Data distribution for training, testing and validation of the semi-supervised model.

(Strain values in parameter s ranged between 1 and 3, with values closer to zero giving more pronounced strains).

https://doi.org/10.1371/journal.pone.0311849.t001

thumbnail
Table 2. Quantitative comparative analysis of the semi-supervised model for different data augmentation scenarios using the evaluation metrics coefficient dice (DC), accuracy (Acc), Area Under the Receiver Operating Characteristic Curve (AUROC), and mean intersection-over-union (mIoU).

https://doi.org/10.1371/journal.pone.0311849.t002

The sample distribution in Table 1 was designed with careful consideration of the data and study objectives. More samples were allocated to the validation set than the test set to fine-tune hyperparameters during training. This approach improves performance on unseen data and helps prevent overfitting. We also recognize the importance of the test set for obtaining an unbiased estimate of model performance.

During data augmentation, deformations were applied by adjusting the parameter “s” to values greater than zero. Plausible distortions were selected, while higher distortions rendered images unusable, resulting in non-uniform augmentation. This increased the unsupervised training data from 1719 to 4220 (2.76-fold) and the supervised data from 216 to 597 (2.45-fold). Greater augmentation was applied in the unsupervised stage (Barlow twins) to capture feature diversity, while less augmentation was used in the supervised stage to avoid overfitting and ensure generalization.

Below in Fig 8 shows the quantitative results taking into account some evaluation metrics for the segmentation task.

thumbnail
Fig 8. Quantitative results of the Dice Coefficient metric (DC) using strategy 1 (Baseline), 2 and 3 with the proposed data augmentation method.

The acronym BT stands for Barlow Twins.

https://doi.org/10.1371/journal.pone.0311849.g008

Fig 8 shows the Dice Coefficient (DC) metric calculated for different semi-supervised models. It is evident that the proposed strategy improves the performance of the models significantly compared to the baseline (strategy 1). Where improvements can be seen by implementing data augmentation only in the unsupervised learning stage and more promising results if it is also applied in the fine tuning of the network. Below, in Table 2, the quantitative results of the architectures used can be observed.

From the Table 2 and Fig 8, it can be observed that the best performance is obtained for the case of Strategy 3, which allows us to affirm that the proposed data augmentation process allows the semi-supervised model to generalize better. This is due to the fact that the generated images represent natural and realistic deformations, which help the diversity of the data.

Our semi-supervised model, utilizing the proposed data augmentation method, demonstrates superior performance in segmenting corneal endothelium images, especially in the presence of guttae. The DenseNet121 encoder-based model achieved an AUROC of 0.92, a precision of 0.91, a DICE coefficient of 0.94, and a mean IoU of 0.86.

As shown in Table 1, our model was evaluated using cellular microscopy images of the corneal endothelium, including both healthy cells and those affected by Fuchs dystrophy. This comprehensive evaluation addresses a significant limitation of current strategies, which predominantly focus on healthy cells. The challenge of accurately segmenting cells with guttas in modern specular microscopy systems, where automatic segmentation often fails, was effectively tackled by our approach.

For comparison, Fabijańska et al. [4] reported an AUROC of 0.92 and a DICE coefficient of 0.86 using a U-NET-based network for segmenting healthy corneal endothelial images. Nurzynska (2018) achieved 93% accuracy in overlapping automatic delineation with manual annotations and a modified Hausdorff distance of 0.14 pixels. Hao et al. [56] developed a deep learning system for segmenting corneal endothelial cells, achieving an AUC of 0.923 and a Pearson correlation coefficient of 0.932 for estimated morphometric parameters. Shilpashree et al. [49] reported an F1 Score of 82.27%, a mean IoU of 77.27%, an accuracy of 87.9%, and an AUROC of 96.70% for healthy subjects using a combination of U-Net and Watershed.

Our method surpasses these benchmarks, particularly in handling the variability and challenges posed by diseased cells, demonstrating the efficacy of incorporating data augmentation via warping transforms to enhance semi-supervised segmentation tasks.

In Fig 9, it can be seen that the model receives as input an image with healthy endothelial cells, with the objective of evaluating the prediction of the output mask using a semi-supervised model with DenseNet121 encoder trained with the strategy 1, 2 and 3. Where it can be seen that strategy 3 presented the best performance having the ground truth as a reference. In Fig 10, the model receives an image with diseased endothelial cells with some lighting challenges, where strategy 3 still has the best precision compared to the other strategies. All the previous results show that the proposed data augmentation strategy helps to improve the generalization of CNN networks.

thumbnail
Fig 9. Segmentation results for an image with healthy cells using the semi-supervised model and the proposed data augmentation method.

https://doi.org/10.1371/journal.pone.0311849.g009

thumbnail
Fig 10. Results of the segmentation of an endothelial image with the presence of guttae using the semi-supervised model and the proposed data augmentation method.

https://doi.org/10.1371/journal.pone.0311849.g010

Conclusions

Deep learning models, such as CNNs, often face difficulties in terms of generalization and prevention of overfitting when the labeled data set is limited. In this study, we have presented an innovative data augmentation strategy that is based on the warping and watershed method, with the purpose of improving the performance of these algorithms in the task of semantic segmentation of specular microscopy images of the corneal endothelium. The results obtained are highly promising compared to traditional data augmentation techniques, despite the presence of variations in scale, lighting, shadows, brightness, lack of sharpness and other factors inherent in this type of images. These conclusions are supported by implementing three data augmentation experiments using a semi-supervised learning approach, where the proposed method proved to be the most effective. To verify the effectiveness of the predictions, the accuracy, mIoU and Dice coefficient metrics were used. Finally, it can be stated that this technique will be of great help for medical databases that have the limitation of few labeled images and present deformation within their ROIs.

Acknowledgments

S. Sanchez thanks Minciencias and Sistema General de Regalías (Programa de Becas de Excelencia) for a PhD scholarship.

References

  1. 1. Fabijańska A. Automatic segmentation of corneal endothelial cells from microscopy images. Biomedical Signal Processing and Control. 2019;47:145–158.
  2. 2. Selig B, Vermeer KA, Rieger B, Hillenaar T, Luengo Hendriks CL. Fully automatic evaluation of the corneal endothelium from in vivo confocal microscopy. BMC medical imaging. 2015;15:1–15. pmid:25928199
  3. 3. Okumura N, Yamada S, Nishikawa T, Narimoto K, Okamura K, Izumi A, et al. U-Net Convolutional Neural Network for Segmenting the Corneal Endothelium in a Mouse Model of Fuchs Endothelial Corneal Dystrophy. Cornea. 2022;41(7):901–907. pmid:34864800
  4. 4. Fabijańska A. Segmentation of corneal endothelium images using a U-Net-based convolutional neural network. Artificial Intelligence in Medicine. 2018;88:1–13. pmid:29680687
  5. 5. Scarpa F, Ruggeri A. Development of a reliable automated algorithm for the morphometric analysis of human corneal endothelium. Cornea. 2016;35(9):1222–1228. pmid:27310881
  6. 6. Prada AM, Quintero F, Mendoza K, Galvis V, Tello A, Romero LA, et al. Assessing Fuchs Corneal Endothelial Dystrophy Using Artificial Intelligence–Derived Morphometric Parameters From Specular Microscopy Images. Cornea. 2024;43(9). pmid:38334475
  7. 7. Patel DV, McGhee CN. Quantitative analysis of in vivo confocal microscopy images: a review. Survey of ophthalmology. 2013;58(5):466–475. pmid:23453401
  8. 8. Lan G, Twa MD, Song C, Feng J, Huang Y, Xu J, et al. In vivo corneal elastography: A topical review of challenges and opportunities. Computational and Structural Biotechnology Journal. 2023;. pmid:37181662
  9. 9. Shen Z, Fu H, Shen J, Shao L. Modeling and Enhancing Low-Quality Retinal Fundus Images. IEEE Transactions on Medical Imaging. 2020;PP:1–1.
  10. 10. Soh YQ, Peh GS, Naso SL, Kocaba V, Mehta JS. Automated clinical assessment of corneal guttae in fuchs endothelial corneal dystrophy. American Journal of Ophthalmology. 2021;221:260–272. pmid:32730910
  11. 11. Aquino NR, Gutoski M, Hattori LT, Lopes HS. The effect of data augmentation on the performance of convolutional neural networks. Braz Soc Comput Intell. 2017;.
  12. 12. Ginsburger K. Style Augmentation improves Medical Image Segmentation; 2022. pmid:35855502
  13. 13. Vallez N, Bueno G, Deniz O, Blanco S. Diffeomorphic transforms for data augmentation of highly variable shape and texture objects. Computer Methods and Programs in Biomedicine. 2022;219:106775. pmid:35397412
  14. 14. Araslanov N, Roth S. Self-supervised Augmentation Consistency for Adapting Semantic Segmentation; 2021.
  15. 15. Sami AS, Rahim MSM. Trainable watershed-based model for cornea endothelial cell segmentation. Journal of Intelligent Systems. 2022;31(1):370–392.
  16. 16. Scarpa F, Ruggeri A. Segmentation of corneal endothelial cells contour by means of a genetic algorithm. In: Ophthalmic Medical Image Analysis International Workshop. vol. 2. University of Iowa; 2015. p. 25–32.
  17. 17. Canavesi C, Cogliati A, Hindman HB. Unbiased corneal tissue analysis using Gabor-domain optical coherence microscopy and machine learning for automatic segmentation of corneal endothelial cells. Journal of Biomedical Optics. 2020;25(9):092902–092902. pmid:32770867
  18. 18. Al-Waisy AS, Alruban A, Al-Fahdawi S, Qahwaji R, Ponirakis G, Malik RA, et al. CellsDeepNet: A Novel Deep Learning-Based Web Application for the Automated Morphometric Analysis of Corneal Endothelial Cells. Mathematics. 2022;10(3).
  19. 19. Sierra JS, Castro JDP, Meza J, Rueda D, Berrospi RD, Tello A, et al. Deep learning for robust segmentation of corneal endothelium images in the presence of cornea guttata. Proc SPIE. 2021;11804:118041F.
  20. 20. Alomar K, Aysel HI, Cai X. Data Augmentation in Classification and Segmentation: A Survey and New Strategies. Journal of Imaging. 2023;9(2). pmid:36826965
  21. 21. Ren W, Tang Y, Sun Q, Zhao C, Han QL. Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview; 2022.
  22. 22. dos Santos VA, Schmetterer L, Stegmann H, Pfister M, Messner A, Schmidinger G, et al. CorneaNet: fast segmentation of cornea OCT scans of healthy and keratoconic eyes using deep learning. Biomed Opt Express. 2019;10(2):622–641. pmid:30800504
  23. 23. Balestriero R, Bottou L, LeCun Y. The Effects of Regularization and Data Augmentation are Class Dependent; 2022.
  24. 24. Wang Y, Huang G, Song S, Pan X, Xia Y, Wu C. Regularizing Deep Networks with Semantic Data Augmentation; 2021.
  25. 25. Sanford TH, Zhang L, Harmon SA, Sackett J, Yang D, Roth H, et al. Data Augmentation and Transfer Learning to Improve Generalizability of an Automated Prostate Segmentation Model. American Journal of Roentgenology. 2020;215(6):1403–1410. pmid:33052737
  26. 26. Deari S, Öksüz İ, Ulukaya S. Importance of data augmentation and transfer learning on retinal vessel segmentation. In: 2021 29th Telecommunications Forum (TELFOR). IEEE; 2021. p. 1–4.
  27. 27. Zhao A, Balakrishnan G, Durand F, Guttag JV, Dalca AV. Data augmentation using learned transformations for one-shot medical image segmentation; 2019.
  28. 28. Liu W, Lu Q, Zhuo Z, Liu Y, Ye C. One-Shot Segmentation of Novel White Matter Tracts via Extensive Data Augmentation; 2023.
  29. 29. Jiao R, Zhang Y, Ding L, Cai R, Zhang J. Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation; 2022.
  30. 30. Wu Y, Ge Z, Zhang D, Xu M, Zhang L, Xia Y, et al. Mutual Consistency Learning for Semi-supervised Medical Image Segmentation; 2022.
  31. 31. Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018;.
  32. 32. Pathak D, Krähenbühl P, Donahue J, Darrell T, Efros AA. Context Encoders: Feature Learning by Inpainting. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. p. 2536–2544.
  33. 33. Balestriero R, LeCun Y. Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods; 2022.
  34. 34. Grill JB, Strub F, Altché F, Tallec C, Richemond PH, Buchatskaya E, et al. Bootstrap your own latent: A new approach to self-supervised Learning; 2020.
  35. 35. Ghosh S, Seth A, Mittal D, Singh M, Umesh S. DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning; 2022.
  36. 36. Nalepa J, Marcinkiewicz M, Kawulok M. Data Augmentation for Brain-Tumor Segmentation: A Review. Frontiers in Computational Neuroscience. 2019;13. pmid:31920608
  37. 37. Taylor L, Nitschke G. Improving Deep Learning using Generic Data Augmentation; 2017.
  38. 38. Nanni L, Paci M, Brahnam S, Lumini A. Comparison of Different Image Data Augmentation Approaches. Journal of Imaging. 2021;7:254. pmid:34940721
  39. 39. Moreno-Barea FJ, Strazzera F, Jerez JM, Urda D, Franco L. Forward Noise Adjustment Scheme for Data Augmentation. In: 2018 IEEE Symposium Series on Computational Intelligence (SSCI); 2018. p. 728–734.
  40. 40. Liu S, Tian G, Xu Y. A novel scene classification model combining ResNet based transfer learning and data augmentation with a filter. Neurocomputing. 2019;338:191–206.
  41. 41. Skandarani Y, Painchaud N, Jodoin PM, Lalande A. On the effectiveness of GAN generated cardiac MRIs for segmentation; 2020. pmid:32746116
  42. 42. Bissoto A, Valle E, Avila S. GAN-Based Data Augmentation and Anonymization for Skin-Lesion Analysis: A Critical Review; 2021.
  43. 43. Mumuni A, Mumuni F. Data augmentation: A comprehensive survey of modern approaches. Array. 2022;16:100258.
  44. 44. Sierra JS, Pineda J, Viteri E, Rueda D, Tibaduiza B, Berrospi RD, et al. Automated corneal endothelium image segmentation in the presence of cornea guttata via convolutional neural networks. Proc SPIE. 2020; p. 115110H.
  45. 45. Kugelman J, Alonso-Caneiro D, Read SA, Collins MJ. A review of generative adversarial network applications in optical coherence tomography image analysis. Journal of Optometry. 2022;15:S1–S11. pmid:36241526
  46. 46. Sierra JS, Pineda J, Rueda D, Tello A, Prada AM, Galvis V, et al. Corneal endothelium assessment in specular microscopy images with Fuchs; dystrophy via deep regression of signed distance maps. Biomed Opt Express. 2023;14(1):335–351. pmid:36698671
  47. 47. Vigueras-Guillén JP, van Rooij J, van Dooren BTH, Lemij HG, Islamaj E, van Vliet LJ, et al. DenseUNets with feedback non-local attention for the segmentation of specular microscopy images of the corneal endothelium with guttae; 2022. pmid:35982194
  48. 48. Kolluru C, Benetz BA, Joseph N, Menegay HJ, Lass JH, Wilson D. Machine learning for segmenting cells in corneal endothelium images. In: Medical Imaging 2019: Computer-Aided Diagnosis. vol. 10950. SPIE; 2019. p. 1126–1135.
  49. 49. Shilpashree PS, Suresh KV, Sudhir RR, Srinivas SP. Automated Image Segmentation of the Corneal Endothelium in Patients With Fuchs Dystrophy. Translational Vision Science & Technology. 2021;10(13):27–27. pmid:34807254
  50. 50. Vigueras-Guillén JP, Lasenby J, Seeliger F. Rotaflip: A New CNN Layer for Regularization and Rotational Invariance in Medical Images; 2021.
  51. 51. Vigueras-Guillén J, Sari B, Goes S, Lemij H, Rooij J, Vermeer K, et al. Fully convolutional architecture vs sliding-window CNN for corneal endothelium cell segmentation. BMC Biomedical Engineering. 2019;1. pmid:32903308
  52. 52. Wu J, Shen B, Zhang H, Wang J, Pan Q, Huang J, et al. Semi-supervised Learning for Nerve Segmentation in Corneal Confocal Microscope Photography. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part IV. Berlin, Heidelberg: Springer-Verlag; 2022. p. 47–57.
  53. 53. Sanchez S, Mendoza K, Quintero F, Prada AM, Tello A, Galvis V, et al. Deep neural networks for evaluation of specular microscopy images of the corneal endothelium with Fuchs’ dystrophy. In: Pattern Recognition and Tracking XXXIV. vol. 12527. SPIE; 2023. p. 183–191.
  54. 54. Kucharski A, Fabijańska A. Corneal endothelial image segmentation training data generation using GANs. Do experts need to annotate? Biomedical Signal Processing and Control. 2023;85:104985.
  55. 55. Saxena D, Cao J. Generative Adversarial Networks (GANs Survey): Challenges, Solutions, and Future Directions; 2023.
  56. 56. Qu JH, Qin XR, Peng RM, Xiao GG, Cheng J, Gu SF, et al. A Fully Automated Segmentation and Morphometric Parameter Estimation System for Assessing Corneal Endothelial Cell Images. American Journal of Ophthalmology. 2022;239:142–153. pmid:35288075
  57. 57. Zbontar J, Jing L, Misra I, LeCun Y, Deny S. Barlow twins: Self-supervised learning via redundancy reduction. In: International Conference on Machine Learning. PMLR; 2021. p. 12310–12320.
  58. 58. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition; 2015.
  59. 59. Lee Y, Yim B, Kim H, Park E, Cui X, Woo T, et al. Wide-Residual-Inception Networks for Real-time Object Detection; 2017.
  60. 60. Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks; 2018.
  61. 61. Chen X, Hsieh CJ, Gong B. When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations; 2022.
  62. 62. Shafiq M, Gu Z. Deep Residual Learning for Image Recognition: A Survey. Applied Sciences. 2022;12(18).
  63. 63. Chen Y, Zheng H, Ma Y, Yan Z. Image stitching based on angle-consistent warping. Pattern Recognition. 2021;117:107993.
  64. 64. Kulwa F, Li C, Grzegorzek M, Rahaman MM, Shirahama K, Kosov S. Segmentation of Weakly Visible Environmental Microorganism Images Using Pair-wise Deep Learning Features; 2022.
  65. 65. Huang J, Li H, Wan X, Li G. Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023. p. 21384–21393.
  66. 66. Kornilov A, Safonov I, Yakimchuk I. A Review of Watershed Implementations for Segmentation of Volumetric Images. Journal of Imaging. 2022;8(5). pmid:35621890