GAN-based underwater image enhancement and scene classification using transfer learning

doi:10.1371/journal.pone.0345593

Table 1.

Comparison of underwater image enhancement methods.

More »

Expand

Fig 1.

Schematic diagram of the proposed underwater video analysis pipeline.

It starts with input video and preprocessing, enhancement techniques to improve image quality and passed to classification model, which outputs the final label.

More »

Expand

Table 2.

Species Annotation Distribution in the LifeCLEF-2015 Dataset.

More »

Expand

Fig 2.

Geographical locations for the Indonesia’s islands Bali, Komodo, Raja Ampat, Lembeh, Fakfak, and Triton Bay.

This spatial distribution maps were generated using a Python-based geospatial pipeline with Matplotlib to visualize the Indonesia region.

More »

Expand

Fig 3.

Sample image from each class in the LifeCLEF-2015 dataset.

It contains 15 classes of underwater species [44].

More »

Expand

Fig 4.

Several illustrations frames.

The frames selected from the underwater video, row 1 and 3 display original images, row 2 and 4 display images are in grayscale. This shows how grayscale converting removes color and retains intensity and important details in the image.

More »

Expand

Fig 5.

Data augmentation techniques.

First row: a) original image, b) contrast, second row: a) original image, b) flipping, and third row: a) original image, b) rotation. This helps to improve model learning and performance.

More »

Expand

Fig 6.

Underwater sample frames.

The frames selected from underwater video with clear environments, row 1 and 3 show original images and row 2 and 4 show images obtained by the GW algorithm.

More »

Expand

Fig 7.

Example for complex underwater environments.

The complex underwater environment such as low visibility due to turbidity, inconsistent lighting, and noisy background. Row 1 shows original images and row 2 shows images obtained by the GW algorithm, which correct the color imbalance. This improves color and visual quality of the images.

More »

Expand

Fig 8.

Examples images of HE and CLAHE.

The images showing the resulting frames after applying HE, CLAHE and the corresponding original frames. This shows contrast enhancement in two techniques, HE improves overall contrast, while, CLAHE improve the local details.

More »

Expand

Fig 9.

Canny edge detection frames.

Row 1 shows the original frames. Row 2 shows the Canny edge detected frames, which highlight the fine edges with reducing noise to detect clear boundaries.

More »

Expand

Fig 10.

Samples of edge detector.

Edge detector for turtle, coral reef, and fish images are shown with varying σ. Small value of σ preserves fine details, while higher σ reduces the noise and less details of the original structures. Comparisons are made on different noise levels with σ = 1, σ = 20. σ = 40, σ = 60, σ = 80, and σ = 100.

More »

Expand

Fig 11.

Histograms of images with different contrast levels.

The images of different contrast such as dark image, bright image, color image, and high or low contrast images.

More »

Expand

Fig 12.

The classification performance metrics.

Performance metrics of precision, recall, and F₁-score for DenseNet121, VGG16 and ResNet50 transfer learning models are compared.

More »

Expand

Table 3.

The performance metrics of precision, recall, and F₁-score for each deep learning model on testing dataset.

More »

Expand

Table 4.

Ablation study on the effect of Canny edge detection on classification performance.

More »

Expand

Table 5.

Comparison between results of underwater image/video classification models.

More »

Expand

Fig 13.

VGG.

(left) VGG training performance, (right) VGG testing performance of accuracy, precision, recall, and F₁-score.

More »

Expand

Fig 14.

ResNet training performance.

The performance of accuracy, precision, recall, and F₁-score (left) and testing performance (right).

More »

Expand

Fig 15.

DenseNet training performance.

of accuracy, precision, recall, and F₁-score (left) and testing performance (right).

More »

Expand

Table 6.

Comparison of SRGAN and ESRGAN Enhancement Performance.

More »

Expand

Fig 16.

VGG confusion matrix.

(left) VGG confusion matrix using training data, (right) VGG confusion matrix using testing data.

More »

Expand

Fig 17.

ResNet confusion matrix.

(left) ResNet confusion matrix using training data, (right) ResNet confusion matrix using testing data.

More »

Expand

Fig 18.

DensNet confusion matrix.

(left) DenseNet confusion matrix using training data, (right) DenseNet confusion matrix using testing data.

More »

Expand

Fig 19.

VGG violin plots.

The plots showing accuracy, precision, recall, and F₁-score using training data (left) and testing data (right).

More »

Expand

Fig 20.

ResNet violin plots.

The plots showing accuracy, precision, recall, and F₁-score using training data (left) and testing data (right).

More »

Expand

Fig 21.

DenseNet violin plots.

The plots showing accuracy, precision, recall, and F₁-score using training data (left) and testing data (right).

More »

Expand

Fig 22.

Stages of image enhancement.

First row are original images, and second row are denoised images. This method improves edge accuracy by removing noise and keeping important details.

More »

Expand

Fig 23.

Results for 10 random samples of LifeCLEF-2015 dataset.

First row shows original images, second row shows enhancement images using SRGAN techniques, and third row shows enhanced images using the ESRGAN technique.

More »

Expand