Table 1.
Comparison of underwater image enhancement methods.
Fig 1.
Schematic diagram of the proposed underwater video analysis pipeline.
It starts with input video and preprocessing, enhancement techniques to improve image quality and passed to classification model, which outputs the final label.
Table 2.
Species Annotation Distribution in the LifeCLEF-2015 Dataset.
Fig 2.
Geographical locations for the Indonesia’s islands Bali, Komodo, Raja Ampat, Lembeh, Fakfak, and Triton Bay.
This spatial distribution maps were generated using a Python-based geospatial pipeline with Matplotlib to visualize the Indonesia region.
Fig 3.
Sample image from each class in the LifeCLEF-2015 dataset.
It contains 15 classes of underwater species [44].
Fig 4.
The frames selected from the underwater video, row 1 and 3 display original images, row 2 and 4 display images are in grayscale. This shows how grayscale converting removes color and retains intensity and important details in the image.
Fig 5.
First row: a) original image, b) contrast, second row: a) original image, b) flipping, and third row: a) original image, b) rotation. This helps to improve model learning and performance.
Fig 6.
The frames selected from underwater video with clear environments, row 1 and 3 show original images and row 2 and 4 show images obtained by the GW algorithm.
Fig 7.
Example for complex underwater environments.
The complex underwater environment such as low visibility due to turbidity, inconsistent lighting, and noisy background. Row 1 shows original images and row 2 shows images obtained by the GW algorithm, which correct the color imbalance. This improves color and visual quality of the images.
Fig 8.
Examples images of HE and CLAHE.
The images showing the resulting frames after applying HE, CLAHE and the corresponding original frames. This shows contrast enhancement in two techniques, HE improves overall contrast, while, CLAHE improve the local details.
Fig 9.
Row 1 shows the original frames. Row 2 shows the Canny edge detected frames, which highlight the fine edges with reducing noise to detect clear boundaries.
Fig 10.
Edge detector for turtle, coral reef, and fish images are shown with varying σ. Small value of σ preserves fine details, while higher σ reduces the noise and less details of the original structures. Comparisons are made on different noise levels with σ = 1, σ = 20. σ = 40, σ = 60, σ = 80, and σ = 100.
Fig 11.
Histograms of images with different contrast levels.
The images of different contrast such as dark image, bright image, color image, and high or low contrast images.
Fig 12.
The classification performance metrics.
Performance metrics of precision, recall, and F1-score for DenseNet121, VGG16 and ResNet50 transfer learning models are compared.
Table 3.
The performance metrics of precision, recall, and F1-score for each deep learning model on testing dataset.
Table 4.
Ablation study on the effect of Canny edge detection on classification performance.
Table 5.
Comparison between results of underwater image/video classification models.
Fig 13.
(left) VGG training performance, (right) VGG testing performance of accuracy, precision, recall, and F1-score.
Fig 14.
The performance of accuracy, precision, recall, and F1-score (left) and testing performance (right).
Fig 15.
DenseNet training performance.
of accuracy, precision, recall, and F1-score (left) and testing performance (right).
Table 6.
Comparison of SRGAN and ESRGAN Enhancement Performance.
Fig 16.
(left) VGG confusion matrix using training data, (right) VGG confusion matrix using testing data.
Fig 17.
(left) ResNet confusion matrix using training data, (right) ResNet confusion matrix using testing data.
Fig 18.
(left) DenseNet confusion matrix using training data, (right) DenseNet confusion matrix using testing data.
Fig 19.
The plots showing accuracy, precision, recall, and F1-score using training data (left) and testing data (right).
Fig 20.
The plots showing accuracy, precision, recall, and F1-score using training data (left) and testing data (right).
Fig 21.
The plots showing accuracy, precision, recall, and F1-score using training data (left) and testing data (right).
Fig 22.
First row are original images, and second row are denoised images. This method improves edge accuracy by removing noise and keeping important details.
Fig 23.
Results for 10 random samples of LifeCLEF-2015 dataset.
First row shows original images, second row shows enhancement images using SRGAN techniques, and third row shows enhanced images using the ESRGAN technique.