Figures
Abstract
Image Quality Assessment (IQA) plays a critical role in image-based decision-making systems, especially in domains requiring high diagnostic precision. Effective feature information is a prerequisite for the high performance of machine learning methods in parasitic organism detection, and the quality of this feature information is influenced by the quality of the images. However, No-Reference IQA (NR-IQA) models have ignored microscopy-based datasets, particularly those involving parasitic organisms such as Cryptosporidium spp. and Giardia spp., which are vital for public health inspection. In this study, PRIQA (Parasite ResNet-101 IQA), a novel deep learning-based NR-IQA model specifically trained on a small parasite image dataset was presented. Using Mean Opinion Scores (MOS) from twenty human evaluators, nine Deep Convolutional Neural Network (DCNN) architectures were benchmarked and identified ResNet-101 as the most robust feature extractor. The features were mapped to MOS using regression models and compared with ten state-of-the-art NR-IQA algorithms. Experimental results demonstrated that PRIQA consistently outperforms existing methods, indicating its suitability as a practical quality control tool for identifying unreliable or low-quality parasite microscopy images and supporting more consistent downstream detection and diagnostic workflows in automated inspection systems.
Citation: Asri MAA, Rajagopal H, Mokhtar N, Wan Mohd Mahiyiddin WA, Lim YAL, Iwahashi M, et al. (2026) Deep learning-based no-reference image quality assessment framework for Cryptosporidium spp. and Giardia spp. PLoS One 21(1): e0341160. https://doi.org/10.1371/journal.pone.0341160
Editor: Ayush Dogra, Chitkara University Institute of Engineering and Technology, INDIA
Received: May 12, 2025; Accepted: January 4, 2026; Published: January 20, 2026
Copyright: © 2026 Asri et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The reference dataset and source code, including scripts for feature extraction, distortion generation, and regression-based quality assessment developed during this study, are publicly available in the GitHub repository at https://github.com/Amirul-777/PRIQA---Parasite-ResNet-101-Image-Quality-Assessment-Study.
Funding: This work was partially supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number JP25K21814. The funders were involved in funding acquisition, supervision, and contributions to the writing (original draft, review, and editing).
Competing interests: The authors have declared that no competing interests exist.
Introduction
Image Quality Assessment (IQA) is essential across various fields for inspection purposes, including medical imaging [1], haze detection [2], and agricultural product evaluation [3]. Xu et al. [4] highlighted that the performance of parasite detection models heavily depends on effective feature extraction, which inherently linked to the quality of input images. High-quality microscopic images with good resolution, contrast, and minimal noise provide detailed and accurate information essential for feature extraction. Conversely, low-quality images may obscure critical details, directly impacting the ability to discern fine features necessary for identifying specific parasites. This is particularly critical for inspecting parasite images, such as Cryptosporidium and Giardia in drinking water treatment plants. These parasites are major non-viral infectious agents causing parasitic diarrhea [5], often found in natural water contaminated by agricultural and animal wastes [6]. They play a significant role in disease outbreaks and are therefore a major concern for public health policies and the operations of water treatment plants [7]. Thus, accurately assessing image quality is crucial to avoid inspection errors, and automating this process with advanced models would be highly beneficial.
Recent research on microscopic images has led to models that quantify image distortion and improve image quality, enhancing pathological image assessments [8]. For example, deep convolutional neural networks (CNNs) have advanced the recognition of formed elements in microscopic images through autofocus processes using blind IQA methods [9]. A similar approach has developed to automatically quantify focus for image correction in microscopic hyperspectral images of cancer cells [10]. In the realm of parasitic images, algorithms have been created for real-time detection of microscopic parasites like Protozoa [11]. However, there is a lack of specific studies or models for IQA of parasite images, particularly Cryptosporidium spp. and Giardia spp. Parasite images are subjected to distortions due to lens distortions, such as radial and tangential distortions, which can cause straight lines to appear curved and images to appear skewed. Additionally, lens aberrations can introduce further distortions that compromise image accuracy, while increased magnification can exacerbate these issues. Grating distortions also contribute, with minimal distortion in the central region and larger distortions near the edges, consistent with geometric models of lens distortion [12].
Given the advancements in IQA for microscopic images, our study aims to fill this gap and potentially stimulate further research interest in parasitic image quality assessment, improving inspection accuracy in public health applications. Techniques for parasite image quality control able to develop using IQA algorithms, which analyze image signals to quantify visual distortions [13].
IQA methods are categorized into subjective and objective types. Subjective IQA methods, though highly accurate and dependable, are labor-intensive and time-consuming, limiting their suitability for parasite image quality control tasks. In contrast, objective IQA algorithms automatically evaluate image quality using trained models and do not require human intervention. These algorithms further classified into Full-Reference (FR-IQA), Reduced-Reference (RR-IQA), and No-Reference (NR-IQA) categories. FR-IQA compares the entire image to a perfect or pristine reference image, RR-IQA uses partial information from a reference image, and NR-IQA assesses image quality without any reference image.
In this study, we emphasize NR-IQA, which is particularly suitable for parasitic microscopy images, as it does not rely on the availability of pristine reference images. Instead, NR-IQA predicts image quality solely based on machine learning algorithms and statistical models by analyzing intrinsic image features such as brightness, contrast, and sharpness. This makes NR-IQA especially appropriate for parasitic image analysis, where high-quality reference images are typically unavailable.
Studies have been recently done on NR-IQA for various image types such as natural image [14], underwater image [15], MRIs [16], PET scans [17] and wood images [18]. A recent novel NR-IQA method called Local Feature Descriptors-IQA, designed for both authentic and artificial distortions. This method processed images by converting them to Y, Cb, and Cr color channels, applying Human Visual System (HVS) inspired filters, and extracting local features using descriptors like speeded up robust features (SURF) and features from accelerated segment test (FAST). A regression model trained on these extracted features predicts perceptual quality scores [19]. Furthermore, a self-supervised NR-IQA method named ARNIQA (leArning distoRtion maNifold for Image Quality Assessment) designed for natural images with both synthetic and authentic distortions. ARNIQA used a novel image degradation model, and a training strategy based on the SimCLR (Simple Framework for Contrastive Learning of Visual Representations) framework. It employs a pre-trained ResNet-50 encoder and a 2-layer MLP projector to generate image representations [20].
Another NR-IQA method, IL-NIQE (Integrated Local Natural Image Quality Evaluator) used a collection of pristine naturalistic images to learn a multivariate Gaussian (MVG) model. It extracts five types of Natural Scene Statistics (NSS) features from image patches, fitting each to a local MVG model to predict local quality scores [21]. These studies have demonstrated that their proposed metrics outperformed the state-of-the-art NR-IQA methods such as Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE), Blind Image Quality Index (BIQI), Blind Image Integrity Notator using DCT Statistics II (BLIINDS-II), Natural Image Quality Evaluator (NIQE), Deep Bilinear Convolutional Neural Network (DB-CNN), Ensemble of Neural Image Quality Assessment (ENIQA), No-Reference Based Image Quality Assessment (NBIQA) and Spatial-Spectral Entropy-based Quality (SSEQ). Additionally, they have shown that NR-IQA algorithms must be specifically trained for the types of images they will assess.
In specialized imaging domains, studies have adopted a similar strategy of learning a regression from image features to subjective quality scores on modestly sized, domain-specific datasets. For example, a study has been done by extracting quantitative features from CT pulmonary angiography scans and trained a Random Forest regression model to predict radiologists’ MOS scores for image quality on approximately 150 cases [22]. In the general NR-IQA setting, a study by Hu et al. used Swin Transformer features with a regression head to predict continuous DMOS/MOS on standard IQA databases, explicitly describing CSIQ (866 distorted images from 30 reference images) as a small laboratory synthesized dataset [23].
In parallel, CNN-based IQA frameworks have been increasingly investigated, particularly within clinical imaging contexts. One study introduced a CNN-based IQA framework for clinical skin images with human-labeled ratings to guide perceptual quality scoring [24]. A BIQA model tailored for pathological microscopy using expert MOS presented for screen and immersion settings [8]. Oh R et al. proposed a CNN based IQA for ultra-widefield fundus (UWF) images. The model predicted the IQA scores with EfficientNet-B3 as the backbone model [25]. A comprehensive review of Medical Image Quality Assessment (MIQA) approaches emphasized the need for robust NR-IQA solutions in clinical workflows [26]. These works affirm the importance of domain-specific NR-IQA but reveal a clear gap in parasite image quality prediction, particularly for Cryptosporidium and Giardia under various distortions, which this study aims to address.
From the perspective of pathological images, CNN-based models have been widely employed for classification tasks. For instance, a CNN-based hybrid model for malaria parasite classification proposed without addressing image quality [27]. In [28], a CNN based model, namely Batch Normalization, Layer Normalization, GELU – Gaussian Error Linear Unit – and Swish functions-based network (BLGSNet) developed to detect parasites. Another study has examined the efficiency of three CNN models, ConvNeXt Tiny, EfficientNet V2 S, and MobileNet V3 S in classifying Ascaris and Taenia, which causes Helminth infections in human body [29]. These studies demonstrate the effectiveness of CNN models in parasite image classification, thereby contributing to improved diagnostic accuracy. However, despite their success in classification, CNN models did not explore for IQA of parasite images, leaving a critical gap in ensuring the reliability of visual data used for diagnosis. Hence, this highlights the need to develop a NR-IQA method to assess parasite images. To the best of our knowledge, this is the first dedicated NR-IQA framework developed specifically for parasitic microscopy images targeting Cryptosporidium and Giardia, a critical yet underrepresented area in public health diagnostics. By tailoring NR-IQA to this domain, we aim to support more robust downstream tasks such as automated parasite detection and digital screening in resource-constrained settings.
In this paper, a NR-IQA metric tailored specifically for microscopic parasite images is proposed. A controlled dataset was constructed using reference images of Cryptosporidium spp. and Giardia spp., in which four distortion types, namely Gaussian white noise (GWN), salt and pepper noise, speckle noise, and JPEG compression, which are commonly encountered in microscopic parasite imaging due to sensor noise [8], acquisition conditions [9], and compression artifacts, were synthetically applied at nine degradation levels, resulting in a total of 1,058 images. MOS were collected from twenty human observers through subjective evaluation. Features were extracted from all images using multiple convolutional neural network architectures, including ResNet variants, Inception-ResNet-v2, GoogLeNet, AlexNet, EfficientNet-B0, and DarkNet. These features were subsequently mapped to MOS using various regression models, including linear regression, support vector machines (SVMs), and decision tree regressors. To assess the reliability of the collected MOS values, correlations with established FR-IQA metrics were examined. To ensure robustness and reliability, the proposed framework was further analyzed using statistical significance testing with Wilcoxon signed-rank and Nemenyi post-hoc tests, sensitivity and ablation studies, and centered kernel alignment (CKA) analysis to examine depth-wise feature representation robustness under different distortions. Comprehensive comparisons between the three best performing configurations, namely PRIQA, PDIQA, and PEIQA, and existing state-of-the-art NR-IQA methods demonstrate that the proposed Parasite ResNet-101 IQA (PRIQA) framework achieves superior performance on the constructed parasite image dataset and offers an objective, automated solution for quality control in microscopic parasite imaging. PRIQA can facilitate more consistent downstream detection and diagnosis workflows, minimize manual rescreening, and provide a solid basis for expanding NR-IQA to additional specialized biomedical imaging contexts by consistently detecting low-quality unreliable images. By enabling reliable and automated quality assessment of parasitic microscopy images, this work supports more consistent diagnostic and screening workflows and aligns with the United Nations Sustainable Development Goal 3 (Good Health and Well-Being), particularly in improving disease prevention and diagnostic reliability.
Research gap
Existing NR-IQA methods have demonstrated efficacy in assessing the quality of natural and medical images. However, their suitability for parasitic image assessment, particularly for microscopic images of Cryptosporidium spp. and Giardia spp., has not been established. Parasitic images present distinct visual challenges, such as low contrast, non-uniform textures, and complex background noise, which differ significantly from natural scene distortions. These differences leave a critical gap in developing reliable quality assessment tools for public health and diagnostic applications. This study aims to address this gap by proposing a robust NR-IQA model specifically tailored to Cryptosporidium spp. and Giardia spp. parasite images.
Materials and methods
Training and testing dataset
A total of twenty-three parasite images, each with a resolution of 1376 x 1320 pixels, are obtained from the Department of Parasitology at the University of Malaya, Malaysia, featuring two parasite species: Giardia spp. and Cryptosporidium spp. The images are captured in the RGB color space as shown in Fig 1 ranging from 0 to 255, ensuring detailed color representation necessary for accurate analysis. These images served as the basis for applying distortion techniques for evaluating both traditional NR-IQA models and the proposed deep convolutional neural network (DCNN)-based models.
The twenty-three reference images were distorted with four types of distortions namely, Gaussian White Noise (GWN), Speckle Noise, Salt and Pepper (SnP) and JPEG Compression at nine levels. Nine levels of distortion were chosen to comprehensively evaluate the model’s ability to manage varying degrees of image quality degradation. These distortion types were used as they are the commonly occurred distortion on parasite images. GWN exists in images due to electronic sources, significantly impacts image quality by introducing random variations in pixel intensity, thereby reducing the clarity and accuracy of the captured details [30–32]. Speckle noise is a common occurrence in microscopic images, arising from the interference of scattered light within the sample. This granular pattern can obscure delicate details and reduce image quality, presenting a challenge in accurate cellular imaging [33]. Salt and pepper noise, a prevalent type of impulsive noise, disrupts image quality by introducing random white and black pixels. This interference can obscure details, distort features, and complicate image analysis and segmentation processes [34]. JPEG compression is applied to microscopic images for file storage purposes, and this may introduce artifacts and degradation, potentially affecting segmentation accuracy and subsequent analysis tasks [35,36]. Therefore, it is essential to thoroughly evaluate its impact on image quality, reliability, and the efficacy of automated analysis algorithms. Table 1 shows the types of distortion and its noise levels.
In total, 1,058 images were generated, comprising 207 images for each of the 3 distortions, alongside twenty-three reference images. Fig 2 shows a sample of reference images with its corresponding distorted images.
Mean Opinion Score (MOS)
Twenty human subjects with normal vision acuity who are aged between 22 and 28 years have been chosen to assess the quality of parasite images. The evaluation followed guidelines outlined in Rec. ITU-R BT.500−11, conducted within an office setting using a 21-inch LED monitor with a resolution of 1920 x 1080 pixels [37]. Before the assessment, each participant’s uncorrected near vision acuity was verified using the Snellen Chart to ensure their suitability for the task.
Subjective evaluation employed the Simultaneous Double Stimulus for Continuous Evaluation (SDSCE) methodology, wherein reference and distorted images were presented side-by-side on the monitor screen [38]. The reference image appeared on the left, while its corresponding distorted version was displayed on the right as shown in Fig 3. Participants assessed the quality of the distorted image relative to the reference, assigning ratings of Excellent (5), Good (4), Fair (3), Poor (2), or Bad (1) accordingly. To minimize bias, numerical scores were not disclosed to the participants. Each evaluation session lasted approximately 15–20 minutes.
The ratings provided by the participants were utilized to calculate MOS following established procedures [39]. The ratings obtained from the subjects were used to calculate MOS using Eq. (1) where the average of the scores obtained from all the twenty human subjects were calculated:
where is the score given by
subject for
image and N is the number of human subjects. In this study, N = 20 as we have twenty human subjects.
Deep Convolutional Neural Network (DCNN)
The study utilized Deep Convolutional Neural Networks (DCNNs), employing nine distinct models: EfficientNet-B0 [40], DarkNet-53 [41,42], Inception-ResNet-v2 [43], DarkNet-19 [44], ResNet-18 [45], ResNet-50 [45], ResNet-101 [45], GoogLeNet [46], and AlexNet [47]. All backbones were initialized with weights from the MATLAB Deep Learning Toolbox (ImageNet: ~ 1.2 M images, 1000 classes). For inference, each RGB image is resized to the backbone’s required input size and normalized with ImageNet mean and standard deviation, then forwarded through the chosen backbone to its late convolutional block. Activations are converted to a fixed-length representation via global average pooling (GAP), yielding a D-dimensional vector. This vector is mapped to a MOS using a lightweight regression learner trained on the training split; CNN weights remain frozen throughout. We favor GAP layer because it compresses spatial activations into compact, position-agnostic descriptors that preserve global content while reducing overfitting and computation properties well suited to downstream MOS regression. Table 2 summarizes, for each backbone, the input size, selected late layer, feature dimensionality, model size and parameter count, the batch setting used for measurement, and the observed runtime characteristics (latency per image, throughput in FPS) together with peak VRAM.
All experiments were run in MATLAB R2024a with Deep Learning Toolbox and Deep Network Designer on Windows 10 (64-bit) with an NVIDIA GeForce RTX 3060 Ti (8 GB) GPU, an Intel Core i5-10400 CPU, and 16 GB RAM. GPU acceleration was enabled via Parallel Computing Toolbox, inference used FP32 precision. Latency and throughput were profiled on this system. Peak VRAM was recorded with nvidia-smi at steady state. All runs used fixed random seeds for reproducibility.
To assess the robustness and generalization of the models, the study conducted two different cross-validation techniques: 5-fold and 10-fold cross-validation. Utilizing both 5-fold and 10-fold cross-validation provides a comprehensive evaluation of model performance and ensures robustness. Cross-validation helps in assessing how well the model generalizes to an independent dataset. The 5-fold cross-validation is faster in computing and useful for quick checks and when computational resources are limited [48]. In contrast, 10-fold cross-validation provides a more thorough evaluation, often resulting in more reliable and stable estimates of model performance, offering better bias-variance trade-off, which leads to more reliable estimates of performance [49].
Additionally, three different test separation methods were employed, involving partitioning the dataset randomly into 70:30, 80:20, and 90:10 of training and testing respectively. Using different training and testing images allows for evaluating the impact of the size of the test set on the model’s performance and ensures robustness of the results across different proportions of training and test data. A 10% test set leaves a larger portion of the data for training, which can be beneficial for the model’s learning, especially with smaller datasets where overfitting can be a concern. However, training with a larger dataset will take longer. On the other hand, 20% and 30% test sets provide a more robust and reliable assessment of the model’s performance [50–52].
Following feature extraction through CNNs, the study employed various machine learning algorithms to map the features to MOS, ensuring that the proposed NR-IQA could give quality scores similar to human evaluation. The machine learning algorithms used in this study include ten regression models: Linear Regression, Linear Support Vector Machines (SVM), Quadratic SVM, Cubic SVM, Fine Gaussian SVM, Medium Gaussian SVM, Coarse Gaussian SVM, Fine Tree, Medium Tree and Coarse Tree. Table 3 summarizes the purpose and characteristics of each regression model.
These regression models were selected to capture a wide range of relationships between the deep CNN features and the subjective MOS scores, from simple linear trends to complex nonlinear patterns. Linear and polynomial SVM models enable testing of different degrees of nonlinearity, while Gaussian SVM variants provide flexibility in modeling localized variations through the choice of kernel scale. Decision trees with varying depth were included to assess performance trade-offs between fine-grained modeling and overfitting control.
Predicted scores were then compared against the ground truth MOS obtained through subjective evaluation using performance metrics. Performance metrics were utilized to evaluate the accuracy and reliability of the predicted image quality score obtained from the best-performing model. In this study, Pearson linear correlation coefficient (PLCC), Root Mean Square Error (RMSE), Spearman rank-order correlation coefficient (SROCC) was used as performance metrics.
Based on the highest values of PLCC, SROCC and the lowest value of RMSE, ResNet-101 model combined with Cubic SVM regression is chosen as the proposed method to predict the image quality score of the parasite images.
The whole experimental workflow is shown in Fig 4 to give a clear understanding of the experimental design, backbone selection, regression learning, and evaluation procedure. This workflow summarizes the steps to identify the most suitable CNN backbone and regression configuration prior to establishing the final framework.
Layer-wise CKA analysis
Since distortions can alter feature representation, it is important to measure how closely these representations align under varying conditions. To quantify the impact of distortions on learned representations, we employ linear Centered Kernel Alignment (CKA) [53], a similarity metric that compares centered Gram matrices of feature activations and remains robust across model architectures [54]. For each backbone, we investigate three depths (Early/Mid/Late) as shown in Table 4. For every clean image, we pair it with its distorted counterpart for each distortion (JPEG, GWN, Speckle, SnP) and compute CKA between clean and distorted global pooled features (per-channel spatial average). CKA ∈ [0,1], where higher value indicates greater representational similarity. For compact reporting we present four depth-wise bar charts (one per distortion), showing Early/Mid/Late CKA for each backbone on a common [0,1] scale.
Proposed framework
The pipeline of proposed Parasite ResNet-101 Image Quality Assessment (PRIQA) framework in Fig 5 starts with input parasite images (1376 x 1032 x 3) undergoing preprocessing steps, including resizing (224 x 224 x 3), to ensure consistent input dimensions for the ResNet-101 backbone. The feature extraction block utilizes convolutional layers to extract hierarchical features of varying resolutions. These features are passed through multiple predictors to generate the required outputs for training and testing the regression models.
The extracted features are mapped to subjective MOS using a regression training block. Once trained, the regression models predict image quality scores during testing. The predicted image quality score, which ranges between 1–5, is the PRIQA score which quantifies the quality of the image. In this scoring scheme, a score of 5 indicates excellent image quality, while a score of 1 reflects extremely poor image quality. Thus, higher scores correspond to better visual quality, as perceived by human evaluators. Performance analysis is conducted by comparing the PRIQA scores with ground truth MOS obtained from subjective evaluations. Metrics such as PLCC, SROCC, and RMSE are used to validate the efficacy of the proposed framework. This modular design ensures that the framework is scalable and adaptable for various image quality assessment tasks, emphasizing its robustness and reliability for parasite image evaluation.
The frameworks for Parasite EfficientNet-B0 Image Quality Assessment (PEIQA) and Parasite DarkNet-53 Image Quality Assessment (PDIQA) follow the same pipeline as PRIQA shown in Fig 4, with the only difference being the CNN backbone used for feature extraction (EfficientNet-B0 and DarkNet-53, respectively). Their extracted features are similarly mapped to MOS using the same set of regression models. The regressor is explained in detail in the Results and Discussions section. This approach allows for a direct and fair comparison between PRIQA, PEIQA, and PDIQA in the subsequent analyses.
No-Reference Image Quality Assessment (NR-IQA)
Following the training and testing phase using DCNNs, the performance of the proposed PRIQA model was further evaluated using another dataset consisting of 125 test images. The PRIQA model was compared with ten state-of-the-art NR-IQA models: EBIQA [55], NIQE [56], BIQI [57]. EPIQA [58], NBIQA [59], ARNIQA [20], ILNIQE [21], BLIINDS II [60], PIQE [61], and BRISQUE [62]. Table 5 shows the development chronology of the ten NR-IQA models along with the model’s descriptions.
We include these ten NR-IQA baselines, providing strong and reproducible reference points. They were chosen by four criteria: (i) high citation impact, (ii) public code/pretrained weights with permissive licenses, (iii) reproducibility on commodity CPU/GPU (documented dependencies), and (iv) method diversity (classic NSS vs. modern deep CNN). This criterion ensures transparent replication and complementary inductive biases.
Performance metrics were used to assess the closeness of the quality scores computed by these NR-IQA metrics with the MOS. The predicted scores are said to be close with MOS if the PLCC and SROCC are close to 1, while RMSE values are the least. The closest metric that can predict close to the MOS is the most suitable metric for parasite images.
Evaluation dataset
The 125 evaluation images were generated using the same distortion types with different levels of distortions to avoid biasness. The predicted quality scores and MOS were obtained for these images and PLCC, SROCC and RMSE were calculated.
Ethics statement
This study involved human participants solely for the purpose of subjective image quality evaluation. All participants were adults who voluntarily participated, and no personally identifiable or sensitive information was collected. Each participant received a written declaration form outlining the study’s purpose, procedures, potential risks, and their rights, including the option to withdraw at any time without penalty. Consent was documented by the participant’s signature on the declaration form prior to participation, and signed forms are retained in the study records. The study consisted exclusively of visual inspection tasks, with no direct human interaction or intervention. These subjective evaluation settings were followed similarly to Rajagopal et al. where all participants do not require any declaration form to be signed [18,38,63,64]. Our survey data was completely anonymous, did not involve sensitive information, and is purely for internal method validation.
Full-Reference Image Quality Assessment (FR-IQA)
To validate the subjective MOS collected in this study, additional benchmarking was performed using established full-reference image quality assessment (FR-IQA) metrics. Specifically, Structural Similarity Index (SSIM), Multiscale SSIM (MSSIM), Feature Similarity Index (FSIM), and Information Weighted SSIM (IWSSIM) were computed for all distorted images relative to their pristine reference images. Previous study has shown that combining subjective and objective IQA techniques is beneficial when assessing microscopy images, including parasite samples [64].
These FR-IQA metrics quantify structural similarity between reference and distorted images, serving as objective predictors of perceived quality. Table 6 summarizes the key characteristics of each FR-IQA metric included in this study. The resulting FR-IQA scores were statistically compared with human MOS values by computing the Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank-Order Correlation Coefficient (SROCC). High correlation values were interpreted as evidence that the subjective MOS aligns with established objective measures, thereby validating the use of MOS as training targets for regression models in the proposed framework.
Statistical significance testing of model performance
Non-parametric statistical tests were employed to evaluate whether differences in model performance metrics were statistically meaningful. Initially, the Friedman test was applied to assess the null hypothesis that all models performed equivalently across evaluation conditions. Upon rejection of this null hypothesis, pairwise comparisons were conducted using the Nemenyi test to identify which specific model pairs exhibit statistically significant differences in average ranks. The Nemenyi test is well-suited for multiple comparison procedures following Friedman tests and has been widely used for model performance comparisons without assuming normality [65].
Additionally, the Wilcoxon signed-rank test was applied to compare the proposed CNN-based models (PRIQA, PEIQA, PDIQA) against established NR-IQA baselines in a pairwise manner. This test evaluates whether the median of the paired differences between two models differs significantly from zero. A significance level of α = 0.05 was used for all tests. Following established practice [66], p-values were interpreted such that p < 0.05 indicates rejection of the null hypothesis in favor of the alternative, meaning that the difference in model performance is statistically significant. Conversely, p ≥ 0.05 was interpreted as insufficient evidence to reject the null hypothesis, indicating statistically no significant difference.
These statistical analyses ensured that observed differences in performance metrics such as PLCC, SROCC, and RMSE were not attributable to random variability, supporting claims of genuine model performance advantages.
Results and discussions
The Relationship between MOS and different distortion levels
The relationship between MOS and distortion levels for four types of distortions Gaussian White Noise (GWN), Salt & Pepper (SnP), Speckle, and JPEG compression is illustrated in Fig 6. Higher MOS values indicate better perceived image quality, while higher distortion levels correspond to more severe degradation.
(a) Gaussian White Noise (GWN), (b) Salt & Pepper Noise (SnP), (c) Speckle Noise, and (d) JPEG compression.These plots illustrate the relationship between increasing distortion intensity and subjective image quality as perceived by human evaluators.
As seen in the scatter plots, a general decreasing trend in MOS is observed with increasing distortion levels, particularly for GWN and Salt & Pepper distortions. This suggests that human evaluators were sensitive to the degradation introduced by these distortion types. In contrast, the MOS for Speckle noise and JPEG compression remained relatively consistent across distortion levels, implying that the visual impact of these distortions was either less distinguishable or not perceived as severe by human evaluators. These trends confirmed by subjective human evaluation typically correlate inversely with distortion severity except in cases where the artifacts are visually subtle or harder to detect.
Correlation between MOS and FR-IQA metrics
To assess the validity of the collected MOS values, comparisons were made with FR-IQA scores obtained using SSIM, MSSSIM, FSIM, and IWSSIM. Based on Table 7, the computed PLCC and SROCC values exceeded 0.68 for all four FR-IQA metrics. A study mentioned that a correlation coefficient greater than 0.68 demonstrates that the subjective MOS and FR-IQA scores are closely aligned [38]. Hence, this proved that the results obtained show a strong positive correlation between subjective evaluations and objective structural metrics. This result validates the use of MOS as a reliable target for training regression models and demonstrates that subjective quality judgments of parasite images are consistent with established FR-IQA methods.
Table 7 shows that the subjective MOS have strong positive correlations with the FR-IQA metrics evaluated. PLCC values were above 0.88 and SROCC values above 0.74 for all metrics, with SSIM and MSSIM showing the highest values. These results suggest that the MOS scores used in this study effectively capture quality differences in a way that aligns with established objective measures. This supports the use of MOS as a reliable ground truth for training regression models. Overall, these findings confirm that the approach can model human-perceived image quality in parasite images, which is important for developing automated inspection tools.
Performance of the Deep Convolutional Neural Network (DCNN)
Feature extraction was performed using the respective DCNN architectures, and multiple regression methods were subsequently applied to map the extracted features to MOS. The best regression model for each DCNN was identified based on achieving the highest PLCC and SROCC values, together with the lowest RMSE. Table 8 presents the selected DCNN architectures along with their corresponding regression models, cross-validation folds, and dataset splits.
Following the model selection process, regression performance of each DCNN was evaluated using PLCC, SROCC, and RMSE metrics, as shown in Table 9. Speckle noise and JPEG compression recorded lower PLCC and SROCC values, indicating these distortions are harder for the models to assess accurately. The lower correlation for JPEG compression is consistent with previous findings in medical imaging IQA [30], while Speckle noise remains an underexplored challenge.
ResNet-101, EfficientNet-B0, and DarkNet-53 achieved higher PLCC and SROCC values with lower RMSE for Gaussian White, Salt and Pepper, and Speckle noises. This suggests these architectures are better suited for quantifying these types of distortions. Among them, ResNet-101 delivered the best overall performance, with PLCC of 0.993, SROCC of 0.921, and RMSE of 0.188. DarkNet-53 and EfficientNet-B0 also performed strongly, closely following ResNet-101 in predictive accuracy.
In addition to predictive accuracy, Table 9 reports each model’s prediction speed and training time. These factors are critical for practical deployment, especially in field laboratories or real-time inspection systems. While ResNet-101 achieved the highest accuracy, its prediction speed was moderate (291.36 image/s) compared to faster models like ResNet-18 (2544.03 image/s) and GoogLeNet (1514.10 image/s).
EfficientNet-B0 and DarkNet-53 offered a balanced trade-off, combining high accuracy with faster prediction speeds and shorter training times. These results suggest that while ResNet-101 is ideal for accuracy-critical tasks, EfficientNet-B0 and DarkNet-53 may be more practical choices for real-time or resource-constrained deployments.
Analysis of ablation
An ablation study was conducted to investigate the impact of different feature extraction layers within each deep convolutional neural network (DCNN) on the overall performance of the regression models. This analysis aimed to determine the most suitable layer that provides high-quality features for predicting image quality in the PRIQA, PEIQA, and PDIQA models.
For each CNN backbone (ResNet-101, EfficientNet-B0, and DarkNet-53), features were extracted from multiple layers, spanning from early to the deepest layers. These features were subsequently passed to the corresponding regressors, and their performance was evaluated using three standard metrics: PLCC, RMSE, and SROCC on the testing dataset. The results in Table 10 summarize the ablation study outcomes across the top three CNN backbones, detailing the predictive performance of each extracted layer.
For ResNet-101, the final layer pool5 achieved the highest PLCC (0.942) and SROCC (0.895), along with the lowest RMSE (0.434), indicating a strong correlation with the subjective MOS. Similarly, for DarkNet-53, the avg1 layer demonstrated the best overall performance. In the case of EfficientNet-B0, the global_average_pool layer outperformed earlier layers, achieving a PLCC of 0.938 and SROCC of 0.882.
These findings suggest that deeper layer capture more abstract and semantically rich features that align more closely with human perception of image quality. Therefore, the deepest and best-performing layer from each of the top three CNNs was selected as the final feature representation for the PRIQA, PEIQA, and PDIQA models.
Analysis of sensitivity
A sensitivity analysis was conducted to examine the impact of key hyperparameters on the regression performance of the top three selected CNN–regressor pairs: ResNet-101 with Cubic SVM, DarkNet-53 with Cubic SVM, and EfficientNet-B0 with Quadratic SVM. This analysis aimed to assess the robustness and reliability of the PRIQA model outputs under controlled perturbations in support vector machine (SVM) parameters. The three hyperparameters studied were Epsilon, Box Constraint, and Kernel Scale. For each parameter, the baseline value was varied by ±25% and ±50%, and model performance was evaluated using Root Mean Square Error (RMSE) on the testing dataset.
To further validate the selection of hyperparameters, the sensitivity study systematically replaced the default “auto” configuration with manual tuning across the three parameters. It was observed that lowering the box constraint, increasing epsilon, and increasing the kernel scale produced a more rigid model, reducing the risk of overfitting but also potentially limiting flexibility. The results, visualized in Fig 7, revealed that while minor gains in RMSE could be achieved through fine-tuning, the performance was generally consistent around the baseline, indicating robustness. Notably, EfficientNet-B0 demonstrated greater sensitivity to box constraint adjustments, while ResNet-101 and DarkNet-53 exhibited more stable performance under parameter shifts.
(a–c) ResNet-101 with Cubic SVM for epsilon, box constraint, and kernel scale, respectively; (d–f) DarkNet-53 with Cubic SVM for epsilon, box constraint, and kernel scale, respectively; (g–i) EfficientNet-B0 with Quadratic SVM for epsilon, box constraint, and kernel scale, respectively.
These findings support the original decision to use the “auto” setting in the main experiments, as it provided a reliable balance between generalization and precision without the risk of overfitting specific distortions. The adaptive nature of the automatic configuration allowed the model to dynamically determine suitable hyperparameter values for the dataset, which is especially useful in scenarios involving unseen parasite image distributions. Therefore, the default “auto” setting remains a practical and effective strategy for real-world deployment of the PRIQA, PEIQA, and PDIQA frameworks.
Depth-wise robustness patterns using CKA
The similarity between the learned representations is quantified using Centered Kernel Alignment (CKA). Depth-wise CKA results for GWN, SnP, Speckle and JPEG distortions are illustrated in Fig 8. Based on this analysis, ResNet-101 is selected as the backbone for the proposed PRIQA framework. At the early layer, ResNet-101 is highly sensitive to additive noise at the first layer (GWN = 0.00, SnP = 0.04), while maintaining moderate to high for JPEG = 0.84 and near-identity for Speckle = 1.00. As depth increases in mid layer, substantial recovery is observed by the mid depth (JPEG = 0.99, Speckle = 0.994, SnP = 0.75, GWN = 0.64), indicating progressive invariance with depth. At late depth, ResNet-101 shows near-identity similarity under JPEG and Speckle (0.99–1.00) and strong robustness to additive noise, achieving similarity scores of 0.91 for GWN and 0.97 for SnP. In contrast, several alternatives exhibit weaker late-layer alignment under additive noise, including AlexNet (GWN = 0.06), ResNet-50 (GWN = 0.62), and GoogLeNet (GWN = 0.60), with similar reduced robustness under SnP distortions. Given that microscopy images frequently suffer from the sensor and illumination noise, late-layer robustness is critical for reliable quality assessment. Combined with mature tooling and a stable 2048-D feature at res5c_branch2c or pool5, these findings support ResNet-101 as the most reliable extractor for small-sample dataset.
Bars denote Early, Mid, and Late layers for each model in four panels: (a) GWN, (b) Salt and Pepper (SnP), (c) Speckle, and (d) JPEG. Values ∈ [0,1]; higher = more similar.
Computational complexity and deployment implications
To complement the accuracy analysis, Table 2 reports parameters, latency, throughput, and VRAM to clarify the operational cost of each backbone. When these factors are considered together with the accuracy results, ResNet-101 emerges as the most suitable choice, combining robust late-layer representations and the highest overall accuracy with manageable runtime on commodity hardware (RTX 3060 Ti). For scenarios constrained by throughput or memory, EfficientNet-B0 provides a favorable accuracy–efficiency trade-off and is well suited to field labs or embedded workstations.
Performance of PRIQA and state-of-the-art NR-IQA models
The analysis from ablation, sensitivity and CKA proved that PRIQA outperformed other model configurations. Furthermore, to verify the performance of PRIQA, the model is compared with other 10 state-of-the-art NR-IQA models. Based on the performance metrics presented in Table 11, the three best-performing models, namely PEIQA (Parasite EfficientNet-B0 Image Quality Assessment), PRIQA (Parasite ResNet-101 Image Quality Assessment), and PDIQA (Parasite DarkNet-53 Image Quality Assessment), were selected for further evaluation using the independent evaluation dataset. These models were subsequently compared against ten state-of-the-art NR-IQA models using the same performance metrics. This comparison aimed to identify the most suitable model for assessing the quality of parasite images, specifically targeting Cryptosporidium spp. and Giardia spp. The comparative performance metrics for PEIQA, PRIQA, PDIQA, and the NR-IQA models are summarized in Table 11. Models that achieve the highest performance metrics are highlighted in bold. PRIQA consistently demonstrated superior performance, achieving the highest PLCC, and SROCC values, as well as the lowest RMSE, indicating its efficacy in predicting image quality. The enhanced performance of PRIQA with 10-fold cross-validation is attributed to its ability to leverage a larger training dataset in each fold, thereby facilitating more robust learning and improved generalization.
Statistical significance test
To evaluate the significance of performance differences between the proposed CNN-based models (PEIQA, PDIQA, and PRIQA) and existing NR-IQA baselines, two complementary non-parametric statistical tests were conducted. First, the Wilcoxon signed-rank test was applied to compare PRIQA specifically against each baseline. This test provides a focused evaluation of whether PRIQA achieves consistent performance improvements over other methods. Table 12 summaries which pairwise comparisons were statistically significant, with significance determined at p < 0.05. Second, the Nemenyi post-hoc test was used for multiple pairwise comparisons among all models to determine whether differences in average ranks were statistically significant. This test offers a broader comparison across all tested models. Table 13 presents the pairwise significance matrix from the Nemenyi test, where a value of ‘1’ indicates a significant difference in ranking between two models, and ‘0’ indicates no significant difference.
The Wilcoxon signed-rank test results in Table 12 show that PRIQA achieved statistically significant improvements over all traditional NR-IQA baselines, reinforcing its superiority for parasite image quality assessment. The only exception was its comparison with PEIQA, which showed no significant difference, suggesting these two CNN-based approaches deliver statistically comparable performance. The results of the Nemenyi test in Table 13 further support these findings by showing that the proposed CNN-based models generally achieved significantly different and better ranks compared to traditional NR-IQA baselines, as indicated by the numerous ‘1’s in their comparisons with existing methods. In contrast, comparisons among PEIQA, PDIQA, and PRIQA themselves show frequent ‘0’s, suggesting no statistically significant difference in ranks among these proposed models. This indicates they perform similarly well in ranking-based evaluations.
Conclusion
This study proposes the first dedicated NR-IQA framework developed explicitly for parasitic microscopy images, with particular emphasis on Cryptosporidium spp. and Giardia spp., which pose significant public health risks due to their association with waterborne disease outbreaks. The proposed PRIQA framework, based on a ResNet-101 backbone and trained using MOS, demonstrated strong performance across standard IQA metrics (PLCC, SROCC, and RMSE), particularly under 10-fold cross-validation. Compared with PEIQA, PDIQA, and ten state-of-the-art NR-IQA methods, PRIQA consistently achieved higher accuracy in predicting distortion severity without requiring pristine reference images. This enables automated identification of low-quality microscopy images prior to downstream analysis, reducing manual rescreening, mitigating missed detections caused by poor image quality, and supporting more consistent diagnostic decisions across laboratories.
Layer-wise CKA analysis further supports the selection of ResNet-101, showing high feature stability at deeper layers under common distortions (e.g., JPEG, speckle, and additive noise), which facilitates reliable MOS regression while maintaining manageable computational cost for deployment.
Despite its competitive performance, this work has limitations that motivate future research. The current dataset comprises synthetically distorted images under controlled conditions, and model performance on naturally distorted or clinically acquired images from diverse microscopes remains to be validated. Future studies should incorporate broader real-world datasets, explore end-to-end deep learning architectures for joint feature learning and quality prediction, and investigate lightweight or real-time implementations suitable for resource-constrained or point-of-care settings.
Overall, this work advances microscopic image quality assessment by introducing a parasite-specific NR-IQA framework that supports quality-aware automated microscopy and diagnostic workflows. By facilitating more reliable image-based screening for parasitic infections, the study aligns with the United Nations Sustainable Development Goal 3 (Good Health and Well-Being) and indirectly supports United Nations Sustainable Development Goal 6 (Clean Water and Sanitation) through improved monitoring of waterborne pathogens.
References
- 1. Lévêque L, Outtas M, Liu H, Zhang L. Comparative study of the methodologies used for subjective medical image quality assessment. Phys Med Biol. 2021;66(15):10.1088/1361-6560/ac1157. pmid:34225264
- 2. Parihar AS, Gupta S. Dehazing optically haze images with AlexNet-FNN. J Opt. 2023;53(1):294–303.
- 3. Momin A, Kondo N, Al Riza DF, Ogawa Y, Obenland D. A Methodological Review of Fluorescence Imaging for Quality Assessment of Agricultural Products. Agriculture. 2023;13(7):1433.
- 4. Xu W, Zhai Q, Liu J, Xu X, Hua J. A lightweight deep-learning model for parasite egg detection in microscopy images. Parasit Vectors. 2024;17(1):454. pmid:39501374
- 5. Yakoob J, Abbas Z, Beg MA, Naz S, Khan R, Islam M, et al. Prevalences of Giardia lamblia and Cryptosporidium parvum infection in adults presenting with chronic diarrhoea. Ann Trop Med Parasitol. 2010;104(6):505–10. pmid:20863439
- 6. Bukhari Z, Smith HV, Sykes N, Humphreys SW, Paton CA, Girdwood RWA, et al. Occurrence of Cryptosporidium spp oocysts and Giardia spp cysts in sewage influents and effluents from treatment plants in England. Water Science and Technology. 1997;35(11–12):385–90.
- 7. Moussa AS, Ashour AA, Soliman MI, Taha HA, Al-Herrawy AZ, Gad M. Fate of Cryptosporidium and Giardia through conventional and compact drinking water treatment plants. Parasitol Res. 2023;122(11):2491–501. pmid:37632544
- 8. Guo Y, Hu M, Min X, Wang Y, Dai M, Zhai G, et al. Blind Image Quality Assessment for Pathological Microscopic Image Under Screen and Immersion Scenarios. IEEE Trans Med Imaging. 2023;42(11):3295–306. pmid:37267133
- 9. Wang X, Liu L, Du X, Zhang J, Ni G, Liu J. GMANet: Gradient Mask Attention Network for Finding Clearest Human Fecal Microscopic Image in Autofocus Process. Applied Sciences. 2021;11(21):10293.
- 10. Quintana-Quintana L, Ortega S, Fabelo H, Balea-Fernández FJ, Callico GM. Blur-specific image quality assessment of microscopic hyperspectral images. Opt Express. 2023;31(8):12261–79. pmid:37157389
- 11. Kahraman İ, Karaş İR, Turan MK. Real-Time Protozoa Detection from Microscopic Imaging Using YOLOv4 Algorithm. Applied Sciences. 2024;14(2):607.
- 12. Qian W, Li J, Zhu J, Hao W, Chen L. Distortion correction of a microscopy lens system for deformation measurements based on speckle pattern and grating. Optics and Lasers in Engineering. 2020;124:105804.
- 13. Sheikh HR, Bovik AC. Image information and visual quality. IEEE Trans Image Process. 2006;15(2):430–44. pmid:16479813
- 14. Jain P, Shikkenawis G, Mitra SK. Natural Scene Statistics And CNN Based Parallel Network For Image Quality Assessment. In: 2021 IEEE International Conference on Image Processing (ICIP), 2021. 1394–8.
- 15. Jiang Q, Gu Y, Li C, Cong R, Shao F. Underwater Image Enhancement Quality Evaluation: Benchmark Dataset and Objective Metric. IEEE Trans Circuits Syst Video Technol. 2022;32(9):5959–74.
- 16. Kastryulin S, Zakirov J, Pezzotti N, Dylov DV. Image Quality Assessment for Magnetic Resonance Imaging. IEEE Access. 2023;11:14154–68.
- 17. Qi C, Wang S, Yu H, Zhang Y, Hu P, Tan H, et al. An artificial intelligence-driven image quality assessment system for whole-body [18F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2023;50(5):1318–28. pmid:36529840
- 18. Rajagopal H, Mokhtar N, Khairuddin ASM. Image Quality Assessment for Wood Images. In: 2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), 2022. 1–5.
- 19. Varga D. A Human Visual System Inspired No-Reference Image Quality Assessment Method Based on Local Feature Descriptors. Sensors (Basel). 2022;22(18):6775. pmid:36146123
- 20. Agnolucci L, Galteri L, Bertini M, Del Bimbo A. ARNIQA: Learning Distortion Manifold for Image Quality Assessment. 2023.
- 21. Lin Zhang, Lei Zhang, Bovik AC. A feature-enriched completely blind image quality evaluator. IEEE Trans Image Process. 2015;24(8):2579–91. pmid:25915960
- 22. Sun Q, Liu Z, Ding T, Shi C, Hou N, Sun C. Machine Learning-Based Objective Evaluation Model of CTPA Image Quality: A Multi-Center Study. Int J Gen Med. 2025;18:997–1005. pmid:40026813
- 23. Hu Z, Yang G, Du Z, Huang X, Zhang P, Liu D. No-reference image quality assessment based on global awareness. PLoS One. 2024;19(10):e0310206. pmid:39374211
- 24. Jeong HK, Park C, Jiang SW, Nicholas M, Chen S, Henao R, et al. Image Quality Assessment Using Convolutional Neural Network in Clinical Skin Images. JID Innov. 2024;4(4):100285. pmid:39036289
- 25. Oh R, Park UC, Park KH, Park SJ, Yoon CK. Deep learning-based automatic image quality assessment in ultra-widefield fundus photographs. BMJ Open. 2025;15(5):e100058. pmid:40398939
- 26. Herath HMSS, Herath HMKKMB, Madusanka N, Lee B-I. A Systematic Review of Medical Image Quality Assessment. J Imaging. 2025;11(4):100. pmid:40278016
- 27. Boit S, Patil R. An Efficient Deep Learning Approach for Malaria Parasite Detection in Microscopic Images. Diagnostics (Basel). 2024;14(23):2738. pmid:39682645
- 28. Aytac O, Senol FF, Tuncer I, Dogan S, Tuncer T. An innovative approach to parasite classification in biomedical imaging using neural networks. Engineering Applications of Artificial Intelligence. 2025;143:110014.
- 29. Mirzaei O, Ilhan A, Guler E, Suer K, Sekeroglu B. Comparative Evaluation of Deep Learning Models for Diagnosis of Helminth Infections. JPM. 2025;15(3):121.
- 30. Luisier F, Blu T, Unser M. Image denoising in mixed Poisson-Gaussian noise. IEEE Trans Image Process. 2011;20(3):696–708. pmid:20840902
- 31. Boulanger J, Kervrann C, Bouthemy P, Elbau P, Sibarita J-B, Salamero J. Patch-based nonlocal functional for denoising fluorescence microscopy image sequences. IEEE Trans Med Imaging. 2010;29(2):442–54. pmid:19900849
- 32. Škorić T, Pantelić D, Jelenković B, Bajić D. Noise reduction in two-photon laser scanned microscopic images by singular value decomposition with copula threshold. Signal Processing. 2022;195:108486.
- 33. Kohlfaerber T, Pieper M, Münter M, Holzhausen C, Ahrens M, Idel C, et al. Dynamic microscopic optical coherence tomography to visualize the morphological and functional micro-anatomy of the airways. Biomed Opt Express. 2022;13(6):3211–23. pmid:35781952
- 34. Abdennouri A, Zouaoui E, Ferkous H, Hamza A, Grimes M, Boukabou A. An improved Symmetric Chaotic War strategy optimization algorithm for efficient Scanning electron microscopy image segmentation: Calcium oxide catalyst case. Chemometrics and Intelligent Laboratory Systems. 2024;244:105043.
- 35. Jalilian E, Linortner M, Uhl A. Impact of Image Compression on In Vitro Cell Migration Analysis. Computers. 2023;12(5):98.
- 36. Zerva MCH, Christou V, Giannakeas N, Tzallas AT, Kondi LP. An Improved Medical Image Compression Method Based on Wavelet Difference Reduction. IEEE Access. 2023;11:18026–37.
- 37.
International Telecommunication Union R. Methodology for the Subjective Assessment of the Quality of Television Pictures. 2002.
- 38. Chow LS, Rajagopal H, Paramesran R, Alzheimer’s Disease Neuroimaging Initiative. Correlation between subjective and objective assessment of magnetic resonance (MR) images. Magn Reson Imaging. 2016;34(6):820–31. pmid:26969762
- 39. Bindu K, Ganpati A, Sharma AK. A Comparative Study of Image Compression Algorithms. IJORCS. 2012;2(5):37–42.
- 40. Tan M, Le QV. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. 2019. http://arxiv.org/abs/1905.11946
- 41. Redmon J, Divvala S, Girshick R, Farhadi A. You Only Look Once: Unified, Real-Time Object Detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 779–88.
- 42. Redmon J, Farhadi A. YOLOv3: An Incremental Improvement. 2018. http://arxiv.org/abs/1804.02767
- 43. Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI. 2017;31(1).
- 44. Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 6517–25.
- 45. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 770–8.
- 46. Szegedy C, Wei Liu, Yangqing Jia, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 1–9.
- 47. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84–90.
- 48. James G, Witten D, Hastie T, Tibshirani R, Taylor J. An Introduction to Statistical Learning. Springer International Publishing. 2023.
- 49. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 1995. 1137–45.
- 50. Abramson N, Braverman D, Sebestyen G. Pattern recognition and machine learning. IEEE Trans Inform Theory. 1963;9(4):257–61.
- 51.
Bishop CM. Pattern Recognition and Machine Learning. 1st ed. Springer. 2006.
- 52. Yao Y, Rosasco L, Caponnetto A. On Early Stopping in Gradient Descent Learning. Constr Approx. 2007;26(2):289–315.
- 53. Kornblith S, Norouzi M, Lee H, Hinton G. Similarity of neural network representations revisited. arXiv. 2019.
- 54. Cortes C, Mohri M, Rostamizadeh A. Algorithms for learning kernels based on centered alignment. arXiv. 2012.
- 55. Attar A, Rad RM, Shahbahrami A. EBIQA: An Edge Based Image Quality Assessment. In: 2011 7th Iranian Conference on Machine Vision and Image Processing, 2011. 1–4.
- 56. Mittal A, Soundararajan R, Bovik AC. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process Lett. 2013;20(3):209–12.
- 57. Moorthy AK, Bovik AC. A Two-Step Framework for Constructing Blind Image Quality Indices. IEEE Signal Process Lett. 2010;17(5):513–6.
- 58. Mousavi SMH, Mosavi SMH. A New Edge and Pixel-Based Image Quality Assessment Metric for Colour and Depth Images. In: 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), 2022. 1–11.
- 59. Ou F-Z, Wang Y-G, Zhu G. A Novel Blind Image Quality Assessment Method Based on Refined Natural Scene Statistics. In: 2019 IEEE International Conference on Image Processing (ICIP), 2019. 1004–8.
- 60. Saad MA, Bovik AC, Charrier C. Blind image quality assessment: a natural scene statistics approach in the DCT domain. IEEE Trans Image Process. 2012;21(8):3339–52. pmid:22453635
- 61. Venkatanath N, Praneeth D, Maruthi Chandrasekhar Bh, Channappayya SS, Medasani SS. Blind image quality evaluation using perception based features. In: 2015 Twenty First National Conference on Communications (NCC), 2015. 1–6.
- 62. Mittal A, Moorthy AK, Bovik AC. No-reference image quality assessment in the spatial domain. IEEE Trans Image Process. 2012;21(12):4695–708. pmid:22910118
- 63. Rajagopal H, Mokhtar N, Tengku Mohmed Noor Izam TF, Wan Ahmad WK. No-reference quality assessment for image-based assessment of economically important tropical woods. PLoS One. 2020;15(5):e0233320. pmid:32428043
- 64. Asri MAA, Mokhtar N, Rajagopal H, Wan Mohd Mahiyiddin WA, Lian Lim YA, Iwahashi M, et al. Quality assessment for microscopic parasite images. Proceedings of International Conference on Artificial Life and Robotics. 2023;28:37–42.
- 65. Kaur I, Kaur A. Comparative analysis of software fault prediction using various categories of classifiers. Int J Syst Assur Eng Manag. 2021;12(3):520–35.
- 66. Dao PB. On Wilcoxon rank sum test for condition monitoring and fault detection of wind turbines. Applied Energy. 2022;318:119209.