Figures
Abstract
De-noising convolutional neural networks (DnCNNs), are a powerful nonlinear mapping models in image processing for impulse noise removal. During training and validation, a set of 12 standard testing images is used to evaluate model performance. DnCNNs demonstrate strong capability in classification of impulse noise with excellent results. To evaluate de-noising performance, a suitable noise ratio should be added so that most appropriate DnCNN model can be used for impulse noise detection. This research proposes an effective image restoration technique that integrates DnCNN and an autoencoder with a fuzzy median filter to detect and eliminate high-density impulse noise. The proposed deep learning de-noising technique used to classify noisy and clean pixels, and result are presented in different metrics such as accuracy, FPR, FNR and f1 score. Further, to remove impulse noise an auto-encoder with fuzzy median filter are used that then reconstructs the clean image based with parametric values. Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), are used to assess our methodology, it is compared to conventional impulse noise filtering techniques, experimental results indicate a significant improvement in image quality. Based on the final de-noised images, this research contributes to developing deep learning-based, de-noising techniques that enhance image restoration quality while preserving image details and essential features.
Citation: Naeem M, Bhatti SM, Rashid M, Jaffar A, Akram S, Fida B, et al. (2026) A robust deep learning approach for impulse noise filtering using hybrid auto-encoder with fuzzy median filter. PLoS One 21(4): e0343141. https://doi.org/10.1371/journal.pone.0343141
Editor: Umair Muneer Butt, University of Management and Technology - Sialkot Campus, PAKISTAN
Received: December 4, 2025; Accepted: February 2, 2026; Published: April 3, 2026
Copyright: © 2026 Naeem et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: https://www.kaggle.com/datasets/leweihua/set12-231008/data.
Funding: This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2603).
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
Digital images have become increasingly popular in image photography, medical, remote sensing, criminal investigation, banking and finance, security, and more in recent decades. In digital image restoration, image noise is an inevitable artifact in processing that can substantially reduce the quality of captured images. Among the various types of noise, impulse noise is particularly considered, as it often originates due to adverse environmental conditions, such as sensor imperfections, transmission errors and can severely disrupt pixel intensity values in the acquired image which causes the loss of information in digital images. An important objective in image de-noising is to effectively eliminate noise while preserving the essential features of the image. These features encompass critical edge and texture details present in images. To handle uncertainty, fuzzy logic has proven to be an effective technique, and when combined with an auto-encoder, it is proposed to put at the end both low and high-density impulse noise from digital images. First of all, we see impulse noise is generally of two types: fixed value also named as salt & pepper noise and random value impulse noise. In fixed value, the corrupted pixel value can be 0 or 255 while in random value it can be any value from the interval [0,255] for grayscale images [1,2]. Removal of random value impulse noise (RVIN) is a challenging task as compared to salt & pepper noise. The noise model of random value impulse noise with noise probability p can be described as follows:
where noisy and original images are represented as Ni,j and Oi,j respectively.To recover corrupted pixels, a variety of linear and nonlinear filtering techniques have been proposed [3–5]. Linear filters, particularly average filters, often result in blurring effects in most cases. In contrast, nonlinear filters, due to their computational efficiency and superior denoising capabilities, exhibit greater robustness against impulse noise [6,7].The Various traditional and advanced denoising techniques—such as NLM, wavelet, diffusion, total variation, BM3D, sparse representation, Markov random field, and neural network-based methods—have been proposed, while neuro-fuzzy filters further enhance impulse noise removal by intelligently analyzing image patterns [8–10]. Traditional noise removal techniques, such as median filtering, have effectively reduced impulse noise. However, they often blur image details and edges. Modern approaches incorporate fuzzy logic, to enhance de-noising performance, researchers increasingly use neuro-fuzzy systems and machine learning-based approaches, including deep neural networks. Salt-and-pepper noise (SPN), typically caused by bit errors or faulty sensors, is a frequent form of impulse noise.The purpose of image denoising is to restore the corrupted image while preserving essential edge features like textures and corners [11,12]. After the appreance of CNN architecture in 1962, it is used different tasks such as image segmentation, image super-resolution, image de-blurring, image de-noising, image de-convolution. Although it was initially developed to eliminate Gaussian noise, it was later applied to suppress random-valued impulse noise (RVIN), achieving with remarkable performance.Weight sharing in CNNs minimizes trainable parameters, resulting in reduced complexity and improved generalization, compared to traditional ANNs [13]. Due to the sparse connectivity of convolutional layers, CNNs are more efficient and easier to train using backpropagation compared to fully connected ANNs [14]. The superior representational capacity of CNNs for digital images enables them to outperform classical machine learning techniques, including decision trees and support vector machines, in image restoration [15–17]. Alternatively, the weights of convolution masks in CNNs are optimized using gradient-based training, which inherently captures pixel-level self-similarity across a large collection of training images [18]. Furthermore, once a CNN is trained, its learned weights can be utilized by another network through transfer learning [19]. Fig 1 illustrates the architecture of the image de-noising model.
De-noising performance against traditional filter such as median filtering, Wiener filtering, and bilateral filtering can be achieved through combining auto-encoder with traditional filter, leveraging fuzzy reasoning for handling uncertainty in noise detection [20,21], so DAEs reconstruct clean versions of images from corrupted inputs by learning robust representations. To further enhance performance, hybrid model such as fuzzy logic and auto-encoders have also been proposed, these models performance are measured with image restoration metrics such PSNR and SSIM. While effective in removing impulse noise, they often compromise image details [22,23]. Conventional de-noising methods, such as median filtering, often fail to restore fine image details while removing noise effectively [8]. In the deep learning era, neural networks offer a promising approach to identifying and eliminating noise while preserving image quality. Deep learning algorithms, especially CNN, have proven effective in solving various image processing tasks, further we achieve robustness ideas for addressing challenges like impulse noise removal in digital images. From recent years, several studies [2,9,10] have shown the superiority of CNN-based and hybrid de-noising systems over conventional algorithms in both synthetic and real-world noisy conditions. Here, deep learning is used for de-noising, allowing the model to learn directly from digital images, which helps reduce training time.
In contrast to existing CNN-based impulse noise removal approaches, the proposed method introduces an explicit two-stage denoising strategy. Unlike conventional deep learning models that apply global filtering across the entire image, the proposed framework first localizes noisy pixels using a CNN-based detector and then selectively restores only the corrupted regions using an autoencoder integrated with a fuzzy median filter. This targeted restoration significantly reduces over-smoothing, preserves edge information, and improves robustness under high-density impulse noise conditions. The major contributions presented in this study include:
- In this approach, a convolutional neural network for impulse noise detection and an autoencoder integrated with a fuzzy median filter are employed for image denoising, enabling effective handling of a wide range of impulse noise levels.
- The novelty of this work lies in integrating a fuzzy median filter with auto-encoder which is a hybrid approach of image de-noising and it demonstrates robustness across heterogeneous image regions smooth, textured, and edges, making it suitable for real-world images with complex structures.
- The influence of different network parameters on the performance of impulse detection is analyzed while controlling the trade-off between noise reduction and detail preservation of images.
- Our method has demonstrated superior results compared to others state-of-the-art de-noising techniques.
This paper is organized as follows. In Section 2, impulse noise de-noising related work presented Section 3 represents methodology and architecture of approach presented, learning and optimization during de-noising operation on CNN, with the dataset used for de-noising, Section 4 Performance Metrics for Noise Detection and Filtering. Section 5 gives experiment and results. Section 6 concludes the paper.
2 Literature review
Recent advancements in deep learning have made substantial contributions to the field of image de-noising.Several studies have focused on CNN-based de-noising models, such as the work by Zhang et al. [24] who introduced a deep residual learning framework for de-noising. Additionally, hybrid approaches combining traditional and deep-learning techniques, such as wavelet transforms alongside CNNs, have been explored.
Table 1 is a summary of research studies that have been carried out in the field of image de-noising, detailing the authors, methods used, and key findings. An adaptive de-noising model that dynamically adjusts filtering parameters based on noise intensity. While these methods achieve competitive results, they often lack explicit noise detection mechanisms, leading to unnecessary alterations in non-noisy regions. Our work addresses this limitation by integrating a noise localization step, improving de-noising accuracy and preserving fine image details.
2.1 Image de-noising techniques
Noise removal in digital images remains a critical challenge in digital image processing, as noise, originating from transmission errors and sensor limitations, degrades image quality. Numerous methods have been proposed over the years, purpose of each technique to remove the noise while in tack different features of images such as edges and textures. This review summarizes some prominent techniques in the field, highlighting their strengths, limitations, and recent advancements.
2.2 Classical de-noising methods
Traditional methods like the median filter have been widely used for impulse noise removal. However, it tends to blur edges and textures in highly corrupted images. To address this, more advanced filters, such as the directional weighted switching median filter, incorporate machine learning for improved accuracy in noise detection, especially in high-density noise situations.
2.3 Fuzzy logic and neuro-fuzzy systems
Fuzzy logic-based techniques have shown promise in image de-noising, especially for impulsive noise. The neuro-fuzzy system combines the flexibility of fuzzy sets with the learning capability of neural networks, resulting in superior performance over traditional filters. These systems can adapt to various noise levels without prior knowledge of the image, making them effective in blind restoration tasks. Techniques like fuzzy impulse noise detection and reduction (FIDRM) have been widely explored for their ability to preserve image detail while removing noise. The analysis and experiment results indicate that the CNN model is more successful compared to the other traditional/standard image filtering methods in terms of noise removal and image details restoration.
2.4 Deep learning-based de-noising approaches
The rise of deep learning has brought significant improvements to image de-noising. Convolution Neural Networks (CNNs) and auto-encoders, particularly the De-noising Auto-encoder (DAE), have become central to modern approaches. These models excel at separating noise from clean image features. Recent studies have also incorporated residual learning strategies, in which the network predicts the residual noise instead of directly estimating the clean image, improving both the performance and the speed of training. Moreover, techniques like DnCNN, which utilize deep CNNs with batch normalization and residual learning, have shown exceptional results, even for unknown noise levels. These models can be trained once and generalized to handle various types of image degradation, such as Gaussian noise, image super-resolution, and JPEG de-blocking.
Fan et al. [25] proposed a novel de-noising method based on CNN which utilizes the power of deep learning to strip noises in a picture. Using CNNs trained on noisy and clean image pairs, noise is removed while preserving image structures and fine details.The technique has good adaptability to different noise levels than traditional ones, which is indicated with data-driven techniques and makes it useful in practical image restoration. Neural networks have a learning power and fuzzy logic has a reasoning power. Therefore, Masood et al. [26] have suggested a neuro-fuzzy filtering technique that integrates the power of both neural networks and fuzzy logic. Smart noise removal is achieved because the system intelligently adapts to local properties of the image so that edge and texture information are preserved. This composite technique works in a variety of noise types, and it is more versatile as compared to the strictly statistical and deterministic techniques.
Farooq and Sava [27] realized a noise removal method based on the convolutional neural networks. Such training of the model transfers noisy inputs to clean outputs, allowing it to automatically acquire the complex nature of noise. The method is highly accurate in the denoising and keeps the computation speed to make it applicable to large image processing tasks. Residual learning-based deep CNN referred to as DnCNN with the task of removing Gaussian noise and restoring images [28]. Modeling residual noise instead of clean image estimates a simpler form of the model and leads to faster convergence. The simple ability of dividing noise in an image close to ideal alongside its powerful trade-off between the performance of image denoising and processing speed has made DnCNN become a benchmark in image denoising tasks. Meng et al. [29] proposed a gray image de-noising approach based on dilated convolution which can expand receptive field without the number of parameters being augmented. This allows the network to extract more information about context around it, increasing performance in very noisy scenarios, where maintaining structure is of paramount importance.
Despite these challenges switching median filter which considers decision trees and neural networks as two-stage noise detection and filtering [30]. The system detects only the noisy pixels and implements a customized filtering process which successfully removes noise and does not over-smooth the clean areas. An alternative to make use of self-similarity between patches of an image [31], who adopted a non-local means strategy in denoising. The given algorithm is noise-reducing, since by averaging uniform-appearing patches in the whole image, it manages to preserve significant structures, especially those in the textured areas. Meng et al. [32] used the methods of sparse representation with a combination with dictionary learning to noise reduction. This algorithm is a sparse linear representation of image patches by a set of atoms in a dictionary, with successful denoising, and retention of critical features. Muhammad et al. [33] offered the hybrid CNN model that used deep residual learning that incorporated residual block to augment denoising power in the model. The method attains higher performance levels by allowing gradient flow and suppressing the aspect of vanishing gradient especially on deeper nets. Zhang et al. [34] proposed the method of low-rank approximation of matrices where the useful information with a matrix A is decomposed into the useful image and matrix with noise. The method has particular success when used in practice on tasks with noisy images where noise distribution is not simple. Ansari et al. [35] employed image de-noising that managed to decompose the image utilizing a wavelet transformation referred to as multi-scale. Such an approach disentangles noise and significant image structures at multiple scales so that noise can be selectively suppressed and edges better preserved. Residual U-net neural network was designed by [36] to address specifications of image denoising. The architecture is a hybrid of the advantages of U-net encoder decoder structure and residual connections resulting in better reconstruction and training stability. Choi et al. [37] provided a noise removal method by decomposing a compromised image and reconstructing its details in addition to removal of noise dynamically. In that way, the method will adapt filtering strength to the local noise properties, which is versatile between different types of noise.
CNN-based de-noising algorithm that is focused on grayscale images. The algorithm is highly effective in dealing away noise in black and white pictures, retaining full fidelity and form. A context-aware CNN where only salt- and pepper noise is removed using the information about the pixels referring to context [38]. This enables the model to filter the corrupted regions, and save clean pixels selectively. The approach used by R. Nawaz et al. [39] consisted of a GAN-based de-noising strategy based on the generator-discriminator architecture producing top-notch restored images. The use of adversarial training framework promotes the generation of the output that is not only convincingly perceived but also noise-free. Zhang et al. [40] studied transformer-based de-noising and incorporated the attention mechanism to include long-range dependencies within the images. This enables the model to recognize and eliminate patterns of noise that is spatially placed more than before. EFID is an edge-oriented CNN de-noising mechanism that was introduced by [41]. Focusing on preservation of edges in the process of eliminating noise, the method achieves sharpness of structure in the restored images.
Thakur et al. [42] established an analysis of the state-of-the-art of CNN-based image denoising in terms of strengths and weaknesses of different architectures and noise levels. A deep CNN architecture that is to be used explicitly impulse removal. The approach applies the selective use of convolutional blocks so as to target the corrupted regions without indulging in needless processing the clean regions [43]. Genetic algorithms to maximize noise remover [44]. This evolutionary computing strategy performs an on-line search of filter parameters which maximise image quality over a series of iterations. Structure-preserving IM denoisers based on Markov Random Field [45]. The probabilistic model provides spatial dependencies and an effective noise reduction could be made not to lose significant textures. Iterative methods of salt-and-pepper denoising were developed by J. Ke et al. [46]. The image gradually improves through an iterative process that goes back and forth with noise detection and biased restoration [47,48].
Dictionary learning-based on sparse representations was used dictionary learning to perform denoising using sparse representations. Using a small number of basis elements, the technique is able to effectively recover clean images associated with noisy measurements [49–51]. Complete fractional-based total variation method to eliminate noise of salt-and-pepper. The method denoises the noisy areas without killing the meaningful edges and minute structures. A hybrid model using both CNN and RNN structures in image restoration using deep learning which is leading to better reconstruction [52].
3 Materials and methods
The proposed hybrid deep learning framework for image de-noising consists of three major phases: preprocessing, noise detection, and noise filtering. The comparison of a number of prominent approaches towards the removal of such impulse noise is demonstrated in Table 1, as well as both traditional filters and modern deep learning-based approaches towards this problem and performance is measured with restoration metrics such as solution the peak signal and noise ratio (PSNR) and the structural similarity index (SSIM). In the Table 1 of this article is meant to give a clear picture of the efficiency of various algorithms with respect to noise suppression, detail preservation, computational efficiency and the overall performance of the algorithms in different set-ups. During preprocessing, standardization and normalization of the input images are performed, so that there is an agreement on pixel range intensities, and because of this convergence in training becomes easier.
Fig 2 represents the proposed methodology of this research. There can also be data augmentation like resizing, conversion to grayscale along with rotation and flipping the images to improve the generalization of the model. Noise detection phase entails identifying and categorizing the noisy pixels with the assistance of supervised learning models. More precisely, CNNs are used to extract deep spatial feature, and these features are fed into the decision modules trained on the distinction between noisy and clean pixels.
All experiments were conducted using Google Colab with code execution on cloud-based computational resources. The datasets were stored and accessed from Google Drive. The proposed framework demonstrated feasible inference time under this cloud computing environment.
The complete denoising pipeline of the proposed framework is summarized as follows:
- Input image preprocessing, resizing, and normalization.
- Synthetic impulse noise injection for training data generation.
- CNN-based noisy pixel detection and confidence map generation.
- Threshold-based identification of corrupted pixels.
- Feature reconstruction using the denoising autoencoder.
- Final image refinement using the fuzzy median filter.
3.1 Preprocessing phase
The preprocessing phase is a foundational component in any deep learning pipeline, particularly in image restoration tasks such as denoising. The objective of this phase is to transform raw image data often inconsistent and heterogeneous in nature into a standardized format suitable for training deep convolutional neural networks (CNNs). In the context of this study, where the focus is on impulse noise removal, preprocessing also plays a critical role in simulating noisy environments, enabling the model to generalize effectively across various real-world scenarios.
3.1.1 Image resizing and normalization.
In the first step of preprocessing, images are resized with a uniform dimension of 224×224 pixels. This resizing step ensures compatibility with the architecture of most pre-trained CNN backbones, such as VGG, ResNet or custom convolutional models, which expect fixed-size inputs. Additionally, uniform image sizes facilitate efficient batch processing during training and reduce computational complexity. Although resizing might introduce minor interpolation artifacts, these are typically negligible compared to the benefits gained in computational efficiency and network compatibility. Following resizing, pixel intensity values are normalized from the original 8-bit integer range of [0, 255] to a continuous floating-point range of [0, 1]. This normalization is crucial for deep learning models as it: Enhances numerical stability during training. Accelerates convergence by preventing large gradient magnitudes. Allows the use of higher learning rates without risking divergence. Makes the optimization landscape smoother and more tractable. This step is applied consistently across both training and testing datasets to maintain coherence in the data distribution the model encounters.
3.2 Noise simulation and label generation
Given the focus on impulse noise removal, it is imperative that the model is trained on a dataset where both clean (ground truth) and noisy (input) versions of each image are available. Since most standard datasets do not provide such pairs, synthetic noise injection is employed to generate noisy counterparts from clean images.
3.3 Noise detection using convolution neural networks (CNN)
In proposed work, we propose deep learning model, such as CNN, to detect and identify infected noisy pixels in digital images as shown in Fig 2. The proposed deep learning model consists of two stages: a classification stage using a CNN and an auto-encoder de-noising stage with fuzzy median filter.Before these two stages, the pre-processing stage works on dataset loading, image augmentation, and resizing, Since the images are sourced from multiple datasets are standardized to a size of 224×224, this length and width, represent the image resolution, and it applied to all standard images in dataset to avoid over fitting, to classify noisy and non-noisy pixels by using the CNN architecture.In the de-noising auto-encoder stage, both noise types (random value impulse noise and salt and pepper implies noise) are resorted to improve their effect on the fine image details with preserving details. In the second stage, this de-noising auto-encoder and fuzzy median filter are used for whole image restoration process. The effectiveness of CNNs in noise detection has been demonstrated in various studies, highlighting their ability to accurately identify and localize noise within images.
In order to evaluate the performance of the proposed network, we used parameters in Table 2 that were proposed for DnCNN. It is worth noting that different 3×3 window sizes were used only during the training phase, whereas during inference, the resulting convolution masks were applied to the entire image. Table 2 presents the hyperparameters for the proposed deep learning-based image de-noising model.
In the proposed architecture, each module performs a distinct function. The CNN is responsible for identifying and localizing noisy pixels at the feature level. The autoencoder performs nonlinear feature reconstruction to recover corrupted pixel intensities, while the fuzzy median filter enhances uncertainty handling and preserves edges by adapting filtering strength according to local image statistics.
These results highlight the network’s ability to localize noise with high precision, even in visually complex scenes. At inference time, the trained CNN processes full-sized images (224×224) by convolving the learned filters across the entire image, producing a noise confidence map. Pixels with confidence scores above a threshold (e.g., 0.7) are flagged as noisy and passed to the restoration module.
3.4 Noise filtering using auto-encoder with fuzzy median filter
The noisy images are passed through a de-noising autoencoder equipped with a fuzzy median filter, which reconstructs their clean counterparts. The autoencoder comprises an encoder for compressing image features and a decoder for restoring the image while removing noise. Both the encoder and decoder contain three layers, along with an initial batch-normalization stage, as illustrated in Fig 3. The de-noising autoencoder includes three convolutional layers that extract and transform image features. Its primary objective is to eliminate noise and recover the clean image structure. During training, the processed image is continuously analyzed with the given clean image, and the model updates its weights to minimize the difference and achieve the closest possible reconstruction.
Fig 3 illustrates the state-of-the-art denoising approach. The input image has dimensions of 224×224, and the encoder produces an output feature map of size 224×224×32, which then serves as the input to the decoder. The decoder reconstructs the final de-noised image with dimensions 224×224. In the proposed de-noising autoencoder architecture, the encoder is composed of a batch normalization layer followed by three convolutional layers. Each convolutional layer uses a 3×3 kernel, with 128, 64, and 32 filters in each layer Sequentially. The decoder consists of ConvTrans layers and a final convolution layer, using 32, 64, and 128 filters in its three stages. Both convolution and transposed convolution operations use a stride of 1, and the same padding strategy is applied throughout. The ReLU activation function is used in all convolutional layers, while a sigmoid function is employed during classification for removing impulse noise.
where yi denotes the original input and represents the reconstructed output. The term
models the decoder output using a Gaussian distribution with variance
. In this context, n refers to the output dimension, and p(yi|z) indicates the decoder distribution. The denoising autoencoder effectively separates meaningful signal from noise by learning feature representations that capture the underlying data distribution, enabling the model to robustly reconstruct the output even when the input has been partially corrupted.
3.5 Learning and optimization during de-noising operation on CNN
Training of CNN models can utilize various algorithms, including hybrid first-order optimization, conjugate gradient, quasi-Newton, Levenberg–Marquardt (and variants), and least-squares-based methods. These training procedures typically operate through either an objective-function framework or a learning-function framework. In the objective-function approach, the solution is obtained by minimizing the reconstruction error, whereas in the learning-function approach, the solution of a regularized optimization problem serves as a parametric function for addressing the de-noising task. In our model, the binary cross-entropy loss is fined during the training process of CNN, while the autoencoder and fuzzy median filter components use the Mean Squared Error loss for pixel-level reconstruction. The Adam optimizer is employed to improve convergence and overall model performance. The loss (or cost) function is minimized to determine optimal network parameters, with MSE serving as the primary metric in supervised reconstruction. Fig 4 de-noising performance result of auto-encoder with fuzzy median filter on real-world noisy images with different noise level.
Here, M denotes the number of output neurons, P represents the total number of training samples, and Sk refers to the number of training iterations. The terms and
correspond to the actual and target outputs of the i-th neuron for the j-th input pattern, respectively. The Mean Squared Error is minimized using the stochastic gradient descent optimization method. In SGD, the network weights are updated iteratively, where each weight w is adjusted from its value at iteration t to wt+1 according to the following update rule:
where is the learning parameter. The general learning equation of CNN to solve the inverse problem in imaging given in is as follows:
A differential and tractable loss function is required to train the network using the back-propagation algorithm. The noisy images are provided to the CNN, which then estimates the corresponding noise-free images .The loss function is mathematically expressed as:
The training of the CNN requires a dataset consisting of input images and their corresponding target outputs. In some datasets, the training, testing, and validation sets are already predefined, while in others, users can manually select images for these splits depending on the specific application requirements.
Regarding computational complexity, the proposed framework introduces moderate overhead due to its two-stage architecture. However, since the noise detection and restoration processes are convolution-based, the model benefits from parallel GPU computation. Experimental evaluation shows that inference time remains practical for real-time and offline image restoration applications.
4 Results and evaluations
To evaluate the effectiveness of noise detection and filtering methods have several quantitative metrics are used. These metrics provide insights into de-noising algorithms’ accuracy, robustness, and quality in different contexts. Below, we discuss the most commonly used performance measures along with their mathematical formulations.
4.1 Accuracy
Accuracy is one of the primary evaluation metrics used in classification tasks, in situations involving imbalanced datasets where noisy pixels occur far less frequently than clean ones, for the prediction of noise. The mathematical formulation of accuracy is given in Equation 7.
Where TP = True Positives (correctly detected noise), TN = True Negatives (correctly detected clean data), FP = False Positives (incorrectly detected noise) and FN = False Negatives (miss detected noise).
4.2 Precision
Precision measures how many of the predicted noise instances are actually noise. Equation 8 presents the mathematical representation of precision.
Where TP = True Positives (correctly detected noise) and FP = False Positives (incorrectly detected noise). Higher precision indicates that fewer irrelevant instances, have been classified a noise, making the method more reliable in practical applications.
4.3 Recall
Recall also known as sensitive evaluate the ability of the methods to detect actual noise, A high recall means that fewer noise instances are missed.
Where TP = True Positives (correctly detected noise) and FN = False Negatives (correctly detected clean data). Equation 9 presents the mathematical representation of recall.
4.4 F1-score
During the process of image de-noising, when data is imbalance.,F1 score is measure which is a harmonic mean of precision and recall.
A high F1 score suggests that both precision and recall are sufficiently high, making the model robust. Equation 10 presents the mathematical representation of f-measure. The impulse noise filtering approach using the auto-encoder with a fuzzy median filter model was also evaluated using SSIM and PSNR metrics [35]. These performance parameters are defined as follows:
4.5 Peak-signal-to-noise-ratio
PSNR is commonly used in image and signal processing. It compares the maximum possible signal strength to the noise affecting its representation. A higher PSNR indicates a cleaner, less noisy signal. The PSNR is characterized as:
Equation 11 presents the mathematical representation of peak signal to noise ratio.
4.6 Structure similarity index(SSIM)
SSIM is designed to measure the perceptual similarity between two images. Unlike PSNR, which focuses on pixel wise differences, SSIM accounts for structural information, luminance, and contrast? The mathematical formula for SSIM is written in equation 12.
Where:
: The pixel sample mean of x;
: The pixel sample mean of y;
: The variance of x;
: The variance of y;
: The covariance of x and y;
, two variables to stabilize the division with weak denominator;
Our sequel paper provides an extensive review of picture restoration solutions using traditional alongside soft computing methods.
4.7 Experimental evaluation
To demonstrate the superior noise suppression capability of the proposed method, it was compared with several state-of-the-art techniques, including BnCNN [52], FTVM [32], BDCN [7], DnCNN [24], GAN [10], HCNN [35], IDCNN [23], U-NN [21], CACNN [24], FADD [34], and FASTF [36].Impulse noise was synthetically included in the images at varying densities of 10.0% to 50.0% with interval 10%, we trained CNN de-noising model, same as in DnCNN [24] and used gray-level dataset12, were contaminated with Salt and pepper noise of various noise level of densities in this experiments.These images were of 321×481 pixels in size. The predictor’s performance was evaluated, root mean square error (RMSE) was used. Image restoration results were assessed using peak signal-to-noise ratio (PSNR), structural similarity (SSIM) metrics. All experiments were performed in TensorFlow and Python on a workstation with an Intel Core i7-6700K CPU at 4.00 GHz, 16 GB RAM, and an NVIDIA Quadro M4000 GPU.
Our empirical experimental findings prove that the proposed method outshines the conventional median filtering. The main results are as follows: Noise Detection Accuracy: the CNN based model has an accuracy 94.2, FPR 0.0021, FNR 0.0089 and f1 score measure in experiment is 0.936. Two image quality measures of high popularity, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), were used to evaluate the performance of result of impulse noise filtering of the proposed model. The PSNR and SSIM are the measures of how similar de-noised and original photographs look visually. The results were then compared with some of the currently available methods such as the classical Median Filter, Total Variation De-noising, and the deep learning based DnCNN. Table 3 shows the evaluation of noise detection metrics.
Fig 5 presents a comparison of accuracy and F1-score across various noise density levels. When using a 10% this noise density level, the model achieves an accuracy of 0.9984 and an F1-score of 0.9925, while maintaining a very low False Positive Rate (0.0012) and False Negative Rate (0.0017)
Fig 6 shows the False Positive Rate (FPR) and False Negative Rate (FNR) at different density of noises (range 10–50 percent). As one can see, error rates both augment with respect to noise density. The FNR however increases much faster than the FPR, which shows that there are more chances of missing the detection of noise at the high noise levels.
Nevertheless, both FPR and FNR are rather low on average, with its maximum of 0.0021 and 0.0089 corresponding to 50% noise density in FPR and FNR respectively. Such findings indicate the powerful capacity of the model to reduce classification errors despite the noise interference that is notable. Table 4 presents the comparative analysis of state and art proposed work. De-noising Performance: PSNR Improvement: DAE increases PSNR by an average of 6.5 dB compared to median filtering. SSIM Score: The proposed model achieves an SSIM of 0.92, preserving image structure better than traditional methods. MSE Reduction: The method significantly reduces error, improving visual quality. Visual Comparison: The reconstructed images retain finer details while eliminating noise effectively. As observed, the proposed model consistently outperforms all other approaches. During the experiment, when noise level is 10%, the proposed technique achieved a PSNR of 35.94 dB, compared to 34.56 dB by DnCNN, 33.82 dB by Total Variation, and 32.15 dB by the Median Filter. This performance gap widens as the noise level increases. For instance, at 50% noise, our method achieved 28.12 dB, which is significantly higher than DnCNN (26.10 dB) and the traditional filters.
The visualization of PSNR of the different denoising techniques with different test images (Boat, House, Parrot, Peppers) and with noise density values (0.1 to 0.5) is provided as a heatmap. Different rows and different columns indicate different denoising approaches and different test conditions (image and noise level), respectively. The PSNR performance increases as you go down the color gradient of light to dark. During the test cases, the proposed work has shown comparatively darker tones almost throughout, representatively infers the best PSNR outcomes over other approaches such as DnCNN, BDCN, and FTVM.
Structural Similarity Index Measure (SSIM) results for various denoising algorithms applied to multiple images (Boat, House, Peppers, Parrot) under noise levels ranging from 0.1 to 0.5. Different rows depict different denoising models and different columns depict the specific combination of image noise. Visual similarity preservation is observed in the color gradient that is maintained in light (lower SSIM) to dark blue (higher SSIM). In the proposed work, the dark color is persistent in nearly all cases, which means the structural preservation is high at greater densities of noise. Conversely, the lighter patches depicted by some of the methods such as DnCNN and U-NN under heavier noise may indicate less performance. The figure is an effective visualization of strength and high SSIM stability of the proposed model. The proposed method preserves fine details such as edges, textures, and contrast better than the baseline methods. Unlike the median filter which introduces blurring and loss of sharpness, or the DnCNN model which sometimes smooths out details, our model maintains the structural integrity of the image by focusing the de-noising process only on the detected noisy pixels. This selective de-noising results in clearer images according to result.
The visual comparison of the acquired results that can be perceived that the proposed IDCNN efficiently highlights nearly all impulse noises; while the persisting visible artifacts result from the limited quality of restoration in the noisy pixel regions. Outstanding performance at high noise ratios was achieved using BDCNN. A key disadvantage of BDCNN is that it was developed to handle a combination of Gaussian and impulsive noise due to that the pixels in the images are uncorrupted are also changed. Due to this combination of auto encoder and median filter approach, noise filtering results are outstanding in the term of PSNR. The results from both quantitative metrics and visual inspection clearly demonstrate the advantages of the proposed two-stage approach. The use of a CNN-based noise detection module allows the model to accurately isolate the noisy pixels, which are then passed to the de-noising auto-encoder for restoration.
This approach reduces the risk of over-smoothing or blurring clean areas, which is a common issue in global denoising strategies. Fig 7 presents a detailed visual analysis of different methods of image restoration that are used on noisy picture of peppers. Using the original and noisy inputs (rho = 0.4), the figure displays the results of several denoising algorithms, namely IDCNNd, IDCNNg, AWOD, BDCNN, DnCNN, FAPGF, FASTAMF, FWNUMI, LoTV and PARIGI. All the restored versions emphasize the efficiency of the of the particular technique in maintaining texture, fidelity of color and sharpness of edges. Remarkably, particular methods such as IDCNN variants and FAST AMF preserve structural integrity more compared to others such that they provide cleaner and less inaccurate reconstructions without an enormous degree of blurriness as well as the presence of artifacts. With this comparison, the differences in visual performance on a continuum of restoration methods can be seen. Fig 7 shows the various levels of noise density.
For fair comparison, the results of competing methods were either reproduced using publicly available implementations or directly reported from their original publications. All methods were evaluated under identical experimental conditions, including the same benchmark images, noise densities, and evaluation metrics.
This implies that the suggested method is stronger and can better maintain the quality of the images when noise is intense. The visual comparison of all datasets confirms the excellence and robustness of the given model in comparison with both the conventional and learning restoration methods. Furthermore, the model generalizes well across various levels of noise density, maintaining high performance even in heavily corrupted images. Compared to existing methods, the proposed architecture achieves better noise reduction without sacrificing image quality. Traditional methods like median filtering and total variation are effective to an extent but are not adaptive to varying noise patterns. Auto-encoder with Fuzzy Median Filter performs significantly better than a standalone auto-encoder in removing impulse noise from images in the BSD-68 dataset. Deep learning-based methods in Table 4 perform better but still apply denoising globally. In contrast, our method is targeted, adaptive and structurally aware, making it more robust and efficient. Fig 4 shows the various level of noisy images.
5 Conclusion and future directions
The proposed hybrid de-noising framework,leveraging a CNN-based impulse noise detection network and an autoencoder coupled with a fuzzy median filter for noise removal, demonstrates remarkable efficiency in image restoration comparative analysis confirms that it consistently outperforms conventional state-of-the-art filtering approaches based on SSIM and PSNR metrics. This image restoration approach leads to improved de-noising performance and better preservation of image details. The method effectively removes impulse noise, preserves clean pixels, avoids visible artifacts, and efficiently processes grayscale images of sizes up to 512×512. This unique approach for feature used in deep learning that it requires adjusting hyper parameters, and the same filtering techniques in images containing different densities in image, further including challenging scenarios with up to different types of noise. Tests on Set-12 and BSD-68 datasets reveal that the current model outperforms both classical de-noising methods and advanced deep learning approaches such as DnCNN.
The proposed denoising framework can be effectively applied in practical domains such as medical image preprocessing, satellite and remote sensing imagery, forensic image analysis, industrial inspection, and surveillance systems, where impulse noise frequently degrades image quality.
Future work may focus on adapting two stages approach to handle other types of impulsive or mixed noise and on further improving the efficiency of noisy pixel replacement through the use of alternative CNN architectures. Another promising research direction is the development of a unified network capable of performing both noise detection and pixel restoration within a single processing stage. Additionally, the proposed framework could be enhanced by integrating advanced deep learning models, such as vision transformer, diffused methods and Generative Adversarial Networks, for better result after the de-noising process, potentially improving overall image quality and classification performance.This hybrid approach is used to filter the impulsive noise at the same time retaining the image details and keeping clear pixels without altering them. Based on the various experiments done on the dataset, such as Set12 and BSD68 datasets, the present method shows better performance than other state-of-the-art approaches with different image restoration metrics. It is also very efficient even at loud noise densities, and easily scalable to change in the size of the image in question, and provides a great balance between the capability to de-noise an image and the reduction of perceived image quality.
The experimental outcomes show that the proposed model outperforms current state-of-the-art systems in terms of accuracy and overall effectiveness, but few systematic deficiencies and limitations are still there;
- The current design and performance analysis of the model is obtained on grayscale image data only, and therefore can be applicable or multichannel image datasets only to a certain extent.
- To achieve this, the approach is based on two different stages (detection by DnCNN and restoration by Auto-Encoder +fuzzy Median filter) that might prove computationally challenging; they can slow down the inference time.
- The hyperparameters should be carefully tuned to achieve an efficient performance of the system; however, they can be non-transferrable to other datasets or noise.
- The system is primarily optimized towards the impulse noise and might not work best in other noise distributions (Gaussian, Poisson, or real-life noise mixtures etc.).
Considering the current weaknesses and limitations, the following key points can serve as directions for future research and improvement.
- To simplify the process and make it less complex, develop one deep network with the capability of detecting and reconstructing noisy pixels simultaneously.
- Generalize the model to higher color resolution and size (e.g., 1024x1024) in order to apply to more generic real-world data.
- Make it more robust by training the model to be noise-tolerant that can deal with mixed or unknown noise in a real-world setting.
5.1 Limitations
Although the proposed method demonstrates strong performance, several limitations should be acknowledged. First, the experiments are conducted only on grayscale benchmark datasets, which may limit generalization to color images. Second, impulse noise is synthetically generated using fixed assumptions, whereas real-world noise may exhibit more complex characteristics. Additionally, the two-stage architecture introduces additional computational cost compared to single-stage networks.
References
- 1. Zhang K, Zuo W, Chen Y, Meng D, Zhang L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing. 2017;26(7):3142–55.
- 2. Zhang K, Zuo W, Zhang L. FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising. IEEE Transactions on Image Processing. 2018;27(9):4608–22.
- 3.
Lehtinen J, Munkberg J, Hasselgren J, Laine S, Karras T, Aittala M, et al. Noise2Noise: Learning Image Restoration without Clean Data. In: Proceedings of the 35th International Conference on Machine Learning (ICML); 2018. p. 2965–74.
- 4.
Krull A, Buchholz T, Jug F. Noise2Void: Learning Denoising from Single Noisy Images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019. p. 2129–37.
- 5.
Guo S, Yan Z, Zhang K, Zuo W, Zhang L. Toward Convolutional Blind Denoising of Real Photographs. In: CVPR; 2019. p. 1712–1722.
- 6.
Anwar S, Barnes N. Real Image Denoising with Feature Attention. arXiv preprint arXiv:190407396. 2019;.
- 7. Chen J, Zhang G, Xu S, Yu H. A Blind CNN Denoising Model for Random-Valued Impulse Noise. IEEE Access. 2019;7:124647–61.
- 8.
Batson J, Royer L. Noise2Self: Blind Denoising by Self-Supervision. In: International Conference on Machine Learning (ICML); 2019.
- 9.
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M, et al. Multi-Stage Progressive Image Restoration. In: CVPR; 2021. p. 14821–31.
- 10. Rafiee AA, Farhang M. A Deep Convolutional Neural Network for Salt-and-Pepper Noise Removal Using Selective Convolutional Blocks. Applied Soft Computing. 2023;145:110535.
- 11.
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang M, et al. Restormer: Efficient Transformer for High-Resolution Image Restoration. In: CVPR; 2022. p. 5728–39.
- 12.
Tu Z, Talebi H, Yang F, Milanfar P, Bovik AC, Li Y. MAXIM: Multi-Axis MLP for Image Processing. In: CVPR; 2022. p. 5769–80.
- 13.
Wang Z, Zhang Z, Li Y, Liang J, Wang Z, Zheng Y, et al. Uformer: A General U-Shaped Transformer for Image Restoration. In: CVPR Workshops; 2022.
- 14.
Ulyanov D, Vedaldi A, Lempitsky V. Deep Image Prior. In: CVPR; 2018. p. 9446–54.
- 15.
Brooks T, Mildenhall B, Xue T, Chen J, Sharlet D, Barron JT. Unprocessing Images for Learned Raw Denoising. In: CVPR; 2019. p. 11036–45.
- 16.
Plötz T, Roth S. Benchmarking Denoising Algorithms with Real Photographs. In: CVPR; 2017. p. 1586–95.
- 17.
Abdelhamed A, Lin S, Brown MS. A High-Quality Denoising Dataset for Smartphone Cameras. In: CVPR; 2018. p. 1692–700.
- 18. Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, et al. A Comprehensive Survey on Transfer Learning. Proceedings of the IEEE. 2021;109(1):43–76.
- 19. Ancuti CO, Ancuti C, Timofte R, De Vleeschouwer C. Efficient Single Image Dehazing with Decomposition Priors. IEEE Transactions on Image Processing. 2018;27(11):5186–97.
- 20.
Cun X, Pun CM. Defocus Blur Detection via Depth Distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. p. 261–70.
- 21. Wang Y, Liang J, Zhang K, Timofte R, Van Gool L. Denoising Autoencoders for Real Image Restoration: Bridging the Gap Between Synthetic and Real Data. IEEE Transactions on Image Processing. 2022;31:2372–86.
- 22.
Maghoumi M, LaViola J. Fast and Accurate Denoising of 3D Point Clouds Using Graph Neural Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2021. p. 1772–81.
- 23. Thakur RS, Yadav RN, Gupta L. State-of-Art Analysis of Image Denoising Methods Using Convolutional Neural Networks. IET Image Processing. 2019;13(13):2367–80.
- 24.
Maheshwari R, Pathak K, Kamal AK. Context Aware CNN Approach to Denoise Salt and Pepper Images. In: 2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI). IEEE; 2024. p. 1–6.
- 25. Fan L, Zhang J, Li Y, Yuan X. CNN-Based Image Denoising with Multi-Scale Feature Learning. IEEE Access. 2019;7:176150–60.
- 26. Masood S, Sharif M, Raza M. Intelligent Image Noise Detection and Removal using Neuro-Fuzzy Filtering. Applied Soft Computing. 2012;12(8):2712–28.
- 27.
Farooq Y, Savasi S. Noise Removal from Images using Convolutional Neural Networks. Multimedia Tools and Applications. 2024;.
- 28. Zhang K, Zuo W, Chen Y, Meng D, Zhang L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing. 2017;26(7):3142–55.
- 29. Meng Y, Zhang J, Xiong R. Gray Image Denoising Using Dilated Convolutional Neural Networks. Signal Processing: Image Communication. 2022;100:116549.
- 30. Mirza AM, Rehman S, Hussain M. A Two-Stage Switching Median Filter Based on Neural Networks for Impulse Noise Removal. Neural Computing and Applications. 2013;23(1):1–12.
- 31. Zhang F, Wu X, Zhang X. Image Denoising via Non-Local Means with Enhanced Self-Similarity. IEEE Signal Processing Letters. 2020;27:496–500.
- 32. Meng D, Xu Z. Image Denoising via Sparse Representation. IEEE Transactions on Image Processing. 2013;22(6):2226–35.
- 33. Ali M, Anwar S. Deep Residual Learning for Image Denoising. Neural Processing Letters. 2021;53:1–15.
- 34. Hussain A, Jaffar MA, Mirza AM. A Hybrid Image Restoration Approach: Fuzzy Logic and Directional Weighted Median Based Uniform Impulse Noise Removal. Knowledge and Information Systems. 2010;24:77–90.
- 35. Ansari AR, Younis M, Alam S. Wavelet Transform-Based Multi-Scale Image Denoising. IET Image Processing. 2018;12(4):454–62.
- 36. Malinski L, Smolka B. Fast Adaptive Switching Technique of Impulsive Noise Removal in Color Images. Journal of Real-Time Image Processing. 2019;16(4):1077–98.
- 37. Choi TS, Kim MS. Adaptive Noise Detection and Filtering Approach for Image Restoration. Pattern Recognition Letters. 2012;33(12):1534–41.
- 38.
Maheshwari R, Sharma AK, Kumar P. Context-Aware CNN for Salt-and-Pepper Noise Removal. Multimedia Tools and Applications. 2024.
- 39. Nawaz R, Raza A, Malik H. GAN-Based Image Denoising for High-Quality Restoration. Neural Computing and Applications. 2020;32:15669–82.
- 40. Zhang J, Liu M, Wei Z. Transformer-Based Image Denoising. IEEE Access. 2021;9:146984–95.
- 41. Holla S, Kumar V, Singh P. Edge-Focused Image Denoising using Convolutional Neural Networks. Signal, Image and Video Processing. 2023.
- 42. Thakur RS, Gupta P, Jain S. Analysis of Image Denoising Methods using IDCNN. Multimedia Tools and Applications. 2019;78:32825–41.
- 43. Rafiee AA, Hosseini SA. Selective Convolutional Blocks for Salt-and-Pepper Noise Removal. Neural Processing Letters. 2023;55:4781–99.
- 44. Jaffar A, Hussain M, Ahmad S. Genetic Algorithm for Optimal Noise Filtering. Applied Soft Computing. 2017;52:1173–84.
- 45. Liu Y, Li X. Markov Random Field Models for Structure-Preserving Image Denoising. IEEE Transactions on Image Processing. 2014;23(1):214–24.
- 46. Ke J, Zhang F, Li Z. Computationally Iterative Methods for Salt-and-Pepper Image Denoising. Journal of Visual Communication and Image Representation. 2025.
- 47. Hussain M, Anwar S, Mirza AM. Sparse Dictionary Learning for High-Frequency Noise Removal. IET Image Processing. 2015;9(6):481–9.
- 48. Tanriover E, Ozturk BK, Demir M. Full Fractional Total Variation Method for Eliminating Salt-and-Pepper Noise. Signal Processing. 2025.
- 49. Zhang K, Li Y, Wang X. Hybrid Deep Learning with CNN and RNN for Image Restoration. Pattern Recognition Letters. 2017;100:1–8.
- 50. Hussain A, Mirza M, Masood S. Hybrid Fuzzy Logic and Directional Weighted Median Filtering for Impulse Noise Removal. Soft Computing. 2010;14(6):599–610.
- 51. Zheng M, Liu Y, Zhang P. Hybrid CNN for Image Denoising. IEEE Access. 2022;10:41230–42.
- 52. Malinski L, Wasilewski P, Nowak M. Fast Adaptive Switching Impulsive Noise Removal in Color Images. Multimedia Tools and Applications. 2019;78:22481–501.
- 53.
Abiko R, Ikehara M. Blind Denoising of Mixed Gaussian-Impulse Noise by Single CNN. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2019. p. 1717–21.