Single image mixed dehazing method based on numerical iterative model and DehazeNet

As one of the most common adverse weather phenomena, haze has caused detrimental effects on many computer vision systems. To eliminate the effect of haze, in the field of image processing, image dehazing has been studied intensively, and many advanced dehazing algorithms have been proposed. Physical model-based and deep learning-based methods are two competitive methods for single image dehazing, but it is still a challenging problem to achieve fidelity and effectively dehazing simultaneously in real hazy scenes. In this work, a mixed iterative model is proposed, which combines a physical model-based method with a learning-based method to restore high-quality clear images, and it has good performance in maintaining natural attributes and completely removing haze. Unlike previous studies, we first divide the image into different regions according to the density of haze to accurately calculate the atmospheric light for restoring haze-free images. Then, dark channel prior and DehazeNet are used to jointly estimate the transmission to promote the final clear haze-free image that is more similar to the real scene. Finally, a numerical iterative strategy is employed to further optimize the atmospheric light and transmission. Extensive experiments demonstrate that our method outperforms existing state-of-the-art methods on synthetic datasets and real-world datasets. Moreover, to indicate the universality of the proposed method, we further apply it to the remote sensing datasets, which can also produce visually satisfactory results.


Introduction
Due to the absorption and scattering of suspended particles in the atmosphere, images taken in haze, fog, and smoke have low contrast and poor visibility. Hazy images can severely affect the performance of surveillance systems, remote sensing imaging, and computer vision tasks that depend on image quality. The single-image defogging method can recover a haze-free image from a low-quality foggy scene. According to the physical process of hazy image formation [1], a hazy image is described as: where x is the pixel position. I denotes the hazy image, and J is the clear image. t represents the scene transmission, and A represents the atmospheric light. The process of image dehazing aims to recover J from I, so t and A need to be estimated, which is a challenging task. Based on statistical knowledge of fog-free images [2][3][4][5][6], the traditional prior-based method can estimate the atmospheric light and transmission in the atmospheric scattering model. To restore a haze-free image, Fattal [2] first estimated the transmission map by assuming that the scene transmission was unrelated to surface coloring. Zhu et al. [4] estimated the depth information in an image by establishing a linear model and then restored the fog-free image. Berman et al. [5] introduced a global algorithm based on a nonlocal path prior to calculating the transmission of each pixel and then obtained the fog-free image according to the physical model. Fattal [7] found that pixels of images ordinarily exhibit one-dimensional distributions in RGB color space, called color-lines. They deduced a local formation model to explain the color-lines in the hazy scene, and recovered scene transmission according to the color-line and origin offset. Liu et al. [8] proposed a transmission adaptive regularized image recovery method for high-quality single image dehazing. Considering that the amplification effect of artifacts depends on scene transmission, a transmission adaptive regularized recovery approach based on non-local total variation (NLTV) was presented, which can simultaneously suppress visual artifacts and retain image details in the final dehazing result. Based on the observation of clear outdoor images, He et al. [6] introduced a novel haze removal method for an image based on the dark channel prior (DCP), which restores clear images by estimating the transmission. The DCP shows good performance in image defogging, so a series of approaches have been presented to further improve the dehazing effect. [9][10][11][12][13][14][15][16]. Based on the DCP, [13] modified the atmospheric veil and the transmission to remove haze from remote sensing images. Pan et al. [15] introduced a constant into the atmospheric scattering model to remove haze, which was based on the view that the average intensity of dark channels in remote sensing images is low but not close to zero. Jiang et al. [16] introduced an experiencebased single-image defogging method, which obtained the haze thickness maps of all bands from the dark channel. Taking into account the uneven distribution of haze thickness and the complexity of the light source, we presented a numerical iterative dehazing algorithm based on the DCP in our previous work [17]. The numerical iterative dehazing method can maintain image details and color fidelity, but the dehazed images still have some fog residue. The above methods rely on certain prior conditions, and when the priors are violated, will lead to inaccurate estimates, so these methods have limited applicability.
With advances in deep learning, many data-driven methods have been presented for singleimage dehazing [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35]. Li et al. [19] presented an all-in-one dehazing network to estimate one variable that is integrated by atmospheric light and transmission. Ren et al. [20] introduced a gated fusion network for restoring fog-free images, which integrates three confidence maps obtained using different physical models. Li et al. [21] presented a defogging model to directly restore fog-free images by utilizing a conditional generative adversarial network. Based on an attention mechanism, Liu et al. [23] introduced GridNet [36] into single-image defogging and introduced a multiscale defogging network. Shao et al. [24] presented a domain adaptation paradigm consisting of an image translation module and two image dehazing modules. The method utilizes the images before and after translation to train networks with a consistency constraint, which further improves the adaptability of the domain. Hong et al. [25] proposed a knowledge-distill dehazing network based on heterogeneous task imitation. By reconstructing the image and designing a spatial-weighted channel attention residual block, the method can better reconstruct the features to restore the haze-free image. Song et al. [26] introduced a deep convolutional neural network to generate the disparity and clear images simultaneously from a hazy stereo image pair. To learn the optimal fusion of depth-related features, a novel encoder-decoder architecture was presented, which extends the core idea of attention mechanism to the simultaneous stereo matching and dehazing. Recently, the problem of irregular and nonuniform distribution of haze in remote sensing images has been solved. Gu et al. [27] presented a single remote sensing image dehazing method that directly produced fog-free images with a prior-based dense attentive dehazing network. Hu et al. [28] presented an unsupervised dehazing method for high-resolution remote sensing images. These learning-based methods can directly restore the fog-free image, but they deviate from the physical model and cannot effectively preserve the physical properties of the image. In contrast to the above methods of directly recovering the fog-free image, Cai et al. [32] presented an end-to-end dehazing network (DehazeNet), which can generate the transmission map and then restore the clear image based on the physical model. Ren et al. [33] presented a multiscale network to compute coarse-to-fine transmission. Zhang and Patel [34] used a densely connected encoder-decoder network and U-Net [37] to estimate the transmission map and atmospheric light, respectively. Jiang et al. [35] introduced a remote sensing image dehazing method utilizing a multiscale residual convolutional neural network (MRCNN). These methods can effectively eliminate the haze effect in the image, but they are trained on synthetic foggy images, so the effectiveness of these methods in processing real-world images using learned intermediate parameters is limited.
Considering that physical model-based methods have poor performance in thorough haze removal, learning-based methods have a limited generalization ability in natural scenes. To effectively remove haze from real foggy scenes, we propose a mixed iterative model that uses the DCP and DehazeNet to jointly estimate the transmission. The DCP is based on the atmospheric scattering model, and DehazeNet is data-driven, so the proposed method can preserve the natural attributes of the image and has universal applicability in achieving an effective defogging effect. Due to the uneven fog density in the images, such as remote sensing images and regular natural hazy images, we adopt local atmospheric light instead of global atmospheric light. The experimental results indicate that the presented method has a good defogging effect in fog removal.
The main contributions of our work are as follows: 1. A mixed numerical iterative framework is presented that uses both a physical model-based method and a deep learning-based method for image defogging.
2. We utilize the DCP and DehazeNet to jointly estimate the transmission, which can retain the physical characteristics of the image and learn the distribution of the fog to perform accurate image restoration.
3. We compare the presented method with other state-of-the-art algorithms on synthetic hazy images and natural foggy images. The results of the experiments show that the presented algorithm has good performance. The ablation study further proves the effectiveness of the presented approach.

Single image numerical iterative dehazing method
The DCP [6] is based on the significant observation that the majority of the local area of an outdoor fog-free image (non-sky area) covers some very low pixels approaching 0 in at least one color channel. The dark channel J dark of a fog-free image can be expressed as: where J denotes the fog-free image, J c represents a color channel in J, and O(x) denotes a local block whose center is x. Utilizing this prior knowledge, [6] estimates the intermediate parameters A and t(x) to reconstruct the clear image according to the atmospheric scattering model. Based on the DCP, we introduce a single-image numerical iterative dehazing method for single-image defogging [17], which can effectively remove the fog from hazy images, preserve the detail of the image and maintain fidelity. This method first divides the region, then estimates the local physical features, and finally recovers the scene iteratively. Region division. Considering that the distribution of fog in an image is not always uniform, the foggy image is divided into different fog density regions by using the affinity propagation (AP) [38] cluster algorithm before defogging. AP is an adaptive clustering algorithm. To represent the fog concentration, we choose the visibility and contrast of the input image as the clustering features. After the three steps of clustering feature selection, correlation matrix calculation and initialization, and matrix update and sample determination, the foggy image is divided into different fog density regions.
Local physical features estimation. To reflect the differences in atmospheric light between regions and prevent halo phenomena from emerging in the restored images, we adopt the method proposed in [6] to calculate the local atmospheric light instead of the global atmospheric light. Then, we calculate the transmission based on the DCP. The rough transmission is estimated as: In fact, the transmission is not always constant within a local block. The thickness of fog in the distance in a hazy image will exceed that of haze near the camera lens. Hence, the farther a pixel is from the camera, the lower the transmission. Accordingly, we allow slightly different estimated transmission in a local patch. To solve this problem, we refine the rough transmission. The refined formula of each pixel can be described as: where t ori denotes the adjusted transmission and is used as the initial value for numerical iterative defogging. D = min(I c ) denotes the minimum value of the three channels of the image block, and D min denotes the minimum value of D. Iteration for scene recovery. After region division and local physical feature estimation, an iterative model is introduced to recover a high-quality and fog-free image. The initial conditions of iterative dehazing can be defined as follows: where J 0 is the hazy image and t 0 denotes the initial transmission. A m 0 ðm ¼ 1; 2; . . .Þ is the initial local atmospheric light, which is computed from separate haze density areas of J 0 . Then, a numerical iterative dehazing model is constructed according to Eqs (1) and (5), which is expressed as: where J n represents the n-th approximation of the fog-free image. A m n is the n-th iteration of local atmospheric light estimated from J n . A m n and t n denote the approximation of the nth iteration calculated from J n .
In the first iteration, we obtain the defogging image J 1 in light of the initial conditions and iteration formula. A m 1 and t 1 are also obtained at this point. After the first iteration, the dehazing image J 1 is taken as a hazy image for the second iteration, and A m 2 , t 2 and J 2 are obtained after the second iteration. The iterative process is repeated until image J n becomes a haze-free image and the iteration termination condition is satisfied.

DehazeNet
The key to dehazing is to calculate a transmission map for the input foggy scene. Cai et al. [32] presented a convolutional neural network-based architecture named DehazeNet, which calculates the transmission and then uses a physical model to recover a fog-free image from a foggy image input.
DehazeNet is made up of four components: feature extraction, multiscale mapping, a local extremum, and nonlinear regression. The Maxout unit [39] is introduced in the first layer of DehazeNet, which is a feed-forward nonlinear activation function. By maximizing the pixel level of k affine feature maps, the Maxout unit generates a new feature map, which contains the features related to fog. In the second layer, parallel convolutional operations are used. The same number of convolution kernels of different sizes are employed to carry out convolution operations on the input features to perform multiscale feature extraction. To preserve the resolution of the restored image, the third layer performs a dense local extremum operation on each feature map obtained from the second layer. For the fourth layer, a novel linear unit, the bilateral rectified linear unit (BReLU), is introduced to maintain bilateral restraint and local linearity. These four layers are connected in series to form a trainable end-to-end system based on a CNN, in which the filters and biases related to the convolutional layer are the network parameters that need to be learned.
Note that the convolutional neural network-based DehazeNet is data-driven and trained using paired data of synthetic hazy images and haze-free counterparts. The transmission learned from synthetic data is different from that of natural hazy images. Moreover, Dehaze-Net regards atmospheric light as a global constant. It is particularly unsuitable for some multisource images and results in the dehazed images suffering color and information loss. Based on local physical properties, the numerical iterative defogging method uses mathematical thinking to calculate the parameters of the physical model iteratively and obtains the parameters from the natural hazy image; it can retain the physical features of the image well and recover the natural color. However, this method relies on prior knowledge, so it has limited universal applicability. In this work, we present an iterative dehazing model based on joint transmission estimation. The transmissions estimated by the numerical iterative dehazing method and DehazeNet are effectively fused. In this way, the clear image restored by our model can effectively remove the fog and maintain fidelity.

Proposed method
The single-image numerical iterative dehazing method we proposed in [17] is dependent on local physical features; it can remove most of the haze in images and restore appropriate brightness and color levels while maintaining the physical characteristics. DehazeNet uses a deep network to calculate the transmission, and it can effectively remove the fog by relying on paired synthetic data. Therefore, we present a fusion-based method by fusing the transmissions obtained from the physical model-based approach and data-driven method, which can effectively eliminate the fog and preserve the details of the image. This section introduces the details of the presented approach. First, we outline the presented single-image iterative dehazing method, which consists of three steps: estimation of the local atmospheric light, calculation of transmission, and iterative optimization. Second, for the key steps in our method, we introduce the estimation of transmission in detail. Then, the termination conditions of iterative optimization are described. Finally, we analyze the loss function used in the proposed method.

Method overview
Given a hazy image, the presented method aims to produce a fog-free image. The workflow is shown in Fig 1. First, without considering the prior information of the number of clusters, the foggy image is adaptively divided into areas with different fog densities using the AP algorithm. The local atmospheric light is estimated for each haze density area based on the atmospheric scattering model. Then, a physical model-based approach and a data-driven method are adopted to estimate the transmission. The joint transmission is produced by fusing the transmission obtained using the DCP-based method and DehazeNet. Next, we optimize the recovered image iteratively according to the iteration termination condition. Finally, a fog-free image is produced that fits the iteration termination condition and can effectively remove the fog.
The presented approach first trains DehazeNet with paired synthetic images to obtain a pretrained model. Then, the single-image numerical iterative defogging method is adopted to obtain the local atmospheric data and a new, refined transmission. The refined transmission is fused with the medium transmission obtained by the network. The joint transmission and local atmospheric data are used to recover the haze-free image based on the atmospheric scattering model. Finally, according to the iterated termination condition, the recovered image is re-input as the hazy image for iterative defogging. In each iteration, we fine-tune the whole network using the above pretrained model and produce the final dehazing output.

Estimation of joint transmission
The atmospheric transmittance is expressed by the transmittance t (0 < t < 1). When t is close to 0, the image has low visibility. The estimation of the presented joint transmission consists of two parts: the DCP-based estimation method and the data-driven estimation method.
DCP-based estimation method. For a transmission t, we first obtain a rough transmission based on the DCP, which assumes that the transmission in a local patch is constant. In fact, the transmission in a local block is not always constant; it alters with the change in fog density in the image. Thus, a refined transmission is introduced, which is defined as: where k represents the number of iterations. The refined transmission improves the accuracy of the transmission, which ensures realistic color levels in the dehazed images. t k,1 (x) represents the transmission calculated by the DCP in the kth iteration. Data-driven estimation method. The data-driven estimation method learns the mapping between foggy images and their related transmission maps. DehazeNet is made up of cascaded convolutional and pooling layers, and some of these layers use appropriate nonlinear activation functions. After four sequential layers, we obtained an estimation of medium transmission.
The layer of feature extraction is defined as follows: where W 1 ¼ fW i;j 1 g ðn 1 ;kÞ ði;jÞ¼ð1;1Þ and B 1 ¼ fB ði;jÞ 1 g ðn 1 ;kÞ ði;jÞ¼ð1;1Þ denote the filters and the biases, respectively, and � indicates the convolution operation. I is the input image, and after the first layer, we obtain n 1 output feature maps. In this layer, the Maxout unit maps each of the kn 1 -dimensional vectors into an n 1 -dimensional vector and captures the fog-related features through automatic learning.
For the multiscale mapping, parallel convolutional operations are selected in the second layer. The output can be written as: ðp;qÞ¼ð1;1Þ contain n 2 pairs of parameters that are divided into three groups. n 2 represents the output dimension in the multiscale mapping. i 2 [1, n 2 ] is the index of the output features.
The third layer is a local extremum option, which is intensively applied to each feature map pixel. It can maintain the resolution for image restoration. It is defined as: where O(x) is an f 3 × f 3 neighborhood whose center is x. The output dimension of the local extremum operation is n 3 = n 2 . The last layer is nonlinear regression, and the BReLU activation function is employed in this layer. In the fourth layer, the feature map is defined as: where W 4 ¼ fW 4 g includes a filter of size n 3 × f 4 × f 4 , B 4 ¼ fB 4 g includes a bias, and t max,min denotes the marginal value of the BReLU. In accordance with (11), the gradient of this activation function is as follows: According to the above four layers, the transmission estimated by DehazeNet t k2 can be obtained. The transmission t k2 estimated by the deep network and t k1 estimated by the physical model are effectively fused. Then, the joint transmission is produced in each iteration for hazefree image recovery, which can be described as: where λ 1 and λ 2 denote the weights of each transmission.

Iterative optimization
The threshold of the iteration termination criterion is set to the average brightness of the training sample ε. The percentage of dark pixels of the recovered image is computed as p at the end of each iteration. When ε > p, iteration should terminate. Otherwise, the recovered image will go through the next loop of the estimation of local atmospheric light and transmission until ε > p.

Loss function
DehazeNet employs the mean squared error (MSE) as the loss function, which minimizes the difference between the transmission of the foggy training image and the corresponding fogfree image. The MSE loss is expressed as follows: where Y ¼ fW 1 ; W 2 ; W 4 ; B 1 ; B 2 ; B 4 g are network parameters. F maps the relationship between the RGB value and the transmission. I P i is the i-th training patch. t i is the groundtruth medium transmission.

Experimental results
The effectiveness of the presented dehazing model is evaluated by comparison with four advanced prior-based algorithms (DCP [6], CAP [4], NLD [5], and Wang et al.'s method [17]) and five learning-based approaches (MSCNN [33], DehazeNet [32], AOD-net [19], GFN [20], and CycleGAN [22]) in this section. On synthetic datasets and real-world foggy images, we compare the presented approach with other methods. The metrics peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM) and information entropy are adopted to quantitatively evaluate the experimental results. To prove the contribution of this paper, we conduct an ablation study to analyze the presented method.

Experimental data
This paper uses the RESIDE dataset [40] to synthesize the training data. The RESIDE dataset utilizes a large number of data sources, including indoor and outdoor scene images (ITS and OTS, respectively). We selected 4000 random fog-free images from ITS and OTS to create the training dataset. Then, 1,000 images were selected from each of these two datasets as the finetuned dataset. For each image, we randomly chose 10 transmissions t 2 (0,1) uniformly to generate 10 foggy images. Thus, there are 100,000 hazy images and corresponding fog-free images in total that are adopted as the training data for DehazeNet.
We made qualitative and quantitative comparisons with other competitive methods on both natural and remote sensing images. The HazeRD dataset [41] contains 75 synthesized hazy images with real haze conditions. The remote sensing images were derived from publicly available databases. The NWPU VHR-10 dataset [42,43] is a 10-level geographic remote sensing dataset for the detection of space objects, which has 800 images consisting of 650 images containing a target and 150 background images. The NWPU-RESISC45 dataset [44] is a remote sensing image scene classification dataset. The RSOD Dataset [45,46] and Dataset for Object Detection in Aerial Images (DOTA) [47] are two datasets for remote sensing image object detection. These images cover many scenes and different objects and contain different haze densities.

Implementation details
The experiment in this article is implemented on a computer with an Intel(R) Core(TM) i5-9400F CPU @2.90 GHz, 16 GB of RAM, and an NVIDIA GTX 2080 Ti GPU. The learning rate of DehazeNet is set to 5e-3, and it decays by half every 10 epochs. DehazeNet is trained for 50 epochs with a batch size of 16 on the training dataset. The weights of transmission fusion are set to λ 1 = 0.5 and λ 2 = 0.5. Finally, we fine-tune the network utilizing the pretrained model on a fine-tuned dataset.
https://doi.org/10.1371/journal.pone.0254664.g003  [6] (c)CAP [4] (d)NLD [5] (e)Wang [17] (f) MSCNN [33] (g)DehazeNet [32] (h)AOD-net [19] (i)GFN [20] (j)CycleGAN [22] (k)Ours (l)Ground truth.  Fig 3(g). The image brightness after GFN restoration is significantly brighter, as shown in Figs 2 and 3(i). The dehazed result of CycleGAN has artifacts and unrealistic color tones, as shown in Figs 2 and 3(j). By contrast, the proposed method can effectively eliminate most of the fog and is visually closer to the ground truth, with realistic colors on hazy images in SOTS dataset and HazeRD dataset. Tables 1 and 2 show the objective results of the presented approach and other comparison approaches of indoor and outdoor hazy images on SOTS test sets. Table 3 lists the quantitative evaluations on the HazeRD dataset. The comparison indicators include PSNR, SSIM, and information entropy. Tables 1-3 prove that the presented method achieves advanced performance on these evaluation matrices, which also indicates the effectiveness of the presented algorithm.
https://doi.org/10.1371/journal.pone.0254664.g007 results on the thin and dense fog images, which can recover results with clear contours and rich colors while reducing the loss of details.

PLOS ONE
9(i). The dehazing results of CycleGAN have artifacts and unrealistic color tones, as shown in Figs 8(j) and 9(j). In comparison, our method can effectively remove most of the fog in the remote sensing image and is visually closer to the natural color. Table 4 shows the information entropy scores of our method and other methods. Our method has the best entropy scores in the NWPU VHR-10 [42,43], RSOD [45,46] and DOTA datasets [47]. Although the performance of our method on the NWPU-RESISC45 dataset [44] is slightly poor, it is superior to the other methods in subjective effect. The average result of our method on the four databases is the best, which indicates that our method is more competitive than the other methods.
https://doi.org/10.1371/journal.pone.0254664.g009 effectively eliminate most of the fog but sacrifices color information, as shown in the sky area in the image. The mixed model proposed in this paper effectively integrates the strengths of these two methods and can effectively remove the fog while maintaining the physical properties of the image.
In conclusion, the experimental results show that the presented approach is competitive in dehazing performance compared with other state-of-the-art methods.

Conclusion
In this paper, a new mixed iterative model is presented to improve the effect of single image dehazing. Noticing that the physical model-based method can yield high fidelity images and that the learning-based method can effectively eliminate haze in the image, we propose a method to fuse the two models and estimate the transmission using DCP and DehazeNet. Hence, our method can retain the physical characteristics of the image and learn the distribution of haze to accurately recover the haze-free image. Furthermore, in order to get more precise results, we also calculate the local atmospheric light in different haze density areas. Finally, by organically integrating the numerical iterative strategy into the proposed method, the joint estimated transmission and atmospheric light are progressively optimized. To prove the generality, our method is evaluated on synthetic and real-world datasets, as well as remote sensing datasets. Thorough experimental results demonstrate that the proposed algorithm can obtain high-quality results quantitatively and qualitatively comparative to the advanced dehazing approaches.