Iterative Nonlocal Total Variation Regularization Method for Image Restoration

In this paper, a Bregman iteration based total variation image restoration algorithm is proposed. Based on the Bregman iteration, the algorithm splits the original total variation problem into sub-problems that are easy to solve. Moreover, non-local regularization is introduced into the proposed algorithm, and a method to choose the non-local filter parameter locally and adaptively is proposed. Experiment results show that the proposed algorithms outperform some other regularization methods.


Introduction
Image restoration is a classical inverse problem which has been extensively studied in various areas such as medical imaging, remote sensing and video or image coding. In this paper, we focus on the common image acquisition model: an ideal image u [ R n is observed in the presence of a spatial invariance blur kennel A [ R n|n and a additive Gaussian white noise n [ R n with zero mean and standard deviation s. Then, the observed image f [ R n is obtained by: Restoring the ideal image from the observed image is ill-posed since the blur kennel matrix is always singular. A common way to solve this problem is to use regularization methods, in which regularization terms can be invited to restrict the solutions. Regularization methods generally have the form as follows: where DD : DD denotes the Euclidean norm, l is a positive regularization parameter balancing the fitting term and the regularization term, R(u) is the regularization term. The total variation regularization proposed by Rudin, Osher and Fatemi [1] (also called the ROF model) is a well known regularization method in this field. The total variation norm has a piecewise smooth regularization property, thus the total variation regularization can preserve edges and discontinuities in the image. The unconstrained ROF model has the form arg min where the term DDuDD TV stands for the total variation of the image. The continuous form of the total variation is defined as Many numerical methods was proposed to solve (3). When A is an identity matrix, the ROF model (3) turns into a TV denoising problem, methods as Chambolle's projection method [2], semismooth Newton methods [3], multilevel optimization method [4] and split Bregman method [5]. When A is a blur kennel matrix, (3) turns into a TV deblurring problem, we have prime-dual optimization algorithms for TV regularization [6][7][8][9], forward backward operator splitting method [10], interior point method [11], majorization-minimization approach for image deblurring [12], Bayesian framework for TV regularization and parameter estimation [13][14][15], using local information and Uzawa's algorithm [16,17], using regularized locally adaptive kernel regression [18], augmented Lagrangian methods [19,20] and so on. However, the problem is far from perfectly solved, problems as edge and detail preserving [21,22], ringing effect reducing [23][24][25] and varied blur kennels and noise types image restoration [26,27] still need better solutions.
The purpose of this paper is to propose an effective total variation minimization algorithm for image restoration. The algorithm is based on Bregman iteration which can give significant improvement over standard models [28]. Then, we solve the proposed algorithm by alternately solving a deblurring problem and a denoising problem [29,30]. In addition, we propose a local adaptive nonlocal regularization approach to improve the restoration results.
The structure of the paper is as follows. In the next section, an iterative algorithm for total variation based image restoration is proposed, moreover we present a nonlocal regularization under the proposed algorithm framework with local adaptive filter parameter to improve the restoration results. Section Experiments shows the experimental results. Section Conculsions concludes this paper.

Iterative Approach for TV-based Image Restoration
Bregman iteration for image restoration. We first consider a general minimization problem as follows: where l is the regularization parameter, J is a convex nonnegative regularization functional and the fitting functional H is convex nonnegative with respect to u for fixed f . This problem is difficult to solve numerically when J is non-differentiable, and the Bregman iteration is an efficient method to solve the minimization problem. Bregman iteration is based on the concept of ''Bregman distance''. The Bregman distance of a convex functional J( : ) between points u and v is defined as: where p [ LJ is a sub-gradient of J at the point v. Bregman distance generally is not symmetric, so it is not a distance in the usual sense, but the Bregman distance measures the closeness of two points. D p J (u,v) §0 for any u and v, and D p J (u,v) §D p J (w,v) for all points w on the line segment connecting u and v. Using Bregman distance (6), the original minimization problem (5) can be solved by an iterative procedure: where LH(u kz1 ,f ) denotes a sub-gradient of H at u kz1 and lw0. When we choose H(u,f )~DDAu{f DD 2 and J(u)~DDuDD TV , (7) turns into the total variation minimization problem, then (7) can be converted into the following two step Bregman iterative scheme [28]: In [28], the authors mentioned that the sequence u k weakly converges to a solution of the unconstrained form of (5), and the sequence DDAu k {f DD 2 converges to zero monotonically. We can see from (8), (9) that,the Bregman iteration just turns the original problem (5) into a iteration procedure and add the noise residual back into the degenerated image at the end of every iteration. Bregman iteration converges fast and gets better results than standard methods. Bregman iteration was widely used in varied areas of image processing [5,[31][32][33].
General framework of the iterative algorithm. As introduced above, we use the Bregman iteration (8), (9) to bulid the main iterative framework. Rather than considering (3), we consider the problem as follows: arg min u,g DDAg{f DD 2 zlDDuDD TV subject to g~u: We separated the variable u in (3) into two independent variables, so we can split the original problem (3) into subproblems which are easy to solve. This problem is obviously equivalent to (3). We can replace (10) into the unconstrained form: l 1 and l 2 are regularization parameters balancing the three terms. If the regularization parameter l 1 is big enough, the problem (11) is close to the problem (10), and the solutions of the problems are similar. If we let J(u,g)~DDAg{f DD 2 zl 2 DDuDD TV and H(u,g)~DDg{uDD 2 , we can see that J and H are all convex, and then (10) is a simple application of (7). Thus, the above problem can be solved by using Bregman iteration: Similar to [28], we can reform the above procedure into a simple two step iteration algorithm: As we can see in (13), when l 1 tends to infinity, the above algorithm is equal to the original Bregman iterative algorithm in [28]. we use an alternating minimization algorithm [29,30] to solve (13). We split (13) into a deblurring and a denoising subproblems. Thus, (13) can be solved by the following two step iterative formation: We can see that (15) is an l 2 -norm differentiable problem, we can solve it as follows: where I is the identity matrix and the matrix 1 invertible. Then (15) can be solved by optimization techniques such as Gauss-Seidel, conjugate gradient or Fourier transform. As for (16), it is a exact total variation denoising problem, we can use Chambolle's projection algorithm [2], semismooth Newton method [3] or split Bregman algorithm [5] to solve this problem. Thus, the proposed alternating Bregman iterative method for image restoration can be formed as follows: Algorithm 1: Alternating Bregman iterative method for image restoration.
Analysis of the proposed algorithm. First, we show some important monotonicity properties of the Bregman iteration proposed in [28].
Theorem 1 The sequence H(u k ,f ) obtained from the Bregman iteration is monotonically nonincreasing. And assume that there exists a minimizerũ and, in particular, u k is a minimizing sequence. Moreover, u k has a weak-* convergent subsequence in BV (V), and the limit of each weak-* convergent subsequence is a solution of Au~f . Ifũ u is the unique solution of Au~f , then u k ?ũ u in the weak-* topology in BV (V).
Then, we show that the alternating minimization algorithm (15) and (16) also convergence to the solution of the sub-problem (13) [30]. Let L be the difference matrix and NULL( : ) denotes the null space of the corresponding matrix, we obtain the following theorem.
Theorem 2 For any initial guess u 0 [R n 2 , suppose fu i g is generated by (15) and (16), then u i converges to a stationary point of (10). And when A is a matrix of full column rank, u i converges to a minimizer of (10).
Then, we can get the following convergence theorem of the proposed alternating Bregman iterative method.
Theorem 3 Let A be a linear operator, consider the algorithm 1. Suppose fu i g is a sequence generated by algorithm 1, u i converges to a solution of the original constrained problem (3).
Proof. Let fu i g and fg i g be the sequence obtained from (13), and every u i is the solution of the (13), moreover H(u,g)~u{g k k 2 ?0 with the increase in iterations of the algorithm 1. Suppose in one iteration, there is u Ã and g Ã satisfying u Ã~gÃ , and let the true solutions of the problem (10) beũ u andg g, then.
Due to u Ã and g Ã satisfy (11), u Ã and g Ã can enable the convex function (11) to obtain its Minimum value. Then.
Thus, we can obtain.
Owning toũ u andg g are the true solutions of the problem (10), this inequality implies that u Ã and g Ã are also the solutions of the problem (10), thus are the solutions of the original unconstrained problem (3).
Connection with other methods. We noticed that the equation (17) can be rewrite as follows: thus, the proposed algorithm 1 can be interpreted as follows: The preconditioned Bregmanized nonlocal regularization (PBOS) algorithm [33] can be formed as: the left and right pseudo inverse approximation are equal: Figure 1. Convergence speed between algorithm 1, operator splitting TV, FTVd and FAST-TV. A. 7|7 Gaussian kernel with sigma~3 and gaussian noise with s~2 B. 9|9 average kernel and gaussian noise with s~3. The figure shows the convergence speed between four methods using the Cameraman image and two different blur kennels. Axis X stands for the iteration times, Axis Y stands for the relative difference between restored images in two iterations, that is DDu k {u k{1 DD u k . doi:10.1371/journal.pone.0065865.g001 Compare these two methods, we can see the only difference between them is the way to calculate the noise and add it back to the iteration. The PBOS method calculates the undeconvolutioned noise and only add it back to the calculation of g, while the proposed method calculate the deconvolutioned noise and add it back to both the calculations of g and u, we believe that is why the proposed algorithm have a faster converge speed and better restoration results according to the experiments in section 0.

Adaptive Nonlocal Regularization
Nonlocal regularization. Recently, nonlocal methods have been extensively studied, the nonlocal means filter was first proposed by Buades et al [34]. The main idea of the nonlocal means denoising model is to denoise every pixel by averaging the other pixels with similar structures (patches) to the current one. Based on the nonlocal means filter, Kindermann et al. [35] tried to investigate the use of regularization functionals with nonlocal correlation terms for general inverse problems. Inspired by the graph Laplacian and the nonlocal means filter, Gilboa and Osher defined a variational framework based nonlocal operators [36]. In the following, we use the definitions of the nonlocal regularization functionals introduced in [36].
is a real function V ? R and w is a nonnegative symmetric weight function. Then the nonlocal gradient + w u(x) is defined as the vector of all partial differences + w u(x, : ) at x: ffiffiffiffiffiffiffiffiffiffiffiffiffi w(x,y) p , and the graph divergence div w of a vector p : V|V?R can be defined as: the weight function is defined as the nonlocal means weight function: where G a is the Gaussian kernel with standard deviation a, h is the filtering parameter related to the standard variance of the noise, When the reference image f is known, the nonlocal means filter is a linear operator. The definition of the weight function (26) shows that the value of the weight is significant only when the patch around y has similar structure as the corresponding patch around x.
The nonlocal TV norm can be defined as isotropic L 1 norm of the weighted graph gradient + w u(x): The main idea of the nonlocal regularization is to generalize the local gradient and divergence concepts to the nonlocal form. Then the nonlocal means filter is generalized to the variational framework.
The nonlocal means filter and the nonlocal regularization functionals can reduce noise efficiently and preserve textures and contrast of the image. Generally, it is good to choose a reference image as close as possible to the original ideal image to calculate the weights. However, the original image structures are broken in the degraded image, we can not get the precise weights between the pixels, thus the weights should be calculated from a preprocessed image [37]. In our alternating minimization framework, we get the deblurred image at the first step, then denoise the deblurred image at the second step. As the nonlocal regularization functionals are robust to the noise, and the structures of the deblurred image are close to the original ideal image, we can calculate the weights by using the deblurred image as the reference image, then apply a nonlocal denoising step to obtain the restored image.
Adaptive nonlocal parameter selection. Within the alternating Bregman iterative method, we can use g kz1 as the reference image to calculate the weights of the nonlocal regularization functionals, then use the weights to denoise the deblurred image g kz1 at every iteration. Note that the nonlocal filter parameter h is related to the standard variance of the noise, however we do not know the exact noise of the image g kz1 . Moreover, when we use the single filter parameter h for the whole image, there will be regions oversmoothed or undersmoothed in the restored image, because a single filter parameter h is not optimal for all the patches in the image. As the nonlocal TV norm defined in (27), we will calculate the filter parameter h adaptively using local information and get the local h for every pixel in the image.
Inspired by local regularization in [21], we define the local power as: and V x,y (x',y')~V(Dx{x'D,Dy{y'D) is a normalized smoothing window, here we use a Gaussian window. I r is the expected image, V is a region to calculate the local power centered at (x,y). Then we use the local power to calculate the local h as follows: h(x,y)~a ffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi P r (x,y) p : The advantage of localizing the filter parameter h is that it can control the denoising process over image regions according to their content, the smooth regions have average w between there neighbors, texture and edge regions have big w only when the patches are similar. Besides, we do not have to know or estimate the noise condition. In this paper, we use a preprocessed oversmoothed image us as the expected image instead of the mean of the patch to get more accurate results. The oversmoothed image is obtained by a standard TV model using a large regularization parameter.
By applying the above adaptive nonlocal regularization, the algorithm 1 can be reformed as the following algorithm, where W is the function to calculate the weights between points and us is the preprocessed oversmoothed image: Algorithm 2: Adaptive nonlocal alternating Bregman iterative method for image restoration.

Experiments
In this section, we present some experimental results of the proposed alternating Bregman method and the adaptive nonlocal alternating Bregman method, and compare them with the operator splitting TV regularization [10], NLTV based BOS algorithm, FTVd algorithm [38], FAST-TV [30] and ForWaRD algorithm [39]. The ForWaRD algorithm is a hybrid Fourierwavelet regularized deconvolution (ForWaRD) algorithm that performs noise regularization via scalar shrinkage in both the Fourier and wavelet domains.
We use the conjugate gradient method to solve the first subproblem in algorithm 1 and algorithm 2, use the Chambolle's projection algorithm to solve the second subproblem in algorithm 1 and the nonlocal version of the Chambolle's projection algorithm in algorithm 2. In algorithm 1 and algorithm 2, we set l 1~0 :02 and l 2~0 :01 by experiment results, the inner iteration times n can be set as 1 or 2 and the stopping condition is DDu k {u k{1 DD=DDu k DDv10 {3 . There are lots of work on determining the parameters in the regularization [40,41], but this work is out of the scope of this paper and we will get in to it later. In algorithm 2, we set the patch size as 5|5, searching window as 21|21 and set the Gaussian variance parameter as s~5 to calculate the local variance. And we set the nonlocal parameter factor as a~2. For the operator splitting method, the regularization parameter is set as 0:1. For the NLTV based BOS method, the regularization parameter is set as l~0:2, searching window is set as 21|21, the patch size is set as 5|5 and the nonlocal filter parameter is set as h~30. For the FTVd algorithm, we set m~1000. For the FAST-TV, we set a 1 as 0:003 and 0:005 according to the degeneration of the image. And the best valve of a 1 . For the ForWaRD algorithm, we set the threshold as 3 Ã sigma, sigma is the standard deviation of the noise, and the regularization parameter is set to 1.
First, we compare the convergence speed of the proposed algorithm 1 with the preconditioned BOS algorithm, the FTVd algorithm and the FAST-TV algorithm in Figure 1. We can find that the proposed algorithm 1 converge faster than the other three methods at first, and still much faster than the preconditioned BOS algorithm and the FTVd algorithm later, close or a little bit slower than the FAST-TV at the end of the iterations. Usually, the stopping condition of the relative difference is set to 10 {3 or 10 {4 . Thus, the proposed algorithm 1 can reach the stopping condition with fewer iterations than other algorithms. In terms of the computation time, the FTVd algorithm is the fastest owning to its strategies and code optimization. And the proposed algorithm 1 is faster than the operator splitting algorithm and the FAST-TV algorithm. As for the nonlocal methods, convergence can not be promised after some iterations, so we compare the computation time between these methods. As the computation of the nonlocal weights, the nonlocal based algorithms cost more computing time than the non nonlocal ones. The NLTV based BOS algorithm stops with 25 steps for 180 seconds, and the preconditioned NLTV based BOS algorithm stops with 8 steps for 75 seconds, however the proposed algorithm 2 stops with 5 steps for only 47 seconds.
Next, we show some image restoration results of these methods to illustrate the effectiveness of the proposed algorithms. We use the classical Cameraman image, so as to be comparable to other image restoration works. The Cameraman image can be found at http://www.imageprocessingplace.com/root_files_V3/ image_databases.htm. Figure 2 and Figure 3 show the restoration results on the Cameraman with two kind of blur kernels. We can see from the results that, the ForWard method can get a good restoration result when the image is not slightly blurred, but poor on the heavily blurred situation, besides the ForWard method can not restore edges clearly. The restoration results of the operator splitting TV method have artificial strips which affect the visual appearance of the restored images. FTVd method and FAST-TV can effectively remove noise from the degenerated images, and have higher PSNRs than the ForWard method and the operator splitting TV method, however, a lot of details are also smoothed. The results of the proposed algorithm 1 have good visual appearance, clear edges and preserved image contrast. The NLTV based BOS method (the preconditioned BOS has almost the same result) and the proposed algorithm 2 have better restoration results than the not nonlocal ones, and the proposed algorithm 2 have more details restored and a higher PSNR.
The Table 1 and Table 2 shows the restoration results on 5 different images and 2 different degradation situations. We can see that, the PSNR and SSIM of the proposed algorithms are generally higher than the methods being compared.

Conclusions
In this paper, we propose a Bregman iteration based total variation image restoration algorithm. We split the restoration problem into a three step iteration process, and these steps are all easy to solve. In addition, we propose a nonlocal regularization under the framework of the proposed algorithm using a point-wise local filter parameter, and a method to adaptively determine the filter parameter. Experiments show that the algorithm converges fast and the adaptive nonlocal regularization method can obtain better restoration results. In the future, we will consider the weights updating problem in a theoretical way and apply the proposed algorithms for other regularization problems such as compressed sensing.