The Augmented Lagrange Multipliers Method for Matrix Completion from Corrupted Samplings with Application to Mixed Gaussian-Impulse Noise Removal

This paper studies the problem of the restoration of images corrupted by mixed Gaussian-impulse noise. In recent years, low-rank matrix reconstruction has become a research hotspot in many scientific and engineering domains such as machine learning, image processing, computer vision and bioinformatics, which mainly involves the problem of matrix completion and robust principal component analysis, namely recovering a low-rank matrix from an incomplete but accurate sampling subset of its entries and from an observed data matrix with an unknown fraction of its entries being arbitrarily corrupted, respectively. Inspired by these ideas, we consider the problem of recovering a low-rank matrix from an incomplete sampling subset of its entries with an unknown fraction of the samplings contaminated by arbitrary errors, which is defined as the problem of matrix completion from corrupted samplings and modeled as a convex optimization problem that minimizes a combination of the nuclear norm and the -norm in this paper. Meanwhile, we put forward a novel and effective algorithm called augmented Lagrange multipliers to exactly solve the problem. For mixed Gaussian-impulse noise removal, we regard it as the problem of matrix completion from corrupted samplings, and restore the noisy image following an impulse-detecting procedure. Compared with some existing methods for mixed noise removal, the recovery quality performance of our method is dominant if images possess low-rank features such as geometrically regular textures and similar structured contents; especially when the density of impulse noise is relatively high and the variance of Gaussian noise is small, our method can outperform the traditional methods significantly not only in the simultaneous removal of Gaussian noise and impulse noise, and the restoration ability for a low-rank image matrix, but also in the preservation of textures and details in the image.


Introduction
Image denoising is highly demanded in the field of image processing, since noise is usually inevitable during the process of image acquisition and transmission, which significantly degrades the visual quality and makes subsequent high-level image analysis and understanding very difficult. There exist two types of most common and extensively studied noise: additive Gaussian noise and impulse noise. Additive Gaussian noise is usually generated during image acquisition and characterized by adding each image pixel a random value from the Gaussian distribution with zero mean and standard deviation s, while impulse noise is very different in nature from Gaussian noise. Impulse noise can often be introduced in both image acquisition and transmission process by malfunctioning pixels in camera sensors, faulty memory locations in hardware, or transmission in a noisy channel, which mainly includes two models, namely salt-pepper noise and random-valued impulse noise [1]. For the former model, each gray value is replaced with a given probability p by the extreme value g min or g max , leaving remaining pixels unchanged, where g min ,g max ½ denotes the dynamic range of an image and p determines the level of the salt-pepper noise; as for the latter case, the intensity values of contaminated pixels are taken place by random values identically and uniformly distributed in g min ,g max ½ . In this paper, we will focus on the first model of impulse noise. A fundamental target of image denoising is to effectively remove noises from an image while preserving image details and textures.
For Gaussian noise removal, the non-local means (NL-means) proposed by Buades A. et al. [2] is quite an efficient method in suppressing Gaussian noise while keeping details and structures in the image intact, which better exploits the redundancy of natural images and calculates the weighted average of all the pixels in the image based on similarity of neighborhoods, to achieve a satisfying filtering result. Inspired by nonlocal concept, many more nonlocal methods such as BM3D [3] and K-SVD [4] have been introduced and get state-of-the-art denoising performance for Gaussian noise. In addition, some filters based on wavelet transform [5] and partial differential equations (PDE) [6,7] including nonlinear total variation (TV) are also powerful tools for it. The traditional approaches for impulse noise removal operate locally and nonlinearly on images. Among them, let us mention the classical median filter and its variants like adaptive median filter (AMF), progressive switching median filter (PSMF), and direction weighting median filter (DWMF). This kind of nonlinear filters can remove impulse noise effectively but cannot preserve image details well. In order to better preserve edge structures in images, variational approaches have been developed. In [8], a data-fidelity term of l 1 -norm was first introduced to achieve a significant improvement to remove outliers like impulse noise. Some authors [9,10,11] proposed two-phase schemes to better preserve image details: the main idea is to detect the location of noisy pixels corrupted by impulse noise using median-like filters, followed by some variational methods to estimate the gray values for those contaminated pixels.
However, most existing image denoising methods can only deal with a single type of noise, which violates the fact that most of the image noise we encounter in real world can normally be represented by a mixture of Gaussian and impulse noise. Effectively eliminating mixed noise is difficult due to the distinct characteristics of both types of degradation processes. It should be noted that the two-phase approaches [12][13][14][15][16] also show exuberant vitality for mixed noise removal, namely detecting or estimating the outliers before proceeding with the restoration phase. Garnett et al. [12] introduced a universal noise removal algorithm that first detects the impulse corrupted pixels, estimates its local statistics and incorporates them into the bilateral filter, resulting in the trilateral filter. In [13] somewhat similar ideas were used to establish a similarity principle which in turn supplies a simple mathematical justification for the nonlocal means filter in removing Gaussian noises. This is also the case of [15] where outliers are first detected by a median-like filter and then a K-SVD dictionary learning is performed on impulse free pixels to solve a l 1 {l 0 minimization problem, and the case of [16] where a median-type filter is used to remove impulse noise first and then NL-means based method is applied to remove the remaining Gaussian noise. Liu et al. [18] proposed a general weighted l 2 {l 0 norms energy minimization model to remove mixed noise, which was built upon maximum likelihood estimation framework and sparse representations over a trained dictionary. While in [19], weighted encoding with sparse nonlocal regularization was put forward for mixed noise removal, where there was not an explicit step of impulse pixel detection, and soft impulse pixel detection via weighted encoding was used to deal with impulse and Gaussian noise simultaneously instead. See [17,34,35] for more approaches to mixed noise removal. Although these denoising methods above were proposed specially for mixed Gaussian-impulse removal and indeed can alleviate the impact on image visual effect brought by mixed noises to some extent, most of them will erase details, and cannot better preserve regularly geometrical textures and fine structures.
In recent years, low-rank matrix reconstruction has become a research hotspot in many scientific and engineering domains, with myriad applications ranging from web search to machine learning to computer vision and image analysis, which mainly involves the problem of matrix completion (MC) [20,21,26,27] and robust principal component analysis (RPCA) [22][23][24][25]28], namely recovering a low-rank matrix from an incomplete but accurate sampling subset of its entries and from an observed data matrix with an unknown fraction of its entries being arbitrarily corrupted, respectively. Luckily, there usually exist many regularly geometrical textures and fine structures in both natural image and Remote Sensing (RS) image, due to the self-similarity and redundancy of image, which makes the grayscale matrix of the image possess low-rank features. In light of this, we propose a novel denoising framework for better preserving details and lowrank features in images while removing mixed Gaussian-impulse noise, based on the theory of low-rank matrix reconstruction.
In this paper, we consider the problem of recovering a low-rank matrix from an incomplete sampling subset of its entries with an unknown fraction of the samplings contaminated by arbitrary errors, which is defined as the problem of matrix completion from corrupted samplings (MCCS) and modeled as a convex optimization problem that minimizes a combination of the nuclear norm and the l 1 -norm. To exactly solve the problem, we introduce an effective algorithm called augmented Lagrange multipliers (ALM). By detecting impulse noises in a noisy image and treating the impulse free entries of the image matrix as available samplings first, and then regarding the Gaussian noises underlying the samplings as arbitrary errors, we can exploit the proposed algorithm to remove mixed noises and to recover the image matrix with low-rank or approximately low-rank features. To sum up, the main contributions of the paper include modeling the problem of low-rank matrix recovery from incomplete and corrupted samplings of its entries, solving the convex optimization problem via the proposed ALM algorithm and creatively applying it to mixed Gaussian-impulse noise removal.
The rest of the paper is organized as follows. We start in Section 2 by giving some introductions about the theory of low-rank matrix recovery including the problem of MC and Robust PCA. We put forward the ALM algorithm for the problem of MCCS and describe some implementation details and parameter settings for our methods in Section 3, in which we also analyze its powerful performance for exact recovery of low-rank matrices with erasures and errors. In Section 4, we will present in details our full denoising schemes for the impulse detector and mixed Gaussianimpulse noise removal. Experiments and comparisons with recent approaches are demonstrated in Section 5. Finally, we conclude this paper in Section 6.

Low-Rank Matrix Reconstruction
Low-rank matrix plays a central role in large-scale data analysis and dimensionality reduction, since low-rank structure is usually used either to approximate a general matrix, or to recover corrupted or missing data [25]. From the mathematical viewpoint, these practical problems come down to the theory of low-rank matrix reconstruction, which mainly involves the problem of MC and Robust PCA at present.

Matrix completion
To describe the problem of MC, suppose to simplify that the unknown matrix M[R n|n is square, and that one has m sampled entries M ij : where V is a random subset of cardinality m. The recovery of complete matrix M from these incomplete samplings of its entries is the MC problem.
The issue is of course that this problem is extraordinarily illposed because, with fewer samples than entries, there are infinitely many completions. Therefore, it is apparently impossible to identify which of these candidate solutions is indeed the ''correct'' one without some additional information. In many instances, however, the matrix we wish to recover has low rank or approximately low rank, which may change the property of the problem, and make the search for solutions feasible since the lowest-rank solution now tends to be the right one.
Candes and Recht [20] showed that matrix completion is not as ill-posed as once thought. Indeed, they proved that most low-rank matrices can be recovered exactly from most sets of sampled entries even though these sets have surprisingly small cardinality, and more importantly, they proved that this can be done by solving a simple convex optimization problem provided that the number of samples obeys for some positive numerical constant C, where r is the rank of matrix M. In (1), the notation . k k Ã denotes the nuclear norm of a matrix, which is the sum of its singular values. The optimization problem (1) is convex and can be recast as a semidefinite program. In some sense, this is the tightest convex relaxation of the NP-hard rank minimization problem min rank(X ) Another interpretation of Candes and Recht's result is that under suitable conditions, the rank minimization program (3) and the convex program (1) are formally equivalent in the sense that they have exactly the same unique solution. The state-of-the-art algorithms to solve the MC problem include the accelerated proximal gradient (APG) approach and the singular value thresholding (SVT) method [21].

Robust principal component analysis
Principal component analysis (PCA), as a popular tool for highdimensional data processing, analysis, compression and visualization, has wide applications in scientific and engineering fields. It assumes that the given high-dimensional data lie near a much lower-dimensional linear subspace. To large extent, the goal of PCA is to efficiently and accurately estimate this low-dimensional subspace.
Suppose that the given data are arranged as the columns of a large matrixD[R m|n . The mathematical model for estimating the low-dimensional subspace is to find a low-rank matrixA m|n , such that the discrepancy between A and D is minimized, leading to the following constrained optimization: where r% min (m,n) is the target dimension of the subspace and . k k F is the Frobenius norm, which corresponds to assuming that the data are corrupted by i.i.d. Gaussian noise. This problem can be conveniently solved by first computing the Singular Value Decomposition (SVD) of D, and then projecting the columns of D onto the subspace spanned by the r principal left singular vectors of D.
As PCA gives the optimal estimate when the corruption is caused by additive i.i.d. Gaussian noise, it works well in practice as long as the magnitude of noise is small. However, it breaks down under large corruption, even if that corruption only affects very few of the observations. Therefore, it is necessary to study whether a low-rank matrix A can still be efficiently and accurately recovered from a corrupted data matrix D~AzE, where some entries of the additive error matrix E may be arbitrarily large.
Recently, Wright et al. [23] have shown that under rather broad conditions the answer is affirmative: provided the error matrix E is sufficiently sparse, one can exactly recover the low-rank matrix A from D~AzE by solving the following convex optimization problem: where D.D 1 denotes the sum of the absolute values of matrix entries, and l is a positive weighting parameter. Due to the ability to exactly recover underlying low-rank structure in the data, even in the presence of large errors or outliers, this optimization is referred to as Robust PCA. So far, RPCA has been successfully introduced into several applications such as background modeling and removing shadows and specularities from face images. Ganesh et al. [31] proposed two new algorithms for solving the problem (5), which are in some sense complementary to each other. The first one is an accelerated proximal gradient algorithm applied to the primal, which is a direct application of the FISTA framework introduced by [30], coupled with a fast continuation technique; the other one is a gradient-ascent approach applied to the dual of the problem (5). Recently, the augmented Lagrange multiplier method was introduced by Lin et al. for exact recovery of corrupted low-rank matrices, which shows rather promising performance when dealing with the problem (5).

MCCS and its optimization model
Since the MC problem is closely connected to the RPCA problem, we may formulate the MC problem in the same way as RPCA where p V : R m|n ?R m|n is a linear operator that keeps the entries in V unchanged and sets those outside V (i.e., in V V) zeros. As E will compensate for the unknown entries of D, the unknown entries of D are simply set as zeros. Then the partial augmented Lagrangian function of (6) is where m is a positive scalar. For updating E, the constraint p V (E)~0 should be enforced when minimizing L(A,E,Y ,m). The problem of restoring matrix from corrupted entries (i.e., RPCA) is less studied than the simpler MC problem when the available entries are not corrupted. However, in many practical applications, there usually exists the case when missing entries and corrupted entries simultaneously concur in an observed data matrix, which urgently drives researchers to pay more and more attention to the problem of recovering low-rank matrix from its observed data matrix with erasures and errors.
For observed matrix D~p V (A Ã zE Ã ) where the ordered pair(A Ã ,E Ã ) denotes the true solutions to low-rank matrix and sparse error matrix respectively, suppose that one can only know the entries of D on subset V (the set of indices of known entries), and that one aims to recover low-rank matrix A Ã from the observed matrix D with erasures and errors. In this paper, we define the problem as matrix completion from corrupted samplings or robust matrix completion, which is the problem of restoring low-rank matrix from incomplete and corrupted samplings.
Similarly, as the entries of matrix E on set V V will compensate for the unknown entries of D and the entries of E on set V are sparse, and E V denotes sparse error components for the known entries. Now we can formulate the problem of MCCS as follows: To recover the low-rank matrix for the MCCS problem (8), we propose a novel and effective algorithm based on the ALM method, whose objective function is as follows: In (9), as E V and E V V are complementary to each other, we can ignore one when solving another one, and finally combine the two to obtain E. Unlike the technique presented in [29], we separate E into two parts in the objective function, which leads to the splitting of Y ,D,A afterwards, and aims to achieve subsection optimization.

The proposed ALM algorithm and parameter settings
The ALM algorithm for the Robust MC problem proposed in the paper is described in Algorithm 1. It is apparent that computing the full SVD for the Robust MC problem is unnecessary, so we only need those singular values that are larger than a particular thresholding and their corresponding singular vectors. Firstly, we should predict the dimension of principal singular space, and in the paper, we exploit the following prediction rule: where d~min (m,n), sv k is the predicted dimension and svp k denotes the number of singular values in the sv k singular values that are larger than m {1 k , and sv 0~5 . In the paper, we choose l~1=sqrt(p Ã max (m,n)),p~DVD=(mn), and set m 0~0 :5= D k k 2 . The following conditions are chosen as the stopping criterion: For convenience, we introduce the following soft-thresholding (shrinkage) operator: where x[R and ew0. This operator can be extended to vectors and matrices by applying it element-wise. In Algorithm 1, k k ? is the maximum absolute value of the matrix entries, and .
k k 2 denotes the maximum singular value of a matrix.

Algorithm 1(MC from Corrupted Samplings via the ALM Method)
Input : Observation matrix D [ R m|n , V, l: Updatem k tom kz1~m in (rm k , m m); k/kz1:

End while
Output : (A k ,E k ):

Performance analysis of the proposed algorithm
In this subsection, we first demonstrate the recovery accuracy of the proposed ALM algorithm by simulation experiments; and then for clarifying how the erasure rate (the fraction of the unknown entries in observed matrixD, i.e.,1{d(V),whered(V)~DVD=(m Ã n)) and the error probability (the fraction of randomly corrupted entries in the known entries available for restoration, i.e., E V k k 0 =DVD,where . k k 0 denotes the sparsity of a matrix) affect the performance of our method, we carry out the phase transition analysis for the proposed algorithm. We generate the rank-r matrixA Ã as a product LR T , where L and R are independent m|r matrices whose elements are i.i.d. Gaussian random variables with zero mean and unit variance, and set a fraction of entries chosen randomly in A Ã as zeros; and then, treat a fraction of non-zero entries also chosen randomly in A Ã as corrupted entries after adding arbitrarily large errors E Ã to them; finally, we get the observed data matrix D m|m with erasures valued zeros.
We exploit different data matrices and various combinations (r V ,r s ) to do the experiments, where r V~d (V) and r s stands for the error probability, and the reconstruction results are presented in Table 1. We observe that the proposed algorithm can accurately recover the low-rank matrices and sparse errors even when erasure rate reaches 30% and 20% of the available entries are corrupted.
As for the experiments of phase transition analysis, we fix m~512, set r~8,16,64 respectively, and vary r V and r s between 0 and 1. For each r and each (r V ,r s ) pair, we generate 5 pairs (A Ã ,E Ã ) to obtain 5 observed matrices as described above. We deem the recovery successful if the recoveredÂ A satisfieŝ A Ã k k F v0:01 for all the 5 observed data matrices.
The curves colored red, green and blue in Figure 1 define ''phase transition'' bounds for the case of r~8,16,64, respectively. In Figure 1, the horizontal coordinate indicates r V , while the vertical coordinate denotes r s . At the points of these curves, all 5 trials were accomplished with success, whereas for the points above the curves, at least one attempt failed. It is assumed that the regions under the curves are ''regions of success''. Our experiments show that for fixed m, the smaller rank r is, the larger area the region of success will have. In fact, when other factors keep unchanged, the recovery performance of the algorithm is inversely proportional to rank(A Ã )=m. In addition, we can also observe that erasure rate and error probability keep the relationship of interacting.

Our Denoising Scheme for Mixed Gaussian-Impulse Noise
Image denoising is a research hotspot in the area of image processing all the way. However, in real world, images are typically contaminated by more than one type of noise during image acquisition and transmission process. As a matter of fact, we often encounter the case where an image is corrupted by both Gaussian and impulse noise. Such mixed noise could occur when an image that has already been stained by Gaussian noise in the procedure of image acquisition with faulty equipment suffers impulsive corruption during its transmission over noisy channels successively.
As we know, there usually exist many regularly geometrical textures and similar structures in both natural image and Remote Sensing image, due to the self-similarity and redundancy of image, which makes the grayscale matrix of the image possess low-rank features. Matrix completion can accurately reconstruct low-rank information by exploiting reliable pixels after detecting the isolated noises, which is quite suitable for impulse noise (especially for saltpepper noise) removal. Robust PCA can recover low-rank matrix from observed matrix with sparse errors, which may deal with the problem of Gaussian noise removal to some extent, since for Gaussian noise, corruptions with large magnitude are rarely distributed while most of the corruptions are located near zero. Luckily, the proposed ALM algorithm for the Robust MC problem, also supplies a powerful technique for removing mixed Gaussian-impulse noise from images with low-rank features.
In this paper, we adopt the two-phase scheme for mixed noise removal: By detecting impulse noises in a noisy image and treating the impulse free entries of the image matrix as available samplings first, and then regarding the Gaussian noises underlying the samplings as arbitrary errors, we can exploit the proposed algorithm to remove mixed noises and to recover the grayscale matrix of the image with low-rank or approximately low-rank features. It should be noted that, we cannot simply use MC for Table 1. Recovery results of low-rank matrices with different (r V ,r s ) using the proposed ALM.  removing impulse noises after detecting them, since the remaining Gaussian noises no longer make the image possess low-rank features; in addition, neither can we directly employ Robust PCA to restore the noisy image at first, because impulse noises destroy the sparsity of large errors, which may greatly decrease the accuracy of the restoration. The success of two-phase approaches for noise removal relies on the accurate detection of impulse noise. Many impulse noise detectors are proposed in the literatures, e.g. AMF is used to detect salt-pepper noise, while adaptive center-weighted median filter (ACWMF) and rank-ordered logarithmic difference (ROLD) [32] are utilized to detect random-valued impulse noise. In this paper, study on detectors for random-valued impulse noise is beyond the scope of the topic.
For salt-pepper noise, we employ the following strategy to detect it after considering its features taken on in the image: (1) Grayscale thresholding. Specifically, set a fixed thresholding a, and regard those pixels whose gray-levels are distributed in the interval ½255{a,255 or ½0,a as candidate noises. (2) Median detecting. For current candidate noise, we consider the absolute difference between its gray-level and the median gray-level of its neighborhood pixels. If the absolute difference is larger than another given thresholding b, we then treat current candidate noise as true noise. Empirically, when the density of salt-pepper noise namely noise level is high, the window size of the neighborhood for median detecting should be large.

Experiments and Discussion
This section is devoted to the experimental analysis and discussion of the denoising scheme introduced in the previous section. We compare our ALM algorithm with some recently proposed approaches [12,15,16] for dealing with the mixture of impulse and Gaussian noise. Comparison results are presented both under the form of statistical index tables and that of visual effects. To evaluate the quality of the restoration results, peak signal to noise ratio (PSNR) and structural similarity (SSIM) are employed for objective evaluation and for subjective visual evaluation, respectively.
Given an image x[ 0,255 ½ m|n , the PSNR of the restored imagê x x is defined as follows: The larger the value of PSNR is, the better the quality of restoration will be. We can formulate the SSIM between original image and the recovered image as (see [33] for more details). The dynamic range of SSIM is ½0,1, which means a better recovery performance with the SSIM closer to the value 1.
As the proposed algorithm is suitable for restoring images with low-rank information, we first choose the classical Mondrian and Barbara image in the field of image processing to do the experiments. In all experiments, parameters of each method have been tuned to yield the best results. On image Mondrian, we     successively add Gaussian noise with s~15=255 and salt-pepper noise with p~0:1. The visual comparison of the four approaches is presented in Figure 2, from where we will see that our method can simultaneously remove Gaussian noise and salt-pepper noise in the restoration phase, and well reconstruct the original image with low-rank structures; above all, it can better preserve details and edges in the image. By contrast, the other three approaches either cannot thoroughly remove the mixed noises or will destroy fine structures. The values of PSNR and SSIM also demonstrate that the ALM-based approach outperforms other methods significantly not only in objective index, but also in subjective visual effect. In the following experiment, image Barbara abundant in geometrical textures, is corrupted by Gaussian noise with zero mean and different standard deviations s~5=255,15=255 first, and then we add salt-pepper noise with different levels p~0:1,0:2,0:3,0:4,0:5 on the image. Comparisons of statistical indices under different combinations of (p,s) are provided in Table 2, and the comparison of visual effect when p~0:3,s~5=255 is also shown in Figure 3. From Figure 3, we also observe that our method can preserve regularly geometrical textures perfectly while removing the mixed Gaussian-impulse noises from the image. However, other approaches will more or less blur the textured information consisting in the image.
Meanwhile, we can see that the differences of recovery performance between the proposed ALM algorithm and the other three methods will be more remarkable when the level of salt-pepper noise is relatively high while the variance of Gaussian noise is small.
To better demonstrate the excellent performance of our method, we do a further comparison experiment of the four approaches in removing mixed Gaussian-impulse noise from Remote Sensing image with low-rank features. As in the previous experiment, we add Gaussian and salt-pepper noise with different values of p and s on RS image. The experimental results are presented in Figure 4 and Table 3. Again, it shows that our method can better remove the mixed noise and restore the image especially when the level of impulse noise is relatively high while Gaussian noise is small.
The above experiments mainly involve images with significant low-rank features such as similar structures and regular textures, which violates the fact that most of images in the real world cannot possess globally low-rank features. However, most local parts in the image will meet low-rank or approximately low-rank condition, due to the self-similarity and spatial correlation of images. Consequently, we can utilize the proposed algorithm for image de-noising via block processing-based technique. The following experiments demonstrate the results of removing mixed Gaussian-impulse noise from Lena and Boat images, compared to trilateral filter method. Both of the noisy images are of size 512|512 pixels, either divided into many image blocks of size 16|16 pixels. The de-noising results for Lena and Boat are presented in Figure 5 and Figure 6 respectively, which also show the better performance of our method over trilateral filter. We can see that the details of hairs are not preserved well in Figure 5c, while in Figure 6c, some structures of the masts on the boats are lost.

Conclusion
In this paper, regarding the problem of low-rank matrix recovery from incomplete and corrupted samplings of its entries namely the Robust MC problem, we construct a mathematical model based on convex optimization, and put forward a novel and effective ALM algorithm, to solve this kind of optimization problem which can be considered as the extension of the Robust PCA and the MC problem. Experiments on performance analysis and mixed Gaussian-impulse noise removal demonstrate the reliability and practicability of the proposed algorithm, which also show that our method can well preserve details and textures and keep the consistency of structures, while simultaneously removing mixed noises from images with low-rank features. By virtue of the novelty and powerful advantages, the approach will bring promising application value in the fields of data mining, image processing, machine learning, RS information processing and so forth.