Compressive Sensing via Nonlocal Smoothed Rank Function

Compressive sensing (CS) theory asserts that we can reconstruct signals and images with only a small number of samples or measurements. Recent works exploiting the nonlocal similarity have led to better results in various CS studies. To better exploit the nonlocal similarity, in this paper, we propose a non-convex smoothed rank function based model for CS image reconstruction. We also propose an efficient alternating minimization method to solve the proposed model, which reduces a difficult and coupled problem to two tractable subproblems. Experimental results have shown that the proposed method performs better than several existing state-of-the-art CS methods for image reconstruction.


Introduction
Compressive sensing (CS) [1,2] allows us to reconstruct high dimensional data with only a small number of samples or measurements, and captures only useful information and has the potential of significantly improving the energy efficiency of sensors in the real-world applications. The key idea behind CS is that the majority of real-world signals including images and videos, can be sparsely represented by given some appropriate basis. Due to the positive theoretical and experimental results, many CS-based imaging methods have been proposed and applied to the various areas such as magnetic resonance imaging [3,4] in medicine, compressed spectral and hyperspectral imaging [5,6] in industry, neural network [7] in biotechnology.
CS also makes it possible to well restore corrupted signals at a fast speed and the small memory cost. Conventional CS recovery uses ℓ 1 norm to characterize the sparsity of a signal, and the resulting convex optimization problems are tractable. Although there are several methods can be used to efficiently solve the ℓ 1 regularization based model for signal recovery [8][9][10], they only achieve suboptimal recovery performance due to their relaxation of the ℓ 0 norm based sparse optimization [11]. More recently, structured or group sparsity based methods [12][13][14] and nonlocal sparsity based methods [15,16] have provided better results for CS recovery. Intuitively, the structured or group sparsity can reduce the degrees of freedom in the solution, and the nonlocal sparsity explicitly exploits self-similarities of the signal, thereby obtaining more accurate recovery performance than the common sparsity.
In CS studies, a number of works have suggested that non-convex optimization based approach often yields better results than convex ones though costing higher computational complexity. Therefore, we propose a CS recovery model by considering a non-convex smoothed function to approximate the rank, denoted as smoothed rank function (SRF), as a low-rank regularization. The emphasis behind the proposed model is the utilization of nonlocal sparsity by image patch grouping. Concretely, we group a set of similar image patches to form a matrix X i for i-th exemplar image patch, which is extracted from the test image. Then the matrix X i is low-rank since similar patches have similar structure. To solve the low-rank optimization problem, we minimize the SRF function to approximate the rank minimization problem, and the resulting problem is a non-convex optimization problem. In order to avoid getting trapped in local solutions, we initialize a rough approximation of the rank, and gradually improve the approximation as the iteration proceeds.
The basic idea of using nonlocal sparsity for image patches has already been used in [16][17][18] with the very impressive results. There have been an abundant research literatures using the patch-based low-rank as plenty with nonlocal sparsity [19][20][21][22]. In [20,21], the low-rank problem is solved by minimizing the nuclear norm of the low-rank matrix, which leads to a convex minimization problem with many efficient methods available. In [22], a more accurate approximation for rank is proposed by exploiting the logdet function, which can be derived as a weighted nuclear norm. Specifically, compared with previous surrogates for the rank such as nuclear norm and weighted nuclear norm, our surrogate is differentiable and can approximate the rank adaptively.
The outline of this paper is organized as follows. Firstly, we briefly review the background of CS. Secondly, the SRF function is introduced and the model based on SRF for CS image reconstruction is proposed. Thirdly, the optimization algorithm is presented to solve the proposed model. To demonstrate the effectiveness of the proposed method, we show some numerical results on several test images in the following section. In the end, the conclusion and future work are given.

Background
The CS recovery problem aims to find the sparsest solution x 2 R N from the underdetermined linear system y = F x, where y 2 R M is the measurements and F 2 R MÂN , M < N is the measurement matrix. It can be formulated as follows: where ||Á|| 0 is the ℓ 0 norm counting the number of nonzero elements of x. In practical applications, such as signal reconstruction problem, the measurement noise is unavoidable. Then the noisy CS recovery problem is formulated as where ||Á|| 2 is the ℓ 2 norm and is the residual error. However, since problem (2) is NP-hard, it is infeasible to solve it directly. Thus it is proposed to replace the ℓ 0 norm by the convex ℓ 1 norm, namely min x jjxjj 1 s:t: jjy À Fxjj 2 ; ð3Þ problem (2) and easy to solve by some methods including iterative shrinkage algorithm [8], alternating direction method of multipliers (ADMM) [23] and Bregman split algorithm [24]. Although problems (2) and (3) are fundamentally different, they return the same solution in many interesting situations [25]. By using an appropriate regularization parameter λ, we can equivalently rewrite the problem (3) as follows: By replacing the ℓ 1 norm by the non-convex ℓ p (0 > p > 1) norm, in [26], the non-convex optimization problem based on ℓ p norm can achieve the more exact CS reconstruction result than the convex one based on the ℓ 1 norm. Moreover, it has been shown in [18,27] that the nonlocal sparsity exploiting the self-similarity of the natural image leads to the state-of-the-art performance. The nonlocal sparsity is a significant prior for CS recovery. In this work, we will propose a non-convex CS image recovery approach, which exploits the nonlocal sparsity.

Nonlocal smoothed rank function
In this section, we present a procedure of patch grouping that uses image self-similarity and leads to the low-rank problem. Hence one underlying assumption is the image exhibits abundant self-similarities. For a test image, we select some exemplar patches of size ffiffiffi n p Â ffiffiffi n p , reordered into the column vector lexicographically and denoted as b x i 2 R n . For each exemplar patch, the underlying assumption makes it possible to find a number of similar patches. Specially, we employ a variant of k-nearest-neighbor search in a local window for each exemplar patch b x i to find its similar patches as follows: where c is a pre-defined threshold and G i indexes the positions of corresponding similar patches. After patch grouping, we aline the similar patches as column vectors to form a matrix Under the above assumption, each matrix X i has the low-rank property since the similar patches have the similar structures. In this way, the nonlocal sparsity leads to a rank minimization problem in our image reconstruction approach. Since the resulting matrix X i also contains the noise during patch grouping, in order to obtain a clean and clear image to well match the ground truth, let X i = L i +W i , where L i denotes the low-rank matrix and W i represents the noise matrix. We can obtain the low-rank matrix L i by solving the following problem: where ||Á|| F is the Frobenius norm and is the residual error. Unfortunately, problem (6) is NPhard. A popular heuristic method is to adopt the nuclear norm (sum of the singular values) to replace the rank, i.e., where ||Á||Ã is the nuclear norm. The problem (7) is a convex surrogate of the rank and can be solved efficiently by the singular value thresholding (SVT) algorithm [28]. However, it is in practice suboptimal due to equally treating each singular value. In [19], Gu et al. have demonstrated that non-convex low-rank approximations adaptively treating the singular values at different scales yield better results than those convex ones.
In [29,30], they have been shown that for a low-rank matrix X 2 R nÂm , the rank can be approximated by the following SRF function: where σ j (X) is the j-th singular value of X, ℓ = min{n, m} and δ is an adjustable parameter.
Although the problem of minimizing G δ (Á) function using a small δ will lead to many local minima, the solution of minimizing G δ (Á) function converges to the minimum rank solution as δ goes to zero [30] as the red curve shown in Fig 1. In order to avoid trapping in local solutions, we initialize a large δ and gradually decrease δ to improve the degree of approximation for the rank. With the decreasing δ, the rank can be better approximated by SRF Eq (8). This technique refines the minimizer of the SRF minimization problem by considering the minimizer of the previous iteration (large δ) as the new initial point of current iteration (small δ), which makes the solution of the SRF minimization problem get closer to the minimum rank solution during the iterations. Fig 1 intuitively illustrates the comparison of SRF, the rank, the nuclear norm and logdet function in the case of a scalar. We observe that SRF can better approximate the rank than the nuclear norm and logdet function. Therefore, we speculate that the problem of minimizing SRF toward rank minimization problem could achieve better performance. Then, we consider the SRF function as the low-rank regularization for CS image recovery.
For the low-rank matrix L i , the rank minimization problem can be replaced to which can be reformulated with an appropriate parameter λ, Obviously, the problem (10) is smoothed and differentiable, and we can adopt the gradient descent method to solve it. For each matrix X i , the same method can be used to obtain the corresponding low-rank matrix L i . For CS image reconstruction problem, based on above presented G δ (L i ), we propose the global model as follows: where η is a regularization parameter and is the matrix constituted by the set of similar patches for each exemplar patch x i . The model (11) exploits the nonlocal sparsity of the image patches and non-convexity of the SRF function G δ (Á). Therefore, we conjecture the proposed method can achieve the good performance. To solve the model (11), we develop an efficient alternative minimization method in the next section.

Optimization algorithm
It is difficult to directly solve the global model (11) since x and L i is coupled. We use the alternative minimization method to decouple x and L i as follows: • For x-subproblem, its optimization problem is In the following subsections we will present the optimization algorithm to solve subproblems Eqs (12) and (13) in detail.

Low-rank matrix optimization algorithm
Then the low-rank matrix L i can be obtained via the gradient descent method, i.e., L ðkþ1Þ [30] as follows: where matrixes U and V and singular values σ i (i = 1, Á Á Á, ℓ) come from the singular value decomposition (SVD) of L i , which is obtained in previous iteration, Then, it is easy to derive the gradient of the function f(L i ), According to the gradient descent method, at each iteration, one has In addition, the step size μ in formula (17) should be set in a decreasing order to return a better result. Following the papers [30,31], we set μ = δ 2 to decrease μ proportional to δ 2 . Then, above formula (17) incorporating the Eq (16) can be rewritten as where ξ = 1 − 2δ 2 .

Image reconstruction via alternating direction method of multipliers
After obtaining the low-rank matrix L i , we reconstruct the image x via solving the problem (13). It is clear that the problem (13) is a quadratic optimization problem with a closed-form solution, However, the inverse of the matrix ðF H F þ Z P i P T i P i Þ is very large and difficult to compute. Therefore, we consider to reconstruct the image x in the framework of ADMM, which also leads to the closed-form solutions for each subproblem. The ADMM method is often used to image restoration [32][33][34][35][36].
Based on the definition of ADMM, we present the augmented Lagrangian function of the problem (13) as follows: where z = x is the auxiliary variable, d is the Lagrangian multiplier, and β > 0 is a scalar. With respect to the variables x, z and d, they are decoupled in the framework of ADMM, thus, can be solved separately, leading to the following iterations: x ðkþ1Þ ¼ arg min x jjy À Fxjj Clearly, both subproblems Eqs (21) and (22) are quadratic optimization problem and have the closed-form solutions. For the subproblem Eq (21), its explicit solution is: For the subproblem Eq (22), according to its first-order derivation for x, we can derive the following equation: Considering measurement matrix F is a partial Fourier transform matrix, we can transform above problem from image space to Fourier space to efficiently obtain x. Concretely, we let F = DF, where D is the down-sampling matrix and F is the fourier transform matrix. It is easy to achieve and then x (k+1) can be drawn We simultaneously obtain the low-rank matrix L i and the image x by the alternative minimization method, and the overall procedure is summarized below as Algorithm 1. Empirically we have found that Algorithm 1 is convergent, but in theory the convergence analysis of Algorithm 1 is difficult to give due to the non-convex subproblem. Although there are some papers have proved the convergence of their non-convex optimization problem, these proofs hold in a few unrealistic and rigorous assumptions. Specially, to save computational complexity, we set J = K = 1 in Algorithm 1. Moreover, we estimate an image b x using the discrete cosine transform (DCT) method as the initial solution for a better initial point, which has been seen in [22]. As iteration increases, the high accuracy results will be achieved.
The complexity of Algorithm 1 is O([T s + log N]N) (N is the total number of image pixels and T s is the average complexity to compute similar patches per exemplar patch), which is mainly generated by the DCT method in 1 step of initialization and the Fourier transform in 4 (b) step of inner loop. The complexity of SVD in 3 (a) step of inner loop, i.e. O(n × m 2 ), can be ignored due to n, m ( N. Therefore, the proposed method is practical feasible and promising.

Numerical results
Here, we present the experimental results of our approach for CS image reconstruction based on the SRF regularization. We generate the CS measurements by random and pseudo-radial sampling of the Fourier transform coefficients of test images respectively. The number of measurements is M = rate Ã N, where rate is the sampling rate. We use peak signal to noise ratio (PSNR) and Structural SIMilarity (SSIM) index [37] as the quantitative measures in our numerical experiments, and the PSNR is defined as where MAX F is the maximum possible pixel value of the image and MSE is the mean squared error, defined as ½f ðjÞ À gðjÞ 2 ; where f and g are the original image and the restored image, respectively. In all experiments, the main parameters of the Algorithm 1 are set as follows: patch size ffiffiffi n p ¼ 6; the number of similar patches for each exemplar patch m = 45 (the more similar patches form a more low-rank matrix but lead to high computational complexity); initialize value d ð0Þ ¼ d, where d is a constant around two times of the largest singular value of initial L i [30]; the parameter c is set as 0.08 experimentally in formula δ (j) = cδ (j − 1) ; the outer loop iteration number S = 400 (that is selected based on the convergence rate of Algorithm 1). To achieve better performance, we select exemplar patch in each 5 pixels along both horizontal and vertical directions.
In addition, the total variation (TV) method [38], the BM3D based CS method (BM3D-CS) [16], the nuclear norm method (denoted as NLR-CS-baseline) [22] and the logdet function method (denoted as NLR-CS) [22] for CS image recovery are compared with the proposed SRF approach (called as SRF-CS). The TV method just considered the underlying sparsity of an image, and the BM3D-CS method used the nonlocal sparsity with an outstanding performance. The NLR-CS-baseline method exploited the nuclear norm to replace the rank for solving the rank minimization problem. The NLR-CS approach adopted the logdet function to achieve a more accurate surrogate than the nuclear norm for the rank. The source codes of above methods [16,22,38] are publicly downloaded from the authors's websites. These methods are significantly state-of-the-art CS algorithms for image recovery. We have carefully tuned their parameters to obtain the best results for fair comparison. Both noiseless and noisy experiments are presented to demonstrate the performance of the proposed approach for CS recovery. The test images (256 × 256) are exhibited in Fig 2, where Fig 2(a), 2(c), 2(e) and 2(f) can be publicly downloaded from the test image net http://decsai.ugr.es/cvg/dbimagenes/g256.php, Fig 2(b), 2(d) and 2(g) can be publicly downloaded from the author's homepage of [22] http://see. xidian.edu.cn/faculty/wsdong/NLR_Exps.htm.

Noiseless experiments for CS recovery
We reconstruct the images from less M = 0.05N and M = 0.1N measurements by randomly sampling the Fourier transform coefficients of the test images. Fig 3 shows the restored images by TV [38], NLR-CS-baseline [22], NLR-CS [22] and the proposed methods for CS image recovery. The TV method for CS recovery can't work well due to less measurements. The close-ups of Number image and Book image obtained by the TV method are indistinct and lose much structured information. The NLR-CS-baseline method gives rise to some artificial shadow like fog as shown in the close-up of Number image. It is well known that the NLR-CS method is a very competitive method for CS recovery due to its remarkable performance. For these test images in Fig 3, however, the performance of the NLR-CS method is undesirable. Obviously, our approach achieves the best visual quality and highest PSNR value among all test methods. Noteworthily, the better performance of our approach compared with the NLR-CS-baseline method verifies that the SRF function Eq (8) is a more accurate surrogate for the rank than the nuclear norm. Moreover, Table 1 displays the PSNR values and the SSIM values obtained by all test methods.
Although the proposed method leads to a non-convex optimization problem, we can see that the proposed method for CS image recovery converges in a few iterations in Fig 4. In the following subsection, we will verify that the proposed method is also robust in a noisy situation.

Noisy experiments for CS recovery
In the noisy experiments, we add Gaussian noise with 0.05 standard deviation to the original images. Firstly, we recover the Head image from M = 0.15N measurements and the Boat image from M = 0.2N measurements by randomly sampling their Fourier transform coefficients respectively. The restoration results by the TV [38], BM3D-CS [16], NLR-CS-baseline [22], NLR-CS [22] and the proposed methods are presented in Fig 5. Clearly, the restored Head image becomes blocky obtained by the TV method, and contains heavy noise obtained by the NLR-CS-baseline method. Meanwhile, some edges in the close-up of the restored Head image achieved by the NLR-CS method are unclear, and the close-up of Head image generated by the BM3D-CS method is blurring. However, compared with other methods, the reconstructed   Table 1. The PSNR (dB) and SSIM values of the restoration results achieved by the TV method [38], the NLR-CS-baseline method [22], the NLR-CS method [22] and the proposed method for noiseless images. Head image by the proposed approach returns the clearer edges and less noise. For the Boat image, we can see that the proposed approach also outperforms the other test methods with the better visual quality, the higher PSNR and SSIM values. Thus the proposed method with less measurements is robust in noisy situation. Secondly, we recover the Boy image from M = 0.24N (65 radial lines) measurements and the Barbara image from M = 0.29N (80 radial lines) measurements by pseudo-radial sampling their Fourier transform coefficients. The pseudo-radial sampling way generates streaking artifacts leading to more difficult image reconstruction than randomly sampling way. We illustrate the performance results of all the test methods in Fig 6. Visually, the close-ups of Plane image generated by the TV, NLR-CS-baseline, NLR-CS and BM3D-CS methods are blurring. Nevertheless, the close-up of Plane image obtained by our approach is good with less noise and higher PSNR value. The close-ups of Barbara image achieved by the NLR-CSbaseline method and the NLR-CS method create some artificial strips, which do not exist in the original Barbara image. The bad performance may be caused by the pseudo-radial sampling way or noise. The texture of restored Barbara image by BM3D method is lost. However, the proposed approach removes the artifacts and a lot of noise returning a better result, which demonstrates its robustness. In addition, Table 2 displays the PSNR and SSIM values obtained by all test methods. To sum up, the proposed approach outperforms other competing methods for CS image reconstruction by both randomly sampling and pseudo-radial sampling schemes.

Conclusion
To better exploit the nonlocal sparsity of similar patches and non-convexity of rank minimization, in this paper, we use the non-convex SRF function surrogating the rank as a low-rank regularization for CS image recovery. This SRF function can better approximate the rank than the standard nuclear norm and the logdet function. We propose an efficient algorithm in the framework of alternative minimization method, which divides this CS problem into the SRF minimization subproblem and the least square subproblem. With respect to the minimization subproblem of the SRF function, we adopt the gradient descent method to solve it since it is differentiable. Simultaneously, the clear image is reconstructed by solving the least square subproblem using the ADMM method. Both noiseless and noisy numerical experiments demonstrate that the proposed approach achieves the better performance and vision quality under the lower sampling rate situation. In the future, we would like to explore better surrogates for the rank to improve performance, and solve other practical problems.