Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression

Guangwei Gao; Jian Yang; Xiaoyuan Jing; Pu Huang; Juliang Hua; Dong Yue

doi:10.1371/journal.pone.0159945

Abstract

In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.

Citation: Gao G, Yang J, Jing X, Huang P, Hua J, Yue D (2016) Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression. PLoS ONE 11(8): e0159945. https://doi.org/10.1371/journal.pone.0159945

Editor: Zhaohong Deng, Jiangnan University, CHINA

Received: April 6, 2016; Accepted: July 11, 2016; Published: August 15, 2016

Copyright: © 2016 Gao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data have been deposited to Figshare (http://dx.doi.org/10.6084/m9.figshare.3486989).

Funding: This work was partially supported by the National Science Fund for Distinguished Young Scholars under Grant Nos. 61125305, 91420201, 61472187, 61233011 and 61373063, the National Natural Science Foundation of China under Grant Nos. 61502245, 61503188 and 61503195, the NUPTSF under Grant Nos. NY214204 and NY214165, the China Postdoctoral Science Foundation No. 2015M571786, the Natural Science Fund for Colleges and Universities in Jiangsu Province No. 15KJB520026, the Natural Science Foundation of Jiangsu Province under Grant Nos. BK20150849 and BK20150982, the Key Project of Chinese Ministry of Education under Grant No. 313030, the 973 Program No. 2014CB349303, Fundamental Research Funds for the Central Universities No. 30920140121005, and Program for Changjiang Scholars and Innovative Research Team in University No. IRT13072.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Object classification is an active topic in the area of pattern recognition [1–8]. Due to the advantages of non-intrusive natural and pronounced uniqueness, face recognition has been an active research topic and has been incorporated into many multimedia applications [9–14], such as surveillance, human machine interaction, access control and photo album management in social networks. Recently, linear regression based face recognition approaches have led to state-of-the-art performance [15–18], with representative examples being sparse representation-based classification (SRC) [15] and linear regression-based classification (LRC) [16]. In SRC, the query sample image is coded as a sparse linear combination of all the training images, and then the classification is made by checking which class yields the least reconstruction error. Many works of SRC have been developed for vision applications, e.g., super-resolution [19, 20], facial expression recognition [21] and human gait recognition [22, 23]. Alternatively, Naseem et al. [16] proposed LRC for face recognition. Based on the assumption that samples from a specific object class lie on a linear subspace, LRC represents a query image as a linear combination of training images of each class. Yang et al. [24] provided an insight into SRC and sought reasonable supports for its effectiveness. They viewed the L₁-regularizer as having two properties, sparseness and closeness. Sparseness determines a small number of nonzero representation coefficients, and closeness makes the nonzero representation coefficients concentrating on the training samples having the same class label as the test sample. Zhang et al. [18] discussed the working mechanism of SRC and demonstrated that it is collaborative representation rather than L₁-norm sparseness that improves the classification performance. In their work, a collaborative representation-based classification (CRC) model was presented with a squared L₂-regularization, which achieves competitive classification performance but with significantly lower complexity than the sparse representation method.

It is worth noting that the majority of studies assume that the testing images are taken under well-controlled settings (e.g., reasonable illumination, poses and variations, without occlusion or disguise). Their performance is degraded when the testing images are contaminated. By introducing an identity matrix I as a dictionary to code the outliers (e.g., corrupted or occluded pixels), SRC [15] exhibits excellent robustness and promising performance. However, SRC is not robust to contiguous occlusion such as sunglasses or scarves, as the occlusion level exceeds the breakdown point determined by this algorithm. Yang et al. [25] modified the SRC framework for handling outliers such as occlusions in face recognition by modeling the sparse coding as a sparsity-constrained robust regression problem. He et al. [26] unified the algorithms for error correction and detection by using the additive and multiplicative forms, respectively, and established a half-quadratic framework to solve the robust sparse representation problem. From the viewpoint of dictionary learning, Yang et al. [27] constructed a feature pattern dictionary that captures structured information and prior knowledge of image features to represent the unknown feature pattern weight of a query image. Similarly, Ou et al. [28] developed a clear and noise dictionary simultaneously and applied the learned clear dictionary for classification. Observing the distribution of the reconstruction error image, Yang et al. [29–32] used the nuclear norm to characterize the structural information of an error image and proposed a nuclear norm-based matrix regression model that has achieved state-of-the-art performance for face recognition with occlusion and illumination changes.

In spite of aforementioned tremendous achievements, the small sample size (SSS) problem till remains one of the most fundamental and challenging issues in face recognition community. In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the available training samples per subject may be very limited [33]. Thus, the performance of these regression-based methods is greatly degraded because the query sample cannot be well represented by the few training samples. To tackle the SSS problem, many efforts have been made in the past few decades. Existing methods mainly fall into three categories. The first are patch-based methods, which generally contain steps of local patch representation, local feature extraction and the combination of classification results [34–36]. However, the patch size has a great impact on the output performance in patch-based methods [37, 38]. The second integrate the local and global features for classification [39, 40] because they can provide complementary information for final results. The third employ different feature extractors to extract multiple types of features, and then utilizes decision level fusion scheme for final classification [41, 42]. We mainly focus on patch-based method in the sequel.

To improve the recognition performance of matrix regression in SSS problem and preserve its outstanding ability dealing with occlusion and illumination changes, in this paper, we propose performing matrix regression on patches. The so-called patch-based matrix regression (PMR) classifies each query matrix patch, and then integrates the recognition outputs of all patches for final decision. Nevertheless, the patch size plays an important role on the final performance in PMR, and the optimal patch size varies greatly across different databases. If the patch size is too small, little information is given, and the method cannot capture the geometric structure of the image; if it is large, the information that can be used is limited. To fully exploit the classification ability and appearance information of different patch sizes, we then devise a multi-scale PMR (MSPMR) scheme by integrating the complementary information from different scales. MSPMR first performs PMR on each scale and then learns optimal scale weights to adaptively fuse multi-scale outputs. To evaluate the performance of the proposed method, we use four databases that involve different recognition tasks: the Extended Yale B, AR and LFW dataset for face recognition without occlusion, the AR database for face recognition with real disguise, and the Extended Yale B dataset for face recognition with block occlusion. The experimental results demonstrate the effectiveness and robustness of the proposed method.

The remainder of the paper is organized as follows. Section 2 briefly reviews two related works. The proposed multi-scale PMR via margin distribution optimization is presented in Section 3. Section 4 conducts extensive experiments, and Section 5 concludes this paper.

Related Works

1. Nuclear norm based matrix regression

By observing the distribution of the reconstruction error image, a nuclear norm-based matrix regression (NMR) [29] model was proposed that uses the nuclear norm to characterize the whole structure of the error image. Here, we define N_i as the number of images from the i-th class and as the total number of training samples from c classes. Given a set of N training image matrices A₁, A₂, …, A_N∈ℜ^row×col and a query image matrix B∈ℜ^row×col, the NMR model can be represented as (1) where λ is the regularization parameter, x and A(x) = x₁A₁+x₂A₂+…+x_NA_N are the representation coefficient vector and the reconstructed image, respectively. Then, the query image can be classified into the class that yields the minimal reconstruction error, i.e., (2) where where x* is the optimal solution of Eq (1) and δ_i(x) is a vector whose only nonzero entries are the entries in x that correspond to Class i. We know that NMR is much more robust and effective for face recognition, particularly with respect to occlusion and illumination changes.

2. Patch-based CRC

Suppose that we have c known pattern classes. Let X_i = [x_i1, x_i2, …, x_iN] be the matrix formed by the training samples of class i, where N_i is the number of training samples of class i. Let X = [X₁, X₂, …, X_c]∈ℜ^M×N be the dataset of all training samples, where . To alleviate the performance degradation of CRC in the small sample size problem, the patch-based CRC (PCRC) [36] model was proposed. For a given query image y, it is first divided into multiple overlapped patches {y₁, y₂,…, y_p}. Then, each patch y_i is tackled by representing it as a linear combination over a local dictionary D_i. Finally, one can employ the plurality or linear-weighted combination scheme to the recognition outputs for a final decision.

For each patch y_i, its representation weights can be obtained by minimizing the following error: (3) Where D_i = [D_i1, D_i2,…, D_ic] denotes the local dictionary located with the same position as y_i, D_ik is the sub-dictionary of the k-th class. The recognition result of patch y_i is Identity(y_i) = argmin_k[43], where , is the coefficients associated with the k-th class.

For clarity, four key components (i.e., muti-scale trick, local patch strategy, structure error and pixel error characterization) of several related methods are compared in Table 1.

Download:

Table 1. Comparisons of related works.

https://doi.org/10.1371/journal.pone.0159945.t001

Multi-Scale Patch-Based Matrix Regression (MSPMR)

1. Motivation

In PCRC, each local patch matrix is first converted to a vector, and then the L₂-norm is used to characterize the reconstruction error. However, the L₂-norm (or L₁-norm) is based on pixel values and thus ignores the structural information of the error image. Fig 1 shows an example where the error between (b) and (a) is shown in (c). By re-arranging the pixels of image (c), we can obtain image (d). The following observations can be made:

The nuclear norm can better characterize the structure error than the L₁ or L₂ norm. For example, the L₂-norm (or L₁-norm) value of image (c) is the same as that of image (d), so it is difficult to distinguish between them. Fortunately, the nuclear norm values of images (c) and (d) are 47.75 and 58.14, respectively.
From the distribution perspective, we can observe that the distribution of the error image does not follow a Laplacian or Gaussian distribution in Fig 1e. Fortunately, it can be seen from Fig 1f that the singular values of the error image (c) fit the Laplacian distribution well. We know that the nuclear norm is the sum of all singular values of a matrix, which can also be considered as the l₁-norm of the singular value vector. Based on the above example, we believe that the nuclear norm is more suitable to describe the structural error.

Download:

Fig 1. (a) Original image; (b) observed image; (c) error image; (d) rearranged error image; (e) distributions of error image; and (f) distributions of singular values of error image.

https://doi.org/10.1371/journal.pone.0159945.g001

2. Patch-based matrix regression (PMR)

To make the model robust and efficient for face recognition with occlusion and illumination changes, matrix regression [29, 30, 32] was proposed using the nuclear norm to characterize the structure of the error image. In our patch-based matrix regression, all local patches are denoted in matrix form. Given a set of N local patches X_i1, X_i2,…,X_iN∈R^p×q and a query patch Y_i∈R^p×q located at position i, Y_i can be represented linearly using X_i1, X_i2,…,X_iN, i.e., (4) where F(α_i) = α_i1X_i1+α_i2X_i2+…+α_iNX_iN, α_i = (α_i1, …, α_iN)^T is the representation coefficient vector and E_i is the representation error. Generally, α_i can be determined by the following regularized model (5) where ||·||_* denotes the nuclear norm (the sum of the singular values) on R^p×q.

The problem is equivalent to (6)

Problem (6) can be solved by the alternating direction method of multipliers (ADMM), which minimizes the following augmented Lagrangian function: (7)

That is, (8)

The entire algorithm is briefly summarized in Algorithm 1, which mainly consists of two steps: a soft-thresholding operator [44] and a singular value thresholding operator [45].

Based on the optimal solution α_i*, we can obtain the reconstructed image of Y as . Let δ_k: Rⁿ→Rⁿ be the characteristic function that selects the coefficients associated with the k-th class. For α∈Rⁿ, δ_k(α) is a vector whose nonzero entries are the entries in α that are associated with Class k. Using the coefficients associated with the k-th class, one can obtain the reconstruction of Y_i in Class k as .

Algorithm 1. Solving problem (6) via ADMM

Input: A set of N patches X_i1, X_i2,…,X_iN∈R^p×q and a query patch Y_i∈R^p×q, parameters λ and μ, the termination condition parameter ε.

1: Fix the others and update by

2: Fix the others and update by

3: Fix the others and update by

4: Update the multipliers

5. If the termination condition is satisfied, go to 6; otherwise go to 1.

6. Output: Optimal coding vector x^k+1.

The corresponding class reconstruction error is defined as (9)

The recognition output z_j of query patch Y_i is denoted as Identity (Y_i) = argmin_k{e_ik(Y_i)}.

Then we can combine the classification outputs of all patches by linear weighted combination [37], probabilistic model [40], kernel plurality [34] or majority voting [35]. In this paper, we use majority voting in the final decision making. Fig 2 shows the main diagram of the patch-based matrix regression for face recognition.

Download:

Fig 2. Diagram of patch-based matrix regression for face recognition.

https://doi.org/10.1371/journal.pone.0159945.g002

3. Multi-scale ensemble

From the previous introduction of PMR, we can find that the patch size plays an important role on the final output performance. In addition, how to set an optimal scale in advance for various databases remains unclear. Fig 3 exhibits the recognition rates curves versus different training sample sizes and patch sizes on the LFW and Extended Yale B databases, respectively. From Fig 3, the following observations can be made. First, the optimal patch size varies greatly between different databases. Second, the optimal patch size also varies a lot under the variation of the training sample size per person. To tackle the aforementioned difficulties, the recognition outputs of multi-scale PMR can be fused optimally; thus, the complementary information from different scales can be fully applied to further enhance the recognition performance. Motivated by [36], we also incorporate the ensemble learning scheme into our method to integrate multi-scale outputs.

Download:

Fig 3. Impact of patch size on PMR (1–5 denote the training sample size per subject).

https://doi.org/10.1371/journal.pone.0159945.g003

The diagram of the proposed multi-scale PMR is shown in Fig 4. In the following text, we first formulate the multi-scale ensemble problem, and then introduce a margin distribution optimization to obtain the optimal solution.

Download:

Fig 4. Flow chart of multi-scale learning for PMR.

https://doi.org/10.1371/journal.pone.0159945.g004

Problem formulation.

Suppose that we have two scales and two sample classes labeled +1 and -1. For any query sample, we can obtain its classification result +1 or -1 on each scale. Thus, each sample will have four possible classification results on these two scales, such as {-1,+1}, {+1,+1}, {-1,-1} and {+1, -1}. Given an available training data set, our goal is to learn a classification function f, which can make it possible to classify all the given samples exactly.

Given a sample set S = {(x_i,z_i)} (i = 1, 2, …, n, z_i is the label of x_i) and s scales, the classification results of samples x_i on these s scales can form a space H∈R^n×s. Denote by w = [w₁, w₂, …, w_s] the scale fusion vector, and .

Definition 1 [36]: For multi-class classification tasks, the classification results of a given query sample x_i∈S on these s different scales are denoted as [4], j = 1, 2, …, s. Then the decision matrix D = {d_ij}, i = 1, 2, …, n, j = 1, 2, …, s can be defined as (10) where z_i is the label of the sample x_i.

Definition 2 [36]: For a query sample x_i∈S, [4] (j = 1, 2, …, s) are the classification results on these s different scales. Then the ensemble margin of x_i is denoted as (11)

The ensemble loss of x_i can be denoted as [36] (12) Where ε(x_i) is the ensemble margin of sample x_i.The square loss applied in CRC [18], SRC [15] and least square regression [16] can be used here to evaluate the ensemble loss. For a sample set S, its ensemble square loss can be formulated as (13) where e₁ is a column vector whose entries are 1.

Algorithm of MSPMR.

In order to obtain the optimal scale fusion weights, the ensemble square loss in Eq (13) should be minimized. Nevertheless, the solutions may be non-unique for this linear system. Intuitively, we should impose constraint on the objective function in Eq (13) to make the solution unique and stable. Also, Shawe-Taylor [46] has provided the bound on the generalization error and pointed out that both the norm of w and the ensemble square loss should be optimized simultaneously to enhance the generalization ability.

As in [36], the following constrained l₁-regularized least square optimization can be used to obtain the optimal scale weights [47]: (14) where τ is a regularization parameter and the regularization term can help to achieve a stable solution. Constraint can be converted to e₂w = 1, where e₂ = [1,1, …,1] is a row vector whose length is s. Then, we have (15)

Denoted by and , then we have [36] (16)

Algorithm 2. Algorithm of multi-scale ensemble learning for PMR

1: Choose s patch scale δ = [δ₁, δ₂, …, δ_s];

2: Obtain the recognition outputs [] by PMR;

3: Obtain the decision matrix

4: Learn the fusion weights

Duo to the fact the decision matrix is usually very small, the scale fusion weights w can be simply obtained by some commonly used l₁-minimization solvers. In our method, l₁_ls [48] is employed. Based on the above description, the proposed multi-scale PMR (MSPMR) scheme is summarized in Algorithm 2. Once the optimal scale fusion weights are achieved, the recognition output for an arbitrary sample x_i can be represented as .

4. Computational complexity

In this subsection, we will evaluate the computational complexity of the proposed method. Since the multi-scale fusion weights can be learned off-line, we only discuss the computational complexity of the on-line recognition process involved in the proposed method. As illustrated in Algorithm 2, the proposed face recognition method takes major cost on patch-based matrix regression process. We observe that there are four factors affecting the cost in our method: the training sample size N, the dimension of one patch m = p×q, the number of iterations k in Algorithm 1, the patch number in one image M.

As described in [30], the matrix regression of each patch takes O(k(m^1.5+mN+N²)) (in the case that p = q) cost. For M image patch, the computational cost is O(k(m^1.5+mN+N²)M). In addition, the scale number s also affect the final running time. Therefore, the computational cost of the proposed method is about O(sk(m^1.5+mN+N²)M). In Section 4, we will further compare the proposed method with the state-of-the-art approaches in terms of CPU runtime.

Experimental Results and Discussion

In this section, we conduct experiments on the benchmark face databases and the proposed method is compared with state-of-the-art models. For each method, we perform 20 runs of test on each database, and the average recognition rates and the corresponding standard deviations are reported. As in [36], seven scales are adopted in our MSPMR, and the patch sizes are 4×4, 6×6, 8×8, 10×10, 12×12, 14×14 and 16×16. In single scale-based PNN, PSRC, PCRC and PMR, the patch size is 10×10, and the patches overlap with their neighbors by 5 pixels. For PMR and MSPMR, we choose the optimal λ∈[0.01,0.1]. Parameter τ is set as 0.1 for MSPMR. It should be mentioned here that all experiments are done on the original face images, without any feature extraction or image preprocessing step. Some face image datasets were used in this paper to verify the performance of our methods. These face image datasets are publicly available for face recognition research, and the consent was not needed. The face images and the experimental results are reported in this paper without any commercial purpose.

As in [36], to learn the optimal scale weights, the training set is divided into subset1 (one image per person) and subset2 (the reminder of the training set). Then, PMR is used to classify samples from subset1 utilizing subset2 as the gallery set and the optimal weights on the seven scales can be learned. It should be noted that at least two samples per person are required to find the optimal scale fusion weights.

1. Face recognition without occlusion

In this subsection, we test the MSPMR for face recognition without occlusion on four face databases (Extended Yale B [49] and AR [50] in controlled environments together with the LFW [36] in uncontrolled environments). The baseline CRC [18], SRC [15], NSC [30], and patch-based methods including PNN [34], BlockFLD [38], Volterra [35], PCRC and MSPCRC [36] are used for comparison.

Extended Yale B database.

The first experiment was conducted on the Extended Yale B database; it includes 38 human objects in 9 poses under 64 illumination changes [49]. 64 images of a person with a particular pose are acquired at a camera frame rate of 30 frames per second, so the variations in the head pose and facial expression are small. All the frontal images marked with P00 are utilized in this experiment, and each is rehsaped to 32×32. Some examples are shown in Fig 5. For each subject, 2~5 samples are randomly selected from the first 32 images for training, and another 5 samples are randomly chosen from the rest 32 images for testing. Table 2 tabulates the experimental results.

Download:

Fig 5. Sample images of a person under various illumination conditions in the Extended Yale B database from different sessions.

https://doi.org/10.1371/journal.pone.0159945.g005

Download:

Table 2. Recognition rates (%) on the Extended Yale B database.

https://doi.org/10.1371/journal.pone.0159945.t002

It can easily be seen that MSPMR obtains the best recognition performance for all tests. Compared with PCRC and MSPCRC, PMR and MSPMR lead to much better results, thus verifying the effectiveness of characterizing the reconstruction error by the nuclear norm.

AR database.

The AR database [50] gathers over 4,000 color face images from 126 subjects, containing frontal facial images with different lighting conditions, facial expressions and occlusions. Pictures of 120 subjects were taken in two sessions (separated by two weeks), and each has 13 color images. As in [18], in this experiments, we choose a subset with only illumination and expression changes, which includes 50 male objects and 50 female objects. Fourteen face images (seven from each session) of these 100 individuals are selected and used. For each object, 2~5 samples from session 1 are randomly chosen for training, and another 3 samples from session 2 are randomly chosen for testing. All the images are manually cropped and then resized to 32×32 pixels. Some sample images of one person are presented in Fig 6.

Download:

Fig 6. Sample images of one person from the AR database.

https://doi.org/10.1371/journal.pone.0159945.g006

The recognition results of different methods are listed in Table 3. The proposed methods always achieve better performance than the other methods. We can observe that in AR database, multi-scale ensemble learning in MSPMR leads to limited improvement over PMR. As described in [36], the reason may be that in this database, the average weight value for the scale 10×10 is approximately 0.9, indicating that patch size 10×10 is a proper choice for PMR in the AR database.

Download:

Table 3. Recognition rates (%) on the AR database.

https://doi.org/10.1371/journal.pone.0159945.t003

LFW database.

Labeled Faces in the Wild (LFW) [43] is a large-scale database of face photographs designed for unconstrained face recognition with variations in pose, illumination, expression, misalignment and occlusion; it contains images of 5,749 subjects. LFW-a is an extension of LFW after a commercial face alignment software is applied. As in [36], the objects who have more than ten samples are gathered to form a dataset with 158 objects from LFW-a. All the images are manually cropped and then resized to 32×32 pixels. Fig 7 shows some sample images from this database. For each subject, we randomly choose 2~5 samples for training and another 2 samples for testing.

Download:

Fig 7. Sample images of the LFW database.

https://doi.org/10.1371/journal.pone.0159945.g007

Table 4 shows the face recognition of each method on the LFW dataset. From Table 4, we can clearly see that the performance of our PMR and MSPMR are superior to that of all other methods. Meanwhile, the recognition performance is greatly improved by MSPMR.

Download:

Table 4. Recognition rates (%) on the LFW database.

https://doi.org/10.1371/journal.pone.0159945.t004

2. Face recognition with occlusion

In the following experiments, we evaluate the robustness and effectiveness of the proposed method when face images encounter with different occlusions, like real disguise or block occlusion. In this subsection, our method is compared with CRC [18], SRC [15], NSC [30], HQ_A and HO_M [26], PSRC [15], PCRC and MSPCRC [36].

Face recognition with real disguise.

As in [29, 32], a subset of the AR face database is applied, containing 50 males and 50 females. Each face image is manually cropped and normalized to a size of 42×30. Fig 8 shows the sample images for one person from the AR database. In our experiment, for each individual, the first four images (with various facial expressions) from session 1 and session 2 are chosen to form the training set. Two image sets with sunglasses and scarves are used for testing, each of which includes 600 images (three images per session of each individual). For each individual, 2~5 samples are randomly chosen from the training set and another 3 samples from the testing set to evaluate the performance of each method.

Download:

Fig 8. Training and testing images of a person in the AR database.

https://doi.org/10.1371/journal.pone.0159945.g008

The recognition results of each method are shown in Tables 5 and 6, from which we can see that the patch-based method achieves better performance than the corresponding original holistic ones. PMR also gives better results than PCRC or MSPCRC. MSPMR obtains the best performance among all the competing methods when testing images with sunglasses and achieves comparable results when testing images with scarves.

Download:

Table 5. Recognition rates (%) on the AR database testing with sunglasses.

https://doi.org/10.1371/journal.pone.0159945.t005

Download:

Table 6. Recognition rates (%) on the AR database testing with scarves.

https://doi.org/10.1371/journal.pone.0159945.t006

Face recognition with block occlusions.

In this subsection, we evaluate the robustness of our method against block occlusions. We adopt Subsets 1 and 2 of the Extended Yale B database for training and Subset 3 for testing. All the face images are normalized to 48×42 pixels. The testing images are corrupted by a randomly located square block of a “baboon” image with an occlusion level of 40%. Fig 9 shows the training and testing sample images for one person from the Extended Yale B database. For each individual, 2~5 samples are randomly chosen from the training set and another 5 samples from the testing set to evaluate the performance of each method.

Download:

Fig 9. Training (the first row) and testing (the second row) sample images of a person in the Extended Yale B database.

https://doi.org/10.1371/journal.pone.0159945.g009

The face recognition results of each method are tabulated in Table 7. We can see that by characterizing the reconstruction error with the nuclear norm, NSC overall outperforms CRC, SRC, HQ_A and HQ_M. By virtue of the patch trick, our PMR always outperforms PCRC and PSRC. By incorperating the multi-scale ensemble learning trick, the proposed MSPMR achieves the best performance among all the competitive methods.

Download:

Table 7. Recognition rates (%) on the Extended Yale B database with block occlusion.

https://doi.org/10.1371/journal.pone.0159945.t007

3. Parameter discussion

In this subsection, we mainly discuss how the regularization parameter λ affects the performance of our PMR and MSPMR in different face recognition scenarios. The experimental settings are the same as in the aforementioned experiments in section 4.1 and 4.2 except that the training samples per person are fixed at 3. Fig 10 plots the recognition results of PMR and MSPMR versus the variation in the regularization parameter λ on different face image databases. We can observe that PMR and MSPMR always achieve their optimal or nearly optimal performance in the range of [0.01, 0.1]. Thus, we can set the regularization parameter of the proposed method in the above range for real-word scenarios.

Download:

Fig 10. Recognition rate curves of PMR and MSPMR versus the variations in the regularization parameter in different face recognition scenarios.

https://doi.org/10.1371/journal.pone.0159945.g010

4. Running time comparisons

In this subsection, the CPU runtime of the proposed method is compared with the state-of-the-art methods. The compared results on the AR face database testing with scarves are listed for demonstration. For each individual, 3 samples are randomly chosen for training and another 3 samples for testing. Table 8 tabulates the CPU time cost on all test images conducted using Matlab R2012b on an Intel Core 8 CPU with 3.6 GHz and 8G memory PC at Windows platform. Since the singular value shrinkage operator in matrix regression, the proposed method consumes much more time compared with other methods. Although the patch-based methods have achieved promising results, they come at the cost of expensive running time. Thanks to the independence of the recognition process of each test image, we can hope to save the cost by parallel computation.

Download:

Table 8. Comparisons of CPU time on AR face database testing with scarves.

https://doi.org/10.1371/journal.pone.0159945.t008

5. Evaluation of the experimental results

The aforementioned experimental results have shown that the proposed method always obtain better performance than some state-of-the-art methods. However, is this superiority statistically significant? In this subsection, we will assess the experimental results by the null hypothesis statistical test [51]. If the evaluated p-value is under the desired significance level (i.e., 0.05), the performance difference between compared approaches is deemed to be statistically significant. The evaluation results are summarized as follows:

For face recognition without occlusion, such as on LFW database, MSPMR outperforms MSPCRC significantly for all tests (p = 0.014, 0.013,0.016 and 0.020). On other database, although MSPMR performs better than other state-of-the-art methods, the performance discrepancies between MSPMR and other approaches are not statistically significant.
For face recognition with occlusion, MSPMR performs significantly better than other approaches in case of real disguise and block occlusion (p < 0.001).

Conclusions and the Future Work

To improve the performance of matrix regression in face of the small sample size problem and preserve the desired performance level in the presence of occlusion and illumination changes, in this paper, we proposed a patch-based matrix regression (PMR) method. PMR first performed matrix regression on each raw patch (without matrix-to-vector conversion), and then combined the recognition outputs of all patches by majority voting. However, it is difficult to pre-define an optimal patch size across different databases. Fortunately, the complementary information across multiple patch scales can be fully utilized to further enhance the recognition performance. To this end, we proposed the multi-scale version of PMR, i.e., MSPMR, to optimally combine the multi-scale outputs. Our extensive experimental results have demonstrated that the proposed methods are more effective and robust than the state-of-the-art methods.

Although our proposed method has obtained successful performance, there are still many issues to be addressed in future. Generally, two main improvements can be made for our method. (1) With the development of the storage device, many images can be collected from real-word applications. One challenge in our method is how to overcome the expensive computational cost. We will try to design efficient matrix regression algorithm to further improve the robustness and effectiveness of our method. (2) In our method, we have to pre-define several specific scale sizes in advance. However, different database may exhibit scale transformation in real-word applications. We can borrow the idea of scale selective local binary patterns [52] to design adaptive scale selection strategy to further improve the flexibility of our method.

Ethics Statement

Some face image datasets were used in this paper to verify the performance of our methods. These face image datasets are publicly available for face recognition research, and the consent was not needed. The face images and the experimental results are reported in this paper without any commercial purpose.

Author Contributions

Conceived and designed the experiments: GWG JY XYJ.
Performed the experiments: GWG PH JLH.
Analyzed the data: GWG JY XYJ DY.
Wrote the paper: GWG JY XYJ DY.

References

1. Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Information Sciences 295:395–406.
- View Article
- Google Scholar
2. Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems 26(7):1403–1416. pmid:25134094
- View Article
- PubMed/NCBI
- Google Scholar
3. Juang C-F, Chiu S-H, Shiu S-J (2007) Fuzzy system learned through fuzzy clustering and support vector machine for human skin color segmentation. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 37(6):1077–1087.
- View Article
- Google Scholar
4. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for v-support vector regression. Neural Networks 67:140–150. pmid:25933108
- View Article
- PubMed/NCBI
- Google Scholar
5. Deng Z, Choi K-S, Jiang Y, Wang S (2014) Generalized hidden-mapping ridge regression, knowledge-leveraged inductive transfer learning for neural networks, fuzzy systems and kernel methods. IEEE Transactions on Cybernetics 44(12):2585–2599. pmid:24710838
- View Article
- PubMed/NCBI
- Google Scholar
6. Gu B, Sheng VS (2016) A Robust Regularization Path Algorithm for v-Support Vector Classification. IEEE Transactions on Neural Networks and Learning Systems
- View Article
- Google Scholar
7. Jiang Y, Chung F-L, Ishibuchi H, Deng Z, Wang S (2015) Multitask TSK fuzzy system modeling by mining intertask common hidden structure. IEEE Transactions on Cybernetics 45(3):534–547.
- View Article
- Google Scholar
8. Gu B, Sun X, Sheng VS. Structural Minimax Probability Machine (2016) IEEE Transactions on Neural Networks and Learning Systems
- View Article
- Google Scholar
9. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Transactions on Circuits and Systems for Video Technology 14(1):4–20.
- View Article
- Google Scholar
10. Wong WK, Lai Z, Xu Y, Wen J, Ho CP (2015) Joint Tensor Feature Analysis For Visual Object Recognition. IEEE Transactions on Cybernetics 45(11):2425–2436. pmid:26470058
- View Article
- PubMed/NCBI
- Google Scholar
11. Yang M, Feng Z, Shiu SC, Zhang L (2014) Fast and robust face recognition via coding residual map learning based adaptive masking. Pattern Recognition 47(2):535–543.
- View Article
- Google Scholar
12. Gao G, Yang J, Wu S, Jing X, Yue D (2015) Bayesian sample steered discriminative regression for biometric image classification. Applied Soft Computing 37:48–59.
- View Article
- Google Scholar
13. Chen B, Shu H, Coatrieux G, Chen G, Sun X, Coatrieux JL (2015) Color image analysis by quaternion-type moments. Journal of mathematical imaging and vision 51(1):124–144.
- View Article
- Google Scholar
14. Yang W, Wang Z, Sun C (2015). A collaborative representation based projections method for feature extraction. Pattern Recognition 48(1):20–27.
- View Article
- Google Scholar
15. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust Face Recognition via Sparse Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2):210–227. pmid:19110489
- View Article
- PubMed/NCBI
- Google Scholar
16. Naseem I, Togneri R, Bennamoun M (2010). Linear Regression for Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 32(11):2106–2112. pmid:20603520
- View Article
- PubMed/NCBI
- Google Scholar
17. Wagner A, Wright J, Ganesh A, Zhou Z, Mobahi H, Ma Y (2012) Toward a practical face recognition system: Robust alignment and illumination by sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(2):372–386. pmid:21646680
- View Article
- PubMed/NCBI
- Google Scholar
18. Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, p. 471–478.
19. Gao G, Yang J (2014) A novel sparse representation based framework for face image super-resolution. Neurocomputing 134:92–99.
- View Article
- Google Scholar
20. Jiang J, Hu R, Wang Z, Han Z (2014) Noise robust face hallucination via locality-constrained representation. IEEE Transactions on Multimedia 16(5):1268–1281.
- View Article
- Google Scholar
21. Tawari A, Trivedi MM (2013) Face expression recognition by cross modal data association. IEEE Transactions on Multimedia 15(7):1543–1552.
- View Article
- Google Scholar
22. Lai Z, Xu Y, Jin Z, Zhang D (2014). Human gait recognition via sparse discriminant projection learning. IEEE Transactions on Circuits and Systems for Video Technology 24(10):1651–1662.
- View Article
- Google Scholar
23. Lai Z, Xu Y, Chen Q, Yang J, Zhang D (2014) Multilinear sparse principal component analysis. IEEE Transactions on Neural Networks and Learning Systems 25(10):1942–1950. pmid:25291746
- View Article
- PubMed/NCBI
- Google Scholar
24. Yang J, Zhang L, Xu Y, Yang J-y (2012) Beyond sparsity: The role of L 1-optimizer in pattern classification. Pattern Recognition 45(3):1104–1118.
- View Article
- Google Scholar
25. Yang M, Zhang L, Yang J, Zhang D (2011) Robust Sparse Coding for Face Recognition. 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp:625–632.
26. He R, Zheng W-S, Tan T, Sun Z (2014) Half-quadratic-based iterative minimization for robust sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(2):261–275. pmid:24356348
- View Article
- PubMed/NCBI
- Google Scholar
27. Yang M, Zhu P, Liu F, Shen L (2015) Joint representation and pattern learning for robust face recognition. Neurocomputing 168:70–80.
- View Article
- Google Scholar
28. Ou W, You X, Tao D, Zhang P, Tang Y, Zhu Z (2014) Robust face recognition via occlusion dictionary learning. Pattern Recognition 47(4):1559–1572.
- View Article
- Google Scholar
29. Yang J, Luo L, Qian J, Tai Y, Zhang F, Xu Y (2016) Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence preprints
- View Article
- Google Scholar
30. Luo L, Yang J, Qian J, Yang J (2014) Nuclear Norm Regularized Sparse Coding. 2014 22nd International Conference on Pattern Recognition (ICPR), IEEE, pp: 1834–1839.
31. Zhang F, Yang J, Tai Y, Tang J (2015). Double Nuclear Norm-Based Matrix Decomposition for Occluded Image Recovery and Background Modeling. IEEE Transactions on Image Processing 24(6):1956–1966. pmid:25667350
- View Article
- PubMed/NCBI
- Google Scholar
32. Qian J, Luo L, Yang J, Zhang F, Lin Z (2015). Robust nuclear norm regularized regression for face recognition with occlusion. Pattern Recognition 48(10):3145–3159.
- View Article
- Google Scholar
33. Tan X, Chen S, Zhou Z-H, Zhang F (2006) Face recognition from a single image per person: A survey. Pattern recognition 39(9):1725–1745.
- View Article
- Google Scholar
34. Kumar R, Banerjee A, Vemuri BC, Pfister H (2011) Maximizing all margins: Pushing face recognition with kernel plurality. 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, pp: 2375–2382.
35. Kumar R, Banerjee A, Vemuri BC (2009) Volterrafaces: Discriminant analysis using volterra kernels. IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp: 150–155.
36. Zhu P, Zhang L, Hu Q, Shiu SC (2012) Multi-scale patch based collaborative representation for face recognition with margin distribution optimization. 2012 European Conference on Computer Vision (ECCV), Springer, pp: 822–835.
37. Tan X, Chen S, Zhou Z-H, Zhang F (2005) Recognizing partially occluded, expression variant faces from single training image per person with SOM and soft k-NN ensemble. IEEE Transactions on Neural Networks 16(4):875–886. pmid:16121729
- View Article
- PubMed/NCBI
- Google Scholar
38. Chen S, Liu J, Zhou Z-H (2004) Making FLDA applicable to face recognition with one sample per person. Pattern recognition 37(7):1553–1555.
- View Article
- Google Scholar
39. Su Y, Shan S, Chen X, Gao W (2009) Hierarchical ensemble of global and local classifiers for face recognition. IEEE Transactions on Image Processing 18(8):1885–1896. pmid:19556198
- View Article
- PubMed/NCBI
- Google Scholar
40. Lin D, Tang X (2006) Recognize high resolution faces: From macrocosm to microcosm. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1355–1362.
41. Wolf L, Hassner T, Taigman Y (2011) Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(10):1978–1990. pmid:21173442
- View Article
- PubMed/NCBI
- Google Scholar
42. Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. 2009 IEEE International Conference on Computer Vision (ICCV), IEEE, pp. 498–505.
43. Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07–49, University of Massachusetts, Amherst.
44. Lin Z, Chen M, Ma Y (2010) The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:10095055.
45. Cai JF, Candes EJ, Shen ZW (2010) A Singular Value Thresholding Algorithm for Matrix Completion. Siam Journal on Optimization 20(4):1956–1982.
- View Article
- Google Scholar
46. Shawe-Taylor J, Cristianini N (1999) Robust bounds on generalization from the margin distribution. 4th European Conference on Computational Learning Theory.
47. Shen C, Li H (2010) On the dual formulation of boosting algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(12):2216–2231. pmid:20975119
- View Article
- PubMed/NCBI
- Google Scholar
48. Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D (2007) An interior-point method for large-scale l1-regularized least squares. IEEE Journal of Selected Topics in Signal Processing 1(4):606–617.
- View Article
- Google Scholar
49. Lee KC, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5):684–698. pmid:15875791
- View Article
- PubMed/NCBI
- Google Scholar
50. Martinez AM, Benavente R (1998) The AR face database. CVC Technical Report 24.
51. Beveridge JR, She K, Draper B, Givens GH (2001) Parametric and nonparametric methods for the statistical evaluation of human id algorithms. 3rd Workshop on the Empirical Evaluation of Computer Vision Systems pp. 1–14.
52. Guo Z, Wang X, Zhou J, You J (2016) Robust Texture Image Representation by Scale Selective Local Binary Patterns. IEEE Transactions on Image Processing 25(2):687–699. pmid:26685235
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Information Sciences 295:395–406.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems 26(7):1403–1416. pmid:25134094
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Juang C-F, Chiu S-H, Shiu S-J (2007) Fuzzy system learned through fuzzy clustering and support vector machine for human skin color segmentation. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 37(6):1077–1087.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref4] 4. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for v-support vector regression. Neural Networks 67:140–150. pmid:25933108
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref5] 5. Deng Z, Choi K-S, Jiang Y, Wang S (2014) Generalized hidden-mapping ridge regression, knowledge-leveraged inductive transfer learning for neural networks, fuzzy systems and kernel methods. IEEE Transactions on Cybernetics 44(12):2585–2599. pmid:24710838
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Gu B, Sheng VS (2016) A Robust Regularization Path Algorithm for v-Support Vector Classification. IEEE Transactions on Neural Networks and Learning Systems
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref7] 7. Jiang Y, Chung F-L, Ishibuchi H, Deng Z, Wang S (2015) Multitask TSK fuzzy system modeling by mining intertask common hidden structure. IEEE Transactions on Cybernetics 45(3):534–547.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref8] 8. Gu B, Sun X, Sheng VS. Structural Minimax Probability Machine (2016) IEEE Transactions on Neural Networks and Learning Systems
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref9] 9. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Transactions on Circuits and Systems for Video Technology 14(1):4–20.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref10] 10. Wong WK, Lai Z, Xu Y, Wen J, Ho CP (2015) Joint Tensor Feature Analysis For Visual Object Recognition. IEEE Transactions on Cybernetics 45(11):2425–2436. pmid:26470058
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref11] 11. Yang M, Feng Z, Shiu SC, Zhang L (2014) Fast and robust face recognition via coding residual map learning based adaptive masking. Pattern Recognition 47(2):535–543.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref12] 12. Gao G, Yang J, Wu S, Jing X, Yue D (2015) Bayesian sample steered discriminative regression for biometric image classification. Applied Soft Computing 37:48–59.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref13] 13. Chen B, Shu H, Coatrieux G, Chen G, Sun X, Coatrieux JL (2015) Color image analysis by quaternion-type moments. Journal of mathematical imaging and vision 51(1):124–144.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref14] 14. Yang W, Wang Z, Sun C (2015). A collaborative representation based projections method for feature extraction. Pattern Recognition 48(1):20–27.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref15] 15. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust Face Recognition via Sparse Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2):210–227. pmid:19110489
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref16] 16. Naseem I, Togneri R, Bennamoun M (2010). Linear Regression for Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 32(11):2106–2112. pmid:20603520
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref17] 17. Wagner A, Wright J, Ganesh A, Zhou Z, Mobahi H, Ma Y (2012) Toward a practical face recognition system: Robust alignment and illumination by sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(2):372–386. pmid:21646680
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref18] 18. Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, p. 471–478.

[ref19] 19. Gao G, Yang J (2014) A novel sparse representation based framework for face image super-resolution. Neurocomputing 134:92–99.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref20] 20. Jiang J, Hu R, Wang Z, Han Z (2014) Noise robust face hallucination via locality-constrained representation. IEEE Transactions on Multimedia 16(5):1268–1281.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref21] 21. Tawari A, Trivedi MM (2013) Face expression recognition by cross modal data association. IEEE Transactions on Multimedia 15(7):1543–1552.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref22] 22. Lai Z, Xu Y, Jin Z, Zhang D (2014). Human gait recognition via sparse discriminant projection learning. IEEE Transactions on Circuits and Systems for Video Technology 24(10):1651–1662.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref23] 23. Lai Z, Xu Y, Chen Q, Yang J, Zhang D (2014) Multilinear sparse principal component analysis. IEEE Transactions on Neural Networks and Learning Systems 25(10):1942–1950. pmid:25291746
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref24] 24. Yang J, Zhang L, Xu Y, Yang J-y (2012) Beyond sparsity: The role of L 1-optimizer in pattern classification. Pattern Recognition 45(3):1104–1118.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref25] 25. Yang M, Zhang L, Yang J, Zhang D (2011) Robust Sparse Coding for Face Recognition. 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp:625–632.

[ref26] 26. He R, Zheng W-S, Tan T, Sun Z (2014) Half-quadratic-based iterative minimization for robust sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(2):261–275. pmid:24356348
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref27] 27. Yang M, Zhu P, Liu F, Shen L (2015) Joint representation and pattern learning for robust face recognition. Neurocomputing 168:70–80.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref28] 28. Ou W, You X, Tao D, Zhang P, Tang Y, Zhu Z (2014) Robust face recognition via occlusion dictionary learning. Pattern Recognition 47(4):1559–1572.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref29] 29. Yang J, Luo L, Qian J, Tai Y, Zhang F, Xu Y (2016) Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence preprints
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref30] 30. Luo L, Yang J, Qian J, Yang J (2014) Nuclear Norm Regularized Sparse Coding. 2014 22nd International Conference on Pattern Recognition (ICPR), IEEE, pp: 1834–1839.

[ref31] 31. Zhang F, Yang J, Tai Y, Tang J (2015). Double Nuclear Norm-Based Matrix Decomposition for Occluded Image Recovery and Background Modeling. IEEE Transactions on Image Processing 24(6):1956–1966. pmid:25667350
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref32] 32. Qian J, Luo L, Yang J, Zhang F, Lin Z (2015). Robust nuclear norm regularized regression for face recognition with occlusion. Pattern Recognition 48(10):3145–3159.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref33] 33. Tan X, Chen S, Zhou Z-H, Zhang F (2006) Face recognition from a single image per person: A survey. Pattern recognition 39(9):1725–1745.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref34] 34. Kumar R, Banerjee A, Vemuri BC, Pfister H (2011) Maximizing all margins: Pushing face recognition with kernel plurality. 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, pp: 2375–2382.

[ref35] 35. Kumar R, Banerjee A, Vemuri BC (2009) Volterrafaces: Discriminant analysis using volterra kernels. IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp: 150–155.

[ref36] 36. Zhu P, Zhang L, Hu Q, Shiu SC (2012) Multi-scale patch based collaborative representation for face recognition with margin distribution optimization. 2012 European Conference on Computer Vision (ECCV), Springer, pp: 822–835.

[ref37] 37. Tan X, Chen S, Zhou Z-H, Zhang F (2005) Recognizing partially occluded, expression variant faces from single training image per person with SOM and soft k-NN ensemble. IEEE Transactions on Neural Networks 16(4):875–886. pmid:16121729
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref38] 38. Chen S, Liu J, Zhou Z-H (2004) Making FLDA applicable to face recognition with one sample per person. Pattern recognition 37(7):1553–1555.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref39] 39. Su Y, Shan S, Chen X, Gao W (2009) Hierarchical ensemble of global and local classifiers for face recognition. IEEE Transactions on Image Processing 18(8):1885–1896. pmid:19556198
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref40] 40. Lin D, Tang X (2006) Recognize high resolution faces: From macrocosm to microcosm. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1355–1362.

[ref41] 41. Wolf L, Hassner T, Taigman Y (2011) Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(10):1978–1990. pmid:21173442
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

[ref42] 42. Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. 2009 IEEE International Conference on Computer Vision (ICCV), IEEE, pp. 498–505.

[ref43] 43. Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07–49, University of Massachusetts, Amherst.

[ref44] 44. Lin Z, Chen M, Ma Y (2010) The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:10095055.

[ref45] 45. Cai JF, Candes EJ, Shen ZW (2010) A Singular Value Thresholding Algorithm for Matrix Completion. Siam Journal on Optimization 20(4):1956–1982.
View Article
Google Scholar

[127] View Article

[128] Google Scholar

[ref46] 46. Shawe-Taylor J, Cristianini N (1999) Robust bounds on generalization from the margin distribution. 4th European Conference on Computational Learning Theory.

[ref47] 47. Shen C, Li H (2010) On the dual formulation of boosting algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(12):2216–2231. pmid:20975119
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref48] 48. Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D (2007) An interior-point method for large-scale l1-regularized least squares. IEEE Journal of Selected Topics in Signal Processing 1(4):606–617.
View Article
Google Scholar

[135] View Article

[136] Google Scholar

[ref49] 49. Lee KC, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5):684–698. pmid:15875791
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref50] 50. Martinez AM, Benavente R (1998) The AR face database. CVC Technical Report 24.

[ref51] 51. Beveridge JR, She K, Draper B, Givens GH (2001) Parametric and nonparametric methods for the statistical evaluation of human id algorithms. 3rd Workshop on the Empirical Evaluation of Computer Vision Systems pp. 1–14.

[ref52] 52. Guo Z, Wang X, Zhou J, You J (2016) Robust Texture Image Representation by Scale Selective Local Binary Patterns. IEEE Transactions on Image Processing 25(2):687–699. pmid:26685235
View Article
PubMed/NCBI
Google Scholar

[144] View Article

[145] PubMed/NCBI

[146] Google Scholar

Figures

Abstract

Introduction

Related Works

1. Nuclear norm based matrix regression

2. Patch-based CRC

Multi-Scale Patch-Based Matrix Regression (MSPMR)

1. Motivation

2. Patch-based matrix regression (PMR)

3. Multi-scale ensemble

Problem formulation.

Algorithm of MSPMR.

4. Computational complexity

Experimental Results and Discussion

1. Face recognition without occlusion

Extended Yale B database.

AR database.

LFW database.

2. Face recognition with occlusion

Face recognition with real disguise.

Face recognition with block occlusions.

3. Parameter discussion

4. Running time comparisons

5. Evaluation of the experimental results

Conclusions and the Future Work

Ethics Statement

Author Contributions

References