Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Illumination Normalization of Face Image Based on Illuminant Direction Estimation and Improved Retinex

  • Jizheng Yi,

    Affiliation School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China

  • Xia Mao,

    Affiliation School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China

  • Lijiang Chen ,

    clj@ee.buaa.edu.cn

    Affiliation School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China

  • Yuli Xue,

    Affiliation School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China

  • Alberto Rovetta,

    Affiliation Department of Mechanics, Polytechnic University of Milan, Milan, 20156, Italy

  • Catalin-Daniel Caleanu

    Affiliation Applied Electronics Department, University POLITEHNICA Timisoara, Timisoara, 300223, Romania

Illumination Normalization of Face Image Based on Illuminant Direction Estimation and Improved Retinex

  • Jizheng Yi, 
  • Xia Mao, 
  • Lijiang Chen, 
  • Yuli Xue, 
  • Alberto Rovetta, 
  • Catalin-Daniel Caleanu
PLOS
x

Abstract

Illumination normalization of face image for face recognition and facial expression recognition is one of the most frequent and difficult problems in image processing. In order to obtain a face image with normal illumination, our method firstly divides the input face image into sixteen local regions and calculates the edge level percentage in each of them. Secondly, three local regions, which meet the requirements of lower complexity and larger average gray value, are selected to calculate the final illuminant direction according to the error function between the measured intensity and the calculated intensity, and the constraint function for an infinite light source model. After knowing the final illuminant direction of the input face image, the Retinex algorithm is improved from two aspects: (1) we optimize the surround function; (2) we intercept the values in both ends of histogram of face image, determine the range of gray levels, and stretch the range of gray levels into the dynamic range of display device. Finally, we achieve illumination normalization and get the final face image. Unlike previous illumination normalization approaches, the method proposed in this paper does not require any training step or any knowledge of 3D face and reflective surface model. The experimental results using extended Yale face database B and CMU-PIE show that our method achieves better normalization effect comparing with the existing techniques.

Introduction

Many objective factors restrict the development of face recognition (FR) and facial expression recognition (FER) systems, such as face posture, illumination variation, and so on. Some works [1, 2] have pointed out that the changes caused by the variation of illumination could be more significant than the differences between individual’s physical appearance. Furthermore, some researchers [3] affirm that the variation of illumination could bring more negative influence to FR comparing with the pose and expression. Removing the negative effects of illumination over FR and FER is a challenging problem in image processing which generates intensive research efforts. The relevant work in the field [49] could be briefly divided into three categories: face modeling, invariant features extraction, and illumination normalization. Face modeling assumes some face images with same pose but with different illumination condition. In this way, a low-dimensional illuminant subspace is created and the main idea is to find out the degree of illumination variation in the low-dimensional illuminant subspace. In order to diminish or even eliminate the negative effects of illumination variations over FR and FER, the method of invariant features extraction focuses on extracting those features which are insensitive to illumination variation from face images. The method of illumination normalization preprocesses face images and aims to obtain face images with uniform illumination. This method does not require any training images, prior knowledge of the 3D face models or the reflection parameters. Moreover, its calculation process is relatively simple. These advantages make illumination normalization the preferred solution for removing the negative effects of illumination over FR and FER. Some relevant works are briefly summarized below.

Histogram Equalization (HE) [10] usually increases the global contrast of many face images and often produces realistic effects in those face images with backgrounds and foregrounds that are both bright or both dark. However, this method fails in images with backgrounds and foregrounds that are quite different, such as photographs. There are also a number of improved HE algorithms, such as Block-based Histogram Equalization (BHE) [11], Adaptive Histogram Equalization (AHE) [12], Oriented Local Histogram Equalization (OLHE) [13], and so on. Local Normalization Technology (LNT) [14] can effectively eliminate the negative effect of uneven illumination while keeping the local statistical properties of the processed image the same as in the corresponding image under normal illumination condition. Xie and Lam [15] use both LNT and HE to eliminate the effects of nonuniform illumination, resulting higher accuracy face recognition rates. Ruiz-del-Solar and Quinteros [16] apply the Illumination Plane Subtraction (IPS) together with HE and in order to reduce shadows caused by extreme lighting angles. In their paper, they have given some experimental results and claimed that IPS together with HE could be applied in face detection and recognition. Before applying the illumination compensation and normalization algorithms, they apply other pre-processing stages to obtain aligned face images of uniform size. IPS together with HE is evaluated in a face identification scenario (1 against n comparison). In the first stage of their study, IPS together with HE was applied in 16 different face recognition systems, which are built using four different projections methods and four different similarity measures. The recognition rates can be found in table 1 of their paper [16]. Emadi et al. [2] conclude that the illumination components reside in the low frequency sub-band. In their method, an input face image is decomposed into its high frequency and low frequency components. After setting low frequency components to zero and approximating new coefficients, face image is reconstructed by wavelet inverse transform. Self-Quotient Image (SQI) [17, 18] is based on the Quotient Image (QI) [19] method and has become an important illuminant normalization method for FR and FER. Wang et al. [20] introduce a nine-dimension face illumination subspace based on QI and construct a lower-dimension training matrix. Thus, they synthesize the nine typical illuminant samples and standard illuminant samples, and implement the illumination normalization of the gray and the color images. Based on the mathematical foundations of Land’s Retinex theory, Jobson et al. define a practical implementation of the Retinex without particular concern for its validity as a model for face imagery and color perception and present the Single-Scale Retinex (SSR) algorithm [21] and the Multi-Scale Retinex (MSR) algorithm [22]. In order to achieve illumination invariant eye detection under varying lighting conditions, Jung et al. [23] use adaptive smoothing based on the Retinex theory to remove the illumination effects. To some extent, although the MSR algorithm can partially reduce the ‘halo’ phenomenon of SSR algorithm, the result of illumination normalization is still not ideal. Local Binary Pattern (LBP) [24] is a typical illumination invariant feature extraction method. Cheng et al. [25] propose Local Binary Patterns Image (LBPI), which uses the LBP texture descriptors to every point of face image and combines the LBP features for all points into an image, to improve the performance of face recognition under various illumination conditions. Bayu and Miura [26] create an adaptive contrast ratio based on Fuzzy by considering two models (appearance estimation model and shadow coefficient model) of individual face as input, and then apply a Genetic Algorithm to optimize the Fuzzy’s rule. They use Principal Component Analysis (PCA) and Nearest Neighbor (NN) based on correlation distance as the classifiers, and the experimental result shows that their algorithm is robust enough in order to normalize uneven illumination moderate and hard illumination [26]. For avoiding the ‘halo’ phenomenon, Yang et al. [27] proposes a face recognition method under the condition of illumination changes based on the improved the Retinex algorithm. The experimental results show that the improved Retinex algorithm reduces the time complexity and gets good image enhancement effect, and it can be a better solution to the ‘halo’ issue [27]. In addition, there are some algorithms [2830] claim that they can provide good illumination normalization effect.

Although all the methods mentioned above can diminish or eliminate the impact of uneven illumination in face images, still some drawbacks persist: the need for training images, prior knowledge of the face models, reflection parameters, etc. It is known that the face images will appear different under different illumination directions: face regions which are closer to the light source tend to be brighter and the ones which are farther from the light source have the opposite characteristic. Our proposed method estimates the illuminant directions of face images and takes advantage of the improved Retinex algorithm. The experiments are performed using the extended Yale face database B and CMU-PIE. The results show that the proposed method can restore the original texture information and structure characteristic, and achieve more favorable normalization effect comparing with the existing techniques.

The rest of the paper is organized as follows. In Section 2, the framework and details of our proposed illuminant normalization method are given. In Section 3, the experimental results and analysis are performed on the databases whereby the proposed method is evaluated and compared to other methods. Finally, conclusions and discussions are presented in Section 4.

The Proposed Method

The system architecture of the proposed illuminant normalization method is shown in Fig 1, which includes two parts: illuminant direction estimation of the input face image and illumination normalization based on improved Retinex. The basic procedure of the proposed method is that we firstly estimate the illumination directions of face images based on local region complexity analysis and average gray value, then we improve the Retinex algorithm considering the results of illuminant direction estimation, and finally we cope with the problem of illumination normalization.

thumbnail
Fig 1. System architecture for the proposed illuminant normalization method.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g001

2.1 Illuminant Direction Estimation

The illuminant direction estimation method we applied in this paper was formulated in our previous work [31], and the reasons why we need to estimate the illuminant directions of face images for illumination normalization are given in Subsection 2.2.

Different regions in face images make different contributions to illuminant direction estimation. For example, the smooth regions play a more important role than the concavo-convex ones [32]. Usually, as illustrated in Fig 2, the smooth regions have similar textures and simple edges, whereas the concavo–convex regions have opposite characteristics. So it is worthwhile to find out regions with simple edges for illuminant direction estimation. The system architecture of the illuminant direction estimation method is shown in Fig 3. In order to better describe our method, we list the basic six steps of the method, as shown below.

thumbnail
Fig 2. Some examples of different illumination distributions.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g002

thumbnail
Fig 3. The main steps of the illuminant direction estimation method.

It should be noted that three of sixteen local regions are selected to calculate the final illuminant direction.

https://doi.org/10.1371/journal.pone.0122200.g003

  1. Adjust the sizes of face images (color or gray) to a uniform value. Color face images are transformed to YCbCr color space, and the luminance component Y is selected as the input gray image. Gray face images are directly utilized as the input.
  2. Use the Canny edge detector to find object boundaries in the luminance component Y and get the binary edge image.
  3. Divide the binary edge image and the luminance component Y into sixteen local regions. The reason why we split the image into exactly sixteen regions has been described in our previously published manuscript [31].
  4. For each local region, analyze its complexity depending on the edge level percentage and calculate its average gray value.
  5. Estimate the illuminant directions of the three selected local regions with less complexity and large average gray value. The reason why we choose exactly three regions has been described in our previously published manuscript [31].
  6. Synthesize the three illuminant directions and transform the result to the final illuminant direction in the original face image (color or gray).

In all the steps mentioned above, the local region complexity analysis and illuminant direction estimation are particularly important. Chacon et al. [33, 34] present a description of image complexity analysis based on the edge level percentage which is defined by 1 Where |•| indicates cardinality of a set, p(x,y) denote the gray value at pixel (x,y), and width × height is the dimension of the image. After receiving the edge level percentage of each local region, we calculate the average gray value of each local region, and then select three local regions with less complexity and large gray value for estimating the illuminant direction.

Model-based illuminant direction estimation is an effective and popular way to estimate a complex scene illumination. These models include Lambertian, dichromatic, Torrance-Sparrow models, and so on. Our proposed illuminant direction estimation method is inspired by model-based illuminant direction estimation and aims to achieve a higher recognition rate in less execution time.

The normal direction of a pixel could be seen as the direction in which the most dramatic change in gray values occurs, which means the points on the normal lines have maximal gray-scale difference from the center pixel. Sometimes, a plurality of points meets the requirements of the calculation process. For this case, the final normal direction is defined as the summation of normal lines, and the magnitude of the normal is the value of maximal gray scale, which is the difference between the points on the normal lines and the center pixel. So the normal direction of any pixel in the image could be calculated.

The mth and nth selected regions are represented as fm(x,y) and fn(x,y) (m,n = 1,2,3 and mn), respectively. Thus, the error function between the calculated intensity Mv and the measured intensity b is given by 2 Where ∥•∥ indicates the magnitude of a matrix, and are two components of the illumination direction corresponding to the mth selected region, and are two components of the illumination direction corresponding to the nth selected region, Nx(fm(xi,yi)) and Ny(fm(xi,yi)) are two components of the normal vector in pixel (xi,yi) of the mth selected region, Nx(fn(xi,yi)) and Ny(fn(xi,yi)) are two components of the normal vector in pixel (xi,yi) of the nth selected region, and kα is the constant intensity of environmental light. Because the human face occupies a very small percentage of the entire illuminant scene, the face images are considered to meet the infinite light source model. Therefore, the constraint function is given by

3

Thus, the final error function is defined as 4 where λ is the Lagrange multiplier. So the process of calculating v is translated into seeking the optimal solution for the following system of equations:

5

Seeking the optimal solution for Eq. (5) is equivalent to solving the following equations:

6

For the constraint function being equal to zero, we know that Lm is equal to Ln. Our approach assumes that L(m,n) represent Lm or Ln. Supposing that (m,n) exists three particular cases ((1,2), (2,3), and (3,1)), the three illuminant directions (L(1,2), L(1,2), and L(1,2)) in a face image could be calculated by using Eq. (6).

Although L(1,2), L(1,2), and L(1,2) are often different from the actual illuminant direction, they are all useful and have different contributions to estimate the final illuminant direction. The edge level percentages can be used to adjust L by calculating

7

Let represent the synthesized illuminant direction:

8

We assume the size of resized face image is Q×Q. The final transformed illuminant direction Lfinal is calculated by

9

Real illuminant directions and estimated illuminant directions are shown in Fig 4.

thumbnail
Fig 4. Real illuminant directions and estimated illuminant directions.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g004

2.2 The Improved Retinex Theory

Retinex theory was described in 1971 by Land and McCann [35]. The word ‘Retinex’ is a portmanteau formed from ‘retina’ and ‘cortex’, suggesting that both the eye and the brain are involved in the processing. Retinex theory supposes that the image S(x,y) is the product of two different images, the reflectance R(x,y) with high frequency components and the illumination L(x,y) with low frequency components. The relationship between the three can be expressed as

10

By estimating the illuminant component, illumination normalization based on Retinex theory decomposes the reflection component, thus eliminates the influence of uneven illumination to improve the visual effect of face image. Eq. (10) is usually transformed into logarithmic form

11

Let s(x,y), r(x,y), and l(x,y) represent log[S(x,y)], log[R(x,y)], and log[L(x,y)], respectively.

12

Thus,

13

That is to say, since s(x,y) is known, we can calculate r(x,y) by estimating l(x,y). Therefore, a reasonable and effective method for estimating l(x,y) is the key to illumination normalization based on Retinex theory.

Compared to the previous Retinex, the center/surround Retinex can significantly increase the operation speed and present better processing effect. SSR [21] is a typical center/surround Retinex method whose mathematical representation derives from Eq. (13). 14 where ‘*’ denotes the convolution operation, and F(x,y) is the surround function. The surround function determines the type of SSR and has several forms: 15 16 17 18 where c is the scale parameter of surround function, and K is selected such that

19

For illumination normalization of face image, Eq. (19) has the discrete representation:

20

Jobson et al. [21] have proved that SSR can present the best illumination normalization effect when the surround function is a Gaussian function. Let G(x,y) represent F(x,y), then we can rewrite Eq. (14):

21

For SSR, with the decreasing of scale parameter, image details become more prominent. Conversely, the dynamic range compression of gray value is more obvious. In order to overcome the shortcomings of SSR, Jobson et al. [22] present MSR which is an extension of SSR. The MSR output is a weighted sum of the outputs of several different SSR using different scale parameters (small, middle, and large). Mathematically, this is expressed by, 22 where k designates the number of scales and usually takes the value of 3, {ωi}(i = 1,2,3) are the weights associated with the different scales and usually take the value of {1/3,1/3,1/3}, and the scale parameters {ci}(i = 1,2,3) usually take the value of {15,80,250}. When the image to be processed is RGB color image, three color bands are usually individually enhanced in MSR, which might create color distortion. To solve this problem, the Multi-Scale Retinex with Color Restoration (MSRCR) [36, 37] has emerged, 23 where denotes the color restoration factor of color band X (X = R,G,B), and C is a constant whose value is 125.

Due to the existence of logarithmic computation, image pixels tend to be negative in MSR and MSRCR and cannot provide a good visual effect after the directly inverse logarithmic computation. Therefore, it is necessary to translate and stretch the gray levels of the output images into the dynamic range of display device by the gain compensation technique [38]. This process can be represented as: 24 where is the color band X (X = R,G,B) of face image after MSRCR, is the color band X (X = R,G,B) of face image after the gain compensation, α and β are the gain and the compensation respectively, rmax and rmin are maximum and minimum of the gray levels of the three color bands respectively, and dmax is the dynamic range of display device and usually takes the value of 255.

We mainly improve Retinex algorithm from two aspects. Firstly, we optimize the surround function. Secondly, we intercept the values in both ends of histogram of face image, determine the range of gray levels, and stretch the range of gray levels into the dynamic range of display device.

The traditional center/surround Retinex estimates the illuminant component by the convolution between the Gaussian function and the face image. We give the three-dimensional and two-dimensional representations of Gaussian function in Fig 5, where the points with the same value have the same Euclidean distance to the center. For illumination normalization of face image, the slowly changing illuminant components are often seen as the low frequency components. Furthermore, the structure of Gaussian function exhibit low pass filter characteristic. Therefore, the convolution between the Gaussian function and the face image can perform illuminant component extraction.

thumbnail
Fig 5. The three-dimensional and two-dimensional representations of Gaussian function used in MSRCR.

(a) No normalization. (b) Normalization.

https://doi.org/10.1371/journal.pone.0122200.g005

The central region’s coordinates of conventional Gaussian and their corresponding values are shown in Fig 6A and 6B Gaussian convolution in the center/surround Retinex method is essentially a process of weighted average for the central pixel and its surrounding pixels. a, b, and c (a > b > c > 0) in Fig 6B are the weights whose values represent the confidence levels. Here, the confidence level means the pixel’s contribution to illuminant component estimation of the central pixel. The traditional Gaussian function makes center/surround Retinex method have an essential property: these pixels with the same Euclidean distance to the center have the same contribution to illuminant component estimation of the central pixel. Therefore, we can conclude that the traditional center/surround Retinex method holds the opinion that illuminant distribution is even within its specified image region. However, Gaussian convolution template with large size usually occupies most of the image, and the assumption of illuminant uniform distribution within this template is not realistic.

thumbnail
Fig 6. Central region’s coordinates of Gaussian and their corresponding values.

(a) Coordinate. (b) The distribution of conventional Gaussian’s values. (c) The real situation.

https://doi.org/10.1371/journal.pone.0122200.g006

In reality, the pixels which are closer to the light source have the stronger illuminant components, and the ones which are away from the light source have the opposite characteristic. Therefore, we believe that Gaussian function for estimating the illuminant component should have directivity, thus the pixels whose illuminant components are stronger should have smaller weights and the ones with the opposite characteristic larger weights. It is assumed that the illuminant direction is from top to bottom. We adjust Gaussian function and get a new distribution about the central region’s values, as shown in Fig 6C where a, b, c, d, and e (a > b > c > d > e > 0) indicate the weights.

For improving the Retinex theory, we firstly introduce the parameters t1 and t2 (t1,t2 ≥ 1) to Gaussian function.

25

Eq. (25) generates three different cases according to the differences of t1 and t2.

Case 1: if t1 = t2 then

26

Case 2: if t1t2 and 1 ≤ t2 < t1 then

27

Case 3: if t1t2 and 1 ≤ t1 < t2 then

28

Comparing with the original Gaussian function, Eq. (26) has no essential differences in the distribution of values. Therefore, this case does not meet our requirement. Obviously, it can be transformed into each other between Eq. (27) and Eq. (28) through rotation and scale change. Therefore, the two equations can be simplified as:

29

We further optimize Eq. (29) and obtain the modified Gaussian function:

30

It should be noted that the Gaussian function in Eq. (30) is adapted to the illuminant direction at an angle of zero degrees (from top to bottom).

We have estimated the final illuminant direction Lfinal, as is presented in Subsection 2.1. Let be the rotation function where θ is the angle of rotation. If θ is positive, the Gaussian function rotates θ degrees counterclockwise; if θ is negative, the Gaussian function rotates −θ degrees clockwise. The rotated Gaussian function can be defined as

31

For example, Fig 7 shows the three-dimensional and two-dimensional representations of rotated Gaussian function, where Lfinal is 60°, 120°, −60°, and −120°, respectively. As can be seen from Fig 7, according to the different illuminant distributions of face images, the improved Gaussian function with different directivities makes the pixels whose illuminant components are stronger obtain the smaller weights and the ones with the opposite characteristics have the larger weights.

thumbnail
Fig 7. The three-dimensional and two-dimensional representations of Gaussian function used in our method.

(a) 60°. (b) 120°. (c) −60°. (d) −120°.

https://doi.org/10.1371/journal.pone.0122200.g007

Moreover, another shortcoming of traditional Retinex theory in Eq. (24) that rmax and rmin are just maximum and minimum of the gray levels in the image, respectively. After analyzing the MSR, Zhao [38] suggested that the selection of two intercept point in histogram is the key to the global luminance adjustment. The gain compensation technique used in Eq. (24) makes the values which have small probability and locate in both ends of histogram take up too much gray level. So these values which truly present the details of face images do not have enough gray level. From this we intercept the values in both ends of histogram, determine the range of gray levels, and stretch the range of gray levels into the dynamic range of display device. Interception rules can be expressed as: 32 where and are the intercepted maximum and minimum in the histogram of the color band X (X = R,G,B) of face image, respectively. Let and , the range of gray levels in intercepted image is [rlow,rup]. Then, linear stretching can be implemented as following: 33

For determining and , we adjust the method proposed in [38] and state the rule: 34 where and are the numbers of abandoned pixels locating in the left and right of the histogram, respectively, Nsum denotes the total number of pixels whose value is width × height. The detail about the interception can be seen in Fig 8.

Experimental Results and Analysis

In this section, we test and verify our proposed method by several experiments on gray face images from the extended Yale face database B [39, 40] and color face images from CMU-PIE [41], since the two face databases are commonly used to evaluate the performance of illumination normalization. Under the same experimental condition, we compare the proposed method with Histogram Equalization (HE) [12], Local Normalization Technology (LNT) [14], Local Mean Map (LMM) of an image representing its low-frequency contents [15], Local Variance Map (LVM) carrying the image’s high-frequency components [15], LNT together with HE [15], Illumination Plane Subtraction (IPS) together with HE [16], Single-Scale Retinex (SSR) [21], Multi-Scale Retinex (MSR) [22], Multi-Scale Retinex with Color Restoration (MSRCR) [36, 37], and Local Binary Pattern (LBP) [24]. As our work focus on illumination normalization, only frontal face images without variations in head pose are considered and resized to 394×326 pixels. It should be noted that the parameters of Canny edge detector [T1 T2] and σ are set as [0.04 0.10] and 1.5, respectively. All the experiments are performed in Matlab2011 environment on a computer with an Intel (R) Core (TM)2 Duo CPU with a clock speed of 2.2 GHz, 2 GB RAM and Windows XP Professional.

Five experiments were carried out in our work. In the first part, we compare the different experimental results for different parameter t which is presented in Eq. (30). In the second part, the experimental results for the case of gray face images are described. Then, the third part describes the experimental results for the case of color face images. In the fourth part, for face images which are processed by different illumination normalization methods, we perform the feature point localization by Active Appearance Model (AAM) [42]. Finally, in the fifth part, we show the results of face recognition task, and present the performance measures used to evaluate the feasibility of our proposed method.

3.1 Choice of Parameter t

For the parameter t in Eq. (30), we aim to determine its best value in illumination normalization. We perform some comparisons among three different values of parameter t which are small, middle, and large, respectively, whereas the number of face images is 10. The experimental results of one sample are given in Fig 9 where the parameter t takes the value of 1.5, 5.5, and 10.5, respectively. Obviously, when the parameter t is 1.5, the effect of illumination normalization is satisfactory.

thumbnail
Fig 9. Comparison among three different sizes of parameter t.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g009

3.2 Experimental Results for Gray Face Images

To evaluate the effectiveness of the proposed method, gray face images from the extended Yale face database B are used in the second experiment. This database is set up for performance evaluation of face recognition and illuminant-processing algorithms under large variations in illumination and pose. In our work, 342 face images of 38 human subjects representing nine illuminant conditions (0° elevation) under frontal pose are employed. The human subjects comprise 10 individuals in the original Yale face database B and 28 individuals in the extended Yale face database B. It should be noted that the parameter t is 1.5 in this experiment. The left side of Fig 10 illustrates the original images in the databases on the first row, the images processed by HE on the second row, those processed by LNT on the third row, those processed by LMM on the fourth row, those processed by LVM on the fifth row, those processed by LNT + HE on the sixth row. The right side of Fig 10 illustrates those processed by IPS on the first row, those processed by IPS + HE on the second row, those processed by SSR on the third row, those processed by MSR on the fourth row, those processed by LBP on the fifth row, and those processed by our algorithm on the bottom row. Obviously, the results in Fig 10 prove that comparing with others, our proposed method can better recover uncertain contours of face images, more effectively remove the negative effect of illumination, more satisfactorily preserve the texture information of face images, and more successfully enhance the image contrast. Of course, all the advantages mentioned above are useful and important to the feature point localization and then for FR and FER.

thumbnail
Fig 10. Comparison of different illuminant normalization methods for gray face images from the Yale face database B.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g010

3.3 Experimental Results for Color Face Images

Between October 2000 and December 2000, Sim et al. [41] collected a total of 41,368 facial images of 68 people and built the CMU-PIE database. All the images were captured in the CMU 3D Room, and each subject was imaged across 13 different poses, under 43 different illumination conditions, and with 4 different expressions. For our third experiment, we employ the frontal facial images (c27 in the numbering scheme) with significant illumination variation and the room lights switched on and off. It should be noted that all the experimental images are colorful and the parameter t is also 1.5 in this experiment. Otherwise, for the facial images of three color bands (R, G, and B), we firstly apply the illuminant normalization methods to one color band and then combine the three results into the final color image. Fig 11 shows some representative results processed by some different illuminant normalization methods and ours. From this figure, it can be seen that the visual performance of our method is superior to that of the other methods.

thumbnail
Fig 11. Comparison of different illuminant normalization methods for color face images from the CMU-PIE database.

Reprinted from http://www.ri.cmu.edu/research_project_detail.html?project_id=418&menu_id=261 under a CC BY license, with permission from Takeo Kanade, original copyright 2003.

https://doi.org/10.1371/journal.pone.0122200.g011

3.4 Feature Point Localization

It is known that texture information and structural characteristic are two important factors for FR or FER. The results of the above two experiments have shown our method can effectively remove the negative effect of illumination and satisfactorily preserve the texture information of face images. In this experiment, we aim to test and verify if method is helpful to supply the structural characteristics to the subsequent studies. Facial feature point located by AAM is a popular and important factor to represent the structural characteristic of face image. Therefore, for face images processed by different illuminant normalization methods, we compare the effects of feature point location in this experiment. We select the face images with uniform illumination to construct the training set of AAM. Furthermore, it should be noted that the training set is rebuilt for the different samples in our experiments. That is to say, for one location experiment, the training set and testing set are from the same person. Some samples’ effects of feature point location are shown in Fig 12.

thumbnail
Fig 12. The analysis of feature point location.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g012

The results in Fig 12 demonstrate the advantage of our proposed method. In order to further analyze the effect of feature point location, we assume that the coordinate of feature point A marked by human is (xh,yh) and the coordinate of feature point A marked by computer is (xc,yc). The error rA is defined as the Euclidean distance between the two coordinates.

35

Let k be the error that can be allowed. When rA is less than or equal to k, we assume that the location is successful, otherwise it is a fail. A total of 2320 feature points of 40 face images (20 gray images and 20 color images) were selected for the experiment. It should be noted that the participants have different age, skin color, and gender. When k is 3, 7, and 10, we calculate the location accuracies of different methods (see Fig 13). As shown in Fig 13, as k increases, the location accuracies of all the methods also increase. Meanwhile, the location accuracy of our method always has the highest value.

thumbnail
Fig 13. The location accuracies for different methods.

(a) Gray image. (b) Color image.

https://doi.org/10.1371/journal.pone.0122200.g013

3.5 Face Recognition

In order to further verify the superiority of our proposed method, we perform face recognition in the two databases based on feature point localization. For each sample of the extended Yale face database B, three face images with uniform illumination are selected to the training set, and the rest with uneven illumination constitutes the testing set. So the training set consists of 114 face images. For each sample of CMU-PIE database, we employ four frontal face images (c27 in the numbering scheme) with uniform illumination to constitute the training set, and the rest with uneven illumination constitutes the testing set. So the training set consists of 272 face images. Note that all the selected face images of the two databases are in a neutral emotional state.

In a face image, the distance between two points can be expressed as:

36

All the distances constitute one set {dij} (0 < i < j ≤ 58). Let d′ be the maximum value of the set, that is

37

We normalize every dij and get set

38

Note that D is the input feature of our face recognition task. After calculating the input features to all face images, we reference [30] and apply Nearest Neighbor Decision (NND) to recognize every test sample’s identity. We compare our method with other well-known methods to prove its improvement. As shown in Fig 14, our proposed method can reach the highest recognition rate compared to the others for both the extended Yale face database B and CMU-PIE database. The satisfactory recognition rate obtained by our proposed method further shows that our algorithm is capable to remove the negative effect of illumination and preserve the texture and structural information of face image.

Conclusion

In this paper, a new method for illumination normalization of face image has been proposed. In our method, we firstly estimate the given face image’s illuminant direction based on local region complexity analysis and average gray value. Then, with the help of the calculated illuminant direction, we improve the original Retinex algorithm by redefining the surround function and adjusting global luminance. Our original method creates promising perspectives in illumination normalization for face images. The experimental results for gray and color face images show the significant advantages of the proposed method over the existing ones.

Our proposed method aims to improve the facial image illumination condition and could be used in the subsequent FR and FER processes. Although the proposed method performs well on overall illumination normalization, it fails to completely eliminate the local shadows and false edges around the nose, which is the most challenging task for all of the illuminant normalization methods. Fig 15 illustrates this problem. Therefore, the authors will devote research effort to overcome this limitation in the future.

thumbnail
Fig 15. Local shadow and false edge around the nose.

Reprinted from http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html under a CC BY license, with permission from David Kriegman, original copyright 2001.

https://doi.org/10.1371/journal.pone.0122200.g015

The authors would like to thank the providers of the extended Yale face database B and the CMU-PIE.

Author Contributions

Conceived and designed the experiments: JY XM LC YX AR. Performed the experiments: JY XM LC YX AR. Analyzed the data: JY XM LC YX AR. Contributed reagents/materials/analysis tools: JY XM LC YX AR. Wrote the paper: JY XM LC YX AR CC.

References

  1. 1. Adini Y, Moses Y, Ullman S. Face recognition: The problem of compensation for changes in illumination direction. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997; 19 (7): 721–732.
  2. 2. Emadi M, Khalid M, Yusof R, Navabifar F. Illumination Normalization using 2D Wavelet. Procedia Engineering 2012; 41: 854–859.
  3. 3. Chen W, Er MJ, Wu S. Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 2006; 36 (2): 458–466.
  4. 4. Zhou Y, Zhou ST, Zhong ZY, Li HG. A de-illumination scheme for face recognition based on fast decomposition and detail feature fusion. Optics Express 2013; 21 (9): 11294–11308. pmid:23669986
  5. 5. Qin J, Silver RM, Barnes BM, Zhou H, Goasmat F. Fourier domain optical tool normalization for quantitative parametric image reconstruction. Applied Optics 2013; 52 (26): 6512–6522. pmid:24085127
  6. 6. Filippi AM, Güneralp İ. Influence of shadow removal on image classification in riverine environments. Optics Letters 2013; 38 (10): 1676–1678. pmid:23938908
  7. 7. Restrepo R, Uribe-Patarroyo N, Belenguer T. Improvement of the signal-to-noise ratio in interferometry using multi-frame high-dynamic-range and normalization algorithms. Optics Communications 2012; 285 (5): 546–552.
  8. 8. Mian A. Illumination invariant recognition and 3D reconstruction of faces using desktop optics. Optics Express 2011; 19 (8): 7491–7506. pmid:21503057
  9. 9. Diaz-Ramirez VH, Kober V. Target recognition under nonuniform illumination conditions. Applied Optics 2009; 48 (7): 1408–1418. pmid:19252643
  10. 10. Pizer SM, Amburn EP, Austin JD, Cromartie R, Geselowitz A, Greer T, et al. Adaptive histogram equalization and its variations. Computer Vision, Graphic, and Image Processing 1987; 39: 355–368.
  11. 11. Xie X, Lam K. Face recognition under varying illumination based on a 2D face shape model. Pattern Recognition 2005; 38 (2): 221–230.
  12. 12. Vishwakarma VP, Pandey S, Gupta MN. Adaptive histogram equalization and logarithm transform with rescaled low frequency DCT coefficients for illumination normalization. International Journal of Recent Trends in Engineering 2009; 1 (1): 318–322.
  13. 13. Lee PH, Wu SW, Hung YP. Illumination compensation using oriented local histogram equalization and its application to face recognition. IEEE Transactions on Image Processing 2012; 21 (9): 4280–4289. pmid:22692906
  14. 14. Xie X, Lam K. An efficient method for face recognition under varying illumination. Proc. IEEE International Symposium on Circuits and Systems 2005; 4: 3841–3844.
  15. 15. Xie X, Lam K. An efficient illumination normalization method for face recognition. Pattern Recognition Letters 2006; 27 (6): 609–617.
  16. 16. Ruiz-del-Solar J, Quinteros J. Illumination compensation and normalization in eigenspace-based face recognition: A comparative study of different pre-processing approaches. Pattern Recognition Letters 2008; 29 (14): 1966–1979.
  17. 17. Wang H, Li SZ, Wang Y. Face recognition under varying lighting conditions using self quotient image. Proc. the 6th IEEE International Conference on Automatic Face and Gesture Recognition 2004; 819–824.
  18. 18. Wang H, Li SZ, Wang Y. Generalized quotient image. Proc. the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2004; 2: 498–505. pmid:20448825
  19. 19. Shashua A, Riklin-Raviv T. The quotient image: Class-based re-rendering and recognition with varying illuminations. IEEE Transactions on Pattern Analysis and Machine Intelligence 2001; 23 (2): 129–139. pmid:25210210
  20. 20. Wang Y, Ning X, Yang C, Wang Q. A method of illumination compensation for human face image based on quotient image. Information Sciences 2008; 178 (12): 2705–2721.
  21. 21. Jobson DJ, Rahman Z, Woodell GA. Properties and performance of a center/surround retinex. IEEE Transactions on Image Processing 1997; 6 (3): 451–462. pmid:18282940
  22. 22. Jobson DJ, Rahman Z, Woodell GA. A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Transactions on Image Processing 1997; 6 (7): 965–976. pmid:18282987
  23. 23. Jung C, Sun T, Jiao L. Eye detection under varying illumination using the retinex theory. Neurocomputing 2013; 113: 130–137.
  24. 24. Timo O, Pietikäinen M, Harwood D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 1996; 29 (1): 51–59.
  25. 25. Cheng Y, Jin Z, Hao C. Illumination normalization based on local binary pattern image. Proc. The 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) 2012; 1: 94–97.
  26. 26. Bayu BSD, Miura J. Fuzzy-based illumination normalization for face recognition. Proc. the 2013 IEEE Workshop on Advanced Robotics and its Social Impacts (ARSO) 2013; 131–136.
  27. 27. Yang J, Wan L, Qu C. Illumination processing recognition of face images based on improved Retinex algorithm. Journal of Multimedia 2013; 8 (5): 541–547.
  28. 28. Luo Y, Guan YP, Zhang CQ. A robust illumination normalization method based on mean estimation for face recognition. ISRN Machine Vision 2013; 1–10.
  29. 29. Patil NK, Vasudha S, Boregowda LR. A novel method for illumination normalization for performance improvement of face recognition system. Proc. the 2013 International Symposium on Electronic System Design 2013; 148–152.
  30. 30. Ge W, Li GJ, Cheng YQ, Xue C, Zhu M. Face image illumination processing based on improved Retinex. Optics and Precision Engineering 2010; 18 (4): 1011–1020.
  31. 31. Yi JZ, Mao X, Chen LJ, Xue YL, Compare A. Illuminant direction estimation for a single image based on local region complexity analysis and average gray value. Applied Optics 2014; 53 (2): 226–236. pmid:24514054
  32. 32. Yang J, Deng ZP, Guo YK, Li JG. Two new approaches for illuminant direction estimation. Journal of Shanghai Jiaotong University 2002; 36 (6): 894–896.
  33. 33. Chacon M, Aguilar LE, Delgado A. Fuzzy adaptive edge definition based on the complexity of the image. Proc. the 10th IEEE International Conference on Fuzzy Systems 2001; 675–678.
  34. 34. Chacon M, Alma D, Corral S. Image complexity measure: a human criterion free approach. Proc. IEEE Annual Meeting of the North American Fuzzy Information Processing Society 2005; 241–246.
  35. 35. Land EH, McCann JJ. Lightness and retinex theory. Journal of the Optical society of America 1971; 61 (1): 1–11. pmid:5541571
  36. 36. Rahman Z, Jobson DJ, Woodell GA. Retinex processing for automatic image enhancement. Proc. SPIE 4662, Human Vision and Electronic Imaging VII 2002; 390–401.
  37. 37. Rahman Z, Jobson DJ, Woodell GA. Retinex processing for automatic image enhancement. Journal of Electronic Imaging 2004; 13 (1): 100–110.
  38. 38. Zhao XX. Research of video images enhancement system based on retinex theory. D. Sc. Thesis, China University of Mining and Technology. 2011.
  39. 39. Georghiades A, Belhumeur P, Kriegman D. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence 2001; 23 (6): 643–660.
  40. 40. Lee KC, Ho J, Kriegman D. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence 2005; 27 (3): 684–698.
  41. 41. Sim T, Baker S, Bsat M. The CMU pose, illumination, and expression database. IEEE Transactions on Pattern Analysis and Machine Intelligence 2003; 25 (12): 1615–1618.
  42. 42. Park S, Shin J, Kim D. Facial expression analysis with facial expression deformation. Proc. the 19th International Conference on Pattern Recognition 2008; 1–4.