Conceived and designed the experiments: MRQ DM AT JV. Performed the experiments: MRQ. Analyzed the data: MRQ DM AT JV. Contributed reagents/materials/analysis tools: MRQ. Wrote the paper: MRQ DM AT JV.
The authors have declared that no competing interests exist.
Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions.
There is a long tradition of research, including non-scientific one (as in ancient Egypt, China or Greece
Although the accuracy of personality judgments from faces is questionable
In a world characterized by an ever growing amount of interactive artifacts, it is important to develop better human-centric systems that incorporate human communicative behaviors. Natural interaction with machines, one that mimics interactions between humans, is hence an important research goal for computer science that converges with similar interests from other disciplines such as social psychology. The understanding of the social value of objects, including faces, requires the development of engaging interactive systems that act in socially meaningful ways
For instance in
Other personality traits seem to have a more permanent effect on the relations and perceptions in social groups. The perception of dominance has been shown to be an important part of social roles at different stages of life, and to play a role in mate selection. Such perceptions positively correlate with dominant behaviors and relational aggression
If the information on which the evaluation of faces is based could be automatically learned, it could be modeled and used as a tool for designing better interactive systems
The aim of this paper is to study to what extent this information is learnable from the point of view of computer science. Specifically, we formulate the task as a classification problem with the objective of predicting a facial trait judgment. Additionally, a second objective of the study is to find out what information is computationally useful for the prediction task.
To achieve these objectives, we use a machine learning framework and derive a system that captures and interprets facial information in several different ways. Subsequently, via state of the art classification rules, the proposed system learns several trait judgments. Once learned, these models are used to evaluate the system on new, previously unseen examples. That is, the system is able to produce a confidence measure on the most likely trait judgment that could be made by a person, when presented with a new image.
The development of the system consists of two stages: the learning stage, where the models of facial information with respect to the trait judgments are learned from data, and the prediction stage, where trait judgments are produced by the classification rules.
The first stage also attempts to determine which is the best face representation. To this end, we test two approaches: a holistic, appearance-based representation, which encodes all available information about a face, and a structural representation, which encodes exclusively the geometry of the face. The latter approach aims to decrease the amount of information used to describe the face, i.e., the representation is reduced to the relations among a small number of points located either in positions perceived to be perceptually relevant or physically descriptive of the face. In this case, we address the question of the possible relation between components of this structural representation and specific facial trait evaluations. The objective is to establish if there are specific relations and/or points within the face that can be associated with any of the facial trait evaluations.
Regarding the main question of the study, the experiments using a labeled facial data set show that two of the studied traits - dominance and threat can be predicted well beyond chance (over
Furthermore, comparison among the techniques used to describe the facial information indicate that, the predictability of facial trait evaluation tends to be more reliable when based on a global representation of the face appearance, than on the information that can be compounded from the structural approach.
With respect to the relations between facial trait evaluation and the facial structure, the experimental results suggest some interesting relations that could serve as starting points for further studies. For instance, there were specific relations between points in the mouth area and perceptions of extroversion.
The paper is structured as follows. In the next section, we review prior findings. The results and the general findings of the experiments are introduced in the subsequent section. Thereafter, the structural and holistic approaches are evaluated and their performance discussed in relation with the proposed objectives. Finally, the Material and Methods section explains in a more detailed manner the data sets, models and experimental framework.
In
In
In
In this study, our aim is to find whether appearance or structure information of the face is useful for the prediction of facial trait evaluation. We adopt a classification framework to evaluate visual information cues, using standard machine learning algorithms. In contrast to
Many feature extraction techniques can be applied to the pixel values in order to extract discriminant and invariant descriptors (such as Gabor Jets, PCA or HOG). In that context a holistic approach is the one that takes into account the whole appearance and texture of the face
The structural approach uses only the locations of specific fiducial facial points, which are considered to be salient from a perceptual point of view. These landmarks are combined in different ways to form a geometric descriptor of the face.
Finally, both approximations are validated through a bank of state of the art machine learning classifiers to assess their general performance and the validity of the results.
The problem is tackled from the perspective of a classification task. We use machine learning techniques to evaluate the proposed two hypothese: first, whether the automatic prediction of facial trait judgments can be performed using a structural or holistic approach, and second, verify whether there are points in the structure or relations in the geometric descriptor that can be related to any of the analyzed trait judgments (for details on the traits analyzed see the
The results presented in this section were computed as follows. First, we obtained a descriptor for each facial image, using the proposed feature extraction techniques (see
Trait | Attractive | Competent | Trustworthy | Dominant | Mean | Frightening | Extroverted | Threatening | Likable |
|
82.52 (6.5) | 68.81 (8.7) | 75.59 (8.0) | 87.52 (7.1) | 76.15 (6.0) | 76.98 (7.6) | 83.11 (6.4) | 90.86 (4.4) | 72.55 (8.9) |
|
75.45 (5.5) | 72.27 (9.0) | 77.95 (8.9) | 87.09 (6.1) | 82.25 (5.0) | 80.74 (7.6) | 91.42 (5.7) | 87.52 (5.2) | 70.45 (8.7) |
|
63.51 (7.0) | 65.77 (8.1) | 75.05 (8.7) | 74.48 (6.9) | 71.85 (7.1) | 71.85 (8.9) | 64.93 (9.8) | 76.98 (4.3) | 52.27 (9.8) |
|
66.58 (5.8) | 63.81 (7.6) | 70.47 (7.5) | 79.46 (6.3) | 71.15 (7.0) | 67.84 (5.4) | 75.18 (9.6) | 81.69 (6.2) | 62.57 (8.8) |
|
75.59 (9.0) | 62.70 (12.0) | 67.14 (10.2) | 77.79 (8.7) | 67.68 (8.4) | 64.91 (5.8) | 70.47 (9.2) | 71.42 (6.2) | 75.61 (7.7) |
Trait | Attractive | Competent | Trustworthy | Dominant | Mean | Frightening | Extroverted | Threatening | Likable |
|
46.87 (8.1) | 60.23 (10.3) | 57.97 (9.0) | 84.89 (6.7) | 65.16 (7.9) | 75.99 (8.1) | 57.30 (9.4) | 73.24 (8.4) | 50.47 (10.1) |
|
63.54 (9.3) | 69.91 (8.7) | 59.50 (8.0) | 93.22 (5.6) | 74.89 (6.0) | 80.72 (8.0) | 62.57 (10.8) | 82.55 (5.5) | 59.64 (8.3) |
|
57.00 (8.0) | 60.63 (9.3) | 52.84 (6.8) | 89.89 (6.0) | 67.82 (7.7) | 54.21 (9.2) | 50.07 (8.2) | 77.41 (6.4) | 54.77 (10.3) |
|
62.14 (8.2) | 61.17 (8.2) | 56.04 (9.5) | 89.05 (6.0) | 73.24 (8.0) | 66.01 (7.7) | 52.84 (9.0) | 77.27 (5.0) | 47.27 (8.5) |
|
67.41 (6.9) | 63.67 (11.2) | 66.44 (8.2) | 77.79 (7.3) | 66.28 (8.8) | 64.08 (6.4) | 70.05 (9.9) | 83.36 (6.3) | 76.85 (8.6) |
Trait | Attractive | Competent | Trustworthy | Dominant | Mean | Frightening | Extroverted | Threatening | Likable |
|
75.02 (6.0) | 69.05 (8.6) | 79.46 (6.3) | 96.67 (3.0) | 84.89 (5.8) | 75.05 (8.0) | 90.02 (5.0) | 94.46 (3.4) | 70.32 (10.8) |
|
81.13 (6.0) | 81.55 (6.7) | 91.13 (4.4) | 96.67 (3.0) | 88.09 (5.9) | 87.25 (6.3) | 85.59 (6.4) | 97.79 (2.4) | 83.49 (8.6) |
|
66.85 (7.9) | 55.47 (6.3) | 73.92 (8.7) | 84.73 (6.1) | 77.68 (8.6) | 72.27 (8.7) | 72.95 (6.8) | 84.21 (7.3) | 74.21 (7.7) |
|
73.81 (5.7) | 68.54 (6.1) | 78.24 (5.9) | 93.06 (4.3) | 81.28 (6.7) | 78.38 (7.5) | 77.55 (8.3) | 91.82 (5.5) | 76.71 (7.6) |
|
75.07 (7.8) | 66.35 (11.4) | 70.13 (9.0) | 81.68 (7.8) | 70.33 (8.4) | 67.72 (5.9) | 73.77 (9.4) | 81.26 (6.1) | 80.04 (8.0) |
The two variables involved –appearance and structure– were analyzed separately. For the holistic approach the images of the faces were projected on a reference image shape to normalize the structure, thus measuring only appearance (see
The mean accuracy results shown are computed using a
Comparison between the implemented classification rules (vertical lines represent the confidence intervals).
In light of these results, further analysis was done to find out whether the information conveyed by a holistic representation is complementary to the one conveyed by a structural one. In this analysis, we took the labels predicted by the appearance and geometric descriptors and correlated them to test if the same images were labeled in the same way by the classifiers.
The correlation was done over the predicted classes per trait for the SVM classifier.
It can be seen that the correlation is high for dominance, suggesting that regardless of the method used, this trait judgment can be accurately predicted.
On the other hand, the low correlation of the EigenFaces-Geometric descriptor pair suggests that there is little relation in the way the information is described by the two methods. The values in
In the case of the HOG-Geometric descriptor pair, the correlation scores are close to
This section presents the experiment that aims to establish if there are specific regions within the face that can be associated with any of the facial trait evaluations. The experiment was performed using the geometric descriptor (see
Results reveal that there is correlation between the geometry of several points and the perception of attractiveness and extroversion. For the first, the area around the eyes shows a clear correlation with the trait judgment; the alignment, size, and distance between the points extracted from the region of the eyes are correlated with that trait judgment. Furthermore, there are relations between the eyes and the lips, and between the eyes and the nose that show correlation to that trait judgment as well.
In the second case, the perception of extroversion is correlated with the mouth area, specifically with the size of the lips. There is also a relation between the mouth and the chin, in terms of spatial distribution, and the judgment of extroversion. These relations are in concordance with the results presented in
Left: Attractive, Right: Extroverted. The size and color of the circles is proportional to the number of times a given point is used in a specific feature of the geometric descriptor. Small dark blue circles represent low correlation.
No other clear relations between the geometric descriptor and other trait judgments were found.
This analysis was then applied to the possible correlations between the geometric descriptor and the labels projected on the first two principal components, Valence and Dominance, of a PCA of all trait judgments. This is based on the results of Oosterhof and Todorov in
In this analysis, relations between the upper half of the face, specifically the eyes and eyebrows areas, and the first principal component were found. Weaker relation between the first principal component and the nose and chick bones was also found.
In general, angles were more correlated with the trait judgments than distances (almost
We studied the problem of determining the prediction capabilities of an automatic system with respect to the task of facial trait judgments. We tackled the question from two perspectives, a holistic and a structural approach, and used machine learning techniques to answer the question on the automatic predictability.
We implemented two different methods for the holistic approach, namely EigenFaces and HOG methods, and one method for the structural approach. The former describes the images in terms of the appearance information, and the latter uses the relations among a few salient points in the image of the face to describe it.
The classification was done using state of the art classifiers. Five algorithms were employed: GentleBoost as an example of additive method, Support Vector Machine with a Radial Basis function kernel as an example of the non-linear classifiers, K-Nearest Neighbor as an example of a non-parametric classifier, Parzen Windows with RandomSubspace, and Binary Decision Trees. The evaluation of the system was performed by using a
The results of the experiment confirm that facial trait evaluation from neutral faces can be computationally learned. More specifically, three traits “Dominant”, “Threatening”, and “Mean” can be learned by an automatic system well beyond chance. Furthermore, it was observed that both facial representations are complementary to one another, and that each trait was encoded differently suggesting that there are representations better suited for specific traits.
Regarding the comparison between the holistic and the structural approaches, the results show that a more consistent and reliable prediction can be obtained when considering the appearance of the face. Nevertheless, this does not necessarily marginalizes the prediction capability of the structural approach. As can be seen from
In summary, we experimentally validated the computational prediction capabilities of facial trait judgments. We have shown that all the analyzed trait judgments can be predicted. Furthermore, at least three judgments exhibit prediction accuracy beyond
In this study, we used the behavioral data obtained by Oostehof and Todorov in
In a first step, the facial trait dimensions were identified in an experiment involving
In a second step, the
A data-driven model for the evaluation of facial trait inferences was built. A Principal Component Analysis resulted in two prevalent orthogonal dimensions accounting for over
In a third step, a synthetic face database was generated using the FaceGen software
Subsequently a new set of dimensions (
Using the synthetic images data set (available under request at
Given that the holistic approach was used to describe appearance, variations on structural information needed to be standardized. To do this, all the faces of the data set were projected onto a reference image shape. This image was chosen to be the closest to the mean face to balance the amount of deformation the faces would suffer. The projection process was done by means of an affine based registration and data fitting of the 2D intensity data, using a b-spline grid to control the process. We used the implementation developed by Dr. Dirk-Jan Kroon of University of Twente, available on the Mathworks file exchange web site.
The appearance variable is isolated controling the structure part of the face by projecting the images of each face to a reference face image.
On account of the ranking of each trait (in a range
Because of the small sample size resulting from the previous procedure, for each classifier of the bank, the error rate was estimated with a N-fold cross-validation scheme. This is a way of splitting a data set where
The results shown for the performance are given with a confidence interval (shown in brackets in
We have used the
Twenty one predefined point locations
The first
The second set encodes the spatial relations between each salient point
The third set encodes the intra face structural relationships, and consists of
The EigenFaces method
For the experiment, we cropped the images to a size of
Ten first Principal Components of the Dataset.
In order to verify the separability of the dataset with respect to the appearance, we have projected the two traits that showed the higher prediction capabilities in our experiments. We used the PCA technique to reduce the pixel data information to only two dimensions. Using this approach and for visualization purposes, each facial picture was projected to a 2D feature space using the first two EigenFaces as bases.
The traits with highest prediction scores in the experiments - (Top) Dominance and (Bottom) Threatening, are shown.
This method was developed with the purpose of general object recognition, where the appearance and shape are the targets of the characterization, as mentioned in
Our implementation of the HOG method is applied to the entire object, hence obtaining a unique descriptor, that is, the image containing the object is divided into a uniform grid of cells and a histogram is computed for each cell. The illumination normalization is performed by grouping cells in blocks to avoid local changes in illumination. These blocks take overlapping cells according to a user defined parameter, thus replicating the presence of a cell-histogram in the final descriptor, but normalized to a different block. Thus, we define the HOG approach as holistic due to the way the descriptor is built using overlapping normalizing blocks all across the object. Notice that neither separate HOG descriptors for separate regions are computed, nor any geometric relationship among cells or blocks to build this descriptor is used (which could be considered as a local approach as in
Finally, we extract a concatenation of the histograms computed at the different cells. The current implementation uses an unsigned gradient, that is, the orientation bins are evenly spaced over
The algorithms for the generation of the HOG descriptor and the Geometric descriptor, as well an implementation for the PCA method can be downloaded from
The evaluation of the study was done by using a bank of classifiers composed of state of the art methods. Four were selected as examples of the different types of approaches:
GentleBoost
Support Vector Machine
Binary Decision Trees
K-Nearest Neighbor
Parzen Window
The implementation of the GentleBoost classifier used is publicly available at Antonio Torralba's web site
Although this is not the main goal of this evaluation paper, we performed a proof of concept experiment using a gallery of celebrity images and the FaceGen Software. Images of famous public characters (projected on the same synthetic system used in the study) are shown as illustrative examples of the prediction capabilities of the system. The results of the classifiers can be usually interpreted as a continuos confidence value, or degree of support for the classification task, rather than a simple binary label. We use this support to rank the image gallery.
The faces were projected on the same synthetic portraying system used in the study. Images are sorted in increasing rank order from left to right, by Dominance (top row), Threatening (middle row) and Attractiveness (bottom row).
Scatter plots of the projections of judgments of the nine traits on the first two principal components derived from a PCA of the traits. It can be seen that each trait projects differently, in the case of dominance projects well to the second PC, where mean and threatening do not project that well hence using the information of each trait allows learning the specific features that make each trait unique.
(PDF)
Inter-rater agreement and reliability of nine social judgments of emotionally neutral faces for the
(PDF)
We would like to thank Dr Loris Nanni for his help on the implementation of the Random Subspace method.