Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Optimal Geometrical Set for Automated Marker Placement to Virtualized Real-Time Facial Emotions

Optimal Geometrical Set for Automated Marker Placement to Virtualized Real-Time Facial Emotions

  • Vasanthan Maruthapillai, 
  • Murugappan Murugappan
PLOS
x

Abstract

In recent years, real-time face recognition has been a major topic of interest in developing intelligent human-machine interaction systems. Over the past several decades, researchers have proposed different algorithms for facial expression recognition, but there has been little focus on detection in real-time scenarios. The present work proposes a new algorithmic method of automated marker placement used to classify six facial expressions: happiness, sadness, anger, fear, disgust, and surprise. Emotional facial expressions were captured using a webcam, while the proposed algorithm placed a set of eight virtual markers on each subject’s face. Facial feature extraction methods, including marker distance (distance between each marker to the center of the face) and change in marker distance (change in distance between the original and new marker positions), were used to extract three statistical features (mean, variance, and root mean square) from the real-time video sequence. The initial position of each marker was subjected to the optical flow algorithm for marker tracking with each emotional facial expression. Finally, the extracted statistical features were mapped into corresponding emotional facial expressions using two simple non-linear classifiers, K-nearest neighbor and probabilistic neural network. The results indicate that the proposed automated marker placement algorithm effectively placed eight virtual markers on each subject’s face and gave a maximum mean emotion classification rate of 96.94% using the probabilistic neural network.

Introduction

Non-verbal communication plays an important role in developing intelligent machines that can exhibit better interaction with humans by closely emulating human-human communications. Researchers have increase their focus on developing an intelligent human-machine interface (HMI) system for assisting elderly people that could improve their quality of life [1, 2]. Human body gestures, postures, and facial expressions are used as non-verbal communication mediums to develop HMI systems. Among these modalities, facial expression is the most common due to its cost effectiveness, more reliable detection, and shorter computation time, among other advantages [25]. Over the past several decades, researchers have developed intelligent methodologies to effectively recognize human facial expressions that have been implemented in real-time systems for a variety of applications, such as video gaming, machine vision, pain assessment, psychology, behavioral analysis, and clinical diagnosis [69]. As a result, recent HMI systems can easily “understand” the expressions of humans and perform different tasks [1012].

Emotions can be universally categorized into six types: anger, sadness, surprise, fear, happiness, and disgust. Emotions can be assessed using different modalities, such as physiological signals, gestures, speech, and facial expressions [5, 1315]. Each method of emotion recognition has its own advantages and limitations. Although physiological signals inherently detect human emotions through either central and/or peripheral nervous system activities, issues with higher computational complexity, presence of noise and artifacts in acquired signals, and intrusive electrode placement on the human body limit the development of intelligent real-time systems. Furthermore, most subjects become uncomfortable wearing the electrodes all day long when interacting with systems for any given application. Indeed, most physiological signal-based emotion recognition systems have been developed within a controlled laboratory environment, and very few have been developed in real-time scenarios [16, 17]. Therefore, recent developments in novel image processing algorithms will likely make facial expression detection more reliable and effective for real-time system development over other modalities.

Facial Action Coding System (FACS)

The Facial Action Coding System (FACS) was originally proposed by Ekman and Friesen [18, 19] to identify facial expression of human emotions. There are 46 AUs and 7000 proposed combinations for facial expression detection in the FACS. Although researchers have used different numbers of AUs for developing facial expression recognition systems in laboratory environments, very few have proposed the detection of facial expressions in real-time [20, 21]. Thus, no standard has been proposed for using either a specific set or combination of AUs to identify facial expression. Ekman and Friesen [18] previously discussed facial muscle activation with different emotions and defined the facial AU system for classification of facial expressions. S1 Table shows the effective changes of AUs in facial muscles for each emotion [22]. So, in this research FACS has been used as a guideline to identify the expressions. FACS acts as an investigative tool to study bout the movements of the markers when each expression take place. All eight virtual markers have been placed and investigate according to AU's.

Face and Eye Detection

Face detection is a very important step in facial expression recognition. An efficient automated face detection system should be able to detect a subject’s face in complex scenes with cluttered backgrounds and be able to locate its exact position therein [23]. In face detection, facial features, such as the eyes, nose, and mouth, serve as reference points [24]. By scanning facial images at different resolutions, the scale or size of a face can be analyzed [25]. Several face detection methods have been reported for the recognition of facial expressions [11, 20]. However, AU-based face detection has been used more often in previous studies than other methods [4, 18]; such as, the Viola and Jones face detection method [26]. In real-time scenarios, face detection is performed through either image pixels or Haar-like features [27]. The image pixels-based approach requires a longer computation time to detect the face, and the number of pixels varies in proportion to face shape and pigmentation [28]. Haar-like features can be used to compute changes in pixel contrast (white and black) between adjacent rectangular groups instead of using original pixel values (Fig 1). A more detailed description of the Haar-like features method can be found in a previous study [29]. The Haar-like features method efficiently detects any object, including human faces, in a given image sequence using the AdaBoost cascade classifier [26]. The Viola and Jones method is used to detect faces using Haar-like features in a shorter time with less computational complexity [26]. S1 Fig shows the flow chart of the Viola and Jones algorithm for face detection.

thumbnail
Fig 1.

Common Haar Features; (i) Edge feature; (ii) Side features (iii) Centre-Surrounded Feature.

https://doi.org/10.1371/journal.pone.0149003.g001

In the current report, Haar-like features were used to detect the front of each subject’s face and their eyes. By using an Open CV library, the facial image captured by webcam was passed into the Open CV in order to detect faces via the Haar cascade classifier. Before sending to Open CV, the acquired facial image was subjected to double-precision formatting and converted to grayscale to reduce computational time and memory. Haar-like features in Open CV were used to detect each subject’s face. Haar-like classifiers can detect a subject’s face in 0.067 s compared to other face detection methods [26]. The system then creates an ellipse around the subject’s face and places a “+” mark on both eyes in order to position virtual markers on the subject’s face [30, 31]; most human faces are relatively ellipsoidal in shape. Hence, we drew the ellipse based on methods discussed previously [30, 31]. S2A Fig shows one subject’s facial image captured by webcam and S2B Fig shows the image after face and eye detection.

Therefore, in this research six basic emotions were classified. Haar like features are used to identify the user face and eye. A total of eight automated virtual markers is placed on user face at specific location which discussed in proposed method section. Previously this research examines with a total of ten, eight and six virtual markers to study the optimal number of markers which identify better emotion recognition. As a conclusion eight virtual markers gives better accuracy and based on previous study, eight virtual markers is the optimal number of marker [32]. Hence, in this paper eight virtual markers are discussed. All the markers are then mapped into optical flow algorithm (OFA) [33, 34] to predict the future point. The movement of the markers for each emotion, then investigate with the guide of FACS. The proper methods and the results are discussed clearly in the following sections.

Proposed Method

The present work proposes a new method of automated virtual marker placement on a subject’s face that can be used to detect six basic emotional facial expressions and compares the emotion recognition performance of this new method with manual marker placement. Eight virtual markers were automatically placed at specific locations on each subject’s face, while a webcam captured emotional facial expression sequences. Initially, subjects were requested to place the markers manually on their face based on the given guidelines from the instructor. The guidelines are obtained from FACS, and marker positions were then used for developing an algorithm for automated marker placement. The flow of manual and automated marker placement methods for facial emotion detection is given in S3 Fig. Our complete algorithm was implemented in Microsoft Visual Studio with an Open CV library using C++ programming language on a desktop computer with an Intel i3 processor, 2 GB ROM, and Windows 8 operating system. The Haar cascade database in an Open Computer Vision (Open CV) library was used to detect each subject’s face from video sequences captured via webcam. Initial marker positions (x-y coordinates) were passed through the Lucas–Kanade OFA for predicting future marker positions. The distance of each marker from the center point of each subject’s face defined features for facial expression detection. Extracted features were then mapped with corresponding emotions using the two nonlinear classifiers K-nearest neighbor (KNN) and probabilistic neural network (PNN).

Manual Marker Placement

Manual marker placement was carried out to detect the mean position (distance between the center of the face to the marker’s location) of each marker on the subject’s face. This position was used to develop the automated marker placement algorithm for facial emotion recognition. In this experiment, subjects were requested to digitally place eight markers on their face in specified locations. The number of markers used for facial expression detection was devised by trial and error. The background was set with a black and white screen and the room light intensity was maintained at 41 lx. All subjects were seated comfortably on a chair placed in front of the computer monitor at a distance of 0.95 m. In total, 10 subjects with a mean (± standard deviation) age of 24 ± 0.84 years were assisted with manually placing the eight markers at defined locations on their facial image using the FACS [18]. Markers were placed by clicking the cursor at each position on the facial image. The system then automatically computed the center of the face [31], calculated each marker’s position, and subsequently saved the newly acquired information. Manually clicking the cursor at each of the eight defined facial positions allowed the system to record the exact x-y coordinates of each spot and insert a virtual marker (pink). Using the Pythagorean theorem, the distance between each marker and the center point of the face was calculated [31]. Each subject underwent three marker placement trials for each emotional facial expression, and the mean marker position distance was calculated with respect to the center of the face. S4 and S5 Figs show the experimental setup and manual marker placement on one subject and the marker position recorded by the system in Fig 2, respectively. Calculation of each marker position with reference to the center of the face via manual marker placement was then used to develop the automatic marker placement algorithm.

thumbnail
Fig 2. Markers placement and each marker's position; (Left side; the coordinate and distance calculating) (right side; all markers placed at specific position).

https://doi.org/10.1371/journal.pone.0149003.g002

Automatic Marker Placement

The eight virtual markers positioned according to the previous section were then used for automated facial expression detection, which is extremely convenient, computationally efficient (less computational time and memory), and works with the OFA for tracking markers. Liu et al. proposed a geometric facial model that created a rectangle around the subject’s face using eye positions [35]. The distance between the eyes is used to identify the center point of the face, followed by the mouth; this geometric model is shown in S6A Fig. Thus, the geometric facial model proposed by Liu et al. [35] was used to identify the center of each subject’s face by detecting the eyes. Herein, a total of eight markers were placed on the upper and lower face with reference to the center of the subject’s face; four markers each were placed on the upper face (two each on the left and right eyebrows) and lower face (one each above, below, to the left, and to the right of the mouth; S6B Fig). These marker positions were used for classifying the subject’s emotional facial expressions.

S6B Fig shows the placement of eight virtual markers (black) on a subject’s face from the center point. Initially, Haar-like features were used to detect the face and eyes of the subject. The system places a center marker after the computation of distance between two eyes. According to Liu et al. [35], the center point is located a quarter of the distance (in cm) from the eye to the mouth if the facial geometry is rectangular. In the current study, the mean marker position calculated after manual marker placement across 10 subjects was used for computing the center point of the subject’s face in the automated marker placement algorithm. Hence, half the distance (in cm) from the eye to mouth was used to position the center point for the entire subject’s in the study. An ellipse was then created around each subject’s face with reference to the center point. The radius of the ellipse was taken from the facial features previously detected by Haar cascade classifiers.

In general, facial shapes are not constant at all times; each face has its own shape (e.g., ellipsoidal, circular, etc.) and radius [36]. Thus, our new method uses ratios to calculate the radius of the ellipse. Initially, a vertical line was drawn from the center point to the intersection point between the ellipse and x-axis of the center point of the face (S6B Fig). Placement of the eight virtual markers was automatically done at a certain angle and distance from the center marker. The angle and distance (position) of each marker were computed from manual marker placement. The first and second markers of the upper face (p_e1 and p_e2) were placed at a 45° angle from the x-axis on the left and right sides of the face, respectively. Later, the radius of the ellipse at the 45° angle was calculated. From manual marker placement, it was determined that a mean marker distance ratio of 6.5:9 with the ellipse radius at a 45° angle provided the best positions for markers (p_e1 and p_e2) on the upper face. The second sets of markers (p_e3 and p_e4) on the upper face were placed at a 65° angle from the x-axis on the left and right sides, respectively. The mean distance ratio of the radius of the ellipse at an angle of 65° was found to be 5:9. The method used for upper face marker placement was then applied to the lower face. A complete description of the ratio calculation and placement of markers on the upper and lower face of each subject is given in Fig 3. Lower face markers were placed to the left, to the right, above, and below the subject’s mouth. Point’s p_m1 and p_m2 were placed at a 130° angle from the x-axis and had a mean distance ratio of 11:15; point’s p_m3 and p_m4 were fixed on the y-axis with mean distance ratios of 3:9 and 7:9, respectively. The placement of markers after the automated marker placement algorithm is shown in Fig 4.

thumbnail
Fig 3. Marker Placement; The positions of upper face and lower face markers from center point with the respective angles and distance.

https://doi.org/10.1371/journal.pone.0149003.g003

thumbnail
Fig 4.

Automatic markers positions on the subject face; (a) user, (b) face and eye detection, (c) automated marker placement, (d) geometrical model of automated marker.

https://doi.org/10.1371/journal.pone.0149003.g004

The distance between the center point and each marker is referred to as a distance feature and considered an important feature of facial expression classification. A total of nine features were used: p_e1, p_e2, p_e3, p_e4, p_m1, p_m2, p_m3, p_m4, and p_m5. Eight features were the distance of each marker from center point, while the ninth (p_m5) was the distance between points to the left (p_m1) and right (p_m2) of the mouth (S7 Fig). In the current study, distance feature m1 was calculated using the Pythagorean theorem [31]. Every marker was assigned their own x-y coordinates [e.g., center point, (xc, yc); p_m1, (xm1, ym1)].

In Fig 3, in the left mouth column, line m1 is the hypotenuse of a right triangle, wherein the line parallel to the x-axis is dx [the difference between x-coordinates of the center point (xc) and p_m1 (xm1)]; and the line parallel to the y-axis is dy [the difference between y-coordinates of the center point (yc) and p_m1 (ym1)]. The formula for the computation of feature m1 is given in Eq (2): (1)

Therefore, the formula for feature m1 computation is given as in Eq (2): (2)

In a similar fashion, the distance of each marker from the center point was calculated using Eq 2. The coordinates of each marker were calculated using trigonometry formulas. The position of (x,y) each marker was found after calculating the feature distances at specific angles. Markers p_m3 and p_m4 were placed on the y-axis. Thus, their x-coordinates were the same as that of the center point, and their y-coordinates were found from the ratio of the y-axes. The coordinates of each feature were subjected to the OFA for future coordinate prediction. Initial coordinate values of each marker were replaced with future coordinate values for each marker, and the new distance from the center point was evaluated during facial expression. S2 Table presents the changes in distance of each marker for different emotions. Distance features e1, e2, e3, e4, m1, m2, m3, m4, and m5 indicate the initial position of the marker before facial expression, while features e1', e2', e3', e4', m1', m2', m3', m4', and m5' show the new position of each marker after facial expression.

Results and Discussion

Facial expression recognition has been considered as a major research topic over the past several decades for developing intelligent systems [16, 17, 20, 37]. Most early work focused on AUs, and a little attention was paid to manual and virtual marker-based facial expression detection. AU-based facial expression detection is computationally intensive due to the large number and combination of AUs. On the other hand, manual marker placement is highly intrusive and subjects need not wear the markers (stickers) at all times. Indeed, this work demonstrated an automated lesser (8 marker) virtual marker placement. Most previous research has focused on recognizing six basic emotions (happiness, sadness, anger, fear, disgust, and surprise) through different numbers of AUs [20, 21, 3842] were taken into the consideration.

Data Collection for Emotion Recognition

The performance and reliability of emotion recognition systems are mainly based on data samples that are used to train the system. Participants provided written informed consent to participate in this study. The individual in this manuscript has given written informed consent to publish these case details. All research involving human participants have been approved by the Human Research Ethics Committee of Universiti Malaysia Perlis (HREC-UniMAP) and written consent has been obtained from the participants. In the present study, a sum of 30 subjects (14 male, 16 female) with a mean (± standard deviation) age range of (22.73 ± 1.68 years) were used to collect data on the six different emotional facial expressions in a video sequence. The subjects included in this study were of mixed ethnicities and religions (Hindu, Muslim, and Chinese) in Malaysia. All of the subjects were asked to express specific emotions following an instructor command in a controlled environment (lighting intensity: 41 lx; room temperature: 26°C; distance between the subject and camera: 0.95 m). The lighting intensity and distance between the subject and camera for this experiment were selected based on studies on different light intensities (3.00, 23.00, and 41.00 lx) and distances (near, 0.75 m; middle, 0.95 m; long, 1.15 m). Each emotional facial expression lasted 6 second and each expression was performed twice by each subject (two trials). Data collection was performed in a laboratory environment with two different backgrounds (one completely black in color, and another with a wall poster display). The total time required for completing the six emotional facial expressions (including instructions and baseline state) by one subject was 20 min. All subjects were healthy university students without any previous history of muscular, cognitive, or emotional disorders. Different emotional facial expressions for one subject with the eight virtual markers shown in Fig 5 and different subjects with the eight virtual markers are shown in S8 Fig.

thumbnail
Fig 5.

One subject’s emotional expressions with virtual markers; (a) anger, (b) disgust, (c) fear, (d) sadness, (e) happiness, (f) surprise.

https://doi.org/10.1371/journal.pone.0149003.g005

Initially, manual marker placement on 10 subjects (triplicate) for each emotion was analyzed to identify the marker position (distance between the center of the face to each marker) on each subject’s face. Each subject was asked to place eight markers at defined facial locations based on the FACS. The mean value of each marker position with reference to the centre point and its position angle over 10 subjects were used to develop the automated marker placement algorithm and the results shown in Table 1. Different set of subjects was used for automated and manual marker placement. All the subjects are considered as unknown. Since a different ethnic group of subject was tested, different type of expression given by the subjects for each emotion. This issue was overcome by choosing eligible subjects and well explain the task to each subject.

Evaluate Position of Automated Marker Placement

Table 1 shows the marker distance ratios and angles of three trials of six emotional expressions for 10 subjects using manual marker placement. As a result, the angle of deviation of left eye_1 and right eye_1 markers was approximately 45°, left eye_2 and right eye_2 markers was 65°, left mouth and right mouth markers was 130°, and upper and lower mouth markers was 90°. Similarly, marker distance ratios from the center point to the left eye, right eye, mouth, and above and below the mouth were approximately 0.72, 0.55, 0.73, and 0.33, respectively. These mean marker angles and distance ratios were used for marker positioning using the automated marker placement algorithm that was later used to evaluate marker positioning on the same 10 subjects who underwent manual marker placement. S3 Table shows the differences (error) in marker distance ratios and position angles between manual and automated marker placement algorithms. In some cases, the marker angle over eight markers gave an error value <0.05° and a distance ratio <0.2 between automated and manual marker placement methods (S3 Table). This indicates that automated marker placement successfully located markers on subject’s faces and effectively recognized emotional facial expressions. Hence, marker distance ratios and angles reported in Table 1 were used to develop and test our proposed emotional facial expression recognition system with a greater number of subjects.

Evaluation with Classifier

Next, a new set of 30 subjects was recruited to test the six emotional facial expressions using the automated marker placement algorithm to develop a facial emotion recognition system. The same experimental setup described in manual marker placement was used to develop the facial expression recognition system. The proposed marker placement algorithm placed markers on each subject's face using distance ratios and position angles shown in Table 1. Each marker transferred its original position (x, y) to the OFA to trace future marker movement and direction. In any facial expression occurred immediately afte the automated marker placement method, then the new position of each marker for each emotion was saved by the system. The data were collected in real-time with automated marker placement, and facial expressions were recorded for every subject.

Most existing facial expression recognition systems in the literature perform offline analysis rather than real-time system development [37, 38]. In the current study, three simple statistical features [mean, root mean square (RMS), and variance] of marker distance (the distance between each marker to the center of the face) and changes in marker distance between the original (neutral) (between the original and new marker positions) and new marker position during emotional facial expression were extracted and normalized using binary and bipolar normalization methods [43]. Finally, these normalized features were fed into two nonlinear classifiers (KNN [44] and PNN [45]) to classify the emotional facial expressions. To accomplish this approach, a 10-fold cross-validation was used to segregate the training and testing data for facial expression classification [20]. In the KNN classifier, the value of K is varied from 2 to 10, and the value of K at 5 provided a higher mean emotion recognition rate; Euclidean distance was used as a distance measure. Therefore, only an emotion recognition rate with K = 5 was reported in the present study. In the PNN, the spread value (σ) varied from 0.01 to 0.1.

Marker distance (the distance between each marker to the center of the face) (MD).

The feature was extract from the distance of each marker from centre point. Table 2 and S4 Table shows the KNN and PNN classifier results for MD features respectively.

thumbnail
Table 2. Facial emotional expression recognition rate (in %) based on marker distance (MD) using KNN.

https://doi.org/10.1371/journal.pone.0149003.t002

Changes in marker distance (small change of distance between the original and new marker positions) (CMD).

The feature will investigate the small changes of each marker when facial expression takes place. The system will calculate the distance moved by the marker and classify the rate of recognition. S5 Table and Table 3 shows the KNN and PNN classifier results for CMD features respectively.

thumbnail
Table 3. Facial emotional expression recognition rate (in %) based on changes in marker distance (CMD) using PNN.

https://doi.org/10.1371/journal.pone.0149003.t003

Based on current experimental results, the RMS feature of the changes in marker distance gave a slightly higher mean accuracy rate (96.94%) than the marker distance (96.81%) using the PNN. This result is likely due to the fact that changes in mean marker distance (i) effectively reflect the effect of different emotional expressions on selective markers compared to all markers used for analysis of marker distance, and (ii) measures subtle changes in marker position with each emotional expression. Research have previously analyzed marker distance for facial expression recognition and achieved a maximum mean classification rate of 94% using 19 facial features and the Random Forests classifier [46].

Herein, simple nonlinear classifiers KNN and PNN were used to classify emotional facial expressions. Statistical features extracted from two different methods of feature extraction and normalization (bipolar and binary) was used to map corresponding emotional expressions using the KNN and PNN. Bipolar normalization gave a slightly higher mean emotional expression classification rate (96.94%) compared with binary normalization (96.81%) and unnormalized data (93.83%) in the PNN. In the case of the KNN, binary normalization gave a higher mean emotional expression recognition rate (92.36%) than bipolar normalization (90.00%) and unnormalized data (91.39%).

The current experimental results indicate that the RMS gives a higher facial expression recognition rate compared to other statistical features (mean and variance). For both marker distance and changes in marker distance methods, the RMS gave maximum mean facial expression recognition rates of 96.94% and 96.81%, respectively; the mean feature performed better than the variance but worse than the RMS in emotional facial expression classification. In both methods of feature extraction, variance provided a much lower mean emotional facial expression recognition accuracy (81.81%). However, use of the variance feature with the KNN classifier performed better than the PNN on emotional facial expression recognition. Researchers have used these statistical features for emotional facial expression recognition [44, 46, 47] and shown that variance provided a lower mean facial expression recognition rate with raw data than other statistical features (i.e., mean and RMS) [46].

Comparison

Table 4 shows a comparison of emotional facial expression classification from the present work with previous reports [20, 32, 3740, 46, 4851]. Most previous studies have used a greater number of facial features or AUs to classify emotional facial expressions. A maximum mean classification rate of 96.00% was achieved by classifying six emotional facial expressions using 26 facial features and the Random Forests classifier. The CK database [52] and geometric facial features were commonly used in earlier studies, and facial expression analysis was performed offline. The highest (122) and lowest (12) number of facial features used previously for facial expression classification achieved maximum mean accuracies of 88.83% and 85%, respectively. However, the present study proposed a new method of automated marker placement on the face of subject’s to classify six facial expressions and achieved a maximum mean classification rate of 96.94% using the RMS and PNN.

thumbnail
Table 4. Comparison of facial emotional expression classification (%) of this present work with earlier works.

https://doi.org/10.1371/journal.pone.0149003.t004

Most previous research has utilized virtual markers to analyze the movement of facial muscles during emotion recognition tasks [20, 21, 3842]. Recently, virtual marker-based facial emotion recognition has become popular in addition to AUs [4]. Different sets of markers, from 12 to 62 [39], have been used to detect facial expressions in laboratory and real-time environments. In these works, virtual markers were placed on the subject’s face manually, and no automated marker placement procedure was reported. In contrast to AUs, virtual markers are highly flexible when investigating the movement of markers when facial expressions take place, convenient (no physical stickers/labels are worn on the face), and therefore, more suitable for real-time facial expression detection [38]. However, virtual marker-based face detection is affected by poor lighting (light intensities <30 lx) and camera pixel resolution. Virtual markers used in real-time applications are more stable if the lighting intensity is >30 lx and minimum camera pixel resolution is 640 × 480 [38]. Besides facial features, geometric features based emotion recognition is more popular in human expression detection [53].

In contrast to earlier studies, this present work requires fewer facial features and a simple feature extraction and classification method to classify emotional facial expressions in real-time with a higher recognition rate. Most of the earlier studies adopted manual marker placement and facial AU methods for identifying different facial expressions and very few studies have addressed real-time facial expression detection. Our proposed automated marker placement algorithm works effectively in real-time scenarios with less computational complexity (memory and computation time) than previously reported methods. However, it also has the following limitations: (i) emotional facial expression recognition was performed with a limited number of subjects. The accuracy of the emotion recognition rate might differ if our method were tested with more unknown subjects. (ii) The currently proposed algorithm should be tested with other international databases for standardization purposes. In the future, we hope to analyze different types of statistical features that more efficiently reflect subtle changes in marker position with each emotion to improve the mean classification accuracy. In addition, we intend to implement more intelligent statistical learning algorithms, such as SVM and ANN, to enhance the mean emotional facial expression classification rate.

Conclusion

Facial expression recognition is an intense research topic with several applications. This paper presents a novel automated marker placement algorithm for emotional facial expression classification using marker distance ratios and angles. The lowest number of virtual marker (8 Marker) was placed in paticular places on a subject’s face in proposed automated manner to identify their emotional facial expressions. The OFA was used to track future positions of the markers during emotional expression. A simple set of statistical features were used for classifying the six different emotions tested (happiness, sadness, anger, fear, surprise, and disgust) using two nonlinear classifiers (KNN and PNN). The proposed automated marker placement method gives a maximum mean emotional facial expression recognition rate of 96.94% relative to earlier studies and the computational time approximately 0.3 seconds. Our proposed automated marker placement algorithm will be highly useful for developing intelligent diagnostics for clinical investigation and analysis of emotional behaviors in facial muscle disorder patients and other clinical situations. Furthermore, it may also prove to be very useful for developing of intelligent automated systems that can assist the elderly with HMI devices that enable communication with their surroundings.

Supporting Information

S1 Fig. Flow chart.

Viola-Jones algorithm flow chart for face detection.

https://doi.org/10.1371/journal.pone.0149003.s001

(DOCX)

S2 Fig. Initial position of image capture.

(A) Webcam image (B) Face and eye detection using Haar cascade classifiers.

https://doi.org/10.1371/journal.pone.0149003.s002

(DOCX)

S3 Fig. Flow chart of the study.

(A) Flowchart of the manual marker placement (B): Flowchart of the automatic marker placement.

https://doi.org/10.1371/journal.pone.0149003.s003

(DOCX)

S4 Fig. Manual marker placement.

Total of eight markers are placed manually.

https://doi.org/10.1371/journal.pone.0149003.s004

(DOCX)

S5 Fig. Experimental setup.

A setup for data collection in manual marker placement.

https://doi.org/10.1371/journal.pone.0149003.s005

(DOCX)

S6 Fig.

A geometrical model impliment; (A) Liu et.al geometrical model of the face; (B): Placement of markers based on geometric model.

https://doi.org/10.1371/journal.pone.0149003.s006

(DOCX)

S7 Fig. The positions of markers and features.

Total of eight markers position with their specific names.

https://doi.org/10.1371/journal.pone.0149003.s007

(DOCX)

S8 Fig. Different subject emotional expressions with virtual markers.

(A) anger, (B) disgust, (C) fear, (D) sadness, (E) happiness, (F) surprise.

https://doi.org/10.1371/journal.pone.0149003.s008

(DOCX)

S1 Table. Action Units studies by Ekman and Friesen.

https://doi.org/10.1371/journal.pone.0149003.s009

(DOCX)

S2 Table. The changes of markers distance for each emotion.

https://doi.org/10.1371/journal.pone.0149003.s010

(DOCX)

S3 Table. Error between manual and automated marker placement.

https://doi.org/10.1371/journal.pone.0149003.s011

(DOCX)

S4 Table. Facial emotional expression recognition rate (in %) based on marker distance (MD) using PNN.

https://doi.org/10.1371/journal.pone.0149003.s012

(DOCX)

S5 Table. Facial emotional expression recognition rate (in %) based on changes in marker distance (CMD) using KNN (K = 5).

https://doi.org/10.1371/journal.pone.0149003.s013

(DOCX)

Author Contributions

Conceived and designed the experiments: VM MM. Performed the experiments: VM. Analyzed the data: VM MM. Contributed reagents/materials/analysis tools: MM. Wrote the paper: VM.

References

  1. 1. Casas R, Blasco Marín R, Robinet A, Delgado A, Yarza A, McGinn J, et al. User Modelling in Ambient Intelligence for Elderly and Disabled People. In: Miesenberger K, Klaus J, Zagler W, Karshmer A, editors. Computers Helping People with Special Needs. Lecture Notes in Computer Science. 5105: Springer Berlin Heidelberg; 2008. p. 114–22.
  2. 2. Tablado A, Illarramendi A, Bermúdez J, Goñi A. Intelligent Monitoring Of Elderly People. 4th Annual IEEE Conf on Information Technology Applications in Biomedicine; United Kingdom2003. p. 78–81.
  3. 3. Rani P, Liu C, Sarkar N, Vanman E. An Empirical Study Of Machine Learning Techniques For Affect Recognition In Human–Robot Interaction. Pattern Analysis & Applications. 2006;9:58–69.
  4. 4. Zhang L, Tong Y, Ji Q. Active Image Labeling and Its Application to Facial Action Labeling. In: Forsyth D, Torr P, Zisserman A, editors. Computer Vision–ECCV 2008. Lecture Notes in Computer Science. 5303: Springer Berlin Heidelberg; 2008. p. 706–19.
  5. 5. Maaoui C, Pruski A. Emotion Recognition Through Physiological Signals For Human-Machine Communication. Cutting Edge Robotics 2010,VedranKordic (Ed). 2010.
  6. 6. Cantos S, Miranda J, Tiu M, Yeung M. Marker-Less Gesture and Facial Expression Based Affect Modeling. In: Nishizaki S-y, Numao M, Caro J, Suarez M, editors. Theory and Practice of Computation. Proceedings in Information and Communications Technology. 7: Springer Japan; 2013. p. 221–41.
  7. 7. Deriso D, Susskind J, Tanaka J, Winkielman P, Herrington J, Schultz R, et al. Exploring the Facial Expression Perception-Production Link Using Real-Time Automated Facial Expression Recognition. In: Fusiello A, Murino V, Cucchiara R, editors. Computer Vision–ECCV 2012 Workshops and Demonstrations. Lecture Notes in Computer Science. 7584: Springer Berlin Heidelberg; 2012. p. 270–9.
  8. 8. Yang X, You C-W, Lu H, Lin M, Lane N, Campbell A. Visage: A Face Interpretation Engine for Smartphone Applications. In: Uhler D, Mehta K, Wong J, editors. Mobile Computing, Applications, and Services. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. 110: Springer Berlin Heidelberg; 2013. p. 149–68.
  9. 9. Zhan C, Li W, Ogunbona P, Safaei F. Facial Expression Recognition For Multiplayer Online Games. Proceedings of the 3rd Australasian conference on Interactive entertainment; Perth, Australia. 1231903: Murdoch University; 2006. p. 52–8.
  10. 10. Martinez A, Du S. A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives. Journal of Machine Learning Research. 2012;13:1589–608. pmid:23950695
  11. 11. Bartneck C. Affective Expressions of Machines: Eindhoven: Stan Ackerman Institute; 2000.
  12. 12. Hay M. Could a machine or an AI ever feel human-like emotions? 2014. Available: http://www.vitamodularis.org/articles/could_a_machine_feel_human-like_emotions.shtml.
  13. 13. Horlings R, DragosDatcu, Rothkrantz LJM. Emotion Recognition Using Brain Activity. International Conference on Computer Systems and Technologies–CompSysTech2008.
  14. 14. Kulic D, Croft EA. Affective State Estimation for Human–Robot Interaction. IEEE Transactions on Robotics. 2007;23:991–1000.
  15. 15. Takahashi K, Sugimoto I, editors. Remarks On Emotion Recognition From Breath Gas Information. IEEE International Conference on Robotics and Biomimetics; 2009 December 19–23; Guilin, China.
  16. 16. Kolakowska A, Landowska A, Szwoch M, Szwoch W, Wróbel MR. Emotion Recognition and its Application in Software Engineering. 6th International Conference on Human System Interaction; June 06–08; Gdansk, Poland: IEEE Xplore Digital Library; 2013.
  17. 17. Monajati M, Abbasi SH, Shabaninia F, Shamekhi S. Emotions States Recognition Based on Physiological Parameters by Employing of Fuzzy-Adaptive Resonance Theory. International Journal of Intelligence Science. 2012;2:166–75.
  18. 18. The Facial Action Coding System: A Technique for the Measurement of Facial Movement [Internet]. San Francisco, CA: Consulting Psychologists Press Inc; 1978.
  19. 19. Ekman P, Friesen WV, Ancoli S. Facial Signs of Emotional Experience. Journal of Personality and Social Psychologists. 1980;39:1123–34.
  20. 20. Suk M, Prabhakaran B. Real-time Mobile Facial Expression Recognition System—A Case Study. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops: IEEE; 2014. p. 132–7.
  21. 21. Ryan A, Cohn JF, Lucey S, Saragih J, Lucey P, Rossi A. Automated Facial Expression Recognition System 43rd Annual 2009 International Carnahan Conference on Security Technologies2009. p. 172–7.
  22. 22. Ekman P, Rosenberg EL. What the face reveals basic and applied studies of spontaneous expression using the facial action coding system (FACS) New York; Toronto: Oxford University Press; 2005. Available: http://www.myilibrary.com?id=42846.
  23. 23. Fasel B, Luettin J. Automatic Facial Expression Analysis: A Survey. Pattern Recognition. 2003;36(1):259–75. http://dx.doi.org/10.1016/S0031-3203(02)00052-3.
  24. 24. Essa I, Pentland AP. Coding, analysis, interpretation, and recognition of facial expressions. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 1997;19(7):757–63.
  25. 25. Rowley HA, Baluja S, Kanade T. Neural network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998;20(1):23–38.
  26. 26. Viola P, Jones M, editors. Rapid Object Detection Using A Boosted Cascade Of Simple Features. Computer Vision and Pattern Recognition, 2001 CVPR 2001 Proceedings of the 2001 IEEE Computer Society Conference on; 2001: IEEE.
  27. 27. Tripathy R, Daschoudhury R. Real-time Face Detection and Tracking Using Haar Classifier on SoC. International Journal of Electronics and Computer Science Engineering 2014;3:175–84.
  28. 28. Wilson PI, Fernandez J. Facial Feature Detection Using Haar Classifiers. Journal of Computing Sciences in Colleges. 2006;21(4):127–33.
  29. 29. Bradski G, Kaehler A, Pisarevsky V. Learning-Based Computer Vision with Intel's Open Source Computer Vision Library. Intel Technology Journal. 2005;9(2):119–30. doi: citeulike-article-id:4097447.
  30. 30. Jain V, Learned-Miller EG. Fddb: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report. 2010.
  31. 31. Sally JD, Sally P. Roots to Research: A Vertical Development of Mathematical Problems: American Mathematical Soc.; 2007.
  32. 32. Saeed A, Al-Hamadi A, Niese R, Elzobi M. Frame-Based Facial Expression Recognition Using Geometrical Features. Advances in Human-Computer Interaction. 2014;2014:4.
  33. 33. Lonare A, Jain SV. A Survey on Facial Expression Analysis for Emotion Recognition. International Journal of Advanced Research in Computer and Communication Engineering. 2013;2:4647–50.
  34. 34. Lucas BD, Kanade T. An Iterative Image Registration Technique with an Application to Stereo Vision. Proceedings of the 7th international joint conference on Artificial intelligence Vancouver, BC, Canada. 1623280: Morgan Kaufmann Publishers Inc.; 1981. p. 674–9.
  35. 35. Liu Z, Li W, Zhang X, Yang J, editors. Efficient Face Segmentation Based On Face Attention Model And Seeded Region Merging. Signal Processing, 2008 ICSP 2008 9th International Conference on; 2008: IEEE.
  36. 36. Thornhill R, Gangestad SW. Facial Attractiveness. Trends in Cognitive Sciences. 1999;3(12):452–60. pmid:10562724
  37. 37. Zhang L, Mistry K, Hossain A, editors. Shape And Texture Based Facial Action And Emotion Recognition. Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems; 2014: International Foundation for Autonomous Agents and Multiagent Systems.
  38. 38. Ghandi BM, Nagarajan R, Desa H. Real-Time System for Facial Emotion Detection Using GPSO Algorithm IEEE Symposium on Industrial Electronics and Applications (ISIEA 2010),. 2010:40–5.
  39. 39. Kotsia I, Pitas I. Real Time Facial Expression Recognition from Image Sequences using Support Vector Machines. IEEE International Conference on Image Processing 20052005. p. 1–8.
  40. 40. Michel P, Kaliouby RE, editors. Real Time Facial Expression Recognition in Video using Support Vector Machines. Proceedings of the 5th international conference on Multimodal interfaces—ICMI '03; 2003: ACM Press.
  41. 41. Srivastava S. Real Time Facial Expression Recognition Using A Novel Method The International Journal of Multimedia & Its Applications (IJMA). 2012;4:49–57.
  42. 42. Bajpai A, Chadha K. Real-time Facial Emotion Detection using Support Vector Machines. International Journal of Advanced Computer Science and Applications,(IJACSA) 2010;1:37–40.
  43. 43. Lee H. Justifying Database Normalization: A Cost/Benefit Model. Inf Process Manage. 1995;31(1):59–67.
  44. 44. Sutton O. Introduction to k Nearest Neighbour Classification and Condensed Nearest Neighbour Data Reduction. 2012.
  45. 45. Lotfi A, Benyettou A. A Reduced Probabilistic Neural Network For The Classification Of Large Databases. Turkish Journal of Electrical Engineering and Computer Science. 2014;22(4):979–89.
  46. 46. Loconsole C, Runa Miranda C, Augusto G, Frisoli A, Costa Orvalho V, editors. Real-time Emotion Recognition—Novel Method for Geometrical Facial Features Extraction2014.
  47. 47. Ryu S-J, Kim J-H. An Evolutionary Feature Selection Algorithm for Classification of Human Activities. Robot Intelligence Technology and Applications 2: Springer; 2014. p. 593–600.
  48. 48. Ghimire D, Lee J. Geometric Feature-Based Facial Expression Recognition in Image Sequences Using Multi-Class AdaBoost and Support Vector Machines. Sensors. 2013;13(6):7714–34. pmid:23771158
  49. 49. HUANG K-C, Kuo Y-H, Horng M-F. Emotion Recognition By A Novel Triangular Facial Feature Extraction Method. International Journal of Innovative Computing, Information and Contral Volume 8. 2012;(11).
  50. 50. Loconsole C, Chiaradia D, Bevilacqua V, Frisoli A. Real-Time Emotion Recognition: An Improved Hybrid Approach for Classification Performance. Intelligent Computing Theory: Springer; 2014. p. 320–31.
  51. 51. Youssif AA, Asker WA. Automatic Facial Expression Recognition System Based on Geometric and Appearance Features. Computer and Information Science. 2011;4(2):p115.
  52. 52. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I, editors. The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on; 2010: IEEE.
  53. 53. Kapoor A, Qi Y, Picard RW, editors. Fully Automatic Upper Facial Action Recognition. Analysis and Modeling of Faces and Gestures, 2003 AMFG 2003 IEEE International Workshop on; 2003: IEEE.
  54. 54. Lyons M, Akamatsu S, Kamachi M, Gyoba J, editors. Coding Facial Expressions With Gabor Wavelets. Automatic Face and Gesture Recognition, 1998 Proceedings Third IEEE International Conference on; 1998: IEEE.