Effect of Subliminal Lexical Priming on the Subjective Perception of Images: A Machine Learning Approach

The purpose of the study is to examine the effect of subliminal priming in terms of the perception of images influenced by words with positive, negative, and neutral emotional content, through electroencephalograms (EEGs). Participants were instructed to rate how much they like the stimuli images, on a 7-point Likert scale, after being subliminally exposed to masked lexical prime words that exhibit positive, negative, and neutral connotations with respect to the images. Simultaneously, the EEGs were recorded. Statistical tests such as repeated measures ANOVAs and two-tailed paired-samples t-tests were performed to measure significant differences in the likability ratings among the three prime affect types; the results showed a strong shift in the likeness judgment for the images in the positively primed condition compared to the other two. The acquired EEGs were examined to assess the difference in brain activity associated with the three different conditions. The consistent results obtained confirmed the overall priming effect on participants’ explicit ratings. In addition, machine learning algorithms such as support vector machines (SVMs), and AdaBoost classifiers were applied to infer the prime affect type from the ERPs. The highest classification rates of 95.0% and 70.0% obtained respectively for average-trial binary classifier and average-trial multi-class further emphasize that the ERPs encode information about the different kinds of primes.


Introduction
Understanding how affective content influences decision making without being consciously perceived has been an area of active research [1]. Such evaluative and affective responses rely on the interplay of underlying emotional and cognitive processes, which are assumed to be instantaneous and automatic [2,3]. When a subliminal presentation of a prime object prejudices or changes one's evaluation of a subsequently presented target object in the direction of the affective valence of the prime object, there is said to be an affective priming effect [4]. This effect occurs faster and more accurately when the prime and target are affectively congruent (i.e., positive-positive or negative-negative) than when affectively incongruent (i.e., positivenegative or negative-positive) [4,5].
Various past and recent studies have found the affect priming effect across an array of prime stimuli, from emotional pictures [6][7][8][9][10][11] to emotional words [12][13][14][15] or both [16][17][18]. Most of these studies elicited the use of ERPs alongside behavioral performance measurements and, as a result, have identified two emotion-related ERP components [19]. The early posterior negativity (EPN) occurs around 100-200 ms [20,21] and the late positive complex (LPC) at around 200-500 ms [22,23]. Neurophysiological analysis of subliminal priming has made the tracking of these seemingly automatic unconscious processes visible, allowing an online measurement with temporal resolution and complex information processing efficiently, otherwise not possible with standard behavioral measurements alone, such as tracking reaction time or accuracy scores [17,18,24]. ERP studies have also been used in applications such as object recognition [25], decoding of visual attention [26,27], and prediction of human cognitive states [28].
However, while utilizing EEG has been beneficial in investigating brain-behavior relationships, can ERP data accurately and efficiently reveal one's intention or what one is thinking? Is it possible to 'decode' an individual's thoughts or even unconscious mental state based only on measurements of their brain activity? It seems that this prospect has remained purely hypothetical [29]. Therefore, in this present study, we address the question of whether unconscious mental states can be consistently decoded from performance patterns during a subliminal affective priming task. Standard ERP analyzes and pattern classification techniques, i.e., Support Vector Machine (SVM) and AdaBoost classifiers, are implemented in a comparative study to provide additional evidence to bolster the reliability of ERP data alongside behavioral data. Pattern classifiers facilitate the integration of neural activity into a decision variable so as to compute the comparison of performance parameters with corresponding behavioral performance. Classifiers such as SVM and AdaBoost have been found to perform extremely well for brain data [30][31][32][33]. The main goal of the present study is to quantitatively assess the ability of ERP metrics to successfully predict the affective valence (positive, negative, and neutral) of the visual lexical stimulus (prime word) presented to the participant and hence reconstruct the mental states across observers according to ERP data.
To the best of our knowledge, no study published to date has used pattern classifiers like the SVM in characterizing the neural correlates of behavioral performance during a subliminal priming presentation of affective cross-domain stimuli so as to predict the mental states of young, healthy participants. The primes used in the present study are words across three valences (positive, negative, and neutral), and subsequent target stimuli are originally neutral images (cf. Gibbons, 2009 for similar experimental stimuli). In a recent study by Grotegerd et al. (2013), SVM was used in a subliminal priming experiment on unipolar and bipolar depression patients but with emotional faces as both prime and target stimuli. , , and Das et al. (2010) used pattern classifiers during perceptual decision paradigms (single-trial EEG) for predicting perceptual decision biases, and both of the prime-target stimuli were pictures (face/car paradigm). Bode et al. (2012) likewise used SVM in their multivariate pattern analyzes to study choice priming biases in a perceptual decision paradigm with static noise-masked images of pianos and chairs.
The use of subliminal emotional stimuli (words and images) in our experimental design has various implications on the predictions for this study. We expect that the affective primes will cause participants to respond differently to the target images and emotional-content-dependent ERP modulations can be observed as early as at the P1 (40-120 ms), N1 (80-170 ms) and P2 (100-210 ms) time windows, characteristic of the EPN [34]. In addition, the LPC can be expected in the posterior regions due to a shift in likeness judgment in the positively primed condition not found in the negative and neutral conditions [35][36][37]. Specifically, the affective word priming conditions would elicit a lexical priming effect, notably the P300 and N400 effects, associated with attentional capture, evaluation, and memory encoding [38][39][40]. There would also be a more pronounced late ERP component in the posterior regions of the right hemisphere rather than the left, as the right hemisphere is said to play a dominant role in emotional prosody and semantics [41]. Lastly, the SVM as a pattern classifier is predicted to be a successful tool for discriminating among the prime types and thereby the mental states of participants.
The preliminary results obtained were previously published in [42].

Methods
In this section, we describe the experimental protocol, stimuli selection, procedure, EEG signal recording, preprocessing, feature extraction, and classification techniques. The theoretical background of SVM and AdaBoost are also briefly reviewed.

Participants
Forty English-speaking students (26 males, 14 females; M = 22.3 years, ranging 19-33) at Nanyang Technological University volunteered to participate in this study. All had normal or corrected-to-normal vision and were naive to the purpose of the experiment. The Edinburgh Handedness Inventory [43] was administered to determine handedness (39 right-handed and 1 left-handed). The Nanyang Technological University Institutional Review Board approved this study and experimental paradigm. All participants gave informed written consent and received monetary remuneration for their participation.

Stimuli
An initial pilot study was conducted to construct the stimulus set. Six hundred and seventyfive images were acquired at random from the Internet and converted to grayscale. All images depicted objects or places that could be named with a single word (e.g., bangles, restaurant). Each image in the initial set was paired with a positive, negative, and neutral word prime based on suggestions from four analysts of varied cultural backgrounds. Words that were semantically unrelated to the image were considered neutral word primes. The resultant word-image pairs were then submitted to a preliminary rating study to determine the strength of association between each image and its three suggested words.
In the rating study, 10 participants (6 females) rated the following on a 7-point Likert scale: The order of the images and the three prime words to be rated per image was randomized among participants. Each image and its word primes were rated by all 10 participants. The selection of suitable word-image pairs for each affect type was based on the following criteria: positive word-image pairs were rated between 5 and 7 for affect valence by at least 80% of participants, neutral pairs were rated 4 by at least 80% of participants, and negative pairs were rated between 1 and 3 by at least 80% of participants. This procedure created a stimulus pool of 417 word-image pairs consisting of 163 positive pairs, 128 neutral pairs, and 126 negative pairs. No images passed the scoring criteria for more than one affect type, thus there are no repeated images within the stimulus pool.
Next, 150 word-image pair consisting of 50 positive pairs, 50 neutral pairs, and 50 negative pairs were selected from the stimulus pool for the experiment. To ensure that the images within each condition had similar distributions of qualities, rating scores for each image-word pairs were averaged across the participants and submitted to a separate one-way ANOVA for verification. Affect valance scores were highly significant between all three affect conditions (F(2,147) = 383.68, p<0.001). The mean association valance scores for the positive, negative, and neutral conditions were 5.40, 2.77, and 3.96, respectively. Likability (Likert score) and ease of recognition scores were non-significant between the three conditions (F(2,147) = 0.57, p>0.1; F(2,147) = 0.15, p>0.1).
It should be noted that the association strength for positive and negative conditions was significantly different (F(1,98) = 45.30, p<0.001). To limit this effect, only word-image pairs with mean association strengths greater than 3 were retained as stimuli. The mean association strength scores for positive, negative, and neutral conditions were 5.14, 3.70, and 1.56, respectively. A few samples of prime word-image pairs chosen are shown in Fig 1. Words were used as subliminal affective primes for the images. Visual stimuli were presented on the LCD monitor (Dell computer, resolution 800 × 600, refresh rate of 60 Hz, color depth of 16-bit) at a viewing distance of 60 cm.

Experiment Procedure
The sequence of the events in a single-trial is schematized in Fig 2. The start of each trial was triggered by presenting a blank screen for 1000 ms followed by displaying a fixation point, the mark '+' at the center of the white screen for 1000 ms. Offset from the fixation point, a prime word was presented subliminally for 34 ms, followed by a mask '##########' for 34 ms. The duration of the prime words was carefully chosen according to the previous literature showing that a presentation of a simple shape [44], or a word [45,46] for 34 ms causes a subliminal priming effect. Following the mask, a target image was exposed for 1500 ms. On the target image offset, participants were prompted to rate how much they liked the presented image on a 7-point Likert scale, ranging from one (liked the least) to seven (liked the most). The prompt remained in view until the participant's response was obtained, which was made by pressing one of the seven buttons of the keyboard. Simultaneously, the EEG signals were recorded. The inter-trial interval was fixed at 1000 ms. Each participant performed 150 trials of the rating task, split up into 5 different blocks consisting of 30 trials per block, with a short break between the blocks. The sequence of 'prime word-image' pairs was randomized between blocks, and each pair was unique.
An additional procedure was carried out with 10 participants (8 males; mean age of 23.4), different from the participants in the experiment with primes, to determine image rating behavior in the absence of subliminal priming. The procedure for the experiment without primes excluded the 34 ms-long prime word presentation from the original sequence and was otherwise identical.

EEG Recording and Preprocessing
The EEG was recorded using a 32-channel HydroCel GSN (HCGSN) sensor array from Electrical Geodesic Inc. (EGI), and arranged according to the 10-20 system [47] at a sampling rate of 250 Hz. Net Amplifier 300 was used to amplify the signal at each electrode by a factor of approximately 20. The EEG data were processed with EEGLab [48] running in the MATLAB (Mathworks, Natick, MA, USA) environment. The recorded data were band pass filtered at 0.1-30 Hz and then referenced to the average of all electrodes.
Epochs for ERPs were collected at −1000 ms to 1500 ms around the image onset for each priming condition. The baseline was set to be −1000 ms to 0 ms. Infomax [49], an independent components analysis algorithm implemented in EEGLAB, was applied to the remaining data to eliminate eye, muscle, and line noise artifacts. In a small number of participants, noisy channels in raw data were removed and interpolated after back-projection using spherical spline interpolation. Individual epochs were then visually inspected for the remaining artifacts, and 8.2% of all epochs were rejected from the final analysis.

ERP Feature Extraction
We extracted features from the time, frequency, and time-frequency domains, which include window-based mean amplitudes, relative power from alpha, beta, and gamma bands, power spectral density (PSD) estimates from short time Fourier transform (STFT), and wavelet coefficients from the discrete wavelet transform (DWT).
The pre-processed artifact-free single-trial ERP waveforms were averaged across the trials for each participant, electrode, and prime affect type. The mean amplitudes in 25 ms discrete time windows, from 0 to 500 ms of the ERP segment, after the image onset, were then extracted. The neural activity associated with different prime affect conditions, the variation as time elapses, and the existence of ERP components related to various brain activities could be measured and differentiated among the prime affect types. The mean amplitudes are used as input features to the classifiers.
For a non-stationary signal like ERP, the time-frequency analyzes such as STFT and wavelet transform (WT) help identifying the time varying spectral content. STFT is applied to singletrial ERPs with a Hamming window of 128 point length with 50% overlap. Then, the FFT algorithm is applied to each segment. The PSD estimates of each segment, corresponding to different frequency bands, are extracted and used as input to the classifier.
In the STFT algorithm, a fixed duration time window is applied across all frequencies. In general, high-frequency signals require shorter time-windows and low-frequency signals require longer time-windows to optimally characterize the signal. This limitation is eliminated by using WT, in which the window size varies across the frequencies. The DWT is used to calculate the wavelet coefficients at discrete intervals of time and scale. This technique provides optimal resolution in both the time and the frequency domains. The DWT of a signal x(t) is expressed as: where a and b are replaced by 2 j and 2 j k respectively. We applied decomposition levels up to 5 to the single-trial ERP data. The approximate coefficients (cAj) at level j were used for reconstructing the signal. We observed that the significant ERP features were kept well preserved up to the decomposition level-3. Hence, we selected level-3 decomposition for further analyzes. Tests were conducted with several mother wavelet functions such as Daubechies (db2, db4, and db8), Symlet (sym8), and Biorthogonal (Bior4.4) waves, and the one that yielded the maximum efficiency was selected for the application [50][51][52]. The approximate coefficients at level-3 (cA3) were used for classification.
The above-mentioned features were acquired from the single-trial ERPs and were averaged across the trials to generate the average ERP features to be used for classification.

Feature Selection
In order to acquire a set of optimal features that allows us to differentiate the three prime affect types, we employed a dimensionality reduction technique called linear local Fisher discriminant analysis (LFDA) [53]. LFDA transforms the high-dimensional data samples into a lowdimensional space while most of the intrinsic information is preserved [53]. This technique combines the ideas of Fisher discriminant analysis (FDA) and locality preserving projection (LPP). As a result, the between-class separability is increased, and within-class local structure is preserved. The samples (x i 2 R n ) in n-dimensional space are transformed to an r-dimensional space (we set r = 5) by using an n × r transformation matrix T as follows: where z i 2 R r ð1 r nÞ are the samples in the reduced space (embedded samples). The features are normalized for each participant by using Z-scores, where the mean is set to zero and the variance is set to 1. The high-dimensional normalized ERP feature set is fed to the LFDA, and the resultant embedded samples are provided to the classifiers.
For each classifier, optimized feature selection through LFDA was carried out.

Learning Algorithms: An Overview
We applied two different classification algorithms to infer the prime affect from the ERP data: SVMs and AdaBoost classifiers. These algorithms have successfully been applied to various classification problems [54][55][56][57]. A brief review of the theory behind the two learning algorithms is given in the following subsections.

Support Vector machines
Support vector machines (SVMs), introduced by Vapnik [58], are large margin classifiers. In the context of decoding information from EEGs, SVMs have exhibited satisfactory classification rates [59,60]. In addition, they are known to have good generalization performance (i.e., error rate on test sets), and insensitivity to overtraining and to the curse-of-dimensionality. Let us consider a training set of m vectors x i 2 R n , where x i belongs to an n-dimensional feature space X . Each vector x i is associated with a label y i , where y i belongs to a finite label space Y. For binary classification, we assume Y ¼ fÀ1; þ1g, i.e., the prime affect type ({positive, neg-ative} or {positive, neutral} or {negative, neutral}). Let us consider a hyperplane w Á x + b = 0, where w is the normal to the plane, kwk is the Euclidean norm of w, and |b|/kwk is the perpendicular distance from the hyperplane to the origin. Also, assume that the hyperplane separates the two classes in some space H and no prior knowledge is available about the data distribution. Then, the optimal hyperplane is the one that maximizes the margin. The optimal values of w and b are obtained by solving the constrained minimization problem using Lagrange multipliers α = α 1 , α 2 , . . ., α m : where K(x i , x) is the kernel function. We refer the readers to [58] for more details on SVMs. The multi-class problem is formulated using the 'one-against-all' (OAA) strategy which constructs k (class labels) binary SVM classifiers, each of which distinguishes one class from the rest. The OAA strategy seems to be robust for cases having a small number of classes and a small set of training samples. We trained the SVMs using the radial basis function (RBF) (a.k.a. Gaussian) kernel. It is highly effective in problems where the relationship between the class labels and the attributes is non-linear. The optimal values of the parameters such as RBF width σ and the regularization constant C are set by cross-validation. This results in the values σ = 8 and C = 1 for average ERP features, and σ = 2 and C = 1 for single-trial ERP features.

AdaBoost
The AdaBoost algorithm proposed by Freund and Schapire has been successfully applied in numerous classification problems [61][62][63][64][65]. It is a type of learning algorithm that combines many simple and moderately inaccurate classifiers into a single highly accurate classifier.
The AdaBoost algorithm repeatedly calls a given weak learning algorithm in a series of iterations t = 1, 2, . . ., T. The weak learner accepts the sample set S = {(x 1 , y 1 ), (x 2 , y 2 ), . . ., (x m , y m )} along with a distribution D t over {1, 2, . . ., m} and outputs a weak hypothesis h t : X ! fÀ1; þ1g. D t denotes the distribution or a set of weights for the training set. Initially, all weights are set equally and are updated in the subsequent iterations in such a way that the misclassified samples assume higher weights and the correctly classified samples the lower weights. This technique forces the weak learner in the subsequent round to focus on the hard sample [61,62]. For each instance x, the sign of h t (x) identifies the predicted class label, and the absolute value gives the confidence in this classification.
The final hypothesis H is computed as a weighted majority vote of T weak hypotheses h t with α t being the weight assigned to h t . Therefore: The AdaBoost is adaptive in that it adapts to the error rate of individual weak hypotheses. The AdaBoost algorithm has been extended to handle multi-class case where the goal is to find weak hypotheses with small pseudo-loss rather than hypotheses whose classification error is small. This is often referred to as AdaBoost.M2 [61].
For a given training sample (x i , y i ), where x i 2 X , and y i 2 Y ¼ f1; 2; ::; kg, the hypothesis h is used to answer k − 1 binary questions where k is the number of distinct class labels (k = 3). For an instance x i and incorrect label y 6 ¼ y i , assume a weight q(i, y) associated with the question that discriminates y from the correct label y i . Provided with D t and label weighting function q t , the pseudo-loss of h t is expressed as: We performed several classification tests with a decision-tree based AdaBoost algorithm with T = 10, 20, 30, 40, and 50. The value T = 20 yielded a good compromise between the computation time and classification accuracy, and is chosen for binary as well as multi-class problems. In comparison with SVM, no parameter tuning (except T) is required for AdaBoost. Further, SVMs are more computationally demanding than AdaBoost because SVM requires quadratic programming, whereas AdaBoost requires only linear programming.

Results and Discussion
In this section, we present the results for: • Behavioral data (Likert scores) analysis for the experiment with and without subliminal primes • EEG data analysis for the experiment with subliminal primes • Decoding of the ERP data using the learning algorithms: • SVM (average-trial and single-trial classification) • AdaBoost (average-trial and single-trial classification)

Analysis of behavioral data
The responses on a 7-point Likert scale for each participant in the experiment with (40 subjects) and without (10 subjects) subliminal primes were averaged across the trials within each affect condition (positive, negative, and neutral) and then analyzed by means of one-way repeated measures ANOVA and two-tailed paired t-tests to determine the effect of subliminal priming on participant's likability ratings on images.
The repeated measures ANOVA test with three conditions was significant for the experiment with subliminal primes (p = 1.11E-16<0.05), indicating significant differences in the Likert scores across the three affect conditions. However, no such effect was observed in the experiment without subliminal primes (p = 0.861).
The priming effect on behavior was further examined with the help of a two-tailed paired ttest for each pair of conditions (see Table 1). For the experiment with primes, the test returned significant results for positive-negative (p = 1.41E-09) and positive-neutral (p = 3.36E-12) pairs, implying a strong bias in the likeness judgment toward positive for the images in the positively primed condition compared to that of the negative condition. The effect of negative primes on behavior was, however, not evident in the data (p = 0.596 for negative-neutral). A possible explanation for this might be the weak association between the negative prime words and the stimuli images compared to that of the positive. The mean Likert score ratings for positive, negative, and neutral conditions were 5.02 (SD = 0.46), 4.48 (SD = 0.43), and 4.51 (SD = 0.34), respectively in the experiment with primes. Conversely, all three t-tests were insignificant for the experiment without primes (see Table 1). The mean Likert score ratings in this In summary, the differences among positive, negative, and neutral conditions were observed only in the experiment with subliminal primes and not in the one without primes. This confirms that the observed differences in behavior are purely due to the effect of subliminal primes shown prior to the stimuli images and not due to the physical and subjective qualities of the supraliminal images.

Analysis of EEG data
The grand ERP averages at different channels reflect the differences in brain activity among positive, negative, and neutral conditions at the early (50-100 ms) and late (400-450 ms) latencies (see Fig 3). The modulations in the EEG signal before the image onset could be due to the effect of prime word and the use of filter. We observed the N400 effect at channel Pz, associated with lexical priming [66]. One-way repeated measures ANOVA tests were carried out to examine the difference in brain activity among the conditions. The artifact-free ERP signals corresponding to three different conditions were first averaged across the trials, and then the mean amplitudes from discrete time windows were extracted using a window of length 25 ms for each subject. The mean amplitudes at discrete time windows corresponding to different prime affect conditions were analyzed by means of ANOVA tests. The p-values are summarized in Tables 2 and 3. It is interesting to note that the ANOVA test showed a significant difference between the negative and the neutral conditions in 400-425 ms and 425-450 ms time windows of channel Pz. Thus, the effect of negative primes, which was not visible in the behavioral data, was observed and confirmed through ERP analysis. This finding emphasizes the relevance of ERP studies in detecting a subliminal priming effect, which is rather subtle.
The effect was prominent in the occipital, lower temporal, and parietal lobes, and the difference was mainly between positive-negative, and positive-neutral pairs. The consistent differences among the three affect types demonstrated an overall priming effect.
The role of left/right dorsolateral prefrontal cortex (DLPFC) in predicting the neural activity of fMRI associated with sentence polarities was address in [67]. It was claimed that the right hemisphere (RDLPFC) can predict the sentence polarity with highest accuracy as compared to left hemisphere (LDLPFC). As can be seen from Table 2, the highest significant p-value was reported at channel O2, which is located at the right hemisphere. This is in line with the statement in [67].

Decoding ERPs
Here, we discuss the performance of the applied classifiers (SVM and AdaBoost) in inferring the prime affect type from the ERPs. We focus on two major classification tasks: (i) average-ERP classification and (ii) single-trial ERP classification.

Average-trial ERP classification. Performance evaluation:
Leave-one-subject-out cross-validation (LOSO-CV) was adopted to assess the performance of the classifiers in the average-trial ERP classification. The training set comprised the feature set of 39 participants' average-trial ERP data. The model was then tested against the remaining subject. The process was repeated until all the subjects were employed as a test set. Finally, we report confusion matrices and measure the classifier accuracy, which is the average accuracy of all the subjects. The confusion matrices for a multi-class and a binary-class (positive-negative) classifiers are given in Tables 4 and 5, respectively; the confusion matrices for positive-neutral and negativeneutral classifiers are constructed similarly.
Results: Classification was performed using the average ERP features (averaged across the trials for each prime affect type) acquired from the 0-500 ms segment of the ERP, after the stimulus onset.
For each binary SVM classifier, the classification rate (% accuracy) for LOSO-CV is presented in Table 6. The highest classification accuracies obtained for SVMs were 95.0%, 87.5%, and 85.0% for positive-negative, positive-neutral, and negative-neutral, respectively. The features from the channels located at the central, temporal, and parietal lobes were found to be  Effect of Subliminal Lexical Priming on the Perception of Images significant for discerning negative from positive, and also from neutral. However, features from only temporal and parietal lobes were required to discriminate between positive and neutral samples.
To further investigate and confirm that the participant's mental state could easily be inferred from the average ERP features with a high performance rate, we conducted similar classification tasks using AdaBoost classifier. Maximum classification rates of 91.25%, 92.50%, and 81.25% were attained with AdaBoost for positive-negative, positive-neutral, and negativeneutral, respectively (see Table 7). The highest individual classification performance was accomplished when using ERP data from channels at locations other than frontal. We did not notice any differences in decoding performance when training with features from the right and left hemispheres in any individual classifiers.
The performance of the individual binary classifiers of SVM and AdaBoost was slightly different, but both were found to be effective for the classification problem at hand. This decoding analysis revealed that different prime affect types induce significant changes in the ERP waveforms, which can be identified by means of any powerful classifier with appropriately tuned parameters and optimally selected input features. In other words, it is possible to reliably decode one's mental states, induced by subliminal primes, using ERPs.
In addition, we investigated multi-class classification problems using the average ERP data. Satisfactory performance results of 70% and 61.67% accuracy rates were obtained for SVM and    AdaBoost, respectively. The sensitivity of positive, negative, and neutral classes was 80.00%, 72.50%, and 57.50%, respectively for SVM. The figures were in the order of 67.50%, 75.00%, and 42.50% for AdaBoost (see Table 8). In summary, the SVM multi-class classifier outperformed AdaBoost.
The multiclass SVM and AdaBoost classifier performance were statistically validated by performing identical classification procedures on randomly permuted data (see Fig 4). Thousand synthetic data sets were generated by randomly assigning the class labels to the data. The performance on the actual set is marked using 'X'. For both SVM and AdaBoost, the performance on the synthetic data set was not better than the one on the actual set. As can be seen from Fig 4, the highest accuracy obtained with SVM and AdaBoost were 49.17% and 58.33% respectively.

Single-trial ERP classification. Performance evaluation:
We applied leave-onesubject-out cross-validation (LOSO-CV) and leave-one-trial-out cross-validation (LOTO-CV) to assess subject-independent and subject-dependent models of single-trial classification, respectively. In the subject-independent approach, a classifier was generated from the training set, comprising of the feature set of 39 participants' single-trial ERPs, and the resulting classifier was tested against the remaining subject's single-trial ERPs. The process was repeated for all subjects. Finally, we calculated the classifier accuracy, i.e., the average accuracy of all the subjects. On the other hand, the subject-dependent model reserved one trial for testing purposes and used the remaining trials (approximately 99 trials in the case of the binary classifier) for training. Here, the final classifier accuracy was the average accuracy of all the trials of the 40 subjects.
Results: To make use of the information available in all single-trial ERPs, we trained classifiers with single-trial ERP features. Both SVM and AdaBoost performances were examined.   Subject-independent and subject-dependent approaches were carried out. Tables 9 and 10 show the performance of subject-independent and subject-dependent classifiers respectively when using features from single-trial ERPs. For the subject-independent case, both SVM and AdaBoost performance were similar in terms of accuracies: 59.42%, 58.49%, and 53.67%, respectively, for positive-negative, positive-neutral, and negative-neutral SVM classifiers, and 59.80%, 58.20%, and 54.00%, respectively, for positive-negative, positive-neutral, and negative-neutral AdaBoost classifiers. As expected, for the subject-dependent case, the prediction accuracies were higher: 65.03%, 65.16%, and 62.65% for positive-negative, positive-neutral, and negative-neutral SVM classifiers, respectively, and 67.65%, 67.34%, and 63.23% for positive-negative, positive-neutral, and negative-neutral, respectively, for the AdaBoost classifiers. Fig 5 shows single-trial SVM and AdaBoost classification results for individual subjects. For some subjects, classification rates of 85% were achieved for single-trial ERPs. This finding further emphasizes that the ERPs encode information about the different kinds of primes.

SVM and
AdaBoost performance on identical input feature sets. The best performance results of SVM and AdaBoost classifiers on average-trial and single-trial ERP features were given in the previous sections. It is also interesting to compare the results on identical input feature sets as shown in Tables 11-14. The features that yielded best classification performance for binary SVM did not perform well for AdaBoost, and vice versa (see Tables 11 and  12). This is due to the fact that the features were optimized for each classifier separately. Subject independent and subject dependent performance on identical input features were computed and submitted in Tables 13 and 14.  Effect of Subliminal Lexical Priming on the Perception of Images

Conclusions
The current study investigated the changes in behavioral and electrophysiological responses to relatively natural and neutral images, after being subliminally exposed to three different types of prime words, which were deliberately designed to generate positive, negative, and neutral emotional associations with the images. Consistent with previous related studies on subliminal priming, the results showed the significant effect of priming on subjective judgment. The analysis of the behavioral data demonstrated a shift in the likeness judgment toward the positive for the images in the positive prime affect condition compared to that of the negative condition. The significance of negative priming on image rating was not visible in the behavioral responses data. A similar experiment conducted in the absence of subliminal prime words confirmed that the difference obtained in the behavioral data of the primed experiment was due to the influence of priming. More interestingly, we were curious to examine how this behavioral shift affects the brain signals (EEG in particular), which could be considered a more objective measure to assess the priming effect. We observed an early and late response difference in the ERPs among the three prime affect types, mainly in the posterior region. These intriguing findings inspired us to explore further to what extent the ERPs encode information relevant to the priming effect. Promising correct classification rates of 95.00%, 87.50%, and 85.00% were reported for positive-negative, positive-neutral, and negative-neutral binary SVM classifiers, respectively, and 91.25%, 92.50%, and 81.25% for AdaBoost classifiers using average ERP data. The performance of the multi-class problem was lower than that of the binary problems (70.00% and 61.67% for SVM and AdaBoost respectively), as expected, since it is a more difficult classification problem. In addition, the decoding accuracies of the single-trial ERP classifications were also reasonable, with accuracies of 80%-85% for certain subjects.
In summary, our results not only support the previous literature on priming, but also highlight the significance of ERP studies for gaining better understanding the brain-behavior correlations. The promising results could benefit research in areas such as brain and cognition research, and health science, and rehabilitation research. In addition, the results could also be used for motivational research, for instance, for subliminally motivating/influencing the staff and students for better productivity and creativity. Further research will need to be carried out Effect of Subliminal Lexical Priming on the Perception of Images to explore the short-term and long-term effects of priming on subjective and objective judgments of images, as well as whether a gender-specific effect can be observed.