Recently, brain-computer interface (BCI) systems developed based on steady-state visual evoked potential (SSVEP) have attracted much attention due to their high information transfer rate (ITR) and increasing number of targets. However, SSVEP-based methods can be improved in terms of their accuracy and target detection time. We propose a new method based on canonical correlation analysis (CCA) to integrate subject-specific models and subject-independent information and enhance BCI performance. We propose to use training data of other subjects to optimize hyperparameters for CCA-based model of a specific subject. An ensemble version of the proposed method is also developed for a fair comparison with ensemble task-related component analysis (TRCA). The proposed method is compared with TRCA and extended CCA methods. A publicly available, 35-subject SSVEP benchmark dataset is used for comparison studies and performance is quantified by classification accuracy and ITR. The ITR of the proposed method is higher than those of TRCA and extended CCA. The proposed method outperforms extended CCA in all conditions and TRCA for time windows greater than 0.3 s. The proposed method also outperforms TRCA when there are limited training blocks and electrodes. This study illustrates that adding subject-independent information to subject-specific models can improve performance of SSVEP-based BCIs.
Citation: Mehdizavareh MH, Hemati S, Soltanian-Zadeh H (2020) Enhancing performance of subject-specific models via subject-independent information for SSVEP-based BCIs. PLoS ONE 15(1): e0226048. https://doi.org/10.1371/journal.pone.0226048
Editor: Xiang Gao, West China Medical School of Sichuan University, CHINA
Received: July 22, 2019; Accepted: November 17, 2019; Published: January 14, 2020
Copyright: © 2020 Mehdizavareh et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data that support the findings of this study are openly available at "ftp://sccn.ucsd.edu/pub/ssvep_benchmark_dataset/. Further questions can be directed to the data owners here: Yijun Wang: firstname.lastname@example.org Xiaogang Chen: email@example.com Xiaorong Gao: firstname.lastname@example.org Shangkai Gao: email@example.com
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Brain-computer interface (BCI) systems provide novel communication channels for the humans, especially severely disabled individuals [1–3]. A character speller system is a highly important BCI system which allows disabled individuals to communicate with their surrounding environment . Electroencephalography (EEG) is a noninvasive, low cost, and simple modality, widely used to implement BCI spellers . In recent years, steady-state visual evoked potential (SSVEP)-based BCI spellers have attracted much more attention compared with other BCI systems including motor imagery and P300. This is because of their high information transfer rate (ITR), less user training, and ability to deal with problems with a large number of classes [4–7].
There are many target coding methods in SSVEP-based BCIs, among which frequency coding is a popular method to encode targets [8, 9]. Several methods have been proposed to combine phase and frequency coding approaches [10–12]. The most discriminative method is joint frequency-phase modulation (JFPM) method which assigns different frequencies and phases to two adjacent targets . Target identification is another crucial issue in SSVEP-based BCIs, for which numerous methods have been proposed. Initially, single-channel methods were presented based on power spectral density analysis (PDSA) [13–14] and then multiple channel methods were introduced to improve the signal-to-noise ratio (SNR) of the SSVEP response. In these methods, channels are combined using appropriate spatial filters so that common noises in the channels are reduced and the quality of SSVEP response is improved. Some powerful examples of such methods are minimum energy combination (MEC) , maximum contrast combination (MCC) , and canonical correlation analysis (CCA) . Although these methods are widely used because of simplicity and no need for training, they only detect frequency. They are unable to discriminate two different phases  and their performance degrades in short time windows due to background noise of EEG. To solve these problems, calibration data has been used [12, 17–20].
Extended CCA was introduced to combine CCA coefficients with the Pearson correlation coefficients of the test and training data . Multiway CCA (MwayCCA) , L1-regularized MwayCCA , and multiset CCA (MsetCCA)  were proposed to optimize artificial sine-cosine reference signals embedded in CCA using training trials of each subject. Also, task-related component analysis (TRCA) was suggested to enhance the SNR of the SSVEP response using optimized spatial filters . TRCA extracts task-related components by maximizing the reproducibility during the task period . Comparison studies have shown that extended CCA and TRCA methods are superior to other methods in terms of classification accuracy and ITR, especially in short time windows [20, 22]. Thus, we compare our proposed method with these two methods.
From training point of view, target identification methods can be classified into three main categories : 1) training-free methods such as PSDA and CCA, which do not need any calibration data; 2) subject-specific training methods such as extended CCA and TRCA, for which calibration data are collected for each subject and the parameters of the algorithm are optimized individually; and 3) subject-independent training methods like transfer template-based CCA (tt-CCA) , which use the training data of the existing subjects to create a fixed model for a new subject.
In this paper, we propose a new CCA-based method which exploits both subject-specific and subject-independent training methods to enhance performance of a BCI system. A publicly available, 35-subject SSVEP benchmark dataset  is used to evaluate the proposed method. First, the most informative CCA-based correlation coefficients are found using a subject-independent training method and then, the selected coefficients are used for a new subject. Also, an ensemble version of the CCA-based method is introduced in which a linear combination of the correlation coefficients derived from the basic and ensemble spatial filters are used to construct the final feature for target identification.
The remainder of the paper is organized as follows. Section 2 introduces benchmark dataset and data preprocessing applied to all methods and reviews standard CCA, extended CCA, and TRCA methods. Then, the basic and ensemble version of the proposed algorithm is described in details, and finally, filter bank analysis is explained. Section 3 presents the experimental results. In section 4, the difference between the proposed algorithm and the extended CCA method is discussed, and the advantages of our method over other methods are described. Section 5 concludes the paper.
2.1. Benchmark dataset
In this study, the benchmark dataset introduced in  has been used. This dataset is freely available to the BCI community to facilitate comparison of the SSVEP response detection algorithms. The dataset has been collected from 35 subjects (17 females, 18 males, a mean age of 22 years, 27 naïve, and 8 experienced). The experiment includes a 40-target speller system which uses the JFPM method to encode characters with 0.2 Hz frequency difference and 0.5π phase difference between the two neighboring targets. Also, the frequency interval used in this task is in the range of [8, 15.8] Hz. It has been shown that the phase interval of 0.35π leads to the best performance of the BCI system . Thus, the method proposed in [12, 25] is used to shift the EEG data circularly such that the phase difference is 0.35π. For each subject, the task consists of six blocks and each block includes 40 trials, one trial for each target randomly presented through the LCD to the subjects. In each trial, a visual cue (red square) is shown on the screen for 0.5 s and the subjects are asked to follow the cue target on the screen using their eyes. As the cue disappears, all 40 targets start flickering simultaneously for 5 s. When the stimuli is finished, the screen becomes blank for 0.5 s before the next trial starts. Therefore, each trial lasts 6 s. In every block, the subjects are asked to avoid blinking during stimulus presentation. To avoid eye fatigue, there are several minutes of rest between the two successive blocks.
The EEG data were acquired from 64 channels using Synamps2 system (Neuroscan Company) with a sampling rate of 1000 Hz. The electrodes were placed according to the international 10–20 system. The ground electrode was placed between Fz and FPz and the reference electrode was placed at the vertex. The passband of the amplifier was between 0.15 Hz and 200 Hz, and the electrode impedances were kept less than 10 kΩ. Also, during data recording, a notch filter was used to remove the 50 Hz power line noise. The synchronous signal generated by the stimulus program was sent to the amplifier and recorded on an event channel synchronized to the visual cue onset. To reduce the data size, all EEG epochs were down-sampled to 250 Hz. Further details of the dataset are given in .
2.2. Data preprocessing
The first step of the EEG data preprocessing is channel selection. The SSVEP topographic scalp maps show high activity over the parietal and visual areas [26, 27]. Based on the previous studies [12, 25], nine electrodes located in these areas (O1, O2, Oz, PO3, PO4, PO5, PO6, POz, and Pz) are selected. By taking into account the 140 ms latency of the visual system [12, 28], for a time window with length Tw s, all epochs are extracted in the interval [0.14 s 0.14+Tw s] in which the time 0 indicates the stimulus onset. Then, all segmented epochs are band-pass filtered from 6 Hz to 90 Hz using a zero-phase Chebyshev Type II infinite impulse response (IIR) filter. The filtfilt() function in MATLAB is used to implement zero-phase forward and reverse filtering.
2.3. Reference methods
2.3.1. Standard CCA method.
CCA is a statistical multivariate method to maximize the correlation between two sets of variables and has been widely used in SSVEP-based BCI for frequency detection [16, 29]. Let fK, Fs, Nt, M, K, and Nh denote the k-th stimulus frequency, the sampling rate, the number of time points, the EEG channels, the targets, and the harmonic frequencies considered, respectively. The multichannel EEG data is represented by and the reference signals are sinusoidal and defined as: (1)
CCA finds the weight vectors wx and wy so that the correlation between two canonical variables x = XTwx and (which are linear combinations of X and Yk respectively) is maximized by solving the following optimization problem : (2) where ρ(x,y) is the Pearson’s correlation coefficient between x and y and ρk is the maximum of ρ with respect to wx and wy. To recognize the frequency of SSVEP, ρk is calculated for all targets (k = 1,2,…,K) and the target with the maximal ρk is selected as: (3)
2.3.2. Extended CCA-based method.
The standard CCA method is an unsupervised method, meaning that it does not use any calibration data for target identification. This method has been originally developed for frequency detection. Since phase detection requires training data, CCA cannot be used to distinguish different phases . Incorporating training data in target identification methods can capture the temporal features of SSVEP response more effectively and enhance the performance of the CCA-based approaches [12, 22]. Extended CCA which combines standard CCA and individual training-based methods has been proposed in several studies [5, 7, 12, 30] and its superiority over other CCA-based training methods has been shown in . In this method, individual SSVEP template signals are derived by averaging multiple training trials related to the k-th target. Then, projections of a test data X and an individual template are computed using the CCA-based spatial filters, and finally, the correlation coefficients between some pairs of the projections are used as features to identify the target. Specifically, in the extended CCA, four additional features are used: (4)
Here, wA(AB) represents the spatial filter derived from CCA between two multidimensional variables A and B and related to variable A. Then, the sum of these five correlation values is used as the final feature for target identification: (5)
Eq (5) also captures the discriminative information from negative correlation coefficients (all except rk(1) can be negative). Although the original method uses the sum of the squares of the coefficients along with their signs, in this study, Eq (5) is used due to its superior performance. Finally, the stimulus target is identified by Eq (3).
2.3.3. TRCA-based method.
TRCA was originally proposed in functional neuroimaging  and then used in SSVEP-based BCI to obtain optimized spatial filters to improve SNR of SSVEP response . The method recovers the task-related components (here SSVEP) using a linear, weighted sum of the observed signals (here, multichannel EEG signals): (6) where j is the index of the channels, is the recovered signal, is the multichannel EEG signal, and is the optimized spatial filter derived from the TRCA method. This problem can be formulated by maximizing inter-trial covariance . Let x(h)(t), y(h)(t), and H denote the h-th trial of x(t), the h-th trial of y(t), and the number of training trials, respectively. The covariance between the h1-th and h2-th trials of y(t) is defined by: (7)
To limit the weight vector in Eq (8), the variance of y is normalized to one: (9)
The optimal weight vector is equivalent to the eigenvector corresponding to the largest eigenvalue of the matrix Q-1S. Then, the following correlation coefficient is computed: (11) where similar to Subsection 2.3.2, X and are the single-trial test data and the SSVEP template signal computed by averaging across trials of the k-th target, respectively. Also, wk is the spatial filter derived from applying TRCA algorithm on the training data for the k-th visual stimulus. In the end, the target can be recognized by the rule provided in Eq (3).
An ensemble TRCA method was proposed in  in which the spatial filters derived for different visual stimulus were integrated to construct an ensemble of the spatial filters : (12)
Since the mixing coefficients from the SSVEP source to the scalp recordings are approximately similar for the utilized frequency range, the K different spatial filters can be considered similar, and this is the reason for the effectiveness of the ensemble TRCA method . In this method, Eq (11) is extended to: (13) where ψ(A,B) indicates the two-dimensional correlation coefficient between A and B. Finally, Eq (3) is used for target identification.
2.4. Proposed method
The extended CCA method has shortcomings. First, there are numerous ways to project the training data or the test data on the CCA-based spatial filters and compute the correlation between each pair of these projections. Extended CCA uses only five of such correlation coefficients in Eq (4). Also, it is unclear how these five features are selected and the others ignored. Second, there is no ensemble extension for this or any other CCA-based methods. Therefore, these methods cannot compete with ensemble TRCA which has the best performance among the current methods. To mitigate these limitations, in this study, a new method is proposed in which the best CCA-based features are selected. Moreover, to enhance the performance of the method, its ensemble version is also proposed. The structures of the proposed algorithms are illustrated in Fig 1 and their details presented below.
Green and purple backgrounds represent subject-independent and subject-specific training, respectively.
2.4.1. Basic algorithm.
In the first step, all possible canonical variables (CVs) derived from the CCA-based spatial filters are constructed. In the CCA-based methods, there are three types of data including: 1) the test data X; 2) the template signal derived from averaging across the training blocks of the k-th target; and 3) the sinusoidal signals Yk. By computing CCA between each pair of these three data types, six spatial filters are generated: 1) ; 2) ; 3) WX(XYk); 4) ; 5) ; and 6) . Projections of X and on the first four spatial filters and Yk on the 5th and 6th spatial filters generate a total of 10 CVs. These CVs are listed in Table 1.
In the second step, the best correlation features derived from the correlation between each pair of the CVs are found. Since there are 10 CVs, 45 correlation features can be computed (). Fig 2 shows the block diagram of the proposed method for generating the 45 correlation features. Most of these features can be used for target identification. The correlation coefficients between the projections of and the projections of Yk (including 8 features) have no capability of detecting SSVEPs even if the test data is used to construct the spatial filters. Also, the correlation between CV9 and CV10 is not useful. Therefore, a combination of the remaining 36 features can be selected for the subject-specific training.
There are a variety of feature selection algorithms in the literature [31, 32]. In this paper, a simple feature selection algorithm called forward selection (FS)  is used to find the best set of correlation features. In this algorithm, the feature which maximizes the average classification accuracy among the 36 features is selected. The classification measure is the same as the one presented in Eq (3). Then, the second feature is selected such that the features selected in the previous and present steps lead to best performance. Similar to Eq (5), the sum of the features is used to combine features for classification. The process of adding features continues until there is no improvement in the average classification accuracy. Finally, the feature set in the last step is considered as the best feature set.
The subject independent training is employed to create the 45 features and apply the FS algorithm on them. After applying the FS algorithm on the seven folds described in Subsection 2.4.3, seven feature sets that contain the best features for each fold are obtained. The interesting point is that in all these feature sets, the maximum performance is provided by the six features that are the same across different folds, although the order in which these features are selected is not the same. Further information regarding features selected in each fold can be found in the Supporting information. These six best features are: (14)
The coefficients rk(1), rk(3), rk(4), and rk(5) are present in both of the extended CCA and the proposed method while the coefficients rk(2) and rk(6) are exclusively present in our method. These coefficients are used for subject-specific training in the basic algorithm. Similar to Eq (5), the following relation is used to build the final feature for classification: (15)
2.4.2. Ensemble algorithm.
Ensemble TRCA showed that an integration of spatial filters derived from calibration data of different classes enhanced performance of the SSVEP BCI . In fact, using both between and within class information in pattern classification methods can boost classifier performance . According to Eq (13), to exploit an ensemble of the spatial filters for a correlation-based feature between two sets, two conditions must be satisfied. First, these two sets should be projected on the same group of spatial filters. Second, the group must contain the spatial filters of all classes. By evaluating these two conditions for the six features in Eq (14), only rk(3), rk(4), and rk(5) satisfy the first condition and only rk(5) satisfies the second condition. Consequently, the six features rk in Eq (14) can be converted to the six features in which all features are the same as rk except for . This feature is constructed using the two-dimensional correlation between two projections on the ensemble of the spatial filters derived from CCA between the template signals and the sinusoidal signals Yk. Since is the best discriminative feature relative to the other coefficients, a uniform combination of the six coefficients similar to Eq (15) will not be the best solution. To take feature differences into account, a linear weighted sum of the coefficients is proposed: (16)
The mixing weights α(i) are estimated using the subject independent data (see Subsection 2.4.3). The objective is to maximize the average classification accuracy, computed based on Eqs (3) and (16). Since the objective function is a complex nonlinear function of α(i), the gradient-based optimization methods cannot be easily applied. Considering the limited parameter space of the problem, the metaheuristic optimization methods including the genetic algorithm (GA) or particle swarm optimization (PSO) can be used . We use GA to estimate α(i) coefficients such that the objective function is maximized. GA is implemented using the ga function in MATLAB. For the sake of simplicity and limiting the search space, the coefficients are confined in the [0 1] interval. The estimation process will assign the largest weight (α(5)) to due to its highest level of discrimination. Finally, it should be noted that the estimated weights α(i) may be different in different folds.
As mentioned before, both of the subject-independent and the subject-specific trainings are used in the proposed method. Cross-validation is performed on the subjects and the six blocks of a specific subject data for the first and second training techniques, respectively. Further information about cross-validation techniques is presented below.
Subject-independent training: The parts related to this training technique are shown in green in Fig 1. In this approach, the K-fold (K = 7) approach is used and the data of 30 subjects is utilized to obtain the best hyperparameters for the remaining 5 subjects. Then, the obtained hyperparameters are used to create the subject-specific models. Specifically, in the basic algorithm, for each fold, 45 CCA-based features are constructed for the 30 subjects and then, the features that maximize the average recognition accuracy for the mentioned subjects are selected (Subsection 2.4.1). Finally, the subject-specific models are created for the remaining 5 subjects using the selected features. Similarly, in the ensemble algorithm, the weights (Subsection 2.4.2) that maximize the average accuracy for the 30 subjects of the corresponding fold are used to build the subject-specific models of the remaining subjects. Therefore, the selected features in the basic algorithm and the weights α(i) in the ensemble algorithm are considered as the hyperparameters.
Subject-specific training: In both of the basic and ensemble algorithms, the subject-specific models are built using the hyperparameters derived from the other subjects’ data. For each subject, the leave-one-out technique is used on the six blocks. In other words, the data samples from five of the six blocks are used as the training data to construct a reference signal for each target while the left-out (sixth) block is used for validation. This procedure is repeated six times such that every block is considered as validation data once. Finally, the average recognition accuracy across these six blocks are computed. It is worthwhile to note that the classification accuracies reported in the Result Section are from this type of training.
2.5. Filter bank analysis
Higher harmonics of the SSVEP stimulus frequency contain useful information which can improve the recognition accuracy. To extract this information, filter bank analysis has been proposed as a practical solution in which a signal is decomposed to multiple frequency sub-bands [29, 34]. Filter bank analysis can reduce the detection error due to the background EEG activities. X. Chen, et al.  applied the filter bank technique to the SSVEP-based BCI, enhancing the performance of the standard CCA method significantly. This technique is applied to all methods presented here and its effect is reported. To design the filter bank, a procedure similar to [12, 29] is utilized. In this method, the EEG data is decomposed into N sub-bands using the N band-pass filters and a feature extraction algorithm is applied to each sub-band separately. The lower and upper cut-off frequencies of the n-th sub-band are set to n×8 Hz and 70 Hz, respectively. The zero-phase Chebyshev Type II IIR band-pass filter is used to extract every sub-band signals. The features computed from the sub-bands are combined as follows: (17) where , , and wSB(n) are the feature value for the n-th sub-band and the k-th target, the final feature for classification, and the weights for the sub-band components, respectively. Based on the previous studies, when the response frequency increased, the SNR of SSVEP decreased . Therefore, the sub-band weights are determined using: (18)
Classification accuracy and ITR were used as the evaluation metrics to compare the performance of the methods. These two metrics were calculated with various data lengths from 0.2 s to 1 s with a step of 0.1 s. The 0.5 s gaze shifting duration was considered to compute the simulated ITR in the offline analysis. Also, the number of harmonics in Eq (1) was set to 3. Fig 3 shows the average accuracies and ITRs across subjects for three basic methods at different time windows, with and without the filter bank. For the filter bank, the number of sub-bands was set to 4. In all possible cases, TRCA showed a superior performance over the other methods for the time windows shorter than 0.3 s. For the 0.3 s time window, the one-way repeated measures analysis of variance (ANOVA) showed no significant difference between the accuracy (F(2,68) = 1.35, p = 0.26) and ITR (F(2,68) = 1.09, p = 0.33) of the three methods without the filter bank. When filter bank was applied in the 0.3 s time window, ANOVA revealed significant difference in the accuracy (F(2,68) = 17.79, p<0.001) and ITR (F(2,68) = 18.45, p<0.001) of the three methods. The post-hoc paired t-tests showed that there was no significant difference in accuracy (p = 0.67) and ITR (p = 0.62) between the TRCA method and the proposed method while both methods outperformed the extended CCA method (p<0.001). For time windows greater than 0.3 s, ANOVA indicated significant difference (p<0.01) between the three methods in all conditions. Post-hoc paired t-tests confirmed superior performance of the proposed method relative to TRCA and extended CCA (p<0.01). In Fig 3B, the time windows corresponding to the highest ITR are different for each method (extended CCA: 0.8 s; TRCA: 0.8 s; the proposed method: 0.7 s) while in Fig 3D, all methods reached their highest ITR in 0.7 s.
Average accuracies, (a) and (c), and ITRs, (b) and (d), across subjects for three basic methods at different time windows. Results in the first and second rows are derived without and with the filter bank, respectively. Number of sub-bands is set to 4. Asterisks represent significant difference between the three methods, using ANOVA at time windows greater than 0.3 (*p<0.01, **p<0.001). Error bars show standard errors.
The ensemble version of the proposed method is compared with the ensemble TRCA method in Fig 4. To estimate the weights (α(i)) in Eq (16) using the procedure described in Subsection 2.4.2, the time window was set to 0.5 s. Similar to the basic methods, the ensemble TRCA method performed better than the proposed ensemble method in all cases when the data length was less than 0.3 s. For 0.3 s, paired t-tests showed no significant difference between the two methods, with and without filter bank (Fig 4A: p = 0.62; Fig 4B: p = 0.50; Fig 4C: p = 0.12; Fig 4D: p = 0.35). For the data lengths greater than 0.3 s, the proposed ensemble method led to significantly (p<0.001) higher accuracy and ITR than the ensemble TRCA method for both cases. Both methods reached their highest ITRs at 0.6 s in Fig 4B and 0.5 s in Fig 4D.
Average accuracies, (a) and (c), and ITRs, (b) and (d), across subjects for ensemble TRCA and ensemble version of the proposed method at different time windows. Results in the first and second rows are derived without and with the filter bank, respectively. Number of sub-bands is set to 4. Asterisks represent significant difference between the two methods by paired t-tests at time windows greater than 0.3 (*p<0.001). Error bars show standard errors.
The performance of the training methods depends on the number of sub-bands, electrodes, and training blocks. Therefore, the effects of varying these parameters on the classification accuracy for all cases including the basic and ensemble TRCA, and the basic and ensemble version of the proposed method are investigated in Figs 5 and 6. Time window was set at 0.5 s to perform the analysis. In Fig 5, the number of the training blocks and the electrodes were fixed at 5 and 9 and the effect of the number of sub-bands was explored. The proposed method represents significantly (p<0.001) higher classification accuracies than TRCA in all cases. For both of the basic and the ensemble versions of the two methods, the highest accuracy is achieved by 4 sub-bands. According to this fact, the number of sub-bands was fixed at 4 and the variations of the average accuracies corresponding to different numbers of the electrodes and the training blocks were examined in Fig 6. The results illustrate that for both of the basic and ensemble cases, the proposed method outperforms TRCA, especially for low numbers of the training blocks and the electrodes (p<0.001). Furthermore, TRCA needs at least two training blocks to obtain optimal spatial filters while the proposed method can deliver an acceptable performance even with a single training block (see Fig 6B and 6D). This characteristic can be one of the major advantages of our method compared with TRCA. Typically, in SSVEP BCI, it is necessary to collect the training data at the beginning of each session which could be time-consuming; our method reduces the training time considerably.
(a) Basic TRCA and the proposed method; and (b) ensemble TRCA and the proposed ensemble method. Asterisks show significant differences between the two methods by paired t-tests (*p<0.001). Error bars show standard errors.
Average accuracies across subjects obtained by different number of electrodes, (a) and (c), and training blocks, (b) and (d). The first row compares two basic methods and the second row compares two ensemble methods. Asterisks show significant differences between the two methods by paired t-tests (*p<0.001). Error bars show standard errors.
Classification accuracy and ITR are the most important factors for practical development of SSVEP-based BCI spellers and thus must be improved as much as possible. In this study, an ensemble CCA-based training method was proposed for the first time, which improved the performance of the extended CCA and TRCA methods. The proposed method outperformed extended CCA in all conditions. Furthermore, it outperformed TRCA in terms of both accuracy and ITR for data lengths greater than 0.3 s. The lower performance of our method for short time durations could be related to inaccurate estimation of the spatial filters by the CCA algorithm from a small number of samples. However, when the data length increases, on one hand, the spatial filters are estimated more accurately and on the other hand, the combination of various coefficients which exploit CCA-based spatial filters improve the performance of the proposed method compared with TRCA.
In practical applications, for majority of the subjects, the maximum speed (highest ITR) is reached at time windows greater than 0.3 s, justifying the application of the proposed method for such subjects. All in all, only when the numbers of the blocks and the electrodes are large and the subject reaches his/her highest ITR in 0.3 s or less, the TRCA method is preferable to the proposed method. Otherwise, the proposed method is recommended. Also, in this paper, due to the limited number of training blocks per subject, the subject-independent training technique was used to find the best CCA-based features and estimate the mixing weights in Eq (16). For a new subject, Eqs (14), (15) and (16), and one set of weights α(i) are sufficient for target detection.
For further investigation of the performance of the proposed method relative to TRCA, feature values can be compared for the two methods. Since the scales of the final features obtained by the two methods are different, feature vectors derived from each trial are linearly normalized into [-1, +1] and then compared. Fig 7A and 7B represent normalized feature values for a sample frequency derived from two basic and two ensemble methods, respectively. The number of sub-bands, electrodes, and training blocks were 4, 9, and 5, respectively. A short data length (0.6 s) was selected to carry out comparisons. In both figures, the feature values of the two methods decline with a similar trend in the neighborhood of the true frequency. However, as we move away from the true frequency, feature values of the proposed method become significantly (p<0.001) lower than those of the TRCA method. Therefore, the probability of a false detection in our method is lower than that of TRCA, leading to its superiority over TRCA.
An example of normalized feature values, averaged across subjects and blocks, obtained by: (a) two basic methods; and (b) two ensemble methods. Red vertical line indicates true frequency. Data length is 0.6 s. Asterisks represent a significant difference between the two methods by paired t-tests (*p<0.01, **p<0.001). Error bars show standard errors.
There are several parameters in this paper which can be further optimized for each method (or subject) separately, including the filter bank design, the stimulus design, and the electrode setting. As a representative example, consider different possible sets of n (n<9) electrodes which can be selected from the nine electrodes introduced in Subsection 2.2. For an n, the optimal electrode layout per method can be found by a grid search, i.e., by calculating average accuracies across the subjects for each layout and selecting the layout with the highest accuracy. This analysis is done on the benchmark dataset with three to six electrodes for the proposed ensemble method and the ensemble TRCA method. Then, the best layout per method along with the corresponding accuracies are shown in Fig 8A. This figure shows that by selecting a suitable subset of four or five electrodes, acceptable accuracies, comparable with those obtained by nine electrodes, can be achieved. It also illustrates that if we consider a local area (i.e., visual area), the best layout obtained by a grid search is almost independent of the spatial filter-based target identification method used.
(a) the best layout of the electrodes per method, derived from a grid search for all subjects and the corresponding average accuracies; and (b) the potential average accuracies across the subjects after selecting the best layout of the electrodes per subject. In both figures, the data length is 0.5 s. Asterisks represent a significant difference between the two methods by paired t-tests (*p<0.001). Error bars show standard errors.
Another approach for optimizing the electrode setting is the channel selection in an unsupervised manner . The maximum achievable accuracy per subject derived from a grid search can be used as a reference to compare the performance of the channel selection algorithms in the future studies. For example, Fig 8B shows average accuracies after selecting the best electrodes per subject. This figure reveals the great potential of an effective channel selection algorithm to enhance the performance of the methods. Superior performance of the proposed method compared with TRCA is illustrated in both Fig 8A and 8B.
In this study, a method was proposed which uses both of the subject-specific and the subject-independent training techniques. Since collecting the training data is time-consuming and may be exhausting for some subjects, the transfer learning methods have been proposed which use the training data of the other subjects  or different sessions of the same subject . Furthermore, using the benchmark dataset containing a large number of subjects , various training-free algorithms can be devised and evaluated in the future studies to improve effectiveness of such methods. Since the optimal data length for various trials can be different, an adaptive selection of the window length using a dynamic stopping criterion can be a solution for the BCI users [37–38]. Besides, the combination of SSVEP and other modalities, e.g., the eye-tracking systems , can improve the performance compared with using two single-modality methods. However, the efficiency of the hybrid methods over the single-modality methods needs to be investigated.
The advantages of our approach relative to the TRCA and extended CCA methods for target detection in SSVEP-based BCI can be summarized as the following.
- Our method integrates subject-specific models with subject-independent information and enhances the BCI performance.
- The classification accuracy and information transfer rate (ITR) of our method are significantly higher than those of the extended CCA in all conditions and those of TRCA in time windows larger than 0.3 s.
- Our method can be easily implemented in online applications of BCI and realize a high-speed SSVEP based speller.
- Our method outperforms TRCA when the number of the training blocks and the number of the electrodes are small. Also, for subject-specific training, TRCA needs at least two training blocks while our method works with a single training block. This facilitates the development and application of the BCI systems.
- A problem with the SSVEP-based BCI spellers is false detection, due to interference from the nearest neighbors of the target frequency. The likelihood of this error for our method is lower than that of the TRCA method.
This study proposed a framework to improve traditional CCA-based training methods by finding the best hyperparameters for each subject using other subjects’ training data. These hyperparameters were used to construct the basic and ensemble versions of the proposed method. The offline analysis based on a benchmark dataset was performed and the proposed method was compared with the extended CCA and TRCA methods. Our method showed significantly higher performance than extended CCA in all conditions and TRCA in time windows greater than 0.3 s. All three methods can be implemented in online BCI applications to realize a high-speed SSVEP-based speller.
The authors would like to thank the authors of  for providing the benchmark dataset freely.
- 1. Choi I, Rhiu I, Lee Y, Yun MH and Nam CS. A systematic review of hybrid brain-computer interfaces: Taxonomy and usability perspectives. PLoS One. 2017; 12(4): e0176674. pmid:28453547
- 2. Nicolas-Alonso L F and Gomez-Gil J. Brain computer interfaces, a review. Sensors. 2012; 12: 1211–79. pmid:22438708
- 3. Nuyujukian P, Sanabria J A, Saab J, Pandarinath C, Jarosiewicz B, Blab C H, et al. Cortical control of a tablet computer by people with paralysis. PLoS One. 2018; 13(11): e0204566. pmid:30462658
- 4. Gao S, Wang Y, Gao X and Hong B. Visual and auditory brain-computer interfaces. IEEE Trans. Biomed. Eng. 2014; 611435–47.
- 5. Chen X, Chen Z, Gao S and Gao X. A high-ITR SSVEP based BCI speller. Brain-Comput. Interfaces. 2014; 1: 181–91.
- 6. Spüler M. A high-speed brain-computer interface (BCI) using dry EEG electrodes. PLoS One. 2017; 12(2): e 0172400.
- 7. Nakanishi M, Wang Y, Wang Y T, Mitsukura Y and Jung T P. A high-speed brain speller using steady-state visual evoked potentials. Int. J. Neural Syst. 2014; 24: 1–18.
- 8. Zhu D, Bieger J, Molina G G and Aarts R M. A survey of stimulation methods used in SSVEP-based BCIs. Comput. Intell. Neurosci. 2010; 1: 702357.
- 9. Vialatte F-B, Maurice M, Dauwels J and Cichocki A. Steady-state visually evoked potentials: focus on essential paradigms and future perspectives. Prog. Neurobiol. 2010; 90: 418–38. pmid:19963032
- 10. Jia C, Gao X, Hong B and Gao S. Frequency and phase mixed coding in SSVEP-based brain-computer interface. IEEE Trans. Biomed. Eng. 2011; 58: 200–6. pmid:20729160
- 11. Chen X, Wang Y, Nakanishi M, Jung T P and Gao X. Hybrid frequency and phase coding for a high-speed SSVEP-based BCI speller. Proc. 36th Ann. Int. IEEE Conf. Engineering in Medicine and Biology. 2014; Society pp 3993–6.
- 12. Chen X, Wang Y, Nakanishi M, Gao X, Jung T-P and Gao S. High-speed spelling with a noninvasive brain-computer interface. Proc. Natl Acad. Sci. 2015; 112: E6058–67. pmid:26483479
- 13. Cheng M, Gao X and Gao S. Design and implementation of a brain-computer interface with high transfer rates. IEEE Trans. Biomed. Eng. 2002; 49: 1181–6. pmid:12374343
- 14. Wang Y, Wang R, Gao X and Gao S. A practical VEP-based brain-computer interface. IEEE Trans. Neural Syst. Rehabil. Eng. 2006; 14: 234–40. pmid:16792302
- 15. Friman O, Volosyak I and Graser A. Multiple channel detection of steady-state visual evoked potentials for brain-computer interfaces. IEEE Trans. Biomed. Eng. 2007; 54: 742–50. pmid:17405382
- 16. Lin Z, Zhang C, Wu W and Gao X. Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs. IEEE Trans. Biomed. Eng. 2006; 53: 2610–4. pmid:17152442
- 17. Zhang Y, Zhou G, Zhao Q, Onishi A, Jin J, Wang X, et al. Multiway canonical correlation analysis for frequency components recognition in SSVEP-based BCIs. Neural Information Processing (ICONIP 2011) (Lect. Notes Comput. Sci.). 2011; 7062: 287–95.
- 18. Zhang Y, Zhou G, Jin J, Wang M, Wang X and Cichocki A. L1-regularized multiway canonical correlation analysis for SSVEP-based BCI. IEEE Trans. Neural Syst. Rehabil. Eng. 2013; 21: 887–96. pmid:24122565
- 19. Zhang Y, Zhou G, Jin J, Wang X and Cichocki A. Frequency recognition in SSVEP-based BCI using multiset canonical correlation analysis. Int. J. Neural Syst. 2014; 24: 1450013. pmid:24694168
- 20. Nakanishi M, Wang Y, Chen X, Wang Y-T, Gao X and Jung T-P. Enhancing detection of SSVEPs for a high-speed brain speller using task-related component analysis. IEEE Trans. Biomed. Eng. 2018; 65: 104–12. pmid:28436836
- 21. Tanaka H, Katura T and Sato H. Task-related component analysis for functional neuroimaging and application to near-infrared spectroscopy data. NeuroImage. 2013; 64: 308–327. pmid:22922468
- 22. Nakanishi M, Wang Y, Wang Y-T and Jung T-P. A comparison study of canonical correlation analysis based methods for detecting steady-state visual evoked potentials. PLoS One. 2015; 10: e0140703. pmid:26479067
- 23. Zerafa R, Camilleri T, Falzon O and Camilleri K. To train or not to train? A survey on training of feature extraction methods for SSVEP-based BCIs. J. Neural Eng. 2018; 15: 051001. pmid:29869996
- 24. Yuan P, Chen X, Wang Y, Gao X and Gao S. Enhancing performances of SSVEP-based brain-computer interfaces via exploiting inter-subject information. J. Neural Eng. 2015; 12: 046006. pmid:26028259
- 25. Wang Y, Chen X, Gao X and Gao S. A benchmark dataset for SSVEP-based brain-computer interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 2017; 25: 1746–52. pmid:27849543
- 26. Bin G, Lin Z, Gao X, Hong B and Gao S. The SSVEP topographic scalp maps by canonical correlation analysis. 30th Annu. Int. Conf. IEEE Engineering in Medicine and Biology Society. 2008; pp 3759–3762.
- 27. Bin G, Gao X, Yan Z, Hong B and Gao S. An online multi-channel SSVEP-based brain-computer interface using a canonical correlation analysis method. J. Neural Eng. 2009; 6: 046002. pmid:19494422
- 28. Russo F D and Spinelli D. Electrophysiological evidence for an early attentional mechanism in visual processing in humans. Vision Res. 1999; 39: 2975–85. pmid:10664797
- 29. Chen X, Wang Y, Gao S, Jung T-P and Gao X. Filter bank canonical correlation analysis for implementing a high speed SSVEP-based brain-computer interface. J. Neural Eng. 2015; 12: 46008.
- 30. Wang Y, Nakanishi M, Wang Y-T and Jung T-P. Enhancing detection of steady-state visual evoked potentials using individual training data. 36th Annu. Int. Conf. IEEE Engineering in Medicine and Biology Society. 2014; pp 3037–40.
- 31. Fukunaga K. Introduction to statistical pattern recognition. San Diego: Academic Press; 1990.
- 32. Theodoridis S. Introduction to Pattern recognition. Burlington, MA: Academic Press; 2010.
- 33. Weise T. Global Optimization Algorithms—Theory and Application. 2008. Available from: http://www.it-weise.de/projects/book.pdf.
- 34. Ang K K, Chin Z Y, Wang C, Guan C and Zhang H. Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b. Front. Neurosci. 2012; 6: 1–9.
- 35. Webster E, Habibzadeh H, Norton J, Vaughan T and Soyata T. An Unsupervised Channel-Selection Method for SSVEP-based BCI Systems. Available from: http://www.tolgasoyata.com/file/webster.uemcon18.pdf. 2018.
- 36. Nakanishi M, Wang Y and Jung T-P. Session-to-session transfer in detecting steady-state visual evoked potentials with individual training data. Foundations of Augmented Cognition: Neuroergonomics and Operational Neuroscience. AC 2016 (Lect. Notes Comput. Sci.). 2016; 9743: 253–60.
- 37. Yang C, Han X, Wang Y, Saab R, Gao S and Gao X. A dynamic window recognition algorithm for SSVEP-based brain-computer interfaces using a spatio-temporal equalizer. Int. J. Neural. Syst. 2018; 28: 1850028. pmid:30105920
- 38. Jiang J, Yin E, Wang C, Xu M and Ming D. Incorporation of dynamic stopping strategy into the high-speed SSVEP-based BCIs. J. Neural Eng. 2018; 15: 046025. pmid:29774867
- 39. Yao Z, Ma X, Wang Y, Zhang X, Liu M, Pei W, et al. High-speed spelling in virtual reality with sequential hybrid BCIs. IEICE Trans. Inf. Syst. 2018; E101.D: 2859–2862.