Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A novel framework for classification of two-class motor imagery EEG signals using logistic regression classification algorithm

  • Rabia Avais Khan ,

    Contributed equally to this work with: Rabia Avais Khan, Nasir Rashid, Muhammad Shahzaib

    Roles Methodology, Writing – original draft

    Affiliation Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan

  • Nasir Rashid ,

    Contributed equally to this work with: Rabia Avais Khan, Nasir Rashid, Muhammad Shahzaib

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    n.rashid@ceme.nust.edu.pk

    Affiliations Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan, Robot Design and Development Lab, National Centre of Robotics and Automation (NCRA), Punjab, Pakistan

  • Muhammad Shahzaib ,

    Contributed equally to this work with: Rabia Avais Khan, Nasir Rashid, Muhammad Shahzaib

    Roles Software, Validation

    Affiliation Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan

  • Umar Farooq Malik,

    Roles Data curation

    Affiliation Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan

  • Arshia Arif,

    Roles Formal analysis

    Affiliation Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan

  • Javaid Iqbal,

    Roles Conceptualization, Methodology

    Affiliations Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan, Robot Design and Development Lab, National Centre of Robotics and Automation (NCRA), Punjab, Pakistan

  • Mubasher Saleem,

    Roles Methodology, Software, Writing – review & editing

    Affiliation Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan

  • Umar Shahbaz Khan,

    Roles Funding acquisition, Visualization

    Affiliations Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan, Robot Design and Development Lab, National Centre of Robotics and Automation (NCRA), Punjab, Pakistan

  • Mohsin Tiwana

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliations Department of Mechatronics Engineering, National University of Sciences & Technology, Islamabad, Pakistan, Robot Design and Development Lab, National Centre of Robotics and Automation (NCRA), Punjab, Pakistan

Abstract

Robotics and artificial intelligence have played a significant role in developing assistive technologies for people with motor disabilities. Brain-Computer Interface (BCI) is a communication system that allows humans to communicate with their environment by detecting and quantifying control signals produced from different modalities and translating them into voluntary commands for actuating an external device. For that purpose, classification the brain signals with a very high accuracy and minimization of the errors is of profound importance to the researchers. So in this study, a novel framework has been proposed to classify the binary-class electroencephalogram (EEG) data. The proposed framework is tested on BCI Competition IV dataset 1 and BCI Competition III dataset 4a. Artifact removal from EEG data is done through preprocessing, followed by feature extraction for recognizing discriminative information in the recorded brain signals. Signal preprocessing involves the application of independent component analysis (ICA) on raw EEG data, accompanied by the employment of common spatial pattern (CSP) and log-variance for extracting useful features. Six different classification algorithms, namely support vector machine, linear discriminant analysis, k-nearest neighbor, naïve Bayes, decision trees, and logistic regression, have been compared to classify the EEG data accurately. The proposed framework achieved the best classification accuracies with logistic regression classifier for both datasets. Average classification accuracy of 90.42% has been attained on BCI Competition IV dataset 1 for seven different subjects, while for BCI Competition III dataset 4a, an average accuracy of 95.42% has been attained on five subjects. This indicates that the model can be used in real time BCI systems and provide extra-ordinary results for 2-class Motor Imagery (MI) signals classification applications and with some modifications this framework can also be made compatible for multi-class classification in the future.

Introduction

Brain-Computer Interface is a technology that creates a communication channel between the human brain and the external devices by picking up brain signals and translating them into artificial outputs. This system includes collecting data from the human brain, processing it to detect the user’s intent, and then training the system to actuate an external device.

Electroencephalography is a non-invasive technique that records brain signals by recognizing the change in brain wave patterns. The EEG signal is often an amalgamation of many base frequencies known to describe the cognitive, affective, or attentional states. These frequencies are based on particular ranges or bands. The EEG signal frequency range is 0–100 Hz, which is divided into five bands delta ‘δ’ (0.5–4 Hz), theta ‘θ’ (4–7 Hz), alpha ‘α’ (8–13 Hz), beta ‘β’ (13–30 Hz), and gamma ‘γ’ (within and above 35 Hz) [1]. The μ frequency band overlaps with α frequency band, but the first arises in the sensorimotor cortex while the second originates in the occipital and posterior regions of the brain [2].

Brain signals can be recorded non-invasively by various techniques such as functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG) and electroencephalography (EEG), etc., as well as invasively through electrocorticography (ECoG) and microelectrode arrays (MEAs) [3]. For motor imagery (MI) data, EEG is mostly preferred due to its non-invasiveness, low cost, portability, less sensitivity to movement, and good temporal resolution [4].

The brain activity due to MI shows amplitude changes in certain frequency bands, also referred to as variations in sensorimotor rhythms. When a voluntary movement is performed, there is a decrease in amplitude, referred to as event-related desynchronizations (ERD), and after the activity is over, there is an increase in amplitude known as event-related synchronizations (ERS) [5]. The ERD and ERS are known as event-related potential (ERP). The MI-related EEG signals originating in the sensorimotor region of the brain are based on μ (8–12 Hz) and β (14–30 Hz) frequency bands [6, 7].

Brain signals are recorded from different brain regions, but directly using the EEG signals from all the channels would increase noise interference and may decrease the classification performance. Common spatial pattern (CSP) [8] is used to separate the appropriate signal characteristics from raw EEG data and represent them in a form interpretable by a human or a computer. Independent component analysis (ICA) [9] is a common approach for artifact removal. For identifying human brain activity patterns and translating them into commands, classification of EEG data is required. Various classification techniques such as support vector machine (SVM) [10], k-nearest neighbor (k-NN) [11], linear discriminant analysis (LDA) [12], naïve Bayes [13], decision trees [14], and logistic regression [15] are widely used.

The research is mainly motivated by the idea to present a balanced and optimized framework that classifies MI signals with a higher accuracy without compromising the execution time, which is a key factor for the successful implementation of any framework on real time BCI devices. This research involves the pre-processing techniques that removes noise and unwanted signals from the data effectively, feature extraction methods that extracts the optimum number of features without making the system complex and thus contributes to the literature by providing a combination of pre-processing and feature extraction techniques that improves the classification with reduced complexity.

Aiming to improve the accuracy of EEG signals classification, a new framework has been put forward in this research, to improve the classification accuracy of binary class EEG data, using a channel selection technique, employing a combination of Butterworth bandpass filter and ICA for pre-processing, and CSP & log-variance for feature extraction, along with different classification techniques such as SVM, LDA, naïve Bayes, decision trees, k-NN, and logistic regression. This allowed us to choose the most relevant classifier to obtain a pronounced improvement in the average classification accuracy of both datasets as compared to the approaches proposed earlier.

The remaining paper is structured as follows: Section II describes the literature review for all the previous work done on the chosen datasets, Section III describes the EEG data paradigm for both the datasets, section IV describes the proposed method; pre-processing, feature extraction, and classification, section V describes the results, section VI discusses the findings and the key factors that contributed in obtaining those results, section VII highlights the conclusion, followed by section VIII that shows the acknowledgement.

Literature review

The proposed framework is tested on BCI Competition IV dataset 1 and BCI Competition III dataset 4a. Previously, researchers have used various techniques to classify BCI competition IV dataset 1 and BCI Competition III dataset 4a with commendable accuracies. Miao, Yangyang, et al. used regularized CSP (RCSP) for feature extraction and AdaBoost classifier for classification and attained an average classification accuracy of 78.4% on BCI competition IV dataset 1 [16]. On the same dataset, Qian, L., et al. used CSP for computing spatial filters, Stockwell transform to get time-frequency information, and CNN for classification and attained an average accuracy of 81.22% when the rectified linear activation function (ReLU) was used and an accuracy of 81.34% when the exponential linear unit activation function (ELU) was employed [17], Park, Y., et al. employed local region frequency optimized CSP (LRFCSP) and SVM and got an average classification accuracy of 84.7% [18], Fu, Rongrong, et al. used RCSP combined with RDA and achieved a maximum average accuracy of 87.21% [19] while recently, Zhang, et al. used CSP-Wavelet+LOG with FLDA as a classifier and achieved an average accuracy of 88.86% [20].

On BCI Competition III dataset 4a, Arabshahi, R., et al. employed CNN and Stacked Auto Encoders and attained an average classification accuracy of 82.0% [21]. Miao, Y., et al. used CTFSP to extract sparse CSP features and SVM with RBF to identify the MI task and attained an average accuracy of 85.0% on the same dataset [22], while Chen, S., et al. used CSP & SVM and pulled off an average classification accuracy of 86.86% [23] and Kirar, S. K., et al. attained an average classification accuracy of 90.58% by using graph theory and Quantum Genetic Algorithm followed by the employment of CSP and SVM [24]. Wijaya, A., et al. suggested a feature extraction technique based on logistic regression and two-stage detection (TSD) in the channel instantiation approach and attained an accuracy of 95.21% on this dataset [25]. There is still room for improvement in the accuracies of both datasets. For optimizing the classification accuracies, we have proposed a new methodology.

As seen above, the previous research work done can be summarized into two main categories: first group includes the methods that require initial raw data to be really precise and with minimal noise. The second group of methods involves complex frameworks with most of them having two or three stage data pre-processing and classification which consequently increases the computational complexity which is not feasible for real time applications.

So there is a need of a simple yet effective framework that can provide good results without increasing the complexity of the model.

EEG data description

Two public MI datasets have been used in this study to evaluate the efficacy of the proposed framework.

Dataset 1

The first dataset used in this work is dataset 1 from BCI competition IV provided by the Berlin BCI group [26]. This data has been collected from 59 channels and is recorded from seven healthy subjects namely, a, b, c, d, e, f, g. The sampling frequency of this dataset is 100 Hz. There is a total of three classes of motor imagery tasks involved; left hand, right hand, and foot, out of which any two classes have been performed for each subject. Visual cues in the form of arrows have been shown on the screen for 4 seconds and the subject has performed a certain mental task based on the cue shown, interleaved with 2s of fixation cross and 2s of a blank screen The fixation cross was displayed for 6 seconds while being superimposed on the cues. So the length of each trial is 4s followed by a rest period of 4s. Every subject has a total of 200 trials, 100 trials for each of the two selected classes for that certain subject. The length of a trial is 4s and the number of samples in each trial is 400 (4s x 100Hz). The data paradigm is shown in Fig 1.

Dataset 2

The second dataset used in this work is dataset 4a of BCI Competition III. It comprises binary class MI EEG data collected from 5 healthy subjects namely, aa, av, al, ay, and aw, using 118 EEG channels positioned according to a 10–20 electrode system. Data of 280 trials have been collected for each subject. During each trial, every subject has been shown a 3.5s visual cue depicting three motor imagery tasks including left hand, right hand, and right foot. However, only the cues for the classes ’right hand’ and ’right foot’ have been provided for the competition. The signals have been bandpass filtered between 0.05 and 200 Hz, digitized at 1000 Hz, and then down-sampled to 100 Hz. The data paradigm is shown in Fig 2.

Materials and methods

Fig 3 illustrates the flow of the proposed method. It includes data preprocessing, feature extraction, and classification, which will be discussed in detail in the preceding sections.

Data preprocessing

Data preprocessing involves the filtering of raw EEG data into a suitable format for further analysis. The EEG signals acquired from the human scalp are not generally an accurate reflection of the actual signals originating from the brain as they contain a great deal of noise and artifacts. So for the separation of required signals, preprocessing techniques are employed.

Filtering

After removing the outermost channels of the headset, a Butterworth bandpass filter is used to decompose the EEG signals into an 8–15 Hz frequency band to remove interference from EOG and EMG sources [27]. This frequency band exhibits the maximum indication of motor imagery.

Independent Component Analysis (ICA)

For removing biological artifacts from data, ICA is widely used [28]. ICA enables effective source estimation, and thus plays a very important role in extracting useful information from raw EEG data [29]. ICA is a statistical technique that assumes non-Gaussian signal distribution [30] to separate a mixture of unknown signals into statistically independent components depending on the characteristics of the data.

The ICA generative model [31] is given by (1).

(1)

Where, Y is the data matrix [a×b], where a denotes the number of samples, whereas b represents the number of variables to be measured respectively, and A is the matrix indicating the linear combination of source activities, S, to construct the input data matrix Y.

The ICA source activities, S, also referred to as independent components can be evaluated by taking the product of the input data matrix Y with the inverse matrix W of the matrix A [31] as shown in (2).

(2)

If the source signals are independent and have non-Gaussian distributions, ICA generates more accurate results [32].

Time window segmentation

The time window chosen to extract the data of each trial for both dataset 1 and dataset 2 is 0.5 s to 2.5 s. The selected time window has been used by previous researchers [33, 34] on the same datasets as they provided better classification accuracy as compared to the other time windows.

Feature extraction

Common Spatial Pattern (CSP)

CSP is a powerful feature extraction technique commonly used for binary classification problems [35]. CSP can effectively extract the features of ERS and ERD in the motor imagery signals, so it has been widely used in BCI systems [36]. CSP maximizes the variance of spatially filtered signals for one class and minimizes it for the other to distinguish the features of both the classes, thus separating the classes by their variances.

The averaged normalized covariance matrices and are calculated for the binary class dataset by first calculating the normalized spatial covariance matrices Ca and Cb, and then averaging them over all the trials for each class. A composite covariance matrix, C, is then obtained by the addition of the two normalized spatial covariance matrices [37] as given in (3).

(3)

Where, Uo represents the eigenvectors and Σ denotes the diagonal matrix of the corresponding eigenvalues.

A whitening transformation matrix, P, is obtained by summing the eigenvalues and the eigenvectors of this composite covariance matrix, C [37] using (4).

(4)

P matrix transforms the normalized spatial covariance matrices and into another space [37], as shown in (5) and (6). The sum of the corresponding eigenvalues is always equal to 1 [37], as given by (7).

(5)(6)(7)

The spatial covariance matrices form a common eigenvector, Q, which then along with the whitening transformation matrix, P, forms a spatially filtered signal, W, [37] as given in (8).

(8)

If the data has n number of channels then the matrix W has [n x n] dimensions, where the rows and the columns represent the number of channels and their corresponding spatial patterns, respectively. These spatial patterns help in the recognition of maximum and minimum values of the weighted channel that truncates surplus and repetitive data and reduces the dimensions of the data by neglecting the remaining values.

The use of two or three CSP filters from both ends of the eigenvectors is commonly recommended [36]. In this study, m = 3 has been chosen, i.e. six spatial filters have been used which means the first and the last three columns of the projection matrix have been selected as they give the maximum variance between both classes. The spatially filtered signal is then multiplied with the original data of each trial of both the classes to obtain the desired signal in the new subspace [37], as shown in (9) and (10).

(9)(10)

Log—variance

Fig 4 shows the visual representation of steps involved in feature extraction of preprocessed data. After applying CSP, a feature vector is obtained by calculating the inter-trial variances of both classes. The logarithm of the obtained results is then computed as given in (11).

(11)
thumbnail
Fig 4. Visual representation of steps involved in feature extraction of preprocessed Data.

https://doi.org/10.1371/journal.pone.0276133.g004

Once data has undergone the process of feature extraction, it is ready to enter the stage of classification.

All the analysis during this research was done in MATLAB and some classifiers were implemented through the Classification Learner app of MATLAB.

Classification

Support Vector Machine (SVM)

SVM is a robust classifier widely used for both binary [38] and multi-class [39] classification problems. The major motivation for employing SVM for EEG data classification is to address the objective of good generalization by optimizing the performance of the machine while reducing the computational complexity of the learned model, simultaneously [40].

SVM constructs an optimal separating hyperplane for mapping the input vector into a high-dimensional feature space using non-linear mapping [41]. The optimal hyperplane maximizes the distance between itself and the points closest to it, to distinguish data points belonging to various classes.

For binary classification problems, the classifier function f(a) of a training data set {(a1, b1), (a2, b2), ……., (an, bn)}, is obtained through Lagrange interpolation and is of the form as given by Eq (12): (12)

Here k(a, ai) is a kernel function denoting the dot product between the two entities and ωi is the Lagrange operator.

Gaussian Radial Basis Function (RBF) is the kernel function used in this research for applying SVM.

Logistic regression

Logistic regression is widely used as a classifier for problems where two or more distinct classes or outcomes are to be classified. Logistic regression is very easy to realize and achieves very good performance with linearly separable classes [42]. It finds a wide range of applications in BCI systems where the existence of a feature is predicted based on the set of predictor variables. This algorithm is best suited for models with dichotomous dependent variables [43].

The logistic regression model [44] can be stated as given by (13): (13)

Where P is the probability of dichotomous dependent variables related to the predictor variables as given by (14) [44]: (14)

Here, x1, x2,…, xn are the independent or the predictor variables, whereas n0, n1, n2, ….., nn are the coefficients associated with them.

Linear Discriminant Analysis (LDA)

LDA is a robust classifier that is simple to implement and has fewer computational requirements, making it ideal for BCI systems. LDA does not modify the data rather provides the best possible decision boundary for distinguishing the given classes, making it one of the best-suited linear classifiers [45].

LDA uses a hyperplane to separate or characterize the data points belonging to two or more classes by maximizing the distance between their means and minimizing the interclass variances, assuming that the data points are linearly separable, and have a normal distribution.

A simple representation of LDA for binary classification problems can be demonstrated through Fisher’s discriminant ratio [46] which finds a projection, ω, to maximize the objective function given in (15): (15)

Here, μ1 and μ2 are the means of the first and second class respectively and represent the ‘scatter between the class’, whereas S1 and S2 are the interclass variances and represent the ‘scatter within the class’.

K- Nearest Neighbour (K-NN)

The k-NN algorithm is a potential classifier for large noisy data as it requires less training and testing time and provides good classification accuracy [12].

The k-NN is a supervised machine learning algorithm that predicts the values of new data points based on how precisely it matches the neighboring points in the training set. It works on the principle of ‘feature similarity’. A value of the nearest data point k is chosen. The distance between each point in the test data and each row of training data is estimated using any of the following methods, Euclidean distance, Mahalanobis distance, Minkowski distance, Manhattan distance, etc. [47]. The distances are sorted in ascending orders and the top k row is chosen. The most frequent class of these rows is used to assign a class to the test point.

Naïve Bayes

Naive Bayes is a widely used classifier in BCI systems for detecting binary-class MI tasks as it provides a flexible approach for dealing with any number of classes and is one of the fastest learning algorithms for simultaneously analyzing all its training input [48].

Naïve Bayes is a classification technique that assumes independence among the various features of a class. Naïve Bayes algorithm evaluates a frequency table for the training dataset and then generates a likelihood table by estimating the various probabilities. The posterior probability P(a|z) of the target class a given the predictor z is evaluated using Bayes theorem [49] as given in 16: (16)

Where, P(z| y) represents the likelihood, P(y) represents the prior probability, and P(z) represents the predictor prior probability. The maximum posterior probability among the probabilities of the various classes is the classifier output [50].

Decision tree

A decision tree is a supervised learning algorithm used for classification problems. One of the major advantages of the decision tree classifier is its ability to use multiple feature subsets and decision rules at different stages of classification [51].

A classification and regression tree (CART) is the most successful method for constructing a Decision Tree [52]. This approach is non-parametric and selects the best attributes for each node of the tree using information gain (IG), Gini diversity index (GDI), and gain ratio [53]. The information gain splits the dataset into segments based on an attribute and assesses how much information a feature provides about a class by evaluating changes in entropy. Based on the value of information gain, the nodes are split and a decision tree is built [54]. The information gain is given by (17).

(17)

Here, E represents the entropy, S represents the total number of samples, WA represents the weighted average, and F denotes the number of features.

The hyper-parameter settings of the proposed framework are obtained through Bayesian optimization. The optimized hyper-parameters for subject ‘a’ of Dataset 1 are shown below in Table 1.

thumbnail
Table 1. Optimized hyper-parameters from subject ‘a’ of dataset 1.

https://doi.org/10.1371/journal.pone.0276133.t001

Results

In this study, a novel framework for the accurate classification of two-class motor imagery EEG signals has been proposed. To evaluate the efficacy of the proposed framework, it has been tested on two publicly available datasets, i.e., BCI Competition IV dataset 1 and BCI Competition III dataset 4a. Different preprocessing methods, feature extraction, and classification techniques have been employed to attain better classification accuracies on both datasets as compared to the accuracies already achieved in literature.

The proposed method employs the filtering of EEG data for each trial using a Butterworth bandpass filter within a frequency range of 8–15 Hz [27]. After attaining the required frequency band, ICA has been applied to extract the useful signals from the EEG data. Furthermore, CSP and log-variance are used for feature extraction. Six different classifiers, i.e., fine Gaussian SVM, fine k-NN, LDA, Gaussian naive Bayes, complex tree, and logistic regression are used to select the best one for classifying the EEG data most accurately. A 10-fold cross-validation was performed and logistic regression outperformed all the other classifiers by attaining an average classification accuracy of 90.42%, followed by LDA with 89.78%, Gaussian naive Bayes with 89.20%, fine Gaussian SVM with 88.14%, fine k-NN with 86.21%, and complex tree with 85.57%, respectively on dataset 1, as demonstrated in Table 2. Logistic regression surpassed other classifiers in terms of achieving the highest accuracies because it is less inclined to overfitting and it is easy to regularize. It doesn’t need tuning and outputs well-calibrated predicted probabilities.

thumbnail
Table 2. BCI competition IV dataset 1 classification accuracy (%) with different classifiers.

https://doi.org/10.1371/journal.pone.0276133.t002

To validate the effectiveness of the proposed method, its performance has been analyzed and compared on dataset 2 as well. Out of all the six classifiers, logistic regression has better performance with the highest classification accuracy among all subjects, with a mean classification accuracy of 95.42%. LDA was second with 95.06%, followed by Gaussian naïve Bayes, fine k-NN, complex tree, and SVM with 94.44%, 92.72%, 90.66%, and 87.78%, respectively. To better comprehend the overall classification using the proposed method on dataset 2, Table 3 shows the individual classification accuracies of each of the 5 subjects and the mean classification accuracy of all the subjects with all the six classifiers.

thumbnail
Table 3. BCI competition III dataset 4a classification accuracy (%) with different classifiers.

https://doi.org/10.1371/journal.pone.0276133.t003

Previously, Li, Y, et al. used a cross-correlation technique for feature extraction and logistic regression to classify BCI Competition III dataset 4a and attained an average classification accuracy of 90.29% [55]. In another study, Li, Y, et al. improved their method by developing a modified version of the cross-correlation-based logistic regression algorithm and attained an average classification accuracy of 93.91% on the same dataset [56]. The preprocessing of raw data is very important before it is fed into the feature extraction phase. It filters the data from noise and artifacts and results in the improvement of the classification accuracy. In our work, ICA has been employed to clean the data from artefactual components, which has a significant contribution in obtaining higher average classification accuracy i.e., 95.42% on the same dataset compared to methods proposed earlier.

To evaluate the performance of various classifiers on dataset 1 and dataset 2, two histograms have been computed as shown in Figs 5 and 6, respectively.

thumbnail
Fig 5. Comparison of classification accuracies of dataset 1 with different classifiers.

https://doi.org/10.1371/journal.pone.0276133.g005

thumbnail
Fig 6. Comparison of classification accuracies of dataset 2 with different classifiers.

https://doi.org/10.1371/journal.pone.0276133.g006

The classification results of the two datasets suggest that using a bandpass filter and ICA for preprocessing, along with CSP, log-variance, and logistic regression, gives better classification accuracies. Average accuracy that has been attained on dataset 1 is 90.42%, while on dataset 2 is 95.42%, when logistic regression has been used for classification. Commendable accuracy for both datasets has been achieved as ICA removes the ocular artifacts from the data, which is the most considerable noise in EEG. A combination of CSP and log-variance has proved to be very effective. Moreover, logistic regression has less computational complexity, and is very efficient to train for linearly separable data.

Tables 4 and 5 lists the classification accuracies of all the methods previously employed on dataset 1 and dataset 2, respectively, along with the classification accuracies attained by our proposed method.

thumbnail
Table 4. BCI competition IV dataset 1 classification accuracy (%) of dataset 1 with proposed method compared with other methodologies.

https://doi.org/10.1371/journal.pone.0276133.t004

thumbnail
Table 5. BCI competition III dataset 4a classification accuracy (%) of dataset 2 with proposed method compared with other methodologies.

https://doi.org/10.1371/journal.pone.0276133.t005

Discussion

In this research, binary-class MI EEG signals have been classified using a novel framework with a logistic regression classification algorithm. The method employs a bandpass filter and ICA for preprocessing, CSP and log-variance for feature extraction, and six different classifiers to select the best one for effective classification of EEG signals. The bandpass filter has reduced the noise interference, ICA has revealed the patterns of active brain regions enabling effective source estimation for accurate identification of MI and played a significant role in extracting useful information, and CSP & log-variance has helped in discriminating intent patterns and extracting spatial information from the signal. After selecting time-window segments, frequency filtering, and feature extraction, the selected features are fed into the classifier. Out of various classifiers used, logistic regression has proved to be the simplest yet very powerful for classifying two-class motor imagery EEG signals very efficiently.

The proposed method is evaluated by dataset 1 of the BCI competition IV and dataset 4a of the BCI Competition III. Both the datasets are validated by using 10-fold cross-validation. Table 2 shows the classification accuracies of all 7 subjects of dataset 1. The results show that maximum classification accuracy of 90.42% is achieved with a logistic regression classifier. Table 3 lists the results of the classification of individual subjects for dataset 2. For dataset 2 as well, logistic regression has surpassed all the other classifiers in terms of generating the highest average classification accuracy of around 95.42%. As compared to the existing methods for both datasets, the proposed method showed excellent classification performance. Tables 4 and 5 demonstrate that our proposed method has effectively improved the average classification accuracy rate of BCI Competition IV dataset 1 by 1.56% and by 0.21% for BCI Competition III dataset 4a, respectively, as compared to the accuracies attained by the previous researchers.

The key advantages of the proposed framework include the improved pre-processing, optimized feature extraction, and the reduced classification complexity due to the use of Logistic Regression as a classifier, which reduces the execution time. The main advantages of the proposed framework are discussed below.

As the EEG signals are a mixture of both cerebral and artefactual sources. Most of the previous studies only employ the application of a bandpass filter and do not involve the removal of other artifacts from the EEG data. But in this study, we have focused on removing EOG artifacts from the data. For this purpose, ICA has been applied. ICA has successfully removed ocular artifacts (artifacts related to eye movement) from the meaningful data and contributed to EEG signal enhancement. ICA decomposes the observed signals into independent components, and once the components are extracted from the original signal, the clean signal is reconstructed by disregarding ICs contained artifacts. The application of ICA on the EEG data after the applying the bandpass filter gives us an edge over the methods proposed in previous studies and is one of the key factors contributing towards a better average classification accuracy.

Another factor contributing to higher classification accuracy is the use of CSP for feature extraction and logistic regression for classification. The CSP feature extraction technique employs a frequency selection approach to identify the most significant features associated with the motor imagery task. Logistic regression classifier is one of the basic classification algorithms for binary classification problems. It is relatively simple to execute, interpret, and train. Unlike decision trees or support vector machines, it allows models to be easily updated to reflect new data. In low-dimensional space, logistic regression is less inclined to overfitting. Moreover, it doesn’t require input features to be scaled, is easy to regularize, doesn’t need tuning, and outputs well-calibrated predicted probabilities, all of which are the major factors for outperforming other classifiers.

The first proposed method without proper pre-processing of the signals failed to provide any extraordinary results. So, in order to obtain significant results, all the filtering and pre-processing techniques that are a part of the second proposed framework must be implemented on the raw data to remove artifacts and obtain a specific band of frequencies before moving towards the feature extraction and classification stages.

Conclusion

This paper introduces a novel approach for improving the classification accuracy of binary-class EEG data. The importance of ICA for preprocessing of raw EEG data and the power of CSP for extracting the appropriate signal characteristics from extraneous data has been highlighted. Moreover, the classification performance of various classifiers has been analyzed and compared in this paper. The analysis has shown that the classification performance of logistic regression with ICA as a preprocessor and CSP & log-variance as the feature extraction techniques is more efficient than SVM, LDA, decision trees, naïve Bayes, and k-NN. With logistic regression, an average classification accuracy of 90.42% is achieved on BCI Competition IV dataset 1, and 95.42% on BCI Competition III dataset 4a, which are comparable to all the research methodologies proposed earlier on both datasets.

One of the limitations of CSP is that it can only apply to binary classification problems. For multi-class problems, some modifications to the basic CSP algorithm are required. Moreover, CSP filters are prone to noise and overfitting, and if a subject has a lesser number of training samples, the performance of CSP is much reduced as compared to the subjects with a larger number of training samples. A concern for the use of ICA for preprocessing is that it suffers from order ambiguity and normalization. Furthermore, logistic regression cannot be used for non-linear problems and is prone to overfitting for classification problems in which the number of features is greater than the number of observations.

In the future, we aim to optimize the proposed method even more by applying new and improved preprocessing techniques and by selecting a more appropriate time window, to make it more feasible for real time implementation. Application of improved classification methodologies, development of algorithms for optimized channel selection, and multi-class classification expansion will be our future approaches.

Supporting information

S1 File. This is the code file which can be utilized to generate results shown in the research paper.

https://doi.org/10.1371/journal.pone.0276133.s001

(RAR)

References

  1. 1. Fatehi TA, Suleiman A-BR. Features extraction techniques of EEG signals for BCI application. 2011 [cited 2021 May 31]; Available from: https://www.semanticscholar.org/paper/fd767631927e2c26db7b3cc3ab003ee798121b76
  2. 2. Hobson HM, Bishop DVM. Mu suppression—A good measure of the human mirror neuron system? Cortex. 2016;82:290–310. pmid:27180217
  3. 3. Hwang H-J, Kim S, Choi S, Im C-H. EEG-based brain-computer interfaces: A thorough literature survey. Int J Hum Comput Interact. 2013;29(12):814–26.
  4. 4. Dong S-Y, Kim B-K, Lee S-Y. EEG-based classification of implicit intention during self-relevant sentence reading. IEEE Trans Cybern. 2016;46(11):2535–42. pmid:26441465
  5. 5. Pfurtscheller G, Lopes da Silva FH. Event-related EEG/MEG synchronization and desynchronization: basic principles. Clin Neurophysiol. 1999;110(11):1842–57. pmid:10576479
  6. 6. Cheyne D, Gaetz W, Garnero L, Lachaux J-P, Ducorps A, Schwartz D, et al. Neuromagnetic imaging of cortical oscillations accompanying tactile stimulation. Brain Res Cogn Brain Res. 2003;17(3):599–611. pmid:14561448
  7. 7. Pfurtscheller G, Neuper C. Future prospects of ERD/ERS in the context of brain-computer interface (BCI) developments. Prog Brain Res. 2006;159:433–7. pmid:17071247
  8. 8. Kang H, Nam Y, Choi S. Composite common spatial pattern for subject-to-subject transfer. IEEE Signal Process Lett. 2009;16(8):683–6.
  9. 9. Kachenoura A, Albera L, Senhadji L, Comon P. Ica: a potential tool for bci systems. IEEE Signal Process Mag. 2008;25(1):57–68.
  10. 10. Mavroforakis ME, Theodoridis S. A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw. 2006;17(3):671–82. pmid:16722171
  11. 11. Md Isa NE, Amir A, Ilyas MZ, Razalli MS. The performance analysis of K-nearest neighbors (K-NN) algorithm for motor imagery classification based on EEG signal. MATEC Web Conf. 2017;140:01024.
  12. 12. Wu S-L, Wu C-W, Pal NR, Chen C-Y, Chen S-A, Lin C-T. Common spatial pattern and linear discriminant analysis for motor imagery classification. In: 2013 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain (CCMB). IEEE; 2013.
  13. 13. Rakshit A, Khasnobish A, Tibarewala DN. A Naïve Bayesian approach to lower limb classification from EEG signals. In: 2016 2nd International Conference on Control, Instrumentation, Energy & Communication (CIEC). IEEE; 2016.
  14. 14. Ishfaque A, Awan AJ, Rashid N, Iqbal J. Evaluation of ANN, LDA and Decision trees for EEG based Brain Computer Interface. In: 2013 IEEE 9th International Conference on Emerging Technologies (ICET). IEEE; 2013.
  15. 15. Mishuhina V, Jiang X. Complex common spatial patterns on time-frequency decomposed EEG for brain-computer interface. Pattern Recognit. 2021;115(107918):107918.
  16. 16. Miao Y, Yin F, Zuo C, Wang X, Jin J. Improved RCSP and AdaBoost-based classification for Motor-Imagery BCI. In: 2019 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA). IEEE; 2019.
  17. 17. Qian L, Feng Z, Hu H, Sun Y. A novel scheme for classification of motor imagery signal using Stockwell transform of CSP and CNN model. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE; 2020.
  18. 18. Park Y, Chung W. Frequency-optimized local region common spatial pattern approach for motor imagery classification. IEEE Trans Neural Syst Rehabil Eng. 2019;27(7):1378–88. pmid:31199263
  19. 19. Fu R, Han M, Tian Y, Shi P. Improvement motor imagery EEG classification based on sparse common spatial pattern and regularized discriminant analysis. J Neurosci Methods. 2020;343(108833):108833. pmid:32619588
  20. 20. Zhang S, Zhu Z, Zhang B, Feng B, Yu T, Li Z. The CSP-based new features plus non-convex log sparse feature selection for motor imagery EEG classification. Sensors (Basel). 2020;20(17):4749. pmid:32842635
  21. 21. Arabshahi R, Rouhani M. A convolutional neural network and stacked autoencoders approach for motor imagery based brain-computer interface. In: 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE). IEEE; 2020.
  22. 22. Miao Y, Jin J, Daly I, Zuo C, Wang X, Cichocki A, et al. Learning common time-frequency-spatial patterns for motor imagery classification. IEEE Trans Neural Syst Rehabil Eng. 2021;29:699–707. pmid:33819158
  23. 23. Chen S, Sun Y, Wang H, Pang Z. Channel selection based similarity measurement for motor imagery classification. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE; 2020.
  24. 24. Kirar JS, Agrawal RK. A combination of spectral graph theory and quantum genetic algorithm to find relevant set of electrodes for motor imagery classification. Appl Soft Comput. 2020;97(105519):105519.
  25. 25. Wijaya A, Mada UG, Adji T, Setiawan N, Mada UG, Mada UG. Logistic Regression based Feature Selection and Two-Stage Detection for EEG based Motor Imagery Classification. Int j intell eng syst. 2021;14(1):134–46.
  26. 26. Blankertz B, Dornhege G, Krauledat M, Müller K-R, Curio G. The non-invasive Berlin Brain-Computer Interface: fast acquisition of effective performance in untrained subjects. Neuroimage. 2007;37(2):539–50. pmid:17475513
  27. 27. Shin Y, Lee S, Lee J, Lee H-N. Sparse representation-based classification scheme for motor imagery-based brain-computer interface systems. J Neural Eng. 2012;9(5):056002. pmid:22872668
  28. 28. Mahajan R, Morshed BI. Unsupervised eye blink artifact denoising of EEG data with modified multiscale sample entropy, Kurtosis, and wavelet-ICA. IEEE J Biomed Health Inform. 2015;19(1):158–65. pmid:24968340
  29. 29. Qin L, Ding L, He B. Motor imagery classification by means of source analysis for brain-computer interface applications. J Neural Eng. 2004;1(3):135–41. pmid:15876632
  30. 30. Yao F, Coquery J, Lê Cao K-A. Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets. BMC Bioinformatics. 2012;13:24. pmid:22305354
  31. 31. Calabrese B. Data Reduction. In: Encyclopedia of Bioinformatics and Computational Biology. Elsevier; 2019. p. 480–5.
  32. 32. Wen Z, Hou J, Atkin J. A review of electrostatic monitoring technology: The state of the art and future research directions. Prog Aerosp Sci. 2017;94:1–11.
  33. 33. Zhang H, Guan C, Ang KK, Wang C. BCI competition IV—data set I: Learning discriminative patterns for self-paced EEG-based motor imagery detection. Front Neurosci. 2012;6:7. pmid:22347153
  34. 34. Feng JK, Jin J, Daly I, Zhou J, Niu Y, Wang X, et al. An optimized channel selection method based on multifrequency CSP-rank for motor imagery-based BCI system. Comput Intell Neurosci. 2019;2019:8068357. pmid:31214255
  35. 35. Selim S, Tantawi MM, Shedeed HA, Badr A. A CSP\AM-BA-SVM approach for motor imagery BCI system. IEEE Access. 2018;6:49192–208.
  36. 36. Blankertz B, Tomioka R, Lemm S, Kawanabe M, Muller K-R. Optimizing Spatial filters for Robust EEG Single-Trial Analysis. IEEE Signal Process Mag. 2008;25(1):41–56.
  37. 37. Goel P, Joshi R, Sur M, Murthy HA. A common spatial pattern approach for classification of mental counting and motor execution EEG. In: Intelligent Human Computer Interaction. Cham: Springer International Publishing; 2018. p. 26–35.
  38. 38. Arvaneh M, Guan C, Ang KK, Quek HC. Spatially sparsed Common Spatial Pattern to improve BCI performance. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2011.
  39. 39. Dong E, Li C, Li L, Du S, Belkacem AN, Chen C. Classification of multi-class motor imagery with a novel hierarchical SVM algorithm for brain-computer interfaces. Med Biol Eng Comput. 2017;55(10):1809–18. pmid:28238175
  40. 40. Garrett D, Peterson DA, Anderson CW, Thaut MH. Comparison of linear, nonlinear, and feature selection methods for EEG signal classification. IEEE Trans Neural Syst Rehabil Eng. 2003;11(2):141–4. pmid:12899257
  41. 41. Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw. 1999;10(5):988–99. pmid:18252602
  42. 42. Subasi A. Practical machine learning for data analysis using python. San Diego, CA: Academic Press; 2020.
  43. 43. Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl. 2008;34(1):366–74.
  44. 44. Roy K, Kar S, Das RN. Understanding the basics of QSAR for applications in pharmaceutical sciences and risk assessment. San Diego, CA: Academic Press; 2015.
  45. 45. Bhatnagar M, Gupta GS, Sinha RK. Linear discriminant analysis classifies the EEG spectral features obtained from three class motor imagination. In: 2018 2nd International Conference on Power, Energy and Environment: Towards Smart Technology (ICEPE). IEEE; 2018.
  46. 46. Wang S, Li D, Song X, Wei Y, Li H. A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Syst Appl. 2011;38(7):8696–702.
  47. 47. Bhattacharyya S, Khasnobish A, Konar A, Tibarewala DN, Nagar AK. Performance analysis of left/right hand movement classification from EEG signal by intelligent algorithms. In: 2011 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain (CCMB). IEEE; 2011.
  48. 48. Siuly Wang H, Zhang Y. Detection of motor imagery EEG signals employing Naïve Bayes based learning process. Measurement (Lond). 2016;86:148–58.
  49. 49. Veetil S, Gao Q. Real-time network intrusion detection using Hadoop-based Bayesian classifier. In: Emerging Trends in ICT Security. Elsevier; 2014. p. 281–99.
  50. 50. Machado J, Balbinot A, Schuck A. A study of the Naive Bayes classifier for analyzing imaginary movement EEG signals using the Periodogram as spectral estimator. In: 2013 ISSNIP Biosignals and Biorobotics Conference: Biosignals and Robotics for Better and Safer Living (BRC). IEEE; 2013.
  51. 51. Du C-J, Sun D-W. Object Classification Methods. In: Computer Vision Technology for Food Quality Evaluation. Elsevier; 2008. p. 81–107.
  52. 52. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. 1st ed. Philadelphia, PA: Chapman & Hall/CRC; 1984.
  53. 53. Rashid M, Bari BS, Hasan MJ, Razman MAM, Musa RM, Ab Nasir AF, et al. The classification of motor imagery response: an accuracy enhancement through the ensemble of random subspace k-NN. PeerJ Comput Sci. 2021;7(e374):e374. pmid:33817022
  54. 54. Haq AU, Zhang D, Peng H, Rahman SU. Combining multiple feature-ranking techniques and clustering of variables for feature selection. IEEE Access. 2019;7:151482–92.
  55. 55. Siuly, Li Y, Wu J, Yang J. Developing a logistic regression model with cross-correlation for motor imagery signal recognition. In: The 2011 IEEE/ICME International Conference on Complex Medical Engineering. IEEE; 2011.
  56. 56. Siuly Li Y, Paul Wen P. Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. Comput Methods Programs Biomed. 2014;113(3):767–80. pmid:24440135