Questioning the evidence for BCI-based communication in the complete locked-in state

When Birbaumer and colleagues [1] showed in 1999 for the first time that a person in a locked-in state can use a brain–computer interface (BCI) to communicate, it also created the hope of BCIs restoring communication in the complete locked-in state (CLIS), in which a patient has no remaining muscle control. Since this pioneering work, multiple electroencephalography (EEG)-based BCI systems have been successfully tested with locked-in patients [2]. However, these systems did not work for completely locked-in patients, leading to the conclusion that voluntary brain regulation is not possible in the CLIS [2,3]. This changed in 2014 when Gallegos-Ayala and colleagues [4] presented a case study suggesting that near-infrared spectroscopy (NIRS) could be used for communication in the CLIS. It was followed up by Chaudhary and colleagues in 2017 [5], who recorded NIRS in 4 patients in the CLIS. In that work, results from offline and online classification are presented with accuracies significantly above chance level, which led the authors to the conclusion that NIRSbased BCI communication is working in CLIS. For this commentary, I performed a reanalysis of the data from Chaudhary and colleagues [5]. As the results are substantially different from the results reported in the original paper, I question the claim of NIRS-based BCI communication in the CLIS.


A2. Results from analysis of Chaudhary et al. with randomly permuted trials
In the first review round, Chaudhary and colleagues responded to this comment and gave a detailed description of the statistical procedure performed. They averaged the data first over all trials of one session, then over all sessions. By using this order of averaging, the variance over trials/sessions is removed, retaining only the variance over channels, which is low as channels are not independent and highly correlated. Therefore a statistical analysis will show (erroneously) significant results. That this method is not correct can be easily shown by performing a permutation test, in which the yes/no labels of all trials are randomly permuted. In a permuted dataset, a correctly applied statistical analysis should not show any significant effects. However, using the (incorrect) method of Chaudhary et al. shows significant effects as shown in figure A1.

A4. Statistical analysis of online classification results presented in (Guger et al., 2017)
In (Guger et al., 2017), a vibrotactile P300 BCI system is evaluated online in 9 patients (7 LIS, 2 CLIS). As the accuracy for all patients was at least 70 %, Guger et al. argue that the system worked for all patients. However, only a very small dataset (10 trials per patient) were collected and the online results were not evaluated statistically in that publication.
The statistical significance of a classification can be modeled by a binomial cumulative distribution (Combrisson et al., 2015). For n=10 trials, c=2 classes and a significance level α = 0.05, the classification accuracy has to be greater (not equal to) 80 % to be significant. With the given number of trials, the actual p-value for 80 % is p=0.0546 and for 90 % p=0.0107. Thereby, only 3 of the 9 online sessions (P1, P5, P6) have a classification accuracy with p<0.05. As significance was assessed individually for each patient, and 9 tests were performed, one would need to correct the p-values for multiple comparisons. When using a Bonferroni correction to correct for multiple comparisons, none of the patients has an online classification accuracy which is significantly above chance level.
With the online results not being significantly above chance level, the results presented in (Guger et al., 2017) should not be used to claim that the presented EEG-based BCI system established communication in the complete locked-in state. However, the results warrant a further investigation, so Guger and colleagues are encouraged to test their system with a larger sample size (more trials per patient) and assess statistically if classification accuracy is significantly above chance level.

A5. Inconsistencies and missing data in the supporting information to (Chaudhary et al., 2017)
For the analysis in this paper, all data published on Zenodo as supporting information to (Chaudhary et al., 2017) was used. As Plos Biology requires authors to make all data publicly available, Chaudhary et al. write in their Data Availability Statement that all data pertaining to the results in their paper can be found on Zenodo. Despite the statement, there are several inconsistencies and missing data when comparing the data published on Zenodo and the results presented in (Chaudhary et al., 2017). Due to data missing in the files uploaded to Zenodo, the number of sessions presented in this paper differs from the number of sessions presented in (Chaudhary et al., 2017) and not all aspects of the original paper could be reanalyzed. In the following, all data pertaining to the results presented in (Chaudhary et al., 2017)