Information spreading by a combination of MEG source estimation and multivariate pattern classification

Masashi Sato; Okito Yamashita; Masa-aki Sato; Yoichi Miyawaki

doi:10.1371/journal.pone.0198806

Abstract

To understand information representation in human brain activity, it is important to investigate its fine spatial patterns at high temporal resolution. One possible approach is to use source estimation of magnetoencephalography (MEG) signals. Previous studies have mainly quantified accuracy of this technique according to positional deviations and dispersion of estimated sources, but it remains unclear how accurately MEG source estimation restores information content represented by spatial patterns of brain activity. In this study, using simulated MEG signals representing artificial experimental conditions, we performed MEG source estimation and multivariate pattern analysis to examine whether MEG source estimation can restore information content represented by patterns of cortical current in source brain areas. Classification analysis revealed that the corresponding artificial experimental conditions were predicted accurately from patterns of cortical current estimated in the source brain areas. However, accurate predictions were also possible from brain areas whose original sources were not defined. Searchlight decoding further revealed that this unexpected prediction was possible across wide brain areas beyond the original source locations, indicating that information contained in the original sources can spread through MEG source estimation. This phenomenon of “information spreading” may easily lead to false-positive interpretations when MEG source estimation and classification analysis are combined to identify brain areas that represent target information. Real MEG data analyses also showed that presented stimuli were able to be predicted in the higher visual cortex at the same latency as in the primary visual cortex, also suggesting that information spreading took place. These results indicate that careful inspection is necessary to avoid false-positive interpretations when MEG source estimation and multivariate pattern analysis are combined.

Citation: Sato M, Yamashita O, Sato M-a, Miyawaki Y (2018) Information spreading by a combination of MEG source estimation and multivariate pattern classification. PLoS ONE 13(6): e0198806. https://doi.org/10.1371/journal.pone.0198806

Editor: Christos Papadelis, Boston Children's Hospital / Harvard Medical School, UNITED STATES

Received: October 18, 2017; Accepted: May 27, 2018; Published: June 18, 2018

Copyright: © 2018 Sato et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This work was supported by JSPS KAKENHI (JP16H01541, JP17H01755 and JP26120514), MIC SCOPE (141203025), JST PRESTO (JPMJPR1778), the KDDI Foundation, the Narishige Neuroscience Research Foundation, the Naito Foundation, and the Yazaki Memorial Foundation for Science and Technology to Yoichi Miyawaki, JSPS KAKENHI (JP16J00417) to Masashi Sato, and a contract research 173 with NICT to Masa-aki Sato. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Close investigation of human brain activity patterns plays a vital role in revealing how information is represented in the human brain. However, no method can directly measure fine spatial patterns of human brain activity at a high temporal resolution. Functional magnetic resonance imaging (fMRI) can capture the spatial patterns of human brain activity, but it lacks temporal resolution, as it measures metabolic processes reflected by blood-oxygenation-level-dependent signals. In contrast, magnetoencephalography (MEG) and electroencephalography (EEG) can capture the fast dynamics of brain activity patterns via electromagnetic signals that propagate from neurons; however, both of these techniques lack fine spatial resolution because their signals are measured by a small number of sensors placed around the head.

Given these constraints, researchers have tried to extract necessary information by solving an inverse problem to estimate cortical current source from measured MEG/EEG signals. MEG is more suitable than EEG for this purpose because of the homogeneous magnetic permeability of brain tissues. The typical approach for MEG source estimation uses an equivalent current dipole or distributed source model, assuming a few or many (typically 10^3–4) current dipoles in the brain. Although additional constraints are necessary to resolve the ill-posed condition, a distributed source model seems preferable to analyze fine spatial patterns of brain activity because of its descriptive power. However, no method can reconstruct cortical current source with perfect accuracy because of the ill-posed nature of MEG source estimation. It is hence important to be aware of limitations of MEG source estimation.

Previous studies have mainly evaluated the performance of MEG source estimation with regard to the accuracy of source position estimation. Thus, error has typically been quantified in terms of spatial displacement [1], dispersion [2–5], and overlap/non-overlap of the estimated sources relative to the original ones [6–8]. However, it remains unclear how accurately the spatial patterns of the original source are restored through MEG source estimation.

Restorability of the spatial patterns of the original sources through MEG source estimation is crucial for investigating the information content represented by brain activity patterns [9–17]. Previous fMRI studies have shown that multivariate pattern analysis (MVPA) allows information of images seen by participants to be extracted from the spatial patterns of their brain activity [18–20]. Recent MEG studies have also applied MVPA to estimated cortical current and evaluated the information represented by them [21–24]. This progress emphasizes the importance of evaluating whether the original spatial pattern of the source is preserved or whether a spurious pattern is fabricated through MEG source estimation.

In this study, we examine whether MEG source estimation can restore a pattern of the original cortical current and the represented information content of the source brain areas (Fig 1). For this purpose, we assumed patterns of source cortical current in certain areas on participants’ cortical surface models that would encode differences in artificial experimental conditions as multidimensional patterns. Given the source cortical current, we calculate the expected magnetic fields at the MEG sensors and estimate the spatial patterns of the cortical current from the simulated MEG sensor signals. We use four representative methods of MEG source estimation based on distributed source models. We quantify the similarity between the spatial patterns of the original and estimated cortical currents and show significant correlation between them. We then demonstrate that the estimated cortical current can be used to predict the corresponding artificial experimental conditions by pattern classification analysis but that significant prediction is possible even in cortical areas where the original source is not defined. This unexpected phenomenon makes it appear as if represented information spreads over a wide cortical area when MEG source estimation is used, which could lead to misinterpretations about the cortical areas that represent target information. Further analyses confirm that this information spreading can be similarly observed in real MEG data.

Download:

Fig 1. Schematic view of multivariate pattern analysis of cortical current estimated by MEG source estimation.

Two patterns of source cortical current are simulated (step 1) and then converted to MEG signals (step 2). Next, cortical current is estimated from the converted MEG signals using MEG source estimation methods (step 3). To evaluate pattern reproducibility and information restorability, the original and estimated patterns of cortical current are compared using multivariate pattern analyses (steps 4 and 5).

https://doi.org/10.1371/journal.pone.0198806.g001

Methods

Modeling cortical surfaces

Magnetic resonance imaging (MRI) was used to obtain each participant’s cortical structural images. Cortical surface models were then extracted from the cortical structural images to simulate source cortical current. The simulation described in the following sections were conducted for each participant independently using the individual cortical surface models.

Participants.

Cortical structural images were obtained from nine participants (eight male and one female). They participated in our study voluntarily, and the same participants also took part in the MEG experiments described later. Each participant gave written informed consent before participating. The procedure was approved by the institutional review board of The University of Electro-Communications and Advanced Telecommunications Research Institute International (ATR) Brain Activity Imaging Center.

MRI acquisition.

Cortical structural images were obtained using 3.0-Tesla Siemens MAGNETOM Trio A Tim and Prisma fit scanners located at the ATR Brain Activity Imaging Center. T1-weighted magnetization-prepared rapid-acquisition gradient-echo (MP-RAGE) fine structural images of the whole head were acquired for each participant (208 sagittal slices; TR, 2250 ms; TE, 3.06 ms; TI, 900 ms; flip angle, 9°; field of view, 256 × 256 mm; voxel size, 1.0 × 1.0 × 1.0 mm; the same parameters were used on the Trio and Prisma scanners).

Extraction of cortical surfaces.

The cortical surface was defined as a polygon model of the gray matter surface extracted from each participant’s MRI data using the FreeSurfer software suite (http://surfer.nmr.mgh.harvard.edu/), with about 300,000 vertices. The polygonized cortical surface models were then imported into Brainstorm [25], and the number of vertices was downsampled to 15,002 to reduce computational load.

Simulation of source cortical current

A source cortical current was simulated for each participant’s cortical surface model with a time course designed to roughly imitate the evoked response caused by visual stimulation along the visual cortical hierarchy. The time period of the source cortical current was defined during −100–300 ms, which corresponds to a single trial in the simulation. We assumed two source areas in different time windows and ROIs: at 25–75 ms in the primary visual cortex (V1) and at 200–250 ms in the inferotemporal cortex (IT; Fig 2). ROIs were manually defined on the ICBM152 standard brain [26] and then projected onto each participant’s cortical surface model using FreeSurfer’s spherical morphing procedure (Fig 2A). In addition to V1 and IT, the parietal cortex (PR) was also defined without any sources for control analyses (Fig 2A). V1 was defined to encompass the occipital pole, IT was defined to overlap with the inferoposterior part of the temporal lobe, and PR was defined as the middle part of lateral parietal lobe. Each ROI enclosed 120 vertices (60 vertices for each hemisphere). No sources were assumed on other vertices. The time origin (0 ms) was considered to be the onset of visual stimulation. Although we used this particular timing setup, it does not affect the presented results qualitatively unless the two source activity temporally overlap.

Download:

Fig 2. ROIs and patterns of source cortical current.

(a) Three bilateral ROIs arranged on the cortical surface of the standard brain. (b) Source cortical current within each ROI. Line colors correspond to ROI colors shown in (a).

https://doi.org/10.1371/journal.pone.0198806.g002

Each of the sources was a combination of multiple current dipoles with the same amplitude within each hemisphere, but different amplitudes across hemispheres. Current dipoles were placed at all vertices of the cortical surface whose directions were perpendicular to the cortical surface (the positive direction was defined as from the inside to the outside of the brain). Note that current dipoles with non-zero amplitude was placed only in the source ROIs whereas those placed in other areas had zero amplitude. The amplitude of the source cortical current in each ROI was temporally modulated as a sinusoidal waveform whose phase started at zero and ended at π within each time window (Fig 2B). Thus, the amplitude was maximal at the middle of each time window (50 ms and 225 ms for V1 and IT, respectively). We created two artificial experimental conditions with differences in maximum amplitude for each hemisphere: for condition 1, the maximum amplitude of the source cortical current was 1 nA⋅m for the left hemisphere and −1 nA⋅m for the right, while condition 2 was the opposite of condition 1 across hemispheres. Thus, the mean amplitude of the source cortical current across hemispheres was the same in both of the artificial experimental conditions.

Calculation of MEG sensor signals

To simulate MEG sensor signals generated by the original cortical current, we measured head position relative to the MEG sensors for each participant.

Head position measurement and registration.

We first measured each participant’s head shape and the positions of five electromagnetic marker coils attached to the participant’s head (three on the forehead and two on the ear tragi) using a three-dimensional digitizer (FastSCAN, Polhemus Inc., USA). The magnetic field generated by the marker coils was then measured by an MEG system (PQ1400RM, Yokogawa Electric Co., Japan) with 400 SQUID sensors (210 axial and 190 planar gradiometers), and each participant’s head position was estimated from the measured signals. Each participant’s head position in the MEG system was then coregistered to the cortical structural image of the same individual’s MRI data. The positional relationship between the MEG sensors and the cortical surface was determined from each participant’s coregistration results.

MEG signal synthesis.

MEG sensor signals were calculated by a forward model that describes magnetic propagation processes from the original cortical current to the MEG sensors. The forward model can be described as (1) where b (ℝ^N×1 vector; N, number of MEG sensors) represents measured MEG signals, G (ℝ^N×M matrix; M, number of vertices on the cortical surface) represents a leadfield matrix, j (ℝ^M×1 vector) represents the original cortical current, and e (ℝ^N×1 vector) represents sensor noise. Each column of the leadfield matrix indicates the amplitude of the MEG signals generated by a unit current dipole of the corresponding vertex. G was calculated by the boundary element method using the OpenMEEG software package [27,28]. Sensor noise was modeled to follow a Gaussian distribution N(0,σ²). In this simulation, σ was set to twice the largest MEG signal amplitude across all time points and sensors to represent the low signal-to-noise ratio that would occur in an actual experiment. The noise was added to each time point and MEG sensor independently. We constructed five artificial experimental runs by repeating this procedure 50 times for each artificial experimental condition. Each run consisted of 100 trials in total.

Estimation of cortical current from simulated MEG sensor signals

Suppose that is an estimate value of j (ℝ^M×1 vector) given b and G. To estimate , we need to solve an inverse problem that is ill-posed because there are fewer MEG sensors than assumed current dipoles on the cortical surface. To obtain a unique solution, we used the following four representative frameworks for MEG source estimation, which have been used extensively in recent studies.

L2-norm regularization.

We first tested L2-norm regularization, whose objective function is written as (2) where C is the ℝ^N×N noise covariance matrix, R is the ℝ^M×M source covariance matrix, λ is a regularization parameter, and T indicates the transpose. The first term on the right-hand side of Eq (2) indicates the estimation error weighted by noise covariance, and the second term indicates L2-norm regularization weighted by source covariance. This framework is known as minimum norm estimation (MNE) [29]. Although classical studies set C and R as identity matrices, here we used a modified version of MNE, in which C was computed from simulated MEG signals with a period of −100 to −1 ms, R was weighted by depth from the cortical surface, and the leadfield matrix was spatially whitened. We tested different λ values ranging from 10⁻² to 10³ in exponential steps. We used MNE implemented in Brainstorm (http://neuroimage.usc.edu/brainstorm/).

L1-norm regularization.

L2-norm regularization yields a cortical current broadly distributed over the cortical surface. Such a solution is often considered undesirable, because one of the purposes of MEG source estimation is to identify the brain areas related to the experimental conditions under investigation. As an MEG source estimation method that gives preference to more sparse cortical currents than MNE, we tested L1-norm regularization, also known as minimum current estimation (MCE; [30,31]) or least absolute shrinkage and selection operator (LASSO; [32]), whose objective function is written as (3) where λ is a regularization parameter, and |x|_p represents the Lp-norm of a vector X. L1-norm regularization promotes sparse solutions in which only small numbers of elements of have non-zero values. In this formulation, λ should be smaller than [33,34]; otherwise becomes the zero vector. As λ_max was on the order of 10⁻¹¹ in our simulation, we tested λ values smaller than the λ_max, ranging from 10⁻¹⁶ to 10⁻¹¹ in exponential steps. We used the L1-norm minimization implemented in the scikit-learn software package (http://scikit-learn.org/stable/).

Hierarchical Bayesian estimation.

The above two methods add a constraint for the minimization of the L2 and L1-norm of , respectively. The constraint can be further improved if prior information on the locations of the MEG signal sources is available. Previous studies have shown that fMRI activity evoked by experimental conditions similar to those used to induce MEG signals can serve as prior information about the possible locations of MEG signal sources [35,36].

In this study, we used a hierarchical Bayesian model to introduce prior information on the MEG signal sources [36]. In this model, the prior information for is introduced as a prior probability that follows a zero-mean Gaussian distribution with inverse variance r: (4) where R = diag(r) and r is a ℝ^M×1 vector whose elements are hyperparameters denoting source variance. The inverse variance r is also treated as a random variable that follows a gamma distribution (5) where represents a gamma distribution with mean and degrees of freedom γ_0,i. The γ_0,i is also known as a confidence parameter. The prior distribution combined with this hyperprior distribution is equivalent to automatic relevance determination (ARD) prior [36–38]. Information about the fMRI activity is introduced to control , with a large source variance being more likely where larger fMRI activity is observed. The likelihood function is defined with the assumption that the MEG sensor noise follows a Gaussian distribution as (6) where α indicates noise variance. Using Bayes’ theorem, the posterior probability distribution for the estimated cortical current can be described as (7) where P(b) indicates the marginal likelihood defined as (8) Using , the posterior distribution of can be obtained as (9) The cortical current is estimated by taking the expectation of the posterior distribution.

We conducted this hierarchical Bayesian estimation using variational Bayesian multimodal encephalography software (VBMEG; http://vbmeg.atr.jp/). For prior information, we set the magnitude of the fMRI activity to 1 for all vertices in V1 and IT and to 0 for all other vertices. Note that cortical current was estimated for all vertices on the whole cortical surface regardless of the prior information. The width of the time window for estimating source variance were set to 100 ms and the time window was shifted in 50-ms steps. We used a single value of the confidence parameter for all vertices (i.e., γ_0,i = γ₀). Tested values were γ₀ = 0 and that in the range from 10⁻¹ to 10³ in exponential steps.

LCMV beamformer.

The fourth representative method for MEG source estimation that we tested was the linearly constrained minimum variance beamformer (LCMV; [39]), which is based on a different concept from the regularization methods described above. LCMV constructs a spatial filter w_i for b to estimate the ith cortical current under the constraint , where g_i is the column vector of the leadfield matrix corresponding to the ith vertex. Under this constraint, LCMV minimizes the power of . This minimization effectively suppresses the influence of spurious sources and sensor noise because the lower bound of the power of should match the original cortical current as a result of the constraint. This procedure yields a spatial filter w_i: (10) where B indicates the covariance matrix of b. Because the signal-to-noise ratio of b can be low in an actual experiment, noise can significantly affect w_i. Thus, w_i is normalized by noise variance as (11) Thus, can be obtained as (12) can thus be estimated by calculating for all vertices. In this study, B was computed using a 100-ms time window shifted with 50-ms steps. Noise covariance matrix C was computed from simulated MEG signals with a period of −100 to −1 ms. We performed the above procedure using custom written MATLAB (MathWorks, Natick, MA) programs.

In this study, MEG source estimation was conducted using all trials for each artificial experimental run independently to avoid using the same data twice in the training and test (double dipping) in leave-one-run-out cross-validation (see Time-resolved decoding in each ROI section).

Evaluation of source estimation accuracy

To evaluate source estimation accuracy, we used the two indices of area under the precision-recall curve and correlation of spatial patterns. Note that because the value of the MEG sensor noise was set high to resemble actual measurements, these evaluations would have been strongly influenced by noise if based on the data from a single trial. We therefore averaged the estimated cortical current across all trials for each artificial experimental condition and used the averaged data for evaluations.

Area under the precision-recall curve.

First, we used the area under the precision-recall curve (APR) to evaluate the accuracy of localization of source cortical current by the MEG source estimation methods. As we simulated the source cortical current as a distribution rather than a current dipole, it is suitable to evaluate the degree of overlap between the source ROIs and the source positions defined by the estimated cortical current. Representative methods for such an evaluation are the area under the receiver operating characteristic curve (AUC) [40] and APR [41,42]. A previous study showed that AUC is less sensitive than APR when the number of positive samples is significantly different from the number of negative ones [43]. As the numbers of vertices within (positive samples) and outside (negative samples) a source ROI differed substantially (120 vs. 14,882) in our simulation, we used APR instead of AUC. The detailed procedure to calculate APR is shown in Appendix A. The chance level of APR was calculated as 0.008. APR was calculated at the time when the source cortical current was at its maximum for each ROI (50 ms and 225 ms for V1 and IT, respectively) in each artificial experimental condition.

Correlation of spatial patterns.

Second, we calculated the correlation coefficients between the original and estimated cortical currents to evaluate how accurately spatial patterns were restored by MEG source estimation. Spatial correlations were computed between the original and estimated cortical current patterns over vertices within each source ROI for each artificial experimental condition at 50 ms and 225 ms for V1 and IT respectively. We assumed that the spatial patterns of the original cortical current were restored if a significant correlation (P < 0.05, uncorrected) was observed.

Time-resolved decoding in each ROI

We also used pattern classification analysis to evaluate whether the MEG source estimation methods restored information represented by differences in the spatial patterns of the original cortical currents. Pattern classification was performed to predict the artificial experimental conditions from the estimated cortical current at each time interval in each ROI on a trial-by-trial basis. We used the linear support vector machine (SVM; [44]) implemented in libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) as a pattern classifier or “decoder”. The decoder’s prediction accuracy was evaluated by time-resolved leave-one-run-out cross-validation analysis [16,45], in which the decoder was trained with four out of five artificial experimental runs and tested with the remaining one at a particular time interval of 5 ms (see below about preprocessing of the estimated cortical current). This procedure was repeated until all artificial experimental runs were tested once at each time interval.

The estimated cortical current was preprocessed before decoding as follows. First, the time course of estimated cortical current was downsampled at each 5-ms interval with an equally-weighted smoothing window of 10 ms. Then, the downsampled data underwent z-score normalization across trials at each time interval for each vertex. The parameters for normalization (i.e., mean and standard deviation) were calculated using only a training data set and were then applied to both the training and test data sets. A decoder received the preprocessed data corresponding to each ROI as an input feature (ℝ^120×1 vector, for all ROIs). Custom software programs partially based on the Brain Decoder Toolbox (http://www.cns.atr.jp/dni/download/brain-decoder-toolbox/) were written to perform the analysis.

Searchlight decoding

MEG source estimation cannot reconstruct the original cortical current with perfect accuracy, and spurious cortical current may be found beyond the source ROIs. If this spurious cortical current has systematic differences related to the artificial experimental conditions, a decoder can capture these differences and predict the artificial experimental conditions, even from irrelevant brain areas far from the source ROIs.

To test this possibility, we performed searchlight decoding [46] to examine whether and how far such systematic differences in estimated cortical current spread beyond the source ROIs. The estimated cortical current at each vertex and its 119 neighborhood vertices were used as an input feature (ℝ^120×1 vector, the same dimensions as for the ROI-based decoding). Decoding was performed for all vertices using the same procedure as that employed for the ROI-based decoding, but the time points only of 50 ms and 225 ms were tested in this analysis.

To assess how far the systematic differences had spread, we analyzed the relationship between the distance from the source ROI to a particular vertex and the prediction accuracy at that vertex. Distance was calculated from the center of mass of the source ROI. This calculation was performed separately for each hemisphere.

Statistical significance of decoding analysis

We defined the statistical significance level of these decoding analyses (ROI-based and searchlight) with reference to the prediction accuracy obtained from the whole cortical area at 0 ms using searchlight decoding for each participant. As the MEG signals at 0 ms consisted of noise, prediction accuracy at this time served as a null distribution. We used the 99th percentile of the null distribution as the statistical significance level. We also used binomial distribution and permutation tests to define the significance level and examined whether the significance of decoding accuracy depends on the definition of the significance level. The significance levels of the permutation tests were defined as 99th percentile of 200 random permutations at each time point.

Real data analysis

We also applied pattern classification analysis to real data. In this analysis, we used VBMEG as a source estimation method because it showed the most accurate source localization and pattern reconstruction in the simulations (see Results).