## Figures

## Abstract

### Background

Electroencephalographic (EEG) microstate analysis is a method of identifying quasi-stable functional brain states (“microstates”) that are altered in a number of neuropsychiatric disorders, suggesting their potential use as biomarkers of neurophysiological health and disease. However, use of EEG microstates as neurophysiological biomarkers requires assessment of the test-retest reliability of microstate analysis.

### Methods

We analyzed resting-state, eyes-closed, 30-channel EEG from 10 healthy subjects over 3 sessions spaced approximately 48 hours apart. We identified four microstate classes and calculated the average duration, frequency, and coverage fraction of these microstates. Using Cronbach's α and the standard error of measurement (SEM) as indicators of reliability, we examined: (1) the test-retest reliability of microstate features using a variety of different approaches; (2) the consistency between TAAHC and *k*-means clustering algorithms; and (3) whether microstate analysis can be reliably conducted with 19 and 8 electrodes.

### Results

The approach of identifying a single set of “global” microstate maps showed the highest reliability (mean Cronbach's α>0.8, SEM ≈10% of mean values) compared to microstates derived by each session or each recording. There was notably low reliability in features calculated from maps extracted individually for each recording, suggesting that the analysis is most reliable when maps are held constant. Features were highly consistent across clustering methods (Cronbach's α>0.9). All features had high test-retest reliability with 19 and 8 electrodes.

**Citation: **Khanna A, Pascual-Leone A, Farzan F (2014) Reliability of Resting-State Microstate Features in Electroencephalography. PLoS ONE 9(12):
e114163.
https://doi.org/10.1371/journal.pone.0114163

**Editor: **Thomas Koenig, University of Bern, Switzerland

**Received: **May 8, 2014; **Accepted: **November 5, 2014; **Published: ** December 5, 2014

**Copyright: ** © 2014 Khanna et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The authors confirm that, for approved reasons, some access restrictions apply to the data underlying the findings. Please contact the corresponding author to request data from this study. Approval from the BIDMC Institutional Ethics Board is required to grant access to data. Other data sharing agreements may also be necessary.

**Funding: **This study was supported in by the Canadian Institute of Health Research (CIHR - 201102MFE-246635-181538), the Sidney R. Baer Jr. Foundation, the National Institutes of Health (R01HD069776, R01NS073601, R21 MH099196, R21 NS082870, R21 NS085491, R21 HD07616), Harvard Catalyst - The Harvard Clinical and Translational Science Center (NCRR and the NCATS NIH, M01-RR-01066, UL1 RR025758), and the Temerty Family through the Centre for Addiction and Mental Health (CAMH) Foundation. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** AK and FF have no conflict of interest to disclose. APL serves on the scientific advisory boards for Nexstim, Neuronix, starlab Neuroscience, Neosync, and Novavision, and is an inventor on patents and patent applications related to noninvasive brain stimulation and real-time integration of TMS with EEG and fMRI.

## Introduction

Neurophysiological impairments may precede the appearance of clinical symptomology in several neuropsychiatric illnesses [1]–[3]. Frequent and longitudinal monitoring of neurophysiological “biomarkers” could enable early detection of disease pathogenesis, and enhance understanding of the neurophysiological impairments underlying these disorders. Thus, there is great interest in developing techniques to detect neurophysiological biomarkers associated with impairments in the brain's functional health.

Electroencephalography (EEG) is a popular and widely used tool that has been explored as one such method capable of detecting the electrophysiology of the brain. EEG detects and records millivolt fluctuations of electric potentials over the cortex with very high temporal resolution [4]. A number of approaches have been proposed to extract features of neurophysiologic relevance from the recording. One such method is to use characteristics of the recorded oscillations to define “states” of the signal that evolve over time. For example, state characteristics such as chaotic complexity [5] or synchronicity [6] have been extracted from resting-state EEG. In this method, brain activity is described in relation to state characteristics, such as the duration or frequency of occurrence of certain states.

Microstate analysis is one such method that defines states of the multichannel EEG signal by spatial topographies of electric potentials over the electrode array. This method was first proposed by Lehmann et al. (1987), who showed that the α frequency band (8–12 Hz) of multichannel resting-state EEG could be parsed into discrete states in this way [7]. When the multichannel resting-state EEG signal is considered as a time series of topographies of electric potentials, two remarkable properties emerge. First, although there are a large number of possible topographies in multichannel recording, a majority of the signal can be represented by surprisingly few maps. Interestingly, most studies of resting-state EEG consistently find the same four archetypal maps that explain more than 70% of the total topographic variance. Second, there is a well-defined temporal structure of these maps, in that a single topography remains dominant for about 80–120 ms before abruptly transitioning to another topography. These periods of quasi-stability of a single topography are “microstates.” Thus, the multichannel EEG signal can be represented by a single time series of microstates alternating among themselves at discrete intervals.

Compared to traditional frequency power EEG analysis, spatial analysis of EEG using microstates has several advantages. Perhaps most importantly, spatial EEG analysis does not assume the EEG signal is a linear dynamical system that can be represented through the Fourier series as a linear function of a set of sine waves. The spatial topography of the EEG signal can be defined at any point in time independently of the preceding or subsequent topography and therefore has millisecond resolution, unlike conventional frequency power analysis that integrates activity over seconds. Indeed, although microstates were initially described in the alpha frequency band, they can in fact be defined within any signal bandwidth; the dominant generator frequency in any given bandwidth dictates the speed of polarity inversions, but resting-state microstate topographies are considered independently of polarity (i.e. two topographies with opposite polarities are considered the same microstate). Microstates are therefore better suited to detect rapid, dynamic activity in large-scale neurocognitive networks than traditional frequency analysis of EEG. These large-scale neural networks, which link spatially distributed cortical areas into functional entities, have been shown to underlie complex neurocognitive activities [8], [9], including those that occur at rest, the so-called “resting-state networks” (RSNs). Accordingly, recent data has indicated that individual microstates may correspond to specific RSNs identified in functional magnetic resonance imaging (fMRI) studies, as there appears to be a temporal correlation between the appearance of microstates and specific RSN activity [10]–[12]. Spatial EEG signal analysis with microstates may be a valuable approach to studying these and other large-scale neurocognitive networks in health and disease.

Consistent with the idea that EEG microstates may reflect the activity of large-scale neurocognitive networks, preliminary reports have found correlations between features of the microstate time series and various cognitive activities, behavioral states, and neuropsychiatric diseases. For example, microstates of certain topographies have a shorter average lifespan in schizophrenia [13], are longer in panic disorder [14], shortened in Alzheimer's disease [15], and appear more frequently in Tourette's syndrome [16]. Neurotropic drugs commonly used to treat neuropsychiatric disease alter microstate features [17], [18]. Microstates vary with cognitive/behavioral states such as drowsiness [19], sleep stages [20], age [21], and even personality characteristics [22]. Pre-stimulus spontaneous EEG microstates also have profound impacts on the electrophysiological [23]–[25] and perceptual [26]–[28] responses to stimuli. These preliminary reports suggest that features of the microstate time series may be related to the neurophysiological basis of these disorders, brain states, and cognitive functions, and can potentially offer insight into the function of the brain in health and disease [29].

The aforementioned cross-sectional studies of microstate features demonstrate intriguing relationships between microstates and disease. To further explore the significance of the microstate time series and its potential utility in the detection of neurophysiological changes underlying disease, longitudinal cohort studies are required to characterize microstate changes over time in individual patients. However, the design of these studies is difficult, because limited information exists about the variance in common microstate features and the test-retest reliability of these values. Furthermore, although there are several different methodological approaches to microstate analysis, few studies have assessed the validity and consistency of these various methods.

In this study, we investigate the test-retest reliability of resting-state EEG microstate features in healthy subjects across three sessions. We extract four of the most common features from the time series, namely, the topography that defined each microstate, the average lifespan of each microstate, the frequency of appearance of each microstate, and the fraction of total time covered by each microstate. Our rationale in choosing these features is based on the fact that these are the most common features examined in previous studies, and changes in each of these features has been associated with one or more neuropsychiatric disorders.

Microstate analysis involves two basic steps: first, a set of microstate topographies is selected, and second, the original data is re-expressed as an alternating sequence of these microstate topographies from which values of interest can be calculated. The first step is usually performed by mathematical clustering of maps in the original data. A single set of microstate topographies can be identified for all subjects (i.e. all of the original data is clustered together), or a unique set of topographies can be generated for subsets of subjects (e.g. different experimental groups may be assigned unique microstates, or each recording may be assigned an individual set of microstates). We tested the consistency of microstate analysis using three different approaches. First, we assumed one set of four *global* maps and identified a single set of four maps that was applied to all subjects across all sessions. Second, we generated a set of topographies *by session* independently. Third, we generated a set of topographies *by recording*, i.e. generated four maps for each individual recording. We also compared two common clustering algorithms (topographic atomize and agglomerate hierarchical clustering and *k*-means clustering) and assessed whether microstate analysis can be performed reliably with as few as 8 electrodes.

## Methods

### Subjects

We studied 10 healthy subjects (mean age: 30±10 yr, 5 females). Subjects were recruited through advertisement in greater Boston area. Exclusion criteria included a self-reported medical illness and history of drug or alcohol abuse. All participants gave their written informed consent and the protocol was approved by the local ethics committee at the Beth Israel Deaconess Medical Centre in accordance with the declaration of Helsinki.

### EEG Recording

Data used in this study was collected as part of a baseline assessment in a larger research study investigating the effect of non-invasive brain stimulation on various cortical processes. Subjects were instructed to sit in a comfortable armchair. Approximately three minutes of resting-state, eyes-closed EEG were recorded in three sessions separated by at least 48 hours. EEG recording was obtained through a 32-channel EEG system (BrainProducts, GmbH) with the CPZ and AFZ electrodes set as reference and ground electrode, respectively. EOG was recorded through two channels placed underneath each eye. The data was sampled at 5 kHz with the online filter setting set to DC to 1 kHz. The skin/electrode impedance was kept below 5 kOhm.

### EEG Preprocessing

Data were imported into MATLAB (The MathWorks. Inc.Natick, MA, USA) for preprocessing. The open source signal processing functions available through the EEGLAB toolbox version 11b [30] were used for data import and preprocessing. The EEG waveforms were epoched into segments of 2 second duration and down sampled to 2 kHz. A notch filter (band-stop: 55–65 Hz) was used to remove the 60 Hz noise. EEG signals were band passed filtered for the frequency range of 1–50 Hz to further minimize contamination by high frequency artifact. The infinite impulse response (IIR) Butterworth filter of second order was employed and both forward and backward filtering was applied (MATLAB function ‘filtfilt’) to maintain a zero phase shift. All epochs were manually reviewed and trials and channels containing eye movements, muscle or any other non-physiological artifact were discarded. The data were then average re-referenced.

### EEG Power Analysis

The EEGLAB function *spectopo* was used to obtain the power spectrum. The absolute and relative power was obtained for delta (1–3.5 Hz), theta (4–7 Hz), alpha (8–12 Hz), and beta (12–30 Hz) frequency bands.

### EEG Microstate Analysis

EEG microstate analysis was conducted using the freely-available CARTOOL software [31]. Preprocessed data were imported into CARTOOL, band passed to 1–30 Hz, and downsampled to 200 Hz before microstate analysis as described below.

In microstate analysis, the multichannel EEG signal is considered as a series of instantaneous topographies of electric potentials. We identified points with the greatest signal-to-noise ratio (SNR) by calculating the global field power (GFP) of each topography in the time series. The GFP at each point in time is equal to the root mean square across the average-referenced electrodes – equivalently, the standard deviation of the signal at all electrodes:(1)where *v _{i}*(

*t*) is the voltage at electrode

*i*at time

*t*, is the mean voltage across all electrodes at time

*t*, and

*n*is the number of electrodes. Maps that occur at local maxima in the GFP curve – i.e. all points with GFP higher than the preceding and following point – represent instants of highest field strength and greatest SNR. Furthermore, because the field topography remains essentially constant between two local minima of the GFP curve, topographies at GFP maxima are representative of topographies at surrounding points in time [7], [32]. Thus, data reduction of the original signal to points at local GFP maxima is a valid method of enhancing topographic SNR. These maps at local GFP maxima, hereafter referred to as the “original maps,” were extracted and submitted to further analysis (

**Figure 1**).

(**A**) The GFP (drawn in red) is calculated at each instant of the multichannel EEG recording. Peaks of the GFP curve represent moments of highest SNR. At peaks of the GFP curve, the potential recorded at each electrode of the multichannel signal is plotted onto a map of the channel array. This collection of maps is entered into a clustering algorithm (TAAHC or *k*-means clustering), which results in a small number of representative microstate maps that explain a large proportion of the global topographic variance. Four topographies are repeatedly found using this method; these maps are labeled A, B, C, or D in the figure. Crosshairs indicate points of maximum or minimum recorded electric potential. (**B**) The original maps at peaks of the GFP curve are assigned to a microstate class A, B, C, or D based on the degree of correlation with the microstate maps and statistical smoothing of the time series. This reassignment results in a representation of the original multichannel data as an alternating series of microstates A, B, C, and D. A microstate is considered dominant in the time during which all successive original maps are assigned to the same microstate class, starting and ending at the midpoint in time between the last original map of the preceding microstate and the first original map of the following microstate, respectively. Each period of dominance is considered a unique appearance of a microstate. The frequency of a microstate is the number of unique appearances per second. The coverage of each microstate is the fraction of total recording time that each microstate is dominant.

Microstate analysis involves two basic steps – first, a set of microstate maps is identified, and second, this set of maps is fit onto the original data to re-express the multichannel EEG as a sequence of microstates (**Figure 1**).

### 1. Derivation of Maps Globally, by Session, and by Recording

We chose *a priori* to identify four group-level classes in order to remain consistent with the majority of previous studies that also use four microstate classes. We compared three different strategies to identify the four maps that would be used to identify microstates on our original data. In the *global* maps strategy, we clustered original maps from each recording into four maps and then entered this set of 4×30 = 120 maps into another round of clustering to identify four *global* maps that was then fit to all of the original maps. This is similar to the strategy used by Lehmann et al. (2005) [13]. In the *by session* strategy, we clustered original maps from each recording into four maps, and then entered these maps into another round of clustering separately for each session. This resulted in three separate sets of four microstate maps (one for each of three sessions). The maps from each session were fit onto all of the original maps from the respective session. Finally, in the *by recording* strategy, we clustered original maps from each recording into four maps, and used these 30 sets of four maps to fit onto original maps from each respective recording.

#### 1.1. Topographic Atomize and Agglomerate Hierarchical Clustering (TAAHC).

The original maps were submitted into a modified hierarchical clustering algorithm known as the topographic atomize and agglomerate hierarchical clustering (TAAHC) method [33] as implemented by the CARTOOL program [31]. Briefly, all maps submitted to the procedure are initially considered to be independent clusters. In each iteration of the algorithm, the “worst” cluster is identified and split into its constituent maps (“atomized”). The “worst” cluster in each iteration is the one with the lowest summated correlation between each constituent map to the average cluster map. Correlation is analogous to the Pearson product-moment correlation coefficient between two topographies:(2)where sums are taken over *i* electrodes. Maps of the “worst” cluster are redistributed (“agglomerated”) to any of the remaining clusters to which they are most strongly correlated. This process is continued until the desired number of clusters is achieved.

We used TAAHC to cluster the original maps from each subject and session into four cluster maps for each subject and session. In the *by recording* strategy, we fit these four cluster maps onto each recording. In the *by session* strategy, we submitted these 120 cluster maps (4 from each of 10 subjects over 3 sessions) to another round of clustering separately for each session, and fit the maps from each session onto original maps from the respective session. Finally, in the *global* maps strategy, we submitted these 120 cluster maps to the TAAHC algorithm to identify four group-level cluster maps. These four maps were the “microstate maps” and were labeled class A, B, C, and D (**Figure 2**). Cluster maps identified *by recording* and *by session* were labeled A, B, C, and D depending on their degree of correlation with maps A, B, C, and D from the *global* maps strategy. After these labels were assigned to the sets of 4 cluster maps derived *by recording* and *by session*, these unique sets of 4 cluster maps were fit onto the original data.

Crosshairs indicate points of maximum or minimum recorded electric potential. The four microstate classes A, B, C, and D have been reported in a number of prior studies. TAAHC and *k*-means clustering give almost identical microstate maps (first two rows). Because microstates are defined by the topography of electric potentials over the entire scalp, it is possible to identify microstates fewer electrodes. The microstate classes A, B, C, and D are identifiable in 19 and 8 electrode data. These lower-resolution electrode arrays give highly reliable results.

#### 1.2. Fitting Microstate Maps onto Original Maps.

In the final step, original maps are labeled either A, B, C, or D depending on which microstate map has the highest correlation to the original map.

#### 1.3. Extraction of Features from the Microstate Time Series.

After labeling at the local maxima of the GFP curve, we could express each of the original signals as an alternating sequence of maps A, B, C, and D. From this microstate time series, we calculated several features. Our outcomes of interest were: (1) topographies of the four cluster maps identified in each clustering strategy, (2) average lifespan of each microstate and all microstates, (3) frequency of appearance of each microstate and all microstates, and (4) fraction of total covered time of each microstate. We calculated these features separately for each subject in each session.

1.3.1. *Average Lifespan of Microstates:* The lifespan of a microstate was calculated as the time during which all successive original maps were assigned to the same microstate class, starting and ending at the midpoint in time between the last original map of the preceding microstate and the first original map of the following microstate, respectively (**Figure 1b**) [13].

1.3.2. *Frequency of Appearance of Microstates:* We calculated the frequency of appearance of each microstate class by counting the number of unique appearances of each microstate divided by the total length of recording (**Figure 1b**) [13].

1.3.3. *Fraction Total Covered Time of Microstates:* We calculated the fraction total covered time (coverage) of each microstate by taking the ratio of the total time spent in each microstate over the total recording time [13]. Note that the coverage of all four microstates can be calculated from their respective average lifespans and frequencies and the total length of recording, i.e. these are not completely independent measures.

### 2. K-Means Clustering

To assess the reliability of microstate analysis performed with the *k*-means clustering algorithm, we repeated the entire analysis, but used *k*-means clustering instead of TAAHC to identify a set of microstate maps. In the *k*-means clustering method, clustering is first initialized and then entered into a convergence loop. During initialization, to find *n* clusters, *n* non-identical maps are randomly selected out of the set of maps entered into the analysis to serve as templates. All maps are then assigned to a cluster seeded by one of the *n* templates based on the degree of correlation to each template. In the convergence loop, all maps in each of the *n* clusters are averaged. These *n* average maps then serve as seeds for new clusters, and all input maps are again assigned to one of *n* clusters based on correlation to cluster seeds. A measure of the quality of current cluster assignment is computed, in our case the global explained variance (GEV):(3)where *m* is the number of original maps. The convergence loop is repeated until the quality of the cluster assignment does not improve. The entire initialization and convergence algorithm is repeated several times to increase the likelihood of finding an optimal set of *n* clusters. The procedure can be repeated for many values of *n*, so the topographies of any number of clusters can be derived. Because the initialization step picks *n* maps to serve as templates randomly, k-means clustering is non-deterministic. To overcome this drawback, the algorithm is repeated 300 times for each value of *n* to minimize run-to-run variance. Maps identified using *k*-means clustering were labeled A, B, C, or D depending on degree of correlation with maps extracting using TAAHC using the *global* maps approach.

### 3. Microstates from Smaller Electrode Arrays

To investigate whether microstate analysis can be reliably conducted using fewer electrodes, we selected 19 electrodes from the original 30-channel recording (AF3, AF4, F7, F3, Fz, F4, F8, T7, C3, C4, T8, Cz, P7, P3, Pz, P4, P8, O1, and O2) and performed TAAHC on these 19-channel data to identify 4 microstates. We also selected 8 electrodes from the original recording (F3, F4, C3, Cz, C4, P3, Pz, and P4) and again performed TAAHC to extract 4 microstates using the *global* maps strategy (**Figure 2**).

### 4. Determining Test-Retest Reliability and Consistency of Microstate Characteristics

All statistical analyses were performed using the SAS software.

Cronbach's α is a well-established measure of the internal consistency and reliability of a test, and was calculated to determine the reliability of microstate characteristics across time, and the consistency of these characteristics across different methods (TAAHC *vs* k-means clustering and various electrode arrays). It is calculated as:(4)where *K* is the number of repeated tests, is the variance of the observed data, and is the variance of component *i* of subject *Y*. Cronbach's α is a measure of reliability relative to between-subject variance. To give a measure of absolute reliability of these features, we also calculated the standard error of measurement (SEM) for microstate characteristics, where appropriate.

#### 4.1. Comparison of Microstate Maps Derived from TAAHC and K-Means Clustering. ing.

We tested the four microstates extracted from *k*-means clustering with those extracted from TAAHC for significant differences using topographic analysis of variance (TANOVA). TANOVA is a randomization procedure that uses the GFP of the electrode-by-electrode subtraction between two maps as the test statistic (effect size) of the difference between maps [34]. When the maps being compared are GFP-normalized, the test statistic is equal to the global map dissimilarity (GMD):(5)where and are the potentials at electrode *i* in the GFP-normalized maps being compared.

To derive the distribution of the test statistic under the null hypothesis that there is no difference between maps from *k*-means clustering and TAAHC, we randomly shuffled maps from each subject between two groups and calculated the test statistic between the average group maps, and repeated this procedure 5000 times. The fraction of iterations with a test statistic greater than the one calculated from the actual data was the p value.

#### 4.2. Test-Retest Reliability of Average Microstate Lifespan, Frequency, and Coverage Across 3 Sessions.

To determine the test-retest reliability of the average lifespan, frequency, and coverage of each and all microstates across 3 sessions, we calculated Cronbach's α and the SEM of these values from 30-electrode data analyzed with TAAHC clustering, as well as with *k*-means clustering, as described above.

#### 4.3. Consistency of Average Microstate Lifespan, Frequency, and Coverage between TAAHC and K-Means Clustering.

To determine the level of consistency between microstate features extracted using TAAHC and *k*-means clustering, we calculated Cronbach's α between the average microstate lifespan, frequency, and coverage measured using microstates extracted by TAAHC and *k*-means clustering.

#### 4.4. Consistency of Average Microstate Duration, Frequency, and Coverage among Channel Arrays with 30, 19, and 8 Electrodes.

To determine the level of consistency between microstate features using topographies extracted using channel arrays with 30, 19, and 8 electrodes with TAAHC, we calculated Cronbach's α between the average microstate lifespan, frequency, and coverage measured with these arrays.

## Results

After the data were preprocessed and epochs with artifacts removed, we had a mean of 127.87 seconds of data (SD = 23.87, range = 80–204) per recording that were submitted to microstate analysis from which we extracted the “original maps” at local maxima in the GFP curve. We chose *a priori* to cluster the original maps from each session into four microstates. Four microstate maps had a mean GEV of 69.93% (SD = 3.58, range = 65.34–77.99) across all recordings using TAAHC.

### 1. Reliability with Maps Derived Globally, by Session, and by Recording

We calculated test-retest reliability of the average lifespan, frequency, and coverage with maps derived *globally*, *by session*, and *by recording* using the TAAHC method. The four *global* maps appear in **Figure 2**. The average lifespan, frequency, and coverage of each microstate calculated using maps derived *globally* appear in **Table 1**, along with Cronbach's α and SEM. The mean Cronbach's α for these values is 0.811. Most values have Cronbach's α>0.7 and SEM is less than 10% of the mean for all values.

Results from maps derived *by session* and *by recording* are also presented in **Table 1**. For maps derived *by session*, the mean Cronbach's α is 0.648 and SEM is approximately 10% of the mean for all values. For maps derived *by recording*, the mean Cronbach's α is 0.523 and SEM is approximately 10% of the mean.

### 2. Reliability with Maps Derived Using K-Means Clustering

To determine the reliability of the analysis conducted with maps derived using the *k*-means clustering algorithm, we used *k*-means clustering to identify a set of *global* maps and assessed both the reliability of the resulting features over three sessions. The four *global* maps derived using *k*-means appear in **Figure 2**. These four maps had a mean GEV of 70.92% (SD = 3.65, range = 65.88–78.70) across all recordings. TANOVA analysis of maps A, B, C, and D extracted using TAAHC and *k*-means reveals no significant difference in the topography of any of the maps (p>0.01). The average lifespan, frequency, and coverage of microstates calculated using these maps appear in **Table 2**. The mean Cronbach's α for these values is 0.830 and SEM is less than 10% of the mean for all values.

We also assessed the degree of agreement between values calculated using maps derived from *k*-means clustering and TAAHC in a single session. These Cronbach's α values appear in **Table 2**. All values are above 0.9.

### 3. Reliability with 19 and 8 Electrodes

To determine whether microstate analysis can be reliably conducted with fewer electrodes, we repeated the analysis after selecting 19 and 8 electrodes from the original 30-electrode array using TAAHC and a *global* maps strategy. The four microstate maps derived using 19 and 8 electrodes appear in **Figure 2**. We could clearly identify maps belonging to classes A, B, C, and D in both 19 and 8 electrode data. Average microstate duration, frequency, and coverage from 19 and 8 electrode data appear in **Table 2**. In 19 electrode data, these microstate features have a mean Cronbach's α of 0.873 and SEM approximately 10% of mean values. In 8 electrode data, these features have a mean Cronbach's α of 0.906 and similar SEM.

We also determined the degree of consistency between values extracted from 30, 19, and 8 electrode data in a single session. These Cronbach's α values appear in **Table 3**. Most values are highly consistent across these electrode arrays (average Cronbach's α = 0.834). Notably, the consistency of the average lifespan, frequency, and coverage of microstates C and D tended to be lower than corresponding values for A and B.

### 4. Correlations Among Microstate Features

We calculated the correlation between each pair of microstate features in a single session. As expected, microstate lifespan is inversely correlated with frequency (average correlation R = −0.72 across 4 pairs of microstate lifespans and frequencies) for each individual microstate (e.g. lifespan of microstate A compared to frequency of microstate A, etc.). When comparing these values among all 4 microstates, we found positive correlations among all average lifespans (average correlation R = 0.79) and frequencies (average correlation R = 0.51) (e.g. comparing lifespans of microstates A, B, C, and D) in each individual recording.

### 5. Correlation between Microstate Features and Spectral Power

To explore the relationship between microstate features and the power spectra of EEG recordings, we also calculated the correlation between various microstate features and the absolute and relative power in the delta, theta, alpha, and beta frequency bands averaged across all electrodes. Multiple regression modeling showed that relative beta power is negatively associated (p = 0.0001) and relative alpha power is positively associated (p = 0.0174) with global average microstate duration (R^{2} = 0.92). Conversely, relative alpha power is negatively associated (p = 0.023) and relative beta power is positively associated (p = 0.0003) with overall microstate frequency (R^{2} = 0.89). Power is not significantly associated with coverage fraction of any microstate class. There were no significant correlations between any microstate feature and absolute power in any frequency band.

## Discussion

In this study, we sought to assess the test-retest reliability of resting-state EEG microstate analysis in healthy subjects over time. We used a number of variations of the method to determine the reliability and the degree of consistency among these approaches. This study has four major findings. First, we found that using a *global* set of microstates for all subjects yields average microstate durations, frequencies, and coverage fractions that have high Cronbach's α, indicating excellent test-retest reliability. Second, we found that the use of *global* maps yields results that are in general more reliable than maps identified *by session* or *by recording*. Third, we showed that TAAHC and *k*-means clustering yield highly consistent results. Finally, we showed that microstate analysis can be reliably conducted with as few as 8 electrodes.

The maps A, B, C, and D (**Figure 2**) have been reported by numerous previous studies of resting-state EEG microstate analysis [5], [13], [21], and the average microstate lifespans, frequencies, and coverage fractions calculated in this study are in general agreement with prior studies (**Tables 1**** and ****2**). With few exceptions, most microstate features calculated from a set of *global* maps have Cronbach's α>0.7. Cronbach's α is equivalent to the 3, k intraclass correlation coefficient and is a measure of between-subject variance relative to within-subject variance, i.e. it is a *relative* measure of test-retest reliability. High (generally,>0.7) Cronbach's α value suggests that variance in these values over three spaced sessions is small compared to the distribution of these values in the entire sample. The SEM values reported in **Tables 1**** and ****2** are measures of absolute test-retest reliability, and are approximately 10% of the mean for all values. We conclude that these values have high relative test-retest reliability.

We compared three different strategies for identifying microstate maps to be fit onto each EEG recording. In the *global* maps strategy, we derived a universal set of four maps that was derived using all sessions. We also derived maps *by session*, where maps were re-calculated for each session but held constant for all subjects within each session, and *by recording*, where maps were identified for each individual recording. We found that *global* maps gave the most reliable results. This suggests that microstate analysis is highly sensitive to the topographies that are fit onto the data and used to calculate values of interest. Minor differences in the microstate maps are introduced when maps were recalculated *by session* or *by recording*, which appear to generate within-subject error and lower test-retest reliability. This also indicates that comparison of microstate features, for example between two studies, is most valid when the same maps are used.

Our *a priori* selection of 4 microstates represented our data well, explaining about 70% of the global topographic variance of the data. We chose to cluster our data into 4 microstates to remain consistent with the majority of previous studies. However, methods of deriving a data-driven estimate of the number of microstates required to “best” explain the data have been proposed [33], [35]. The most common of these approaches to minimize the cross-validation (CV) criterion, which is proportional to the ratio between the GEV and the degrees of freedom of the maps [35]. Importantly, the CV criterion is highly sensitive to the number of electrodes used, and some have argued against its use when fewer than 64 electrodes make up the channel array [36]. Another measure, the Krzanowski-Lai criteria, has recently been proposed for microstate analysis [31]. In our data, the CV criterion was on average minimized between 4 and 5 microstates for all recordings; thus, our selection of 4 microstates was similar to the data-driven estimate.

We compared the maps derived using TAAHC and *k*-means clustering to determine the extent of agreement between these two clustering algorithms. We found that TAAHC and *k*-means clustering (iterated 300 times) both give results with excellent test-retest reliability across three sessions. The four *global* maps derived using both methods are highly similar (**Figure 2**). Cronbach's α values for microstate features calculated using maps from these two methods are all above 0.9, indicating that TAAHC and *k*-means give highly consistent results.

Because microstate analysis considers the topography of potentials over the entire cortex in a global representation of brain state, and given the nature of the microstate maps extracted using 30 electrodes, we hypothesized that fewer electrodes could successfully identify the four microstate maps A, B, C, and D and be used to conduct microstate analysis reliably. To test this hypothesis, we eliminated all but 19 and 8 electrodes of our original data and repeated the analysis. We could clearly identify maps representing microstate A, B, C, and D in both 19 and 8 electrode data (**Figure 2**). The results of these analyses were also highly reliable (**Table 2**) and, interestingly, even appeared slightly *more* reliable than 30-electrode data. It is possible that elimination of superfluous electrodes reduced noise in the data and refined the results. We also compared the results from 30, 19, and 8 electrode data in a single session to assess the degree of consistency between these electrode arrays. In general, results from these arrays are consistent (**Table 3**). Notably, results of microstates C and D appear less consistent. This is unsurprising, as the topographies of C and D are similar and are probably more difficult to resolve with fewer electrodes. Nevertheless, these data suggest that microstate analysis can be conduced reliably with as few as 8 electrodes. This may be particularly relevant in the development of microstates as clinically useful biomarkers of disease for longitudinal assessment over time, because it reduces the invasiveness and inconvenience that is otherwise associated with clinical EEG studies.

Within individual recordings, we found positive correlations among the average durations and frequencies of all microstates, suggesting that individuals have a tendency toward relatively longer or shorter and relatively more or less frequent microstates in general. This likely reflects natural inter-subject variance, and may also be a function of age [21]. We were also interested in determining how microstate features relate to power in EEG frequency bands. We found that the global average microstate duration decreases and global average frequency increases with increasing relative power in higher spectral frequencies (beta *vs* alpha bands). These associations are likely due to the fact that increased power in higher frequencies reflects faster cortical oscillations, which gives more frequent local GFP maxima and enables finer resolution of microstate transitions. This also suggests that eyes-open EEG data might be expected to yield shorter and more frequent microstates, as alpha power is reduced in the eyes-open state. Our findings agree somewhat with the findings of Koenig et al. (2002), in which shortening overall average microstate lifespans with age was correlated with increasing relative power in higher frequency bands, although they reported lower correlation values (R^{2}<0.44) [21]. We did not find evidence of significant association between the coverage of any microstate and power in any frequency band, in agreement with the findings of Britz el al (2010) [10].

The statistics we present here are an important contribution to the translation of microstates to clinical practice as potential biomarkers of neurophysiological health for longitudinal monitoring in individual patients. Unlike experimental paradigms, in which results from multiple subjects are aggregated to produce estimates of variance that are used to determine the significance of microstate differences between groups (see, for example, [13]–[16], [37]–[40]), assessing the significance of changes in microstates observed across repeated measurements in a single subject requires estimates of measurement reliability. To this end, for example, the SEM can be used to estimate a 95% confidence interval for microstate features outside of which differences in repeated measurements in a single individual can be reasonably attributed to changes in the true value, rather than measurement error [41], [42]. This estimate is given by , where *Z _{α}*

_{/2}= 1.96 for a Type I error threshold of 5%. Thus, for the overall microstate duration using 30 electrodes, TAAHC, and a

*global*maps strategy, the SEM (6.41 ms from

**Table 1**) suggests that, for repeated measurements in a single individual, a change in overall microstate duration of 17.77 ms can be considered significant with 95% confidence. Similarly, reliability can be used to estimate the false-positive and false-negative rate if microstate features are used as decision thresholds [43]. As Cronbach's α is an ICC, it can be used in power calculations in the design of large trials [44]–[46]. We encourage other investigators to use the values reported herein to optimally design future studies.

## Conclusions

In this study, we found that when a *global* set of microstates is used to conduct microstate analysis over multiple sessions, resting-state EEG microstate analysis has high test-retest reliability in healthy subjects as measured by Cronbach's α and SEM. We also determined the consistency of the *k*-means clustering and TAAHC algorithms in extracting microstate maps. Finally, we found that microstate analysis can be reliably conducted with as few as 8 electrodes.

The microstate features we analyzed in this study have been shown to vary in altered cognitive/behavioral states and neuropsychiatric disease, and may be related to the neurophysiological changes that underlie these disorders. The clinical use of microstates as potential biomarkers of disease presupposes within-patient reliability of relevant features, so that changes in the features can reasonably be attributed to changes in neurophysiology. Our aim in this study was to assess the degree of this reliability. Our results indicate good reliability of all the features we examined, and suggest potential value in further exploring microstates as neurophysiological markers of disease in future studies.

## Acknowledgments

This study was supported in by the Canadian Institute of Health Research (CIHR - 201102MFE-246635-181538), the Sidney R. Baer Jr. Foundation, the National Institutes of Health (R01HD069776, R01NS073601, R21 MH099196, R21 NS082870, R21 NS085491, R21 HD07616), Harvard Catalyst - The Harvard Clinical and Translational Science Center (NCRR and the NCATS NIH, M01-RR-01066, UL1 RR025758), and the Temerty Family through the Centre for Addiction and Mental Health (CAMH) Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of Harvard Catalyst, Harvard University and its affiliated academic health care centers, the National Institutes of Health or the Sidney R. Baer Jr. Foundation.

## Author Contributions

Conceived and designed the experiments: FF APL. Performed the experiments: FF. Analyzed the data: FF AK. Contributed reagents/materials/analysis tools: APL. Wrote the paper: AK FF.

## References

- 1. Avila MT, McMahon RP, Elliott AR, Thaker GK (2002) Neurophysiological markers of vulnerability to schizophrenia: Sensitivity and specificity of specific quantitative eye movement measures. Journal of Abnormal Psychology 111:259–267.
- 2. Jelic V, Johansson SE, Almkvist O, Shigeta M, Julin P, et al. (2000) Quantitative electroencephalography in mild cognitive impairment: longitudinal changes and possible prediction of Alzheimer's disease. Neurobiology of Aging 21:533–540.
- 3. Ponomareva NV, Fokin VF, Selesneva ND, Voskresenskaia NI (1998) Possible Neurophysiological Markers of Genetic Predisposition to Alzheimer's Disease. Dementia and Geriatric Cognitive Disorders 9:267–273.
- 4. Ingber L, Nunez PL (2011) Neocortical dynamics at multiple scales: EEG standing waves, statistical mechanics, and physical analogs. Mathematical Biosciences 229:160–173.
- 5. Wackermann J, Lehmann D, Dvorak I, Michel CM (1993) Global dimensional complexity of multi-channel EEG indicates change of human brain functional state after a single dose of a nootropic drug. Electroencephalography and Clinical Neurophysiology 86:193–198.
- 6. Carmeli C, Knyazeva MG, Innocenti GM, De Feo O (2005) Assessment of EEG synchronization based on state-space analysis. NeuroImage 25:339–354.
- 7. Lehmann D, Ozaki H, Pal I (1987) EEG alpha map series: brain micro-states by space-oriented adaptive segmentation. Electroencephalography and Clinical Neurophysiology 67:271–288.
- 8. Bressler SL (1995) Large-scale cortical networks and cognition. Brain Research Reviews 20:288–304.
- 9. Fuster JM (2006) The cognit: A network model of cortical representation. International Journal of Psychophysiology 60:125–132.
- 10. Britz J, Van De Ville D, Michel CM (2010) BOLD correlates of EEG topography reveal rapid resting-state network dynamics. NeuroImage 52:1162–1170.
- 11. Yuan H, Zotev V, Phillips R, Drevets WC, Bodurka J (2012) Spatiotemporal dynamics of the brain at rest — Exploring EEG microstates as electrophysiological signatures of BOLD resting state networks. NeuroImage 60:2062–2072.
- 12. Van De Ville D, Britz J, Michel CM (2010) EEG microstate sequences in healthy humans at rest reveal scale-free dynamics. Proceedings of the National Academy of Sciences 107:18179–18184.
- 13. Lehmann D, Faber PL, Galderisi S, Herrmann WM, Kinoshita T, et al. (2005) EEG microstate duration and syntax in acute, medication-naïve, first-episode schizophrenia: a multi-center study. Psychiatry Research: Neuroimaging 138:141–156.
- 14. Kikuchi M, Koenig T, Munesue T, Hanaoka A, Strik W, et al. (2011) EEG Microstate Analysis in Drug-Naive Patients with Panic Disorder. PLoS ONE 6:e22912.
- 15. Dierks T, Jelic V, Julin P, Maurer K, Wahlund LO, et al. (1997) EEG-microstates in mild memory impairment and Alzheimer's disease: possible association with disturbed information processing. Journal of Neural Transmission 104:483–495.
- 16. Stevens A, Günther W, Lutzenberger W, Bartels M, Müller N (1996) Abnormal topography of EEG microstates in Gilles de la Tourette syndrome. European Archives of Psychiatry and Clinical Neuroscience 246:310–316.
- 17. Lehmann D, Wackermann J, Michel CM, Koenig T (1993) Space-oriented EEG segmentation reveals changes in brain electric field maps under the influence of a nootropic drug. Psychiatry Research: Neuroimaging 50:275–282.
- 18. Yoshimura M, Koenig T, Irisawa S, Isotani T, Yamada K, et al. (2007) A pharmaco-EEG study on antipsychotic drugs in healthy volunteers. Psychopharmacology 191:995–1004.
- 19. Cantero J, Atienza M, Salas R, Gómez C (1999) Brain Spatial Microstates of Human Spontaneous Alpha Activity in Relaxed Wakefulness, Drowsiness Period, and REM Sleep. Brain Topography 11:257–263.
- 20. Brodbeck V, Kuhn A, von Wegner F, Morzelewski A, Tagliazucchi E, et al. (2012) EEG microstates of wakefulness and NREM sleep. Neuroimage 62:2129–2139.
- 21. Koenig T, Prichep L, Lehmann D, Sosa PV, Braeker E, et al. (2002) Millisecond by Millisecond, Year by Year: Normative EEG Microstates and Developmental Stages. NeuroImage 16:41–48.
- 22. Schlegel F, Lehmann D, Faber P, Milz P, Gianotti LR (2012) EEG Microstates During Resting Represent Personality Differences. Brain Topography 25:20–26.
- 23. Kondakor I, Lehmann D, Michel CM, Brandeis D, Kochi K, et al. (1997) Prestimulus EEG microstates influence visual event-related potential microstates in field maps with 47 channels. J Neural Transm 104:161–173.
- 24. Kondakor I, Pascual-Marqui RD, Michel CM, Lehmann D (1995) Event-related potential map differences depend on the prestimulus microstates. J Med Eng Technol 19:66–69.
- 25. Lehmann D, Michel CM, Pal I, Pascual-Marqui RD (1994) Event-related potential maps depend on prestimulus brain electric microstate map. Int J Neurosci 74:239–248.
- 26. Britz J, Landis T, Michel CM (2009) Right Parietal Brain Activity Precedes Perceptual Alternation of Bistable Stimuli. Cerebral Cortex 19:55–65.
- 27. Mohr C, Michel CM, Lantz G, Ortigue S, Viaud-Delmon I, et al. (2005) Brain state-dependent functional hemispheric specialization in men but not in women. Cereb Cortex 15:1451–1458.
- 28. Muller TJ, Koenig T, Wackermann J, Kalus P, Fallgatter A, et al. (2005) Subsecond changes of global brain state in illusory multistable motion perception. J Neural Transm 112:565–576.
- 29. Lehmann D, Strik WK, Henggeler B, Koenig T, Koukkou M (1998) Brain electric microstates and momentary conscious mind states as building blocks of spontaneous thinking: I. Visual imagery and abstract thoughts. International Journal of Psychophysiology 29:1–11.
- 30. Delorme A, Makeig S (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134:9–21.
- 31. Brunet D, Murray MM, Michel CM (2011) Spatiotemporal analysis of multichannel EEG: CARTOOL. Intell Neuroscience 2011:1–15.
- 32. Lehmann D, Skrandies W (1980) Reference-free identification of components of checkerboard-evoked multichannel potential fields. Electroencephalography and Clinical Neurophysiology 48:609–621.
- 33. Tibshirani R, Walther G (2005) Cluster Validation by Prediction Strength. Journal of Computational and Graphical Statistics 14:511–528.
- 34.
Koenig T, Melie-Garcia L (2009) Statistical analysis of multichannel scalp field data. In: Koenig T, Melie-Garcia L, editors. Electrical Neuroimaging. Cambridge, United Kingdom: Cambridge University Press. pp. 169–190.
- 35. Pascual-Marqui RD, Michel CM, Lehmann D (1995) Segmentation of brain electrical activity into microstates: model estimation and validation. Biomedical Engineering, IEEE Transactions on 42:658–665.
- 36. Murray M, Brunet D, Michel C (2008) Topographic ERP Analyses: A Step-by-Step Tutorial Review. Brain Topography 20:249–264.
- 37. Irisawa S, Isotani T, Yagyu T, Morita S, Nishida K, et al. (2006) Increased Omega Complexity and Decreased Microstate Duration in Nonmedicated Schizophrenic Patients. Neuropsychobiology 54:134–139.
- 38. Nishida K, Morishima Y, Yoshimura M, Isotani T, Irisawa S, et al. (2013) EEG microstates associated with salience and frontoparietal networks in frontotemporal dementia, schizophrenia and Alzheimer's disease. Clinical Neurophysiology 124:1106–1114.
- 39. Strik WK, Chiaramonti R, Muscas GC, Paganini M, Mueller TJ, et al. (1997) Decreased EEG microstate duration and anteriorisation of the brain electrical fields in mild and moderate dementia of the Alzheimer type. Psychiatry Research: Neuroimaging 75:183–191.
- 40. Strik WK, Dierks T, Becker T, Lehmann D (1995) Larger topographical variance and decreased duration of brain electric microstates in depression. Journal of Neural Transmission/General Section JNT 99:213–222.
- 41. Roebroeck ME, Harlaar J, Lankhorst GJ (1993) The Application of Generalizability Theory to Reliability Assessment: An Illustration Using Isometric Force Measurements. Physical Therapy 73:386–395.
- 42. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K (1994) Statistical Methodology for the Concurrent Assessment of Interrater and Intrarater Reliability: Using Goniometric Measurements as an Example. Physical Therapy 74:777–788.
- 43. Charter RA, Feldt LS (2001) Meaning of Reliability in Terms of Correct and Incorrect Clinical Decisions: The Art of Decision Making is Still Alive. Journal of Clinical and Experimental Neuropsychology 23:530–537.
- 44.
Fleiss JL (2011) Design and Analysis of Clinical Experiments: Wiley.
- 45. Shrout PE (1998) Measurement reliability and agreement in psychiatry. Statistical Methods in Medical Research 7:301–317.
- 46. Perkins DO, Wyatt RJ, Bartko JJ (2000) Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials. Biological Psychiatry 47:762–766.