Reliability of Resting-State Microstate Features in Electroencephalography

Background Electroencephalographic (EEG) microstate analysis is a method of identifying quasi-stable functional brain states (“microstates”) that are altered in a number of neuropsychiatric disorders, suggesting their potential use as biomarkers of neurophysiological health and disease. However, use of EEG microstates as neurophysiological biomarkers requires assessment of the test-retest reliability of microstate analysis. Methods We analyzed resting-state, eyes-closed, 30-channel EEG from 10 healthy subjects over 3 sessions spaced approximately 48 hours apart. We identified four microstate classes and calculated the average duration, frequency, and coverage fraction of these microstates. Using Cronbach's α and the standard error of measurement (SEM) as indicators of reliability, we examined: (1) the test-retest reliability of microstate features using a variety of different approaches; (2) the consistency between TAAHC and k-means clustering algorithms; and (3) whether microstate analysis can be reliably conducted with 19 and 8 electrodes. Results The approach of identifying a single set of “global” microstate maps showed the highest reliability (mean Cronbach's α>0.8, SEM ≈10% of mean values) compared to microstates derived by each session or each recording. There was notably low reliability in features calculated from maps extracted individually for each recording, suggesting that the analysis is most reliable when maps are held constant. Features were highly consistent across clustering methods (Cronbach's α>0.9). All features had high test-retest reliability with 19 and 8 electrodes. Conclusions High test-retest reliability and cross-method consistency of microstate features suggests their potential as biomarkers for assessment of the brain's neurophysiological health.


Introduction
Neurophysiological impairments may precede the appearance of clinical symptomology in several neuropsychiatric illnesses [1][2][3]. Frequent and longitudinal monitoring of neurophysiological ''biomarkers'' could enable early detection of disease pathogenesis, and enhance understanding of the neurophysiological impairments underlying these disorders. Thus, there is great interest in developing techniques to detect neurophysiological biomarkers associated with impairments in the brain's functional health.
Electroencephalography (EEG) is a popular and widely used tool that has been explored as one such method capable of detecting the electrophysiology of the brain. EEG detects and records millivolt fluctuations of electric potentials over the cortex with very high temporal resolution [4]. A number of approaches have been proposed to extract features of neurophysiologic relevance from the recording. One such method is to use characteristics of the recorded oscillations to define ''states'' of the signal that evolve over time. For example, state characteristics such as chaotic complexity [5] or synchronicity [6] have been extracted from restingstate EEG. In this method, brain activity is described in relation to state characteristics, such as the duration or frequency of occurrence of certain states.
Microstate analysis is one such method that defines states of the multichannel EEG signal by spatial topographies of electric potentials over the electrode array. This method was first proposed by Lehmann et al. (1987), who showed that the a frequency band (8)(9)(10)(11)(12) Hz) of multichannel resting-state EEG could be parsed into discrete states in this way [7]. When the multichannel resting-state EEG signal is considered as a time series of topographies of electric potentials, two remarkable properties emerge. First, although there are a large number of possible topographies in multichannel recording, a majority of the signal can be represented by surprisingly few maps. Interestingly, most studies of resting-state EEG consistently find the same four archetypal maps that explain more than 70% of the total topographic variance. Second, there is a well-defined temporal structure of these maps, in that a single topography remains dominant for about 80-120 ms before abruptly transitioning to another topography. These periods of quasi-stability of a single topography are ''microstates.'' Thus, the multichannel EEG signal can be represented by a single time series of microstates alternating among themselves at discrete intervals.
Compared to traditional frequency power EEG analysis, spatial analysis of EEG using microstates has several advantages. Perhaps most importantly, spatial EEG analysis does not assume the EEG signal is a linear dynamical system that can be represented through the Fourier series as a linear function of a set of sine waves. The spatial topography of the EEG signal can be defined at any point in time independently of the preceding or subsequent topography and therefore has millisecond resolution, unlike conventional frequency power analysis that integrates activity over seconds. Indeed, although microstates were initially described in the alpha frequency band, they can in fact be defined within any signal bandwidth; the dominant generator frequency in any given bandwidth dictates the speed of polarity inversions, but resting-state microstate topographies are considered independently of polarity (i.e. two topographies with opposite polarities are considered the same microstate). Microstates are therefore better suited to detect rapid, dynamic activity in large-scale neurocognitive networks than traditional frequency analysis of EEG. These large-scale neural networks, which link spatially distributed cortical areas into functional entities, have been shown to underlie complex neurocognitive activities [8,9], including those that occur at rest, the so-called ''resting-state networks'' (RSNs). Accordingly, recent data has indicated that individual microstates may correspond to specific RSNs identified in functional magnetic resonance imaging (fMRI) studies, as there appears to be a temporal correlation between the appearance of microstates and specific RSN activity [10][11][12]. Spatial EEG signal analysis with microstates may be a valuable approach to studying these and other large-scale neurocognitive networks in health and disease.
Consistent with the idea that EEG microstates may reflect the activity of largescale neurocognitive networks, preliminary reports have found correlations between features of the microstate time series and various cognitive activities, behavioral states, and neuropsychiatric diseases. For example, microstates of certain topographies have a shorter average lifespan in schizophrenia [13], are longer in panic disorder [14], shortened in Alzheimer's disease [15], and appear more frequently in Tourette's syndrome [16]. Neurotropic drugs commonly used to treat neuropsychiatric disease alter microstate features [17,18]. Microstates vary with cognitive/behavioral states such as drowsiness [19], sleep stages [20], age [21], and even personality characteristics [22]. Pre-stimulus spontaneous EEG microstates also have profound impacts on the electrophysiological [23][24][25] and perceptual [26][27][28] responses to stimuli. These preliminary reports suggest that features of the microstate time series may be related to the neurophysiological basis of these disorders, brain states, and cognitive functions, and can potentially offer insight into the function of the brain in health and disease [29].
The aforementioned cross-sectional studies of microstate features demonstrate intriguing relationships between microstates and disease. To further explore the significance of the microstate time series and its potential utility in the detection of neurophysiological changes underlying disease, longitudinal cohort studies are required to characterize microstate changes over time in individual patients. However, the design of these studies is difficult, because limited information exists about the variance in common microstate features and the test-retest reliability of these values. Furthermore, although there are several different methodological approaches to microstate analysis, few studies have assessed the validity and consistency of these various methods.
In this study, we investigate the test-retest reliability of resting-state EEG microstate features in healthy subjects across three sessions. We extract four of the most common features from the time series, namely, the topography that defined each microstate, the average lifespan of each microstate, the frequency of appearance of each microstate, and the fraction of total time covered by each microstate. Our rationale in choosing these features is based on the fact that these are the most common features examined in previous studies, and changes in each of these features has been associated with one or more neuropsychiatric disorders.
Microstate analysis involves two basic steps: first, a set of microstate topographies is selected, and second, the original data is re-expressed as an alternating sequence of these microstate topographies from which values of interest can be calculated. The first step is usually performed by mathematical clustering of maps in the original data. A single set of microstate topographies can be identified for all subjects (i.e. all of the original data is clustered together), or a unique set of topographies can be generated for subsets of subjects (e.g. different experimental groups may be assigned unique microstates, or each recording may be assigned an individual set of microstates). We tested the consistency of microstate analysis using three different approaches. First, we assumed one set of four global maps and identified a single set of four maps that was applied to all subjects across all sessions. Second, we generated a set of topographies by session independently. Third, we generated a set of topographies by recording, i.e. generated four maps for each individual recording. We also compared two common clustering algorithms (topographic atomize and agglomerate hierarchical clustering and k-means clustering) and assessed whether microstate analysis can be performed reliably with as few as 8 electrodes.

Subjects
We studied 10 healthy subjects (mean age: 30¡10 yr, 5 females). Subjects were recruited through advertisement in greater Boston area. Exclusion criteria included a self-reported medical illness and history of drug or alcohol abuse. All participants gave their written informed consent and the protocol was approved by the local ethics committee at the Beth Israel Deaconess Medical Centre in accordance with the declaration of Helsinki.

EEG Recording
Data used in this study was collected as part of a baseline assessment in a larger research study investigating the effect of non-invasive brain stimulation on various cortical processes. Subjects were instructed to sit in a comfortable armchair. Approximately three minutes of resting-state, eyes-closed EEG were recorded in three sessions separated by at least 48 hours. EEG recording was obtained through a 32-channel EEG system (BrainProducts, GmbH) with the CPZ and AFZ electrodes set as reference and ground electrode, respectively. EOG was recorded through two channels placed underneath each eye. The data was sampled at 5 kHz with the online filter setting set to DC to 1 kHz. The skin/electrode impedance was kept below 5 kOhm.

EEG Preprocessing
Data were imported into MATLAB (The MathWorks. Inc.Natick, MA, USA) for preprocessing. The open source signal processing functions available through the EEGLAB toolbox version 11b [30] were used for data import and preprocessing. The EEG waveforms were epoched into segments of 2 second duration and down sampled to 2 kHz. A notch filter (band-stop: 55-65 Hz) was used to remove the 60 Hz noise. EEG signals were band passed filtered for the frequency range of 1-50 Hz to further minimize contamination by high frequency artifact. The infinite impulse response (IIR) Butterworth filter of second order was employed and both forward and backward filtering was applied (MATLAB function 'filtfilt') to maintain a zero phase shift. All epochs were manually reviewed and trials and channels containing eye movements, muscle or any other non-physiological artifact were discarded. The data were then average re-referenced.

EEG Microstate Analysis
EEG microstate analysis was conducted using the freely-available CARTOOL software [31]. Preprocessed data were imported into CARTOOL, band passed to 1-30 Hz, and downsampled to 200 Hz before microstate analysis as described below.
In microstate analysis, the multichannel EEG signal is considered as a series of instantaneous topographies of electric potentials. We identified points with the greatest signal-to-noise ratio (SNR) by calculating the global field power (GFP) of each topography in the time series. The GFP at each point in time is equal to the root mean square across the average-referenced electrodes -equivalently, the standard deviation of the signal at all electrodes: where v i (t) is the voltage at electrode i at time t, v t ð Þ is the mean voltage across all electrodes at time t, and n is the number of electrodes. Maps that occur at local maxima in the GFP curve -i.e. all points with GFP higher than the preceding and following point -represent instants of highest field strength and greatest SNR. Furthermore, because the field topography remains essentially constant between two local minima of the GFP curve, topographies at GFP maxima are representative of topographies at surrounding points in time [7,32]. Thus, data reduction of the original signal to points at local GFP maxima is a valid method of enhancing topographic SNR. These maps at local GFP maxima, hereafter referred to as the ''original maps,'' were extracted and submitted to further analysis ( Figure 1).
Microstate analysis involves two basic steps -first, a set of microstate maps is identified, and second, this set of maps is fit onto the original data to re-express the multichannel EEG as a sequence of microstates ( Figure 1). The GFP (drawn in red) is calculated at each instant of the multichannel EEG recording. Peaks of the GFP curve represent moments of highest SNR. At peaks of the GFP curve, the potential recorded at each electrode of the multichannel signal is plotted onto a map of the channel array. This collection of maps is entered into a clustering algorithm (TAAHC or k-means clustering), which results in a small number of representative microstate maps that explain a large proportion of the global topographic variance. Four topographies are repeatedly found using this method; these maps are labeled A, B, C, or D in the figure. Crosshairs indicate points of maximum or minimum recorded electric potential. (B) The original maps at peaks of the GFP curve are assigned to a microstate class A, B, C, or D based on the degree of correlation with the microstate maps and statistical smoothing of the time series. This reassignment results in a representation of the original multichannel data as an alternating series of microstates A, B, C, and D. A microstate is considered dominant in the time during which all successive original maps are assigned to the same microstate class, starting and ending at the midpoint in time between the last original map of the preceding microstate and the first original map of the following microstate, respectively. Each period of dominance is considered a unique appearance of a microstate. The frequency of a microstate is the number of unique appearances per second. The coverage of each microstate is the fraction of total recording time that each microstate is dominant. doi:10.1371/journal.pone.0114163.g001

Derivation of Maps Globally, by Session, and by Recording
We chose a priori to identify four group-level classes in order to remain consistent with the majority of previous studies that also use four microstate classes. We compared three different strategies to identify the four maps that would be used to identify microstates on our original data. In the global maps strategy, we clustered original maps from each recording into four maps and then entered this set of 46305120 maps into another round of clustering to identify four global maps that was then fit to all of the original maps. This is similar to the strategy used by Lehmann et al. (2005) [13]. In the by session strategy, we clustered original maps from each recording into four maps, and then entered these maps into another round of clustering separately for each session. This resulted in three separate sets of four microstate maps (one for each of three sessions). The maps from each session were fit onto all of the original maps from the respective session. Finally, in the by recording strategy, we clustered original maps from each recording into four maps, and used these 30 sets of four maps to fit onto original maps from each respective recording.

Topographic Atomize and Agglomerate Hierarchical Clustering (TAAHC)
The original maps were submitted into a modified hierarchical clustering algorithm known as the topographic atomize and agglomerate hierarchical clustering (TAAHC) method [33] as implemented by the CARTOOL program [31]. Briefly, all maps submitted to the procedure are initially considered to be independent clusters. In each iteration of the algorithm, the ''worst'' cluster is identified and split into its constituent maps (''atomized''). The ''worst'' cluster in each iteration is the one with the lowest summated correlation between each constituent map to the average cluster map. Correlation is analogous to the Pearson product-moment correlation coefficient between two topographies: where sums are taken over i electrodes. Maps of the ''worst'' cluster are redistributed (''agglomerated'') to any of the remaining clusters to which they are most strongly correlated. This process is continued until the desired number of clusters is achieved. We used TAAHC to cluster the original maps from each subject and session into four cluster maps for each subject and session. In the by recording strategy, we fit these four cluster maps onto each recording. In the by session strategy, we submitted these 120 cluster maps (4 from each of 10 subjects over 3 sessions) to another round of clustering separately for each session, and fit the maps from each session onto original maps from the respective session. Finally, in the global maps strategy, we submitted these 120 cluster maps to the TAAHC algorithm to identify four group-level cluster maps. These four maps were the ''microstate maps'' and were labeled class A, B, C, and D ( Figure 2). Cluster maps identified by recording and by session were labeled A, B, C, and D depending on their degree of correlation with maps A, B, C, and D from the global maps strategy. After these labels were assigned to the sets of 4 cluster maps derived by recording and by session, these unique sets of 4 cluster maps were fit onto the original data.

Fitting Microstate Maps onto Original Maps
In the final step, original maps are labeled either A, B, C, or D depending on which microstate map has the highest correlation to the original map.

Extraction of Features from the Microstate Time Series
After labeling at the local maxima of the GFP curve, we could express each of the original signals as an alternating sequence of maps A, B, C, and D. From this microstate time series, we calculated several features. Our outcomes of interest were: (1) topographies of the four cluster maps identified in each clustering strategy, (2) average lifespan of each microstate and all microstates, (3) frequency of appearance of each microstate and all microstates, and (4) fraction of total covered time of each microstate. We calculated these features separately for each subject in each session.
1.3.1. Average Lifespan of Microstates: The lifespan of a microstate was calculated as the time during which all successive original maps were assigned to the same microstate class, starting and ending at the midpoint in time between the We calculated the fraction total covered time (coverage) of each microstate by taking the ratio of the total time spent in each microstate over the total recording time [13]. Note that the coverage of all four microstates can be calculated from their respective average lifespans and frequencies and the total length of recording, i.e. these are not completely independent measures.

K-Means Clustering
To assess the reliability of microstate analysis performed with the k-means clustering algorithm, we repeated the entire analysis, but used k-means clustering instead of TAAHC to identify a set of microstate maps. In the k-means clustering method, clustering is first initialized and then entered into a convergence loop. During initialization, to find n clusters, n non-identical maps are randomly selected out of the set of maps entered into the analysis to serve as templates. All maps are then assigned to a cluster seeded by one of the n templates based on the degree of correlation to each template. In the convergence loop, all maps in each of the n clusters are averaged. These n average maps then serve as seeds for new clusters, and all input maps are again assigned to one of n clusters based on correlation to cluster seeds. A measure of the quality of current cluster assignment is computed, in our case the global explained variance (GEV): where m is the number of original maps. The convergence loop is repeated until the quality of the cluster assignment does not improve. The entire initialization and convergence algorithm is repeated several times to increase the likelihood of finding an optimal set of n clusters. The procedure can be repeated for many values of n, so the topographies of any number of clusters can be derived. Because the initialization step picks n maps to serve as templates randomly, k-means clustering is non-deterministic. To overcome this drawback, the algorithm is repeated 300 times for each value of n to minimize run-to-run variance. Maps identified using k-means clustering were labeled A, B, C, or D depending on degree of correlation with maps extracting using TAAHC using the global maps approach.

Determining Test-Retest Reliability and Consistency of Microstate Characteristics
All statistical analyses were performed using the SAS software. Cronbach's a is a well-established measure of the internal consistency and reliability of a test, and was calculated to determine the reliability of microstate characteristics across time, and the consistency of these characteristics across different methods (TAAHC vs k-means clustering and various electrode arrays). It is calculated as: where K is the number of repeated tests, s x 2 is the variance of the observed data, and s Y(i) 2 is the variance of component i of subject Y. Cronbach's a is a measure of reliability relative to between-subject variance. To give a measure of absolute reliability of these features, we also calculated the standard error of measurement (SEM) for microstate characteristics, where appropriate.

Comparison of Microstate Maps Derived from TAAHC and K-Means Clustering. ing
We tested the four microstates extracted from k-means clustering with those extracted from TAAHC for significant differences using topographic analysis of variance (TANOVA). TANOVA is a randomization procedure that uses the GFP of the electrode-by-electrode subtraction between two maps as the test statistic (effect size) of the difference between maps [34]. When the maps being compared are GFP-normalized, the test statistic is equal to the global map dissimilarity (GMD): where u' i and v' i are the potentials at electrode i in the GFP-normalized maps being compared.
To derive the distribution of the test statistic under the null hypothesis that there is no difference between maps from k-means clustering and TAAHC, we randomly shuffled maps from each subject between two groups and calculated the test statistic between the average group maps, and repeated this procedure 5000 times. The fraction of iterations with a test statistic greater than the one calculated from the actual data was the p value.

Test-Retest Reliability of Average Microstate Lifespan, Frequency, and Coverage Across 3 Sessions
To determine the test-retest reliability of the average lifespan, frequency, and coverage of each and all microstates across 3 sessions, we calculated Cronbach's a and the SEM of these values from 30-electrode data analyzed with TAAHC clustering, as well as with k-means clustering, as described above.

Consistency of Average Microstate Lifespan, Frequency, and Coverage between TAAHC and K-Means Clustering
To determine the level of consistency between microstate features extracted using TAAHC and k-means clustering, we calculated Cronbach's a between the average microstate lifespan, frequency, and coverage measured using microstates extracted by TAAHC and k-means clustering.

Consistency of Average Microstate Duration, Frequency, and Coverage among Channel Arrays with 30, 19, and 8 Electrodes
To determine the level of consistency between microstate features using topographies extracted using channel arrays with 30, 19, and 8 electrodes with TAAHC, we calculated Cronbach's a between the average microstate lifespan, frequency, and coverage measured with these arrays.

Results
After the data were preprocessed and epochs with artifacts removed, we had a mean of 127.87 seconds of data (SD 523.87, range 580-204) per recording that were submitted to microstate analysis from which we extracted the ''original maps'' at local maxima in the GFP curve. We chose a priori to cluster the original maps from each session into four microstates. Four microstate maps had a mean GEV of 69.93% (SD 53.58, range 565.34-77.99) across all recordings using TAAHC.

Reliability with Maps Derived Globally, by Session, and by Recording
We calculated test-retest reliability of the average lifespan, frequency, and coverage with maps derived globally, by session, and by recording using the TAAHC method. The four global maps appear in Figure 2. The average lifespan, frequency, and coverage of each microstate calculated using maps derived globally appear in Table 1, along with Cronbach's a and SEM. The mean Cronbach's a for these values is 0.811. Most values have Cronbach's a.0.7 and SEM is less than 10% of the mean for all values.
Results from maps derived by session and by recording are also presented in Table 1. For maps derived by session, the mean Cronbach's a is 0.648 and SEM is approximately 10% of the mean for all values. For maps derived by recording, the mean Cronbach's a is 0.523 and SEM is approximately 10% of the mean.

Reliability with Maps Derived Using K-Means Clustering
To determine the reliability of the analysis conducted with maps derived using the k-means clustering algorithm, we used k-means clustering to identify a set of global maps and assessed both the reliability of the resulting features over three sessions. The four global maps derived using k-means appear in Figure 2. These TANOVA analysis of maps A, B, C, and D extracted using TAAHC and k-means reveals no significant difference in the topography of any of the maps (p.0.01). The average lifespan, frequency, and coverage of microstates calculated using these maps appear in Table 2. The mean Cronbach's a for these values is 0.830 and SEM is less than 10% of the mean for all values. We also assessed the degree of agreement between values calculated using maps derived from k-means clustering and TAAHC in a single session. These Cronbach's a values appear in Table 2. All values are above 0.9.

Reliability with 19 and 8 Electrodes
To determine whether microstate analysis can be reliably conducted with fewer electrodes, we repeated the analysis after selecting 19 and 8 electrodes from the original 30-electrode array using TAAHC and a global maps strategy. The four microstate maps derived using 19 and 8 electrodes appear in Figure 2. We could clearly identify maps belonging to classes A, B, C, and D in both 19 and 8 electrode data. Average microstate duration, frequency, and coverage from 19 and 8 electrode data appear in Table 2. In 19 electrode data, these microstate features have a mean Cronbach's a of 0.873 and SEM approximately 10% of mean values. In 8 electrode data, these features have a mean Cronbach's a of 0.906 and similar SEM.
We also determined the degree of consistency between values extracted from 30, 19, and 8 electrode data in a single session. These Cronbach's a values appear in Table 3. Most values are highly consistent across these electrode arrays (average Cronbach's a50.834). Notably, the consistency of the average lifespan, frequency, and coverage of microstates C and D tended to be lower than corresponding values for A and B.

Correlations Among Microstate Features
We calculated the correlation between each pair of microstate features in a single session. As expected, microstate lifespan is inversely correlated with frequency (average correlation R520.72 across 4 pairs of microstate lifespans and frequencies) for each individual microstate (e.g. lifespan of microstate A compared to frequency of microstate A, etc.). When comparing these values among all 4 microstates, we found positive correlations among all average lifespans (average correlation R50.79) and frequencies (average correlation R50.51) (e.g. comparing lifespans of microstates A, B, C, and D) in each individual recording.

Correlation between Microstate Features and Spectral Power
To explore the relationship between microstate features and the power spectra of EEG recordings, we also calculated the correlation between various microstate features and the absolute and relative power in the delta, theta, alpha, and beta frequency bands averaged across all electrodes. Multiple regression modeling showed that relative beta power is negatively associated (p50.0001) and relative alpha power is positively associated (p50.0174) with global average microstate duration (R 2 50.92). Conversely, relative alpha power is negatively associated (p50.023) and relative beta power is positively associated (p50.0003) with overall microstate frequency (R 2 50.89). Power is not significantly associated with coverage fraction of any microstate class. There were no significant correlations between any microstate feature and absolute power in any frequency band.

Discussion
In this study, we sought to assess the test-retest reliability of resting-state EEG microstate analysis in healthy subjects over time. We used a number of variations of the method to determine the reliability and the degree of consistency among these approaches. This study has four major findings. First, we found that using a global set of microstates for all subjects yields average microstate durations, frequencies, and coverage fractions that have high Cronbach's a, indicating excellent test-retest reliability. Second, we found that the use of global maps yields results that are in general more reliable than maps identified by session or by recording. Third, we showed that TAAHC and k-means clustering yield highly consistent results. Finally, we showed that microstate analysis can be reliably conducted with as few as 8 electrodes.
The maps A, B, C, and D ( Figure 2) have been reported by numerous previous studies of resting-state EEG microstate analysis [5,13,21], and the average microstate lifespans, frequencies, and coverage fractions calculated in this study are in general agreement with prior studies (Tables 1 and 2). With few exceptions, most microstate features calculated from a set of global maps have Cronbach's a.0.7. Cronbach's a is equivalent to the 3, k intraclass correlation coefficient and is a measure of between-subject variance relative to within-subject variance, i.e. it is a relative measure of test-retest reliability. High (generally,.0.7) Cronbach's a value suggests that variance in these values over three spaced sessions is small compared to the distribution of these values in the entire sample. The SEM values reported in Tables 1 and 2 are measures of absolute test-retest reliability, and are approximately 10% of the mean for all values. We conclude that these values have high relative test-retest reliability. We compared three different strategies for identifying microstate maps to be fit onto each EEG recording. In the global maps strategy, we derived a universal set of four maps that was derived using all sessions. We also derived maps by session, where maps were re-calculated for each session but held constant for all subjects within each session, and by recording, where maps were identified for each individual recording. We found that global maps gave the most reliable results. This suggests that microstate analysis is highly sensitive to the topographies that are fit onto the data and used to calculate values of interest. Minor differences in the microstate maps are introduced when maps were recalculated by session or by recording, which appear to generate within-subject error and lower test-retest reliability. This also indicates that comparison of microstate features, for example between two studies, is most valid when the same maps are used.
Our a priori selection of 4 microstates represented our data well, explaining about 70% of the global topographic variance of the data. We chose to cluster our data into 4 microstates to remain consistent with the majority of previous studies. However, methods of deriving a data-driven estimate of the number of microstates required to ''best'' explain the data have been proposed [33,35]. The most common of these approaches to minimize the cross-validation (CV) criterion, which is proportional to the ratio between the GEV and the degrees of freedom of the maps [35]. Importantly, the CV criterion is highly sensitive to the number of electrodes used, and some have argued against its use when fewer than 64 electrodes make up the channel array [36]. Another measure, the Krzanowski-Lai criteria, has recently been proposed for microstate analysis [31]. In our data, the CV criterion was on average minimized between 4 and 5 microstates for all recordings; thus, our selection of 4 microstates was similar to the data-driven estimate.
We compared the maps derived using TAAHC and k-means clustering to determine the extent of agreement between these two clustering algorithms. We found that TAAHC and k-means clustering (iterated 300 times) both give results with excellent test-retest reliability across three sessions. The four global maps derived using both methods are highly similar ( Figure 2). Cronbach's a values for microstate features calculated using maps from these two methods are all above 0.9, indicating that TAAHC and k-means give highly consistent results.
Because microstate analysis considers the topography of potentials over the entire cortex in a global representation of brain state, and given the nature of the microstate maps extracted using 30 electrodes, we hypothesized that fewer electrodes could successfully identify the four microstate maps A, B, C, and D and be used to conduct microstate analysis reliably. To test this hypothesis, we eliminated all but 19 and 8 electrodes of our original data and repeated the analysis. We could clearly identify maps representing microstate A, B, C, and D in both 19 and 8 electrode data (Figure 2). The results of these analyses were also highly reliable ( Table 2) and, interestingly, even appeared slightly more reliable than 30-electrode data. It is possible that elimination of superfluous electrodes reduced noise in the data and refined the results. We also compared the results from 30, 19, and 8 electrode data in a single session to assess the degree of consistency between these electrode arrays. In general, results from these arrays are consistent (Table 3). Notably, results of microstates C and D appear less consistent. This is unsurprising, as the topographies of C and D are similar and are probably more difficult to resolve with fewer electrodes. Nevertheless, these data suggest that microstate analysis can be conduced reliably with as few as 8 electrodes. This may be particularly relevant in the development of microstates as clinically useful biomarkers of disease for longitudinal assessment over time, because it reduces the invasiveness and inconvenience that is otherwise associated with clinical EEG studies.
Within individual recordings, we found positive correlations among the average durations and frequencies of all microstates, suggesting that individuals have a tendency toward relatively longer or shorter and relatively more or less frequent microstates in general. This likely reflects natural inter-subject variance, and may also be a function of age [21]. We were also interested in determining how microstate features relate to power in EEG frequency bands. We found that the global average microstate duration decreases and global average frequency increases with increasing relative power in higher spectral frequencies (beta vs alpha bands). These associations are likely due to the fact that increased power in higher frequencies reflects faster cortical oscillations, which gives more frequent local GFP maxima and enables finer resolution of microstate transitions. This also suggests that eyes-open EEG data might be expected to yield shorter and more frequent microstates, as alpha power is reduced in the eyes-open state. Our findings agree somewhat with the findings of Koenig et al. (2002), in which shortening overall average microstate lifespans with age was correlated with increasing relative power in higher frequency bands, although they reported lower correlation values (R 2 ,0.44) [21]. We did not find evidence of significant association between the coverage of any microstate and power in any frequency band, in agreement with the findings of Britz el al (2010) [10].
The statistics we present here are an important contribution to the translation of microstates to clinical practice as potential biomarkers of neurophysiological health for longitudinal monitoring in individual patients. Unlike experimental paradigms, in which results from multiple subjects are aggregated to produce estimates of variance that are used to determine the significance of microstate differences between groups (see, for example, [13][14][15][16][37][38][39][40]), assessing the significance of changes in microstates observed across repeated measurements in a single subject requires estimates of measurement reliability. To this end, for example, the SEM can be used to estimate a 95% confidence interval for microstate features outside of which differences in repeated measurements in a single individual can be reasonably attributed to changes in the true value, rather than measurement error [41,42]. This estimate is given by Z a=2 ffiffi ffi 2 p NSEM, where Z a/2 51.96 for a Type I error threshold of 5%. Thus, for the overall microstate duration using 30 electrodes, TAAHC, and a global maps strategy, the SEM (6.41 ms from Table 1) suggests that, for repeated measurements in a single individual, a change in overall microstate duration of 17.77 ms can be considered significant with 95% confidence. Similarly, reliability can be used to estimate the false-positive and false-negative rate if microstate features are used as decision thresholds [43]. As Cronbach's a is an ICC, it can be used in power calculations in the design of large trials [44][45][46]. We encourage other investigators to use the values reported herein to optimally design future studies.

Conclusions
In this study, we found that when a global set of microstates is used to conduct microstate analysis over multiple sessions, resting-state EEG microstate analysis has high test-retest reliability in healthy subjects as measured by Cronbach's a and SEM. We also determined the consistency of the k-means clustering and TAAHC algorithms in extracting microstate maps. Finally, we found that microstate analysis can be reliably conducted with as few as 8 electrodes.
The microstate features we analyzed in this study have been shown to vary in altered cognitive/behavioral states and neuropsychiatric disease, and may be related to the neurophysiological changes that underlie these disorders. The clinical use of microstates as potential biomarkers of disease presupposes withinpatient reliability of relevant features, so that changes in the features can reasonably be attributed to changes in neurophysiology. Our aim in this study was to assess the degree of this reliability. Our results indicate good reliability of all the features we examined, and suggest potential value in further exploring microstates as neurophysiological markers of disease in future studies.