Dynamical Hurst analysis identifies EEG channel differences between PTSD and healthy controls

We employ a time-dependent Hurst analysis to identify EEG signals that differentiate between healthy controls and combat-related PTSD subjects. The Hurst exponents, calculated using a rescaled range analysis, demonstrate a significant differential response between healthy and PTSD samples which may lead to diagnostic applications. To overcome the non-stationarity of EEG data, we apply an appropriate window length wherein the EEG data displays stationary behavior. We then use the Hurst exponents for each channel as hypothesis test statistics to identify differences between PTSD cases and controls. Our study included a cohort of 12 subjects with half healthy controls. The Hurst exponent of the PTSD subjects is found to be significantly smaller than the healthy controls in channel F3. Our results indicate that F3 may be a useful channel for diagnostic applications of Hurst exponents in distinguishing PTSD and healthy subjects.


1-Introduction
EEG (Electroencephalogram) signal measures voltage temporal variations, which reflects brain neuronal electrical activity [1]. The EEG signals contain relevant dynamic information about the brain's electrophysiological activity. Thus, prediction and modeling EEG signals is an important area of biological and biomedical research [2,3]. EEG signals feature non-linear and non-stationary pseudo oscillatory behavior characterizing spontaneous brain oscillations such as alpha waves. To extract important features of EEG for the diagnosis of different diseases, advanced signal processing techniques are required. There are various states and conditions that influence the signals-such as sleep, epilepsy, reflexology, drugs/anesthesia, diabetes, meditation, experiencing emotions, listening to music-as well as artifacts that influence the signals [4]. Long-term and short-term characteristics of EEG time series have been investigated in a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 biological applications [5], and EEG time series have been studied to identify affected regions of the brain in disease, such as epilepsy [6].
In the current study, EEG was employed to study time-series differences related to posttraumatic stress disorder (PTSD). In a study of the dynamical complexity of EEG time series in 27 PTSD and 14 healthy people, Jeong-Ho Chae et al. (2004) found reduced complexity in  channels Fp1, F8, C4, P4, T3, T4, T5, T6 and O1 for PTSD cases [7]. Another group calculated non-linear independence (NI) values of EEG data of 16 channels corresponding to 18 pairs of PTSD and healthy controls. They showed that, in PTSD patients, NI factors increase in channels F3, F7, C3, T5, P3 and decrease in channels F4, C4, P4, and O2 [8]. In five case studies, Rutter (2014) determined channels F3, F4, C3, C4, P3, P4, Fz, Cz, and Pz as the most influenced by the disorder [9].
There have been several studies on the application of the Hurst exponent to investigate EEG signals [10]. The Hurst exponent is a measure of the long-memory properties of signals [11,12]. In this study, we aim to explore the possibility of developing a Hurst exponent-based method for feature selection of channels that may be important for prediction. We hypothesize that the long memory of the EEG signals in the PTSD and healthy controls differentiate the groups. To this end, we investigate the long-memory properties of the EEG data by applying the time dependent Hurst analysis using the rescaled range (R/S) technique.
The manuscript is organized as follow. First, the EEG data are described statistically. Next, the theoretical approach of the Hurst exponent calculation including the R/S analysis method and the importance of stationary data are explained. Finally, the results are presented and discussed.

Material and methods
EEG data were collected at the Laureate Institute for Brain Research as part of a simultaneous EEG and fMRI study [13] conducted on individuals with combat-related PTSD and healthy controls. The study was approved by the Western Institutional Review Board, Puyallup, WA. All procedures with human subjects were conducted according to the code of ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans. All subjects gave written informed consent to participate in the study and received financial compensation.
For the EEG preprocessing, MRI gradient artifact and cardioballistic (BCG) artifact were removed using the template subtraction method. After the gradient artifact removal, the EEG data was down sampled to 250 samples/s (4 ms temporal resolution) and low-pass filtered to 40Hz. Residual cardio ballistic artifact, as well as blink and saccade artifacts, were removed using independent component analysis (ICA). Due to motion of PTSD subjects during the fMRI scan, we removed time periods with subject head motion. In the experiment, the scan lasted for 526 s. The first 6s was removed for steady-state signals. There were 130,000 time points in each channel. For the analysis, we included only 50,000 data points by selecting the first available 50,000 points without subject motion. Provided that there are sufficient EEG data points to reach stationarity, using fewer data points does not affect the results statistically but decreases the calculation time.
For the Hurst analysis, we calculated the temporal changes in the preprocessed data. As we will discuss later in Section 2.2, the Hurst exponent differentiates most strongly between healthy and PTSD subjects for the F3 channel. Thus, we summarize the statistics of the F3 channel data for all subjects ( Table 1). Note that positive skewness and kurtosis of the EEG data are found for both groups of subjects. The positive skewness indicates the asymmetrical distribution of the EEG signal amplitude with a long tail to the right. Furthermore, the positive kurtosis suggests that the distribution about the mean is more peaked than a Gaussian distribution. EEG time-series distributions in μV for channel F3 for each subject are shown in Fig 1. The distribution of the other channels is given in supplementary materials, S1 Appendix.  [14]. This method can be described by the following steps: Step 1: Calculate the logarithmic retunes of detrended time series with length N = r − 1, where t has length of original time series.
Step 2: Split the time series into m adjoining subsets S j of length n, where m × n = N, and j = 1,2,Á Á Á,m. The segments of each subset calls N k,j , with k = 1,2,Á Á Á,n. The average of each subset S j is counted by: Step 3: Calculate the addition of deviation from the average for each subset of S j as: Step 4: The mean relative range of any single subset is calculated as: Step 5: In this step, standard deviation of each subgroup is considered: Step 6: The range R I j of each subset rescaled by the related standard deviation S I j . Therefore, the average R/S measures for each window with length n is: All above steps should be repeated for different time periods.
Step 7: Plot log(R/S) n versus log(n). The slope of this graph shows the Hurst exponents H [15].
Hurst values could be calculated using Rescaled range formula estimated by above steps.
Where H is the Hurst exponent for each EEG signal and n is the number of data points [16,17,18]

Stationarity of data.
A time series is considered stationary when its statistical properties such as mean, variance, autocorrelation, etc., are constant over time. In terms of probability, if the probability distribution function of a time series does not change with time, it can be considered as a stationary process [19,20]. In practice, most of statistical forecasting methods are based on the assumption that the time series can be rendered approximately stationary through the use of mathematical transformations.
The R/S method estimates reliable Hurst exponents only for stationary time series while EEG signals present strong non-stationary characteristics [21]. Thus, to investigate the dynamical Hurst exponents of EEG signals, the issue of non-stationarity of data should be resolved [22]. To this end, one possibility is to process the data within a window that is large enough so that the data statistically behave like a stationary time series. This approach would be beneficial only if the statistical properties of data such as mean, standard deviation, etc. saturate over an increasing time scale.
In this study, we used the variation of the standard deviation calculated within different time window lengths to estimate the window width that best fulfills the stationary criterion. The stationary criterion of different channels is separately calculated and may be different from each other. Since the Hurst exponent calculation for each channel in one subject was time consuming, we performed a preliminary data examination using a smaller set of subjects (first available eight subjects) to determine which electrodes to be focused on for further analysis. In the preliminary analysis, we calculated the time variations of the standard deviations and the Hurst exponents for all channels. Since the Hurst exponents for all channels, except F3, did not show any significant group difference, we focus on channel F3 for the Hurst exponent calculations and further analysis. The standard deviation of F3 against different time window length for all subjects is shown in Fig 2.

Results and discussions
Positive skewness and kurtosis indicate deviation from a Gaussian distribution. Our statistical inferences demonstrate that the EEG data are strongly non-Gaussian (Table 1). To prepare the data for the estimation of the Hurst exponent, the data are segmented according to the saturation window length as explained in Section 2.2. The saturation window length or, as we call it, the stationary point for each EEG signal is determined by calculating the signal standard deviation versus time for all 31 channels of all twelve subjects.
We compute the variation of the standard deviation over time for channel F3 for each of the 12 subjects (Fig 2). Each curve corresponds to a healthy or PTSD subject with 49 windows each with 1000 data. The closest power of 2 for the stationary point is plotted in dotted line. Our results show that, although for many EEG signals the standard deviation saturates over a few thousand data points, the largest saturation point that is large enough for both the original and filtered data to be considered stationary is 32,768 (or 131 second).
Once we determined the window length within which the EEG data can be considered stationary (32,768 points), we then perform the Hurst exponent calculations within moving windows of this length for all EEG channels and subjects. The moving window is defined in such a way that the window of data slides over the time series each iteration with the original beginning 1,000 data points removed and the next 1,000 new data points updated at the end of the window for the 50,000 data points considered in each EEG channel, there are almost 17,000 moving windows, and hence, 17,000 Hurst exponents.
The Hurst exponents calculated for the representative channel F3 from the preprocessed data are presented in Fig 3. The readers may find the Hurst exponents for other channels in the supplementary material. Fig 3 shows that all subjects, healthy and PTSD, possess Hurst exponents with highly persistent behavior (H > 0.5). The high Hurst exponent values are indicative of the existence of strong correlation in the data, which leads to long-term memory of the data. The Hurst exponent separation between healthy and PTSD subjects is small for channel F3, but the difference between the groups is statistically significant (Fig 3). We used a Mann-Whitney U test to investigate the null hypothesis of no difference in the Hurst exponent between PTSD and control groups. The Hurst exponent of the PTSD group is found to be significantly smaller than the healthy controls. (p < 0.0260) ( Table 2).
Our findings suggest that the F3 channel discriminates between PTSD and healthy controls based on the Hurst exponent. The relevance of channel F3 to PTSD is consistent with other reports [8,9]. Non-linear independence (NI) values of PTSD and healthy controls calculated by J. Kim and collaborators show that in PTSD patients NI factors increases in channel F3 [8].
In five case studies, Rutter (2014) determined F3 as one of the most associated channels with the disorders [9].
Hurst exponent analyzes the long term memory and data dependency. In addition to the potential diagnostic insights of the Hurst values, it also uses more information from the dataset, which provides more stable estimates.
F3 is located in the frontal region of brain, which is related to emotion recognition responsibilities. Furthermore, it involves the tasks of judgment, planning, and sustained attention, inhibition of responses, verbal episodic memory retrieval, and problem solving, sequencing, and deducing facts to conclusions. Changes in the EEG alpha band have been investigated in multiple studies [23,24,25,26]; however, we did not find a significant difference between PTSD and healthy subjects in the Hurst exponent for the EEG alpha band.