A novel RSW&TST framework of MCPs detection for abnormal pattern recognition on large-scale time series and pathological signals in epilepsy

To quickly and efficiently recognize abnormal patterns from large-scale time series and pathological signals in epilepsy, this paper presents here a preliminary RSW&TST framework for Multiple Change-Points (MCPs) detection based on the Random Slide Window (RSW) and Trigeminal Search Tree (TST) methods. To avoid the remaining local optima, the proposed framework applies a random strategy for selecting the size of each slide window from a predefined collection, in terms of data feature and experimental knowledge. For each data segment to be diagnosed in a current slide window, an optimal path towards a potential change point is detected by TST methods from the top root to leaf nodes with O(log3(N)). Then, the resulting MCPs vector is assembled by means of TST-based single CP detection on data segments within each of the slide windows. In our experiments, the RSW&TST framework was tested by using large-scale synthetic time series, and then its performance was evaluated by comparing it with existing binary search tree (BST), Kolmogorov-Smirnov (KS)-statistics, and T-test under the fixed slide window (FSW) approach, as well as the integrated method of wild binary segmentation and CUSUM test (WBS&CUSUM). The simulation results indicate that our RSW&TST is both more efficient and effective, with a higher hit rate, shorter computing time, and lower missed, error and redundancy rates. When the proposed RSW&TST framework is executed for MCPs detection on pathological ECG (electrocardiogram)/EEG (electroencephalogram) recordings of people in epileptic states, the abnormal patterns are roughly recognized in terms of the number and position of the resultant MCPs. Furthermore, the severity of epilepsy is roughly analyzed based on the strength and period of signal fluctuations among multiple change points in the stage of a sudden epileptic attack. The purpose of our RSW&TST framework is to provide an encouraging platform for abnormal pattern recognition through MCPs detection on large-scale time series quickly and efficiently.


Introduction
Generally, epilepsy is a common chronic neurological disorder, and all epilepsies involve episodic abnormal electrical activity in the brain. Epilepsies, also called seizures, may be associated with cardiac arrhythmias, prominent arterial oxygen desaturations, and sudden death [1]. The authors reported that postictal heart rate oscillations are marked by the appearance of transient but prominent heart rate oscillations in a group of patients with partial epilepsy (PE). This finding may be a marker of neuroautonomic instability, and may imply some association between perturbations of the heart rate and partial seizures [1,2]. The cardiac state is generally reflected by the shape of the ECG waveform, and heart rate. Abrupt changes in heart rate can be informative and may be used as an extra clinical sign for predicting sudden epileptic attacks. In addition, sleep-related hyper-motor epilepsy (SHE), formerly known as nocturnal frontal lobe epilepsy (NFLE), is a focal epilepsy characterized by the occurrence of abrupt and typically sleep related seizures with motor patterns of variable complexity and duration [3,4]. The identification of recurrent, transient perturbations of pathological signals in brain activity during sleep, so called cyclic alternating patterns (CAP), is of significant interest as they have been linked to multiple pathologies [3,5].
Pathological signals can be recorded and processed by using present signal processing techniques, such as time series analysis [6], fast Fourier transform (FFT) [7], power spectral density (PSD) [8], empirical mode decomposition (EMD) [9,10] wavelet analysis [11] etc. However, the enormous volume of data usually makes the study tedious and time-consuming for these traditional methodologies. Abnormal patterns or change points in the signals can indicate that important events have occurred, or that a system has changed in critical ways [12,13]. Change point detection has been widely studied and has been applied to help medical research, for example, through the prediction of onset of illness or increasing illness severity, gene expression detection, and to other fields (eg. climate science) [14][15][16][17][18][19][20].
Sliding window strategies are very useful tools for multiple change points (MCPs) detection in signal processing, and have been investigated in various fields [21][22][23][24][25][26]. However, a key factor of these strategies is how to select a suitable size of sliding window, because it needs to capture the necessary characteristics of a signal to achieve correct detection/classification [24]. If the window size is too small, the task of pattern recognition will be split into multiple consecutive windows without achieving high efficiency. On the other hand, if the window size is too large, it might contain multiple patterns and decrease the recognition performance [22,24]. In addition, wild binary segmentation (WBS) method is a popular technique for multiple changepoint detection [27,28]. By using the CUSUM-like test in the stochastic manner, WBS avoid the problem of span or window selection by drawing intervals of different lengths [27,28]. However, WBS might encounter a problem that the regular and rhythmic data fluctuations in the entire datasets might be redundantly selected, even the actual target change points (tCPs) might be missed or discarded, especially when the criterion of candidate change point is unsuitable.
In this paper a novel RSW&TST framework for MCPs detection is presented based on random slide window (RSW) and trigeminal search tree (TST) methods [18,20,29]. With the RSW approach, a series of slide windows in random sizes is selected according to the data features and experimental knowledge. The TST method is used to detect a series of single change points from each of the slide windows and to create a vector of MCPs. Based on synthetic time series, our RSW&TST method was evaluated by comparing it to existing FSW and WBS&CU-SUM, as well as BST, KS and T methods [15,18,20,23,30], in terms of computing time, and the values of hit, missed, error, and redundancy, rates. In real experiments on pathological recordings in epilepsy, our RSW&TST was applied to not only recognize the abnormal patterns in terms of the number and position of the resultant MCPs, but also roughly estimate the severity of the epilepsy in accordance with the strength and period of signal fluctuations among MCPs vector during a sudden epileptic attack.

Definition of MCPs
Suppose a time-series signal X = (X 1 ,. . .,X i ,. . .,X N ) can be observed as a trajectory of a multiple data distribution process, in which the segment X i is defined by [31][32][33]: where t2 {t i-1 +1,. . ., t i }, 0< i < = M, and f i 2{f 1 ,. . .,f M } is a deterministic and piece-wise function of one-dimensional signals with MCPs (satisfying f i 6 ¼f i+1 , and i = 1, . . ., M-1 for insuring that changes occur), and M2{1, 2,. . ., n} is the number of data segment regimes and therefore M-1 is the number of abrupt changes, 0 = t 0 < t 1 < ���< t i <���< t M = n. The number M-1 and locations η 1 ,. . ., η M-1 of MCPs in the process are supposed to be unknown. The sequence (ε i ) i2N is assumed to be random white noise and such that E(ε i ) is exactly or approximately zero. In the simplest case (ε i ) i2N is modeled as i.i.d., but can also follow more complex time series distributions [33].

Wild binary segmentation (WBS)
Initially, the wild binary segmentation (WBS) method randomly draws a number of vectors (X s , X s+1 ,. . .,X e ) from the entire data sample (X 1 , X 2 ,. . .,X T ), where s and e are integers such that 1 � s < e � T, and then compute the CUSUM-like test on each subsample [27]. The whole dataset is split into two sub segments, if and once a change-point is detected typically via CUSUM-like procedure [27]. Then, the WBS choose the largest one over the entire collection of statistical tests, and take it to be the first change-point candidate to be tested against a certain threshold. If it is considered to be significant, the same procedure is then repeated recursively to the left and to the right. By using the CUSUM test in the stochastic manner, WBS avoids the problem of span or window selection by drawing intervals of different lengths [27,28]. However, WBS probablely encounters an issue as the entire dataset contains the regular and rhythmic data fluctuations. It seems reasonable that the largest maximiser from the entire collection of CUSUM-like tests is taken to be the first change-point candidate against a certain threshold, but the rest part satisfied with the candidate criterion might be redundantly selected, even the acutal target change points (tCPs) might be missed or discarded in the process of MCPs detection, especially when the threshold for candidate change point in the collection of CUSUM test is too low or too high.

TST-based single CP detection
In the RSW approach, as shown in Fig 1, we consider a diagnosed time series signal X 0 = (X s ,. . .,X i ,. . .,X e ) is divided into multiple data segments accordingly by random slide windows. As for an observed time series segment X i ¼ fX i a ; . . . ; X i c ; . . . ; X i b g in each slide window W i , we apply the TST method for a potential change point detection from W i in our RSW framework [22,29]. In this TST-base single CP detection method, as shown in Fig 2, the trigeminal search trees TSTcA/TSTcD are first constructed by adding virtual middle branches into existing binary trees [18,20]. Then, the search criteria for multi-channel detection are executed to find an optimal path towards a potential change point from the top root to the bottom leaf levels in the TSTcA/TSTcD respectively. Finally, a resultant change point is obtained from X i in the current W i after logNw i search steps, where Nw i is the length of X i .
TSTs construction. If the length Nw i of X i is k times divisible by 2, then X i can be generally decomposed into an average signal vector A k and a set of detail signal vectors D = {D 1 , D 2 ,. . .,D k }, and then represented by a mapping H k in terms of the k-level Haar Wavelet Transform (HWT) as follows: where 1�k�Lk = log 2 Nw i , and Nw i = |b−a| = |We i −Ws i |. In terms of Multi-Resolution Analysis (MRA) [34,35]

PLOS ONE
D k can be expressed as follows: where v k j is the j th level signal of scaling vector V k , and w k j is the j th level signal of wavelet basis vector W k ; jv k j j ¼ jw k j j ¼ N; Lj = N/2 k , and k = 1,2,. . .,log 2 Nw i . The coefficient vectors in the average signal set A = {A k |0�k�Lk} and the detail signal set D = {D k |0�k�Lk} can be further presented by the following two matrices McA and McD: where 1�j�Lj, 1�k�Lk, and . . . ; X i c ; . . . ; X i b g. Based on both McA and McD, as shown in Fig 2, the TSTcA and TSTcD are constructed by adding the virtual middle branches at each non-leaf level in the existing TcA and TcD. Therefore, the time series segment X i can be divided into three overlapped parts of S L ¼ fX i a ; . . . ; X i c g, S M ¼ fX i aM ; . . . ; X i bM g, and S R ¼ fX i cþ1 ; . . . ; X i b g. If a current non-leaf node cA k,j /cD k,j is selected in both TSTcA/TSTcD, the related variables are denoted by the following formulas: where 2�k�Lk and 1�j�Lj; a = 2 k (j−1)+1, b = 2 k� j, and c = 2 k (j−1)+2 (k−1) . The implementation of the TSTcA/TSTcD construction is described in Algorithm 1 in detail.

Trigeminal-branches search strategies.
To detect a potential change point from a time series segment in each of the slide windows quickly and efficiently, three search criteria are introduced on the basis of data features within existing TSTcA/TSTcD at different non-leaf levels.
TSTcD-based search criterion. Definition 2.1: Suppose the time series segment X i ¼ fX i a ; . . . ; X i c ; . . . ; X i b g in a current slide window W i is selected from the whole observed sample X = {X 1 ,. . .,X N }. Then the variance fluctuation (VF) within X i is defined by: where 1�a<b�N, a�l�c, c+1�r�b, m = c−a+1, and n = b−c. If VF mn ðX i c Þ > C 1 ðaÞ holds, then an abrupt change occurs at the time point X i c between two adjacent data segments X i L ¼ fX i a ; . . . ; X i c g and X i R ¼ fX i cþ1 ; . . . ; X i b g in X i , and C 1 (α)2R represents a threshold of the variance fluctuation between X i L and X i R which obey an identical distribution in X i . On the other hand, if VF mn ðX i c Þ � C 1 ðaÞ holds, then no abrupt change occurs in the current segment X i . Definition 2.2: Given a piece of a time series segment X i in the slide window W i , suppose a sub-tree cD k,j is selected from the trigeminal search tree TSTcD at non-leaf node levels, three variance fluctuations VF k,j:L , VF k,j:R , and VF k,j:M are formulated in accordance with the sub-branches cD k,j:L , cD k,j:R and cD k,j:M as follows: where 1�j�N/2 k , m = n = 2 k−2 , and 2�k�log 2 N; a = 2 k (j−1)+1, b = 2 k� j, and c = 2 Criterion 2.1: Given three measurements VF k,j:L , VF k,j:R , and VF k,j:M in definition 2.2, if max(VF k,j:L , VF k,j:R , VF k,j:M )>C 1 (α) and 2�k�log 2 N hold, then the sub-branch with the maximal VF value is selected from cD k,j;L , cD k,j;R , or cD k,j;M in the TSTcD, and two others are discarded.
Proof 2.1: Suppose a target CP X i c is contained in an observed segment Then, in terms of Definition 2.1, there exists a bigger VF value between two adjacent segments before and after the target X i c , than that of any other parts without X i c . As for a current non-leaf node cD k,j with trigeminal branches cD k,j:L , cD k,j:R and cD k,j:M in the TSTcD, in terms of the definition 2.2, one reliable explanation for Criterion 2.1 can be associated with the reason that the one with the maximal VF value in all sub-branches cD k,j:L , cD k,j:R and cD k,j:M has a higher probability of containing the target CP than that of two others. Therefore, it is reasonable to select the sub-branch with the maximal VF value as the current search path and discard the others.

TSTcA-based search criterion. Definition 2.3: Suppose a piece of data segment
. . . ; X i c ; . . . ; X i b g in a current slide window W i is selected to be diagnosed from the whole time series sample X = {X 1 ,. . .,X N }, then the statistic fluctuation (SF) is defined as follows: where 1�a<b�N, a�l�c, c+1�r�b, m = c−a+1, and n = b−c. F m and G n stand for the empirical cumulative distribution function (e.c.d.f) of two adjacent data segments . . . ; X i b g respectively, and I is an indicator function. Suppose SF mn ðX i c Þ > C 2 ðbÞ holds, there exists an abrupt change X i c within X i , where C 2 (β)2R is a threshold of the SF value between X i L and X i R that obey an identical distribution. On the other hand, if SF mn ðX i c Þ � C 2 ðbÞ holds, then no abrupt change exists in the current segment X i .

Definition 2.4:
Consider the other TSTcA constructed from the same data segment X i in W i. Suppose a sub-tree cA k,j is selected from one of the non-leaf nodes in TSTcA, then the related variables X i k;j:L ; X i k;j:R and X i k;j:M are introduced, and three statistic fluctuations SF k,j:L , SF k, j:R , and SF k,j:M are presented according to the three sub-branches cA k,j:L , cA k,j:R and cA k,j:M as follows: 1�j�N/2 k , and 2�k�log 2 N. Criterion 2.2: Consider the three variables SF k,j:L , SF k,j:R , and SF k,j:M in Definition 2.3. If max(SF k,j:L , SF k,j:R , SF k,j:M )>C 2 (α) and 2�k�log 2 N hold, then the sub-branch with the maximal SF value is selected from cA k,j:L , cA k,j:R , and cA k,j:M in the TSTcA, and the others are omitted.

Proof 2.2: Suppose a change point X i c exists in an observed segment
Then, in terms of Definition 2.3, the SF value between two adjacent segments containing X i c will be bigger than that of any part without X i c . On the other hand, according to Definition 2.4, Criterion 2.2 reliably shows that the part with the maximal SF value in all sub-branches cA k,j:L , cA k,j:R , and cA k,j:M has a higher probability of including the target CP than the other parts. As a result, it is best to choose the sub-branch with the maximal SF value as the current search path, and dismiss the others.
Leaf-node search criterion. To find a target CP from the bottom leaf nodes in the TSTcA/ TSTcD, another search criterion is introduced by using the revised KS statistics. Definition 2.5: Given a sub-tree cA k,j /cD k,j selected from TSTcA/TSTcD in the last nonleaf level, where k = 1, two statistic variables S L ðX i cL Þ and S R ðX i cR Þ are defined in terms of two leaf nodes cA 0,2j−1 /cD 0,2j−1 and cA 0,2j /cD 0,2j as follows: . . . ; X i b g respectively; cL = 2j−1, cR = 2j, m = 2j−1 or 2j, and n = N−m+1.
In addition, it is worth noting that the largest SF between F m (x) and G n (x) is achieved either before or after one of the signal jumps, i.e., an abrupt change, as: ( Definition 2.6: In terms of the formula (34) in Definition 2.5, we then define another two variables S À L and S À R as, Thereafter, the maximal values of two statistic measurements in accordance with the leaf nodes cA 0,2j−1 /cD 0,2j−1 and cA 0,2j /cD 0,2j can be obtained by S 0 holds, then one of the two leaf nodes cA 0,2j−1 /cD 0,2j−1 and cA 0,2j /cD 0,2j with maxðS 0 L ; S 0 R Þ is chosen from TSTcA/TSTcD at the last bottom level, and the other one is discarded. Accordingly, one of two time points X i 2jÀ 1 or X i 2j is selected from the diagnosed segment X i , and dealt as the final resultant CP. Otherwise, no abrupt change is detected from the current slide window W i . Proof 3.3: Providing a change point X i c exists in an observed segment , the values of S 0 L and S 0 R can be calculated precisely according to Definition 2.5 and 2.6. Criterion 2.3 ensure that the leaf node with maxðS 0 L ; S 0 R Þ is selected as the resultant CP, because it has a higher probability of containing the target CP than the other one. Meanwhile, if maxðS 0 L ; S 0 R Þ > C 3 ðrÞ holds, then the statistic distance between two adjacent parts . . . ; X i b g has exceeded the threshold value C 3 (r) that X i L and X i R belong to an identical distribution. Therefore, it can be guaranteed that one of two time points X i 2jÀ 1 or X i 2j with maxðS 0 L ; S 0 R Þ is chosen from the diagnosed time series segment X i , and then dealt as the final resultant CP detected from an observed slide window W i , and the other one is ignored.

RSW&TST framework
In the proposed RSW&TST framework, the RSW approach is applied for dividing the diagnosed time series X 0 = (X s ,. . .,X i ,. . .,X e ) mentioned above randomly into multiple data segments. Then, the TST-base method is executed repeatedly to detect the potential single CP from each slide window. Our RSW&TST framework for MCPs detection is stated as below.
1. First, given a slide window W i-1 shown in Fig 1, the data segment is denoted as We iÀ 1 �, and the total sample length TN_w i−1 from the beginning slide window W 1 to the current one W i-1 is denoted as, where Nw k = |We k −Ws k |, 1�k�i−1, and N X 0 = length(X 0 ).

2.
As for a successive slide window W i , the candidate set of slide window size Set_Nw i is defined as below, where Cd = Td_Nw i and Cu = Tu_Nw i are two predefined constants, and 0<Cd�Cu<N X 0 , referring to the lower and upper bounds of Nw i respectively, and 1<i<n, n is the number of data segments within X 0 .
3. Next, the rest length of unprocessed part N R from the beginning of W i to the end of X 0 is presented by, where TN_w i−1 and N X 0 are defined as in step (1). 4. Then, the candidate set Set_Nw i can be reformulated as, In addition, if 0<N R <Cd holds, then the process of MCPs detection jumps to step (8), and the RSW approach is ended. 5. Next, the size of slide window Nw i can be selected randomly from Set_Nw i mentioned above. This step is denoted by, where random(SetNw i ) is a pseudo-function to select a random value of Nw i from Set_Nw i , and the two endpoints of slide window W i are readjusted by Ws i = We i−1 +1, and We i = Ws i +Nw i , respectively.
6. Thereafter, the data segment within current slide window W i , X i ¼ ½X i Ws i ; . . . ; X i We i �, is disposed of by the TST-base approach for a potential CP detection. If there exists a change point CP i in W i , then the CP i is assembled into the resultant MCPs vector in order. 7. Similarly, the procedure of TST-based single CP detection is repetitively executed on a series of data segments in the successive slide windows, until 0<N R <Cd holds. That is, the TST-based approach for CP detection is stopped when the rest part in the last slide window is less than the lower bound of the minimal window size Cd.
8. Finally, the vector of resultant MCPs is assembled by all the detected CPs from slide windows, and then the RSW&TST framework for MCPs detection is coming to an end.

Implementations of RSW&TST framework
In the implementations of our RSW&TST framework, the proposed RSW approach aims to divide the whole time series into a series of data segments by random slide windows. For each slide window, the procedure of TST-based single CP detection is executed for discerning an optimal path towards a potential change point from the TSTcD/TSTcA. In this multi-channel search process, Criterion 2.1 is used for selecting the abnormal part from trigeminal branches in the TSTcD at each non-leaf level. Criterion 2.2 is used for discerning the abnormal one from trigeminal branches in the TSTcA at each non-leaf level, if Criterion 2.1 is invalid as the value of VF measurement is indistinctive to be detected. Criterion 2.3 is executed to estimate a potential CP from the left and right nodes in the last leaf level. Finally, the vector of resultant MCPs is assembled orderly by a series of detected CPs from each of slide windows. The related algorithms including the procedures of TSTs construction, and TST-based single CP detection, as well as the integrated framework of RSW&TST for MCP detection are described in Algorithms 1-3 in detail.

Performance evaluation on MCPs detection
To evaluate the proposed RSW&TST performance for MCPs detection, the measurements including the hit, error, miss, and redundancy rates, as well as the search time are introduced. For a current slide window W i to be diagnosed, as shown in Fig 3, some related variables are defined in terms of the distance between the target CP (tCP) and the estimated CP (eCP) as follows: 1. Hit area: If a target CP named tCP i is located within a current slide window W i , the hit area HA tCpi is formulated by HA tCPi = [tCP i −hd i , tCP i +hd i ], where hd i is a distance constant. 6. Redundancy: If a resultant eCP i is falsely detected from W i when no target CP exists, then eCP i is identified as a redundancy, and recorded as Redund(eCP i ) = 1.
On the basis of these definitions above, the hit, miss, error, and redundancy rates, as well as search time, are then introduced as follows:

Results
In our synthetic simulation experiments, we evaluated the proposed RSW&TST framework by comparing it with the fixed slide window (FSW) approach and the CUSUM-based wild binary segment (WBS&CUS) method using metrics including hit, miss, error and redundancy rates respectively. Specifically, our TST method was verified against existing BST, KS and T methods under different RSW and FSW approaches. When our RSW&TST was applied for MCPs detection on different pathological signals in the clinical databases on PhysioNet [1,5,31], then the abnormal patterns were recognized in terms of the data features among resultant MCPs within abnormal data segments of epilepsy patients.

MCPs detection on synthetic time series
In the synthetic experiments, a synthetic time series sample X = {X 1 ,. . .,X i ,. . .,X N } was assembled by N pieces of data segments, in which X i ¼ fX i 1 ; . . . ; X i j ; . . . ; X i m g was composed of the random numbers N(μ, σ) of size m, and the parameters μ and σ were taken randomly from two sets u = {u 1 , u 2 ,. . ., u N } and σ = {σ 1 , σ 2 ,. . ., σ N } respectively. Therefore, the total N-1 target MCPs were assigned in the whole time series X.
In the first, a series of time series samples were synthesized with different numbers of target MCPs ranging from 30 to 120, then the proposed RSW&TST framework was tested by comparing it to BST, KS and T methods, respectively. As shown in Table 1, the results indicate that our TST has the highest hit-rate and the shortest computing time of all four methods, as well as relatively smaller miss, error and redundancy rates than most of the others. Specifically, the trend analyses in Fig 4 indicate that all tracks in the proposed TST keep more satisfactory levels, and more stable dynamics without drastic oscillations than the BST, KS and T methods in response to changeable MCPs, especially when the number of MCPs is much higher or lower.
The second simulations focus on testing the four methods above in the FSW framework, by using the sample of the fixed 30 tMCPs with different slide window sizes. Generally, the mean analyses listed in Table 2 reveal that our TST has relatively better performance due to having the shortest time and highest hit rate, as well as the lowest values of error and redundancy rates of all four methods. Unfortunately, all four methods have unsatisfactory and much lower efficiency than the former simulation results in the RSW approach. Furthermore, as for Nw ranging from 2^6 to 2^15, the trend analyses in Fig 5 show that our TST has much lower and more stable tracks of error and redundancy rates, but unstable hit and miss rates with more drastic fluctuations than the other three methods, especially when the size of Nw is much larger or smaller.
In addition, based on the synthetic sample with predefined 30 tMCPS under the TST-based FSW framework, some representative simulations were selected from the former experiments in Fig 5 above, then the results were plotted under Nw = 2^6, 2^11, and 2^15 respectively. The results shown in Table 3 and Fig 6 indicate that the TST-based FSW framework has the best efficiency as Nw takes a suitable value of 2^11, but much more sensitive and worse performance as Nw takes other values of 2^6, and 2^15. Therefore, it can be seen that a suitable size of slide window is very important for the efficiency of MCPs detection methods under the FSW framework.  In the final synthetic experiments, the simulations of MCPs detection under different numbers of tMCPs from 5 to 30 are implemented by using the WBS&CUSUM and the proposed RSW&TST respectively. Generally, the results in Fig 7 and Table 4 indicate that the WBS&CU-SUM tends to be unstable and inefficient as the number of tMCPs increases, due to smaller hit rate, bigger error and redundant rates, as well as longer search time. Especially, as shown in Fig 7  (F), a missed area appears in the last half of the time series sample, and none of the continuous tMCPs is detected from it. These results suggest that WBS&CUSUM is inefficient for MCPs detection on these synthetic datasets, probably because this 'greedy' method is hard to discern the transient and drastic data fluctuations from the regular and rhythmical oscillations, especially when the threshold of candidate change-point is unreasonable in the collection of CUSUM tests.
Meanwhile, the proposed RSW&TST is also executed for MCPs detection on these synthetic datasets. Compared with the WBS&CUSUM method above, as shown in Fig 8 and Table 5, the simulations indicate that our RSW&TST relatively keeps more stable and efficient as the number of tMCPs increases from 5 to 30, with higher hit rate, smaller missed, error and redundant rates, as well as shorter search time. These results suggest that our RSW&TST can successfully detect the target MCPs on these synthetic time series. A plausible reason is that it uses a global threshold of data fluctuation, and estimates the candidate change point orderly from the data segment in each of random slide windows, without any discarding or eliding operation in the MCPs detection procedure.
In summary, all these simulation results above verify that our RSW&TST framework has much better performance than that of both FSW approach and WBS&CUSUM method. It is an encouraging and efficient method for MCPs detection on large-scale time series.

Abnormal pattern recognition on pathological signals
In the real data experiments, the proposed RSW&TST framework was used for MCPs detection on pathological recordings in the CAP sleep [5] and Post-Ictal Heart Rate Oscillations in Partial Epilepsy databases on PhysioNet [1,31]. First, it was evaluated by comparing it to the existing FWS and WBS&CUSUM approaches, as well as the BST, KS and T methods, respectively. Second, in terms of the numbers and positions of the resultant MCPs, the abnormal patterns were roughly recognized in the stage of a sudden epileptic attack. Last, the severity of the patients having the attack was roughly discerned based on the data features among MCPs within different abnormal areas.
In the first experiment, one ECG sample was selected from 22 pathological signals in the nfle10m, which is one of 40 recordings of patients diagnosed with nocturnal frontal lobe epilepsy (NFLE) in the CAP sleep databases [5]. Typically, the diagnosed ECG sample shown in Fig 9 can be roughly divided into two normal segments near the left and right parts, as well as an abnormal region within a sudden attack area called Sa1 in the middle. By means of the FSW framework, our TST was executed for MCPs detection on this ECG segment during a transient period of an epileptic attack. For different sizes of slide window Nw from 2^10 to 2^14, the results in Fig 9(A)-9(E) show that the abnormal ECG region within Sa1 can be obviously discerned as Nw = 2^12, in spite of a redundant CP Rc1 near the normal left part. However, as the size of Nw is below the threshold of 2^12, the smaller Nw is, the greater the number of redundant CPs; as a result, the Sa1 is harder to identify. Now, assume the size of Nw is bigger than 2^12-the bigger Nw is, the more missed CPs there are. Therefore, the Sa1 is harder to discern. These results above indicate that, as the size of Nw takes a suitable threshold value, our TST method can roughly distinguish abnormal ECG segment area under a sudden epileptic

PLOS ONE
attack, but it might be invalid as Nw gets too big or too small. Unfortunately, it is very hard to determine an optimal threshold value of Nw for the FSW framework, due to the complicated data features, especially for large-scale pathological bio-signals. Meanwhile, our RSW&TST framework was tested further by using the same ECG sample with a sudden epileptic attack as above. Compared with existing BST, KS and T methods, the results shown in Fig 10 illustrate that our TSTKS can clearly distinguish the target abnormal segment area ASa1 between the resultant MCPs, without any redundant or missed points. However, for other three methods, the Asa1 is hard to discern according to the positions of resultant MCPs, due to the redundant points within the normal areas near the left and/or right parts, especially the missing beginning and/or ending in the target MCPs. These results suggest that the proposed RSW&TST framework can successfully distinguish the abnormal ECG segment under a sudden epilepsy attack, especially it does better without a suitable threshold of Nw used in the FSW approach.
Moreover, the proposed RSW&TST and existing WBS&CUSUM were executed for MCPs detection on different ECG signals from sz04m.mat in the Partial Epilepsy databases [1,31]. The ECG segments were selected from the sz04m recording with different start points Dstart = 3800, 24800, 415000, and 949000, and then the resultant MCPs were detected by using our RSW&TST and existing WBS&CUSUM, respectively. Generally, as shown in Fig 11, in terms of different locations of the resultant MCPs, our RSW&TST can efficiently discern the abnormal areas with drastic fluctuations including AZ-A1, AZ-B1,B2,B3, AZ-C1, and

PLOS ONE
AZ-D1, except of few redundant eCPs in Fig 11(A) and 11(D). However, the WBS&CUSUM method seems insensitive for the drastic fluctuations, especially some redundant eCPs are detected from the regular and rhythmic parts in these ECG signals. These results suggest that our RSW&TST is more efficient to recognize abnormal patterns from pathological ECG recordings of a patient in Post-Ictal Heart Rate Oscillations.
In the second experiment, our RSW&TST was executed for MPCs detection on pathological signals including ROC-LOC, SX1-SX2, EMG1-EMG2, Pleth, and Ox status, all of which were selected from the identical nfle10m of a patient diagnosed with NFLE in the CAP sleep databases [5], and then abnormal patterns were roughly recognized according to the numbers and positions of resultant MCPs obtained from different pathological signals. For the two EEG signals shown in Fig 12(A) and 12(B), the ROC-LOC seems more sensitive to a sudden NFLE attack, because it has the earliest start CP (EScp-a), as well as the continuous and dramatic fluctuations with bigger magnitude. On the other hand, the SX1-SX2 initially shows a slower response, and then has intermittent and intensive oscillations, as well as the latest end CP (LEcp-b). In addition, the EMG1-EMG2 in Fig 12(C) presents milder sensitivity due to weaker fluctuations and smaller swings, but has the largest number of resultant MCPs. For the Pleth in Fig 12(D), the periodical track is intermittently disrupted by irregular and moderate fluctuations. Fig 12(E) shows that the Ox status only takes several square waves with different widths, and has the latest start CP (LScp-e) and the earliest end CP (EEcp-e) of all five signals. To

PLOS ONE
some extent, these experimental results above probably suggest that two EEG signals ROC-LOC and SX1-SX2 are most correlated with the NFLE attack, due to the data features of more drastic fluctuations among different MCPs during the process of NFLE attack. Therefore, EScp-a in ROC-LOC and LEcp-b in SX1-SX2 can roughly work as two indicators to predict the start and end during the period of a sudden NFLE attack. In the last experiment, our RSW&TST was subsequently applied for MCPs detection on five ECG recordings of different patients in Post-Ictal Heart Rate Oscillations, which were selected respectively from sz01m to sz05m in Partial Epilepsy databases [1,31]. In our experiments, for the sz01m in Fig 13(A), it appears mainly as a bigger and more intensive abnormal segment AS-a2 with intermittent and sharp oscillations, which is composed of eleven gathered change points, and three smaller segments AS-a1, AS-a3 and AS-a4, which are locally scattered with a single change point. As for another sample sz02m in Fig 13(B), it roughly contains a whole abnormal area AS-b1 with ten change points in total, and it has more persistent and intensive fluctuations than those in the sz01m, and the longest lasting time of all five ECG samples. Compared with the sample sz02m, the sz04m in Fig 13(D) similarly has one abnormal segment AS-d1 including total five CPs, but it has shorter onset time, as well as more rapid and dramatic oscillations. For sz03m and sz05m in Fig 13(C) and 13(E) respectively, although both samples have the same three abnormal parts, sz05m with three abnormal areas is very similar to the sz01m except for having one more AS-a4, and sz03 looks more mild and has a slighter fluctuation in response to PE attack.

PLOS ONE
These experimental results suggest that abnormal segments in each of the ECG samples can be generally distinguished in accordance with the positions and numbers of the detected MCPs, and the severity of patients in the stage of partial epilepsy attack can be further evaluated in terms of the data features among different abnormal areas. Generally, the greater the number of MCPs is, the more abnormal zones there are. Specifically, the longer the time and the stronger the data fluctuations, the greater the severity of the partial epilepsy attack.

Conclusions
In this paper, a novel RSW&TST framework was proposed for MCPs detection on large-scale time series. In our method, an observed data sample was first divided into a series of data segments by means of the random slide window strategy, in which the slide window size was stochastically chosen from a predefined collection in terms of data characteristics and experimental knowledge. Then, the piece of data segment in each slide window is diagnosed by using a TST-based CP detection procedure, and a potential change point is estimated from the top root to the bottom leaf nodes by using multi-channel search criteria in the target TST. Finally, the resultant MCPs were assembled by a series of single change points in each slide window.
In our synthetic simulations, our RSW&TST was evaluated by comparing it with the existing FSW and WBS&CUSUM, as well as BST, KS and SSA methods, in terms of computing time, the hit, missed, error and redundancy rates etc. The experimental results show that our RSW&TST has better performance because of a higher hit rate, as well as lower rates of computing time, miss, error and redundancy than other BST, KS and T methods in the RSW or FSW frameworks, as well as the WBS&CUSUM method, respectively. Furthermore, the proposed RSW&TST is applied for MCPs detection on pathological recordings of patients in partial epilepsy, and nocturnal frontal lobe epilepsy (NFLE) respectively. The experimental results of MCPs detection on different ECG signals also indicate that our RSW&TST has better performance than that of the existing FSW, WBS&CUSUM, and other BST, KS and T methods. Especailly, for each of the pathological signals, the abnormal parts are distinguished by the resultant MCPs, and the abnormal patterns are roughly recognized in terms of the numbers and positions of the resultant MCPs. Thus, the severity of patients in an epileptic state can be roughly analyzed based on the strength and duration of data fluctuations during the period of sudden epileptic attacks.
Our RSW&TST framework, although preliminary and simple, provides a novel and efficient method for MCPs detection quickly and efficiently, as well as a very flexible platform for abnormal pattern recognition from large-scale pathological signals.