Determining the Stationarity Distance via a Reversible Stochastic Process

The problem of controlling stationarity involves an important aspect of forecasting, in which a time series is analyzed in terms of levels or differences. In the literature, non-parametric stationary tests, such as the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests, have been shown to be very important; however, they are affected by problems with the reliability of lag and sample size selection. To date, no theoretical criterion has been proposed for the lag-length selection for tests of the null hypothesis of stationarity. Their use should be avoided, even for the purpose of so-called ‘confirmation’. The aim of this study is to introduce a new method that measures the distance by obtaining each numerical series from its own time-reversed series. This distance is based on a novel stationary ergodic process, in which the stationary series has reversible symmetric features, and is calculated using the Dynamic Time-warping (DTW) algorithm in a self-correlation procedure. Furthermore, to establish a stronger statistical foundation for this method, the F-test is used as a statistical control and is a suggestion for future statistical research on resolving the problem of a sample of limited size being introduced. Finally, as described in the theoretical and experimental documentation, this distance indicates the degree of non-stationarity of the times series.


Introduction
A time series is a sequence of data points collected over time, and the analysis of time series is a very important issue in economic and engineering forecasting. In the engineering literature, state-space methods have been developed for sequential data analysis. Several researchers have attempted to bridge the gap between engineering methods and statistics [1,2]. Recently, two advanced research areas have been combined using this approach: the methods of nonlinear time-series analysis and the theory of complex networks. Multivariate time-series analysis is used to model and explain the interactions and co-movements among a group of time-series variables. Specifically, Gao et al. [3][4][5][6] recently developed some important multivariate time series analysis methods, especially complex network-based methods, to uncover complicated flow mechanism underlying industrial multiphase flow system. Time-frequency (TF) analysis is considered to be one of the most significant active fields with respect to this issue. Because selection for tests in which the null hypothesis is stationarity, their use should be avoided, even for so-called 'confirmation' [12].
In conclusion, parametric stationary tests, such as KPSS tests, are very important but suffer from problems with reliability in the lag and sample size selection that limit their applicability.
The aim of this study is to introduce a new method that measures the distance in terms of the measurement error between mirror time series. This error corresponds to the theoretical distance that must be investigated to determine whether a time series is stationary. Using this approach eliminates the aforementioned problems with lag and sample size selection. Furthermore, the distance calculated via this method could be used as a specific property in a multiscale complex network.
This distance is based on a stationary ergodic process, which relies on the following three (3) considerations: 1. Any reversible process is stationary, and any time reversal of a reversible process is also stationary [13][14][15].
2. If {X n ,n 2 R} is stationary, then the time-reversed process fX n ; n 2 Rg is also stationary.
3. The possible metric deviation (distance) between the unpredictable series Y n &Ỹ n can be used as a measure of the degree of irreversible progress; this is implemented using the Dynamic Time-warping (DTW) technique. DTW is adopted because it is considered to be the better of the two time-series methods [16] because its metric is superior to the classic Euclidean distance metric. This metric, which corresponds to the degree of irreversibility, is called the "Stationarity Distance", and it is a measure of the non-stationary characteristics of a time series and is used in the above-described process.
To corroborate these assertions, a series of statistical procedures is applied using the proposed method to analyze a specific data set. Additionally, a KPSS test (with a null hypothesis of stationarity) is also applied to the same data [17], and the reliability of the proposed method is evaluated on suitable (with stationary and non-stationary properties) time-series data. For further verification, the selected (non-)stationary data segments are visually inspected by plotting. More details on this procedure are presented in the Experimental Procedure section.
Finally, this study can be broadly divided into four sections. In the first section, Methodology, the definitions of the reversible stochastic process (RSP) and KPSS methods are given. In the second section, Experimental Procedure, the experimental methods and the results obtained with both methods are presented. In the third section, the statistical evaluation of the RSP method is analyzed. Finally, in the fourth section, the conclusions and plans for future research on the RSP method are described.

The RSP Method
This method is based on the following formulation: A discrete time-stationary process {X N ,i = 1,. . .,N} is time reversible if, for every positive integer N [14], and a discrete time series f X ! N ; i ¼ 0; 1; . . . ; Ng produces a corresponding mirror time Thus, given Eq 2, the proposed algorithm is based on the following: If the error is zero (error = 0), then the time series X ! N consists of a stationary process, as defined in Eq 1, where the estimated error is based on the dissimilarity measure between the discrete time series X ! N and the reversible time series X N . The physical meaning of this distance is the process's degree of stationarity.

Error calculation using DTW
Numerous time series dissimilarity measures have been proposed [18]. However, the most common measures, which were proposed long ago, remain the most competitive ones. The most-used metric distance function is the Euclidean distance, which obeys the metric properties: triangle inequality, non-negativity, identity, and symmetry. This function remains amazingly competitive [19] with other, more complicated metrics, particularly for large data sets [20]. Euclidean distance and its variants have several drawbacks that make their use inappropriate in certain applications, such as sensitivity to noise, shifting and scaling [19,21] DTW lends robustness to the similarity computation. Using this method, time series of different lengths can be compared because DTW replaces the one-to-one point comparison, used in Euclidean distance, with a many-to-one (and vice versa) comparison [19]. Thus, one of the main conclusions of comparison studies [22,23] is that, despite newly proposed methods, the Euclidean and DTW [20,24] dissimilarity measures remain two of the most robust, simple, generic, and capable measures. Additionally, to measure the shape similarity in the sign data set, alignment-based distances (such as DTW) are more suitable; in contrast, the Euclidean distance is not robust to distortions in time and other noise [25]. Because the DTW algorithm is more efficient than the Euclidean distance with respect to noise sensitivity [20,24,25], the DTW is selected for use in the error calculation based on Eq 3 to obtain more reliable measurements than would be possible using other, similar techniques.
Assume that the local dissimilarity of function {f} is defined between any pair of elements x i^yi under the constraint Then, if a given path is the lowest-cost path between two series, the corresponding technique (DTW) [26] lays the warping curve φ(k), 8 k = 1,2,. . .,T: The warping functions φ x^φy remap the time indices of X^Y correspondingly. Given φ, the average accumulated distortion between the warped time series X^Y can be calculated [26] as follows.
where m φ (k) is a per-step weighting coefficient, and M φ is the corresponding normalization constant, which confirms that the accumulated distortions are comparable along different paths. To ensure reasonable warps, constraints on φ are usually imposed. The idea underlying DTW is finding the optimal alignment φ such that Therefore, one chooses the distortion of the time axes of X^Y that brings the two time series as near to each other as possible.

Procedure formulation
The measure of the degree of non-stationarity is calculated according to the scaling of min φ fd φ ðX N ; Y N Þg > 0, keeping in mind that a time series {X N } is stationary when Then, according to these definitions, the implementation of this method can be analyzed as follows (Fig 1):

Comparison with the adaptive KPSS test
According to the KPSS model [8,9], the null hypothesis of the stationarity trend corresponds to the hypothesis that the variance of the random walk (r t ) equals zero, which is expressed by the following equations.
where r t ¼ r tÀ 1 þ u t ; ð10Þ and t = 1,2,3,. . .,T, which are expressed in terms of the number of observations. Thus, if we suppose that the null hypothesis is determined to be ξ t = ξ 0 , i.e., is constant or H 0 : s 2 e , this hypothesis defines the stationary process against the hypothesis, H A : s 2 e > 0 which defines the nonstationary process. assuming that e t are the estimated errors, which are computed as the residuals of regression {y t } and are given by e t ¼ y t À " y, then the value of KPSS is calculated using Eq (11).
where p is the order (lag) of an autoregressive AR(1) model. In other words, the partial autocorrelation at lag p is equal to the estimated AR coefficient in an autoregressive model with p coefficients [8]. The parameters of Eq 11 are defined as follows. where σ 2 is the long-run variance of e t , with is the consistent estimator of σ 2 , which can be constructed from the residuals e t , where p is the truncation lag and is the optional weighing function that corresponds to the specific choice of window [10].

Experimental Results
For the input data, a long data set y t of sample size T = 1560 was selected from the Time Series Data Library [17,27]. This data set contains internet traffic data (in bits) from an internet service provider (ISP) and aggregated traffic in the backbone of the United Kingdom academic network. It was collected between 19 November 2004 (09:30 hours) and 27 January 2005 (at 11:11 hours) as hourly data [17,27]. This interval was chosen because this time series contains segments with (non-)stationary features (Fig 2). Additionally, for calibration, we tested both methods on a classic stationary-shaped function [28,29] called the "Mexican Hat", which was implemented in the numerical series to investigate the reliability and sensitivity of the tested methods with respect to the detection of a classic simulated stationary series (Fig 3 and Table 1). Thus, these simulated data were generated into T = [20, 40, 60, 100, 500] values to investigate the abilities of both methods to recognize this classic simulated stationary series as being stationary (Table 1). Similarly, the methods were also tested on a classic non-stationary function [28,29] called the "sinusoidal-shape" using simulated data with T = [32, 63, 126, 629, 1527] values (Fig 4 and Table 2).
Next, the processing of these data is implemented using two processing methods: RSP and KPSS. The above-described data set [16] is segmented into data sets of various sizes ranging from 42 and 301 (Tables 3 and 4). This segmentation was performed by applying an empirical criterion. The selected weak stationary data and non-stationary segments were obtained by visually inspecting the data using the plots. Specifically, the weak stationary data were segmented using visual criteria, such as stable mean and variance. These segments have sinusoidal shapes with visually apparent random phases. This selection was adopted because these data represent ergodic and weak stationary processes [30].
Furthermore, this segmented time series was verified using the KPSS criterion, and the null hypothesis results of the first two lags (Tables 3 and 4) were investigated. Finally, the statistical results and the series diagrams are depicted in Table 1.

KPSS processing
The KPSS processing was implemented via the kpsstest.m function in Matlab. Via this procedure, we investigate the possible stationary positions of the candidate segment for two specific tests-0-and 1-autocovariance lags in the Newey-West estimator of the long-run variance, each conducted at a 0.1 level of significance-using a significance level of a = 0.01. Then, we calculate the KPSS value according to Eq (8) and the exact probability value (p-value). The pvalue of a statistical hypothesis test indicates the probability of obtaining a value of the test statistic that is as extreme as or more extreme than that observed by chance alone. If the null hypothesis H 0 is true, then the p-value determined by a KPSS test will be low, indicating an increased probability of rejecting the hypothesis of stationarity. In this case, the null hypothesis H o (hp = 0) is accepted when the p-value, which is calculated using the kpss.m function of Matlab, is less than or equal to 0.01. For each candidate segment, two p-values are calculated for each lag (0 and 1) (Tables 2 and 3), whereas for the Mexican Hat, nine lags (0-9) are used ( Table 1) [9].

RSP processing
The RSP processing is implemented in 3 steps according to the procedure depicted in Fig 1. Specifically, in the first step, the selected time series is processed using a reversible procedure, according to the first and second definitions (Eqs (1) and (2)). For example, for a given time series (series 14 and Fig 5), using the reversible procedure, the reverse time series can be produced (series 15 and Fig 6).
In the second step, DTW = 0.3661 is calculated between the two time series using Eq (6), which is implemented via the dwt.m function of Matlab (Fig 7). Finally, in the third step, the result of the characterization of the investigated time series X n is extracted according to the evaluation procedure; this result is analyzed in Section 4.

Results
As seen in Table 1, the decision hypothesis with H 0 = 0, i.e., the acceptance decision for stationarity, depends on the sample size (Table 5) because the minimum-order lag increases dramatically according to a proportional relation. In contrast, for the RSP distance, the decision hypothesis depends on the sample size that consistently yields a zero RSP distance. Furthermore, in Tables 2 and 6, the decision hypothesis with H 0 = 1, i.e., the rejection decision for stationarity, is presented for all five cases from the first lag. However, the rejection is very strong for segments with sizes exceeding 100. In contrast, in the RSP method, the calculated distances are much greater than one, indicating clear differentiation between the two investigated series types (Mexican Hat and sinusoid). Furthermore, in the sinusoid case (Table 6), the RSP distance is half of the corresponding sample size.
Furthermore, the results shown in Tables 3 and 4 indicate that, in the non-stationary decision, KPSS is very sensitive to the calculation of the orders of the lags. The characterized case of  Table 3, which is referred to as segment (200:300) in the time series in the ISP database). In contrast, the RSP distance yields stationary progress <10 −2 , whereas the non-stationary progress is >10 −1 . The statistical verification of this remark is provided in Section 4. The most obvious method of estimating the similarity or dissimilarity between time series is to calculate a metric distance directly. In this case, the DTW between the original series is selected. For a small data set, this method may be feasible. However, for large data sets, it is problematic because the time complexity is O(n Ã N), where n is the number of features that must be obtained for each time series, and N is the number of time series in the data set. To calculate the similarity and index efficiently, many techniques for dimensionality reduction, such as the discrete Fourier transform (DFT), the piecewise aggregate approximation (PAA), and the discrete wavelet transform (DWT), have been proposed. These techniques allow a time series X of arbitrary length n to be represented with a time series of length w, where w< n [18].
The PAA is a very simple dimensionality-reduction method for time-series mining. For the time series X N (Eq (1)), a new reduction time series X M is obtained, which is calculated using       In this case, the proposed method is tested using the PAA method to examine the efficiency of this method with respect to dimensionality reduction. Then, eight examples from Tables 3  and 4 are tested by attempting to reduce their dimensionality by~50%, and the results are presented in Table 7. Furthermore, the results presented in Tables 1 and 2 demonstrate the efficiency of this dimensionality reduction because the stationary (Table 1) and non-stationary ( Table 2) cases were tested using the same curves (Mexican hat and sinusoidal) with different sizes (20-1257) and yielded stable decisions. In Table 1 (stationary case), the reduction of the dimensionality from 500 to 20 yielded a value of zero in every case, unlike the KPSS method, which showed sensitivity to this variation. The same stable results were obtained for the results shown in Table 2 (non-stationary case), with dimensionality reduction from 1257 to 32. This case, the obtained distances are >15, and the KPSS method suffered from the same problems with the decision as mentioned for Table 1.
As shown by the results in Table 7, the error of the distances between the low-dimensional data and the original time series is very small (<0.005) and, in every case, the decisions on non-stationarity or weak stationarity remain the same, as indicated in Tables 3 and 4.

Statistical processing
In this section, the populations of the variances, means and medians of the calculated distances (Tables 3 and 4), which were extracted separately for both classes (stationary and non-stationary), are investigated (Table 7). This investigation focuses on the following two queries: 1. Why is the diversity higher than those of the two aforementioned populations? 2. What is the sample size required for the calculated diversity to achieve statistical accuracy?  Table 5. Summary of the results in Table 1. Determining the Stationarity Distance Answering these queries could involve accurately measuring of the classification of the time series, i.e., stationary or non-stationary. To this end, the two-sample F-test for equal variances (homogeneity) is adopted because the hypothesis of the homogeneity of variance is known to be significant in the classification stage of discrimination inquiries. Implementing this process returns a test decision for the null hypothesis H 0 according to the data in vectors x and y, which states that x and y are drawn from normal distributions with the same variance [32]. The alternative hypothesis H A is that they come from normal distributions with different variances. The result h will be 1 if the test rejects the null hypothesis at the 5% significance level and 0 otherwise. Thus, by using this test, the possible diversity of variances for the aforementioned populations is obtained. The F-test is calculated according to Eq (19): Under the null hypothesis, the test statistic F has an F-distribution with degrees of freedom (equal to n 1 −1) in the numerator and degrees of freedom (equal to n 2 −1) in the denominator, where n 1 and n 2 are the sample sizes of the two data sets (x and y). Using the data in Table 4, the coefficient can be calculated via Eq (13): F = 6.4240e-04 with p = 4.4234e-26; the confidence intervals for the x and y vectors are 0.0003 and 0.0016, respectively, and DF = 19. Thus, H 0 is strongly rejected because F < F 1À a;n 1 À 1;n 2 À 1 ¼ 2:53 according to a two-tailed test.
Answering the second query could solve crucial issues regarding the collection of difficultto-collect sampling data, such as medical signals. One important aspect of designing an experiment is determining how many observations are needed to draw conclusions with sufficient accuracy and confidence. The required sample size depends on many factors, including the type of experiment being contemplated, how it will be conducted, the available resources, and the desired sensitivity and confidence. Thus, the required sample sizes for the aforementioned two populations (stationary and non-stationary) are identified via an iterative procedure, resulting in a series of successively improving estimates of the required n [32]. To this end, the following equation is used: where s 2 p ¼ and SS is the sum of the squares of the deviations from the mean. This values is called the sum of squares and is defined as Then, if we set d equal to half the width of the confidence interval [8], the interval limits for the difference between the two "populations means" can be estimated, and d is approximately y j 2 with 2(n−1) degrees of freedom. Suppose the difference between the populations is μ 1 −μ 2 . For 95% confidence, the interval must be no wider than d ¼ j0:0028À 0:0730j 2 ¼ 0:0351. Therefore, SS 1 = 6.1009e-05, SS 2 = 0.0950, and the following quantity is determined using Eq (12): Table 7. Dimensionality reduction results obtained using the PAA algorithm for RSP methods applied to times series.   Tables 3 and 4. x (Stationary) ( Then, let us suppose that a sample size of 50 is necessary, with 2(50−1) = 98 degrees of freedom. Thus, t 0.05(2),98 = 1.984. According to Eq (14), the limiting sample size n can be calculated using an iterative procedure [16]. First, we calculate n ¼ 2 Ã 0:0025 2 Ã 1:984 2 0:0351 2 ¼ 15:9983: Next, it is possible to determine the estimate using n = 16, for which t 0.05 (2) According to the proposed convergence procedure [8], the value of the sequential iteration is less than 0.1. Therefore, a sample of size at least 17 (i.e., more than 16) should be obtained from each of the two populations to achieve the specified confidence interval (see Table 8).

Conclusion
This paper investigates the measurement of the stationarity distance of a stochastic series. The proposed method is based on the assumption that each series deviates from a stationary state and that the series would itself be stationary if it satisfied a particular condition. This particular condition is expressed by a reversible property, which contains the stationary series. The estimated deviation of this condition is called the stationarity distance or measurement error between mirror time series. This distance is based on a novel stationary ergodic process, in which the stationary series has reversible symmetric features and is calculated using the DTW algorithm via a self-correlation procedure.
Additionally, for verification, this method is compared with the KPSS test. The results of this comparison indicate that the proposed method solves several problems associated the KPSS test, including those relating to the sample size and the prediction of the order lag on which the null hypothesis is based.
Furthermore, to obtain additional statistical evidence supporting the utility of this method, as a statistical control, the F-test was used. Resolving the problem of the introduction of a sample of limited size is a topic for future research. The testing results for both methods regarding the simulated stationary series (Mexican Hat) and non-stationary (sinusoidal) series revealed the superiority of the RSP method, particularly for large sample sizes >100 (Tables 1, 2, 5 and 6). Additionally, the test performed using real data verified the weaknesses of KPSS, particularly for some cases (e.g., cases 4 and 17; Table 4) in which the null hypothesis is accepted for a visually verified non-stationary time series. Furthermore, in both tests, the RSP showed good agreement with the expected results. Additionally, the results of the F-test showed that the RSP-distance testing groups (weak-stationary in Table 3 and non-stationary in Table 4) have non-homogeneous properties, i.e., the distances of each group are drawn from normal distributions with different variances. This finding indicates that the selected distance populations can be differentiated from each other at the 95% confidence level. Furthermore, the selected sample size of each group (20) was found to be sufficient for this statistical analysis because it is greater than 17, the value calculated in Section 4.

Future Research
This method could be applied to additional time-series difference data to investigate the variation in the stationarity distance over time. This research could be valuable for making predictions using diagnostic medical data. For example, electroencephalography (EEG) is a noninvasive and accessible method that is widely used to measure brain function and make inferences about regional brain activity. The stationarity of EEG has been studied by many researchers, but the stationarity of EEG segments with event-related potentials (ERPs) remains concerning in many abnormal cases. Thus, the proposed method could provide a practical solution for measuring the stationarity of EEG segments very quickly and accurately.
Furthermore, this method could be used in directed weighed-complex networks. For example, in a typical study [33], the weighed-complex network is constructed using the phase-space distance calculation. This model could be modified to use RSP instead of the phase-space distance to spatially depict the stationarity properties of the investigated times series, which could have very important applications, such as EEG and seismography.