Structural change detection in ordinal time series

Change-point detection in health care data has recently obtained considerable attention due to the increased availability of complex data in real-time. In many applications, the observed data is an ordinal time series. Two kinds of test statistics are proposed to detect the structural change of cumulative logistic regression model, which is often used in applications for the analysis of ordinal time series. One is the standardized efficient score vector, the other one is the quadratic form of the efficient score vector with a weight function. Under the null hypothesis, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the former statistic performs better when the change-point occurs at the centre of the data, but the latter is preferable when the change-point occurs at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two test statistics to a group of sleep data, the results show that there exists a structural change in the data.


Introduction
In categorical data analysis, ordinal categorical variables are frequently encountered in many contexts, such as health status (very good, good, so-so, bad, very bad), blood pressure (low, normal, high). The data observed hourly or daily constitutes an ordinal time series. The cumulative logistic regression model is often applied for analyzing the ordinal time series [1]. Sometimes the model may change at some unknown time moments (change-points) while it remains stable between these points. Structural stability is of prime importance in statistical modeling and inference. If the parameters have changed with the observed sample, inferences can be severely biased, and forecasts lose accuracy. Because of the importance of parameter stability, it is necessary to detect the structural change. Studies of structural change detection has been a popular research subject in statistics, see Csörgö and Horváth [2], Bai and Perron [3], Lee et al. [4], Perron [5], Gombay [6], Wang et al. [7], Chen et al. [8], Baranowski et al. [9], Wang et al. [10], Chen [11] and Liu et al. [12] for reviews of the field. Structural changes detection in categorical data have been considered as well. Höhle [13] proposed a prospective CUSUM change-point detection procedure to detect a structural change in categorical time series; Wang et al. [10] described a procedure based on high-dimensional homogeneity test to detect and estimate multiple change-point in multinomial data; Plasse and Adams [14] illustrated a multiple change-point detection method for categorical data streams, which could adaptively monitor the category probabilities. As generalized linear regression models for categorical time series allow for parsimonious modeling and incorporation of random time-dependent covariates, Fokianos and Kedem [15] suggested the generalized linear model for categorical time series modeling. For change-point detection in the generalized linear model, Xia et al. [16] introduced two procedures to sequentially detect the structural change in generalized linear models with assuming independence; Hudecová [17] investigated the detection of change in autoregressive models for binary time series; Fokianos et al. [18] provided a statistical procedure based on the partial likelihood score process to detect a structural change in binary logistic regression model; Gombay et al. [19] and Li et al. [20] discussed retrospective change detection and sequential change detection in multinominal logistic regression model.
Score test for detection of changes in time series models has been studied by Gombay and Serban [21], Gombay et al. [22]. The test statistic is usually computationally less demanding than the likelihood ratio test statistic. In this paper, we first propose a test statistic based on the efficient score vector to detect a structural change in cumulative logistic regression model, which extends the change-point detection of Gombay et al. [19]. Simulation shows that the empirical power of the proposed statistic is low when the change-point occurs at the beginning or end of the data. To this end, we propose a new statistic, which is the quadratic form of the efficient score vector and has a weight function. Under the null hypothesis of no change, we derive the asymptotic distribution of the two statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. The empirical power of the former statistic is higher when the change-point is located at the centre of the data, but the latter performs better when the change-point is located at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two statistics to study a group of sleep data, and find a structural change in the data.
Let θ = (α 1 , . . ., α q , β 0 ) 0 be a p-dimensional parameter vector, p = q + d. In this paper we wish to test if there exists a structural change in the parameter θ, that is, . . . ; n; where θ 0 is the true value of θ, k � denotes the change-point which occurs in some of the parameter θ, θ 0 6 ¼ θ � 0 , θ 0 , θ � 0 and k � are unknown. Next, we estimate the parameter vector θ by the partial likelihood method (Fokianos et al. [18]). The partial likelihood function and the partial log-likelihood function are defined in Gombay et al. [19]. Denote the partial score vector To obtain the existence, consistency and asymptotic normality of the maximum partial likelihood estimator, we give a few assumptions on the the covariate matrices {Z t } and parameter vector θ.

Assumption 3 The covariate matrix Z t−1 lies almost surely in a non-random compact subset
Assumptions 1 and 2 ensure that the second derivative of l(θ) is continuous, det(@ h(η t )/@ η t ) 6 ¼ 0 implies that U t (θ) is not singular (Fokianos and Kedem [24]). From Assumption 3, is positive definite with probability one [24]. Since the likelihood estimation employs an assumption regarding ergodicity of the joint process ðY T t ; z T t Þ T (Fokianos and Truquet [25]), let {Y t } be a time series taking values in a finite set E with cardinal m, and such that , q is a transition kernel. We assume that the applications ðo; y; xÞj ! qðojy; x À tÀ 1 Þ are measurable, as applications from For y, y 0 2 E m and a positive integer s, we write y ¼ [26]). Assumption 4 The d-dimensional covariate vector {z t−1 } is stationary and ergodic.
Assumptions 4 and 5 guarantees that ðY T t ; z T t Þ T is stationary and ergodic [26]. Assumptions 1-5 are required to obtain consistency and asymptotic normality of the maximum likelihood estimator. However, existence of moments for the covariate process is still required to study large sample properties of the maximum likelihood estimator [25]. So we have

The proposed testing procedure
Based on the partial likelihood score process, a test statistic is defined by whereT n ¼ 1 n S n ðθ n ÞS 0 n ðθ n Þ,θ n is the maximum partial likelihood estimator of θ, which can be obtained by maximizing the partial log-likelihood function (1) (see Fokianos and Kedem [23]).
Under the null hypothesis of no change, we derive the asymptotic distribution of the proposed test statistic.
Theorem 1 If Assumptions 1-6 and H 0 hold, then we have let S ðiÞ k denote the i-th element of S k , i = 1, 2, . . ., p, θ 0 is the true value of θ, then we have Next, it is similar to the proof of Proposition 3 in Gombay et al. [19], we can prove that By Theorem 4.1 of Fokianos and Kedem [23], we get The error terms E ðiÞ kn have higher orders of products of ðθ ðiÞ n À θ ðiÞ 0 Þ, it can be shown that E ðiÞ kn ¼ o P ð1Þ. According to Proposition 1 (Gombay et al. [19]) and Slutsky's theorem, we get BðtÞ as n ! 1. Remark 1 When using the above test, if there exists some i, the null hypothesis is rejected and a change-point occurs, α � = 1 -(1α) 1/p . Let B(u) be a onedimensional Brownian bridge, Csörgö and Révész [27] suggested that C(α � ) could be obtained by ðÀ 1Þ kþ1 expðÀ 2k 2 x 2 Þ: Simulation shows that W 1 has poor performance at the boundaries. In particular, the limiting Brownian bridge is tied down at t = 0 and t = 1 (meaning B(0) = B(1) = 0), and hampers the ability of the test to detect the structural change occurring near the beginning or end of the data. Many authors address this problem by adding a weight function [28]. Therefore, we construct a new test statistic which is the quadratic form of the efficient score vector and has a weight function, where Theorem 2 If Assumptions 1-6 and H 0 hold, then we have The conclusion of Theorem 2 can be deduced directly from Theorem 1. To obtain the critial values of the asymptotic distribution, Csörgö and Horváth [2] used a result of Vostrikova [29] to show that Under the alternative hypothesis, there exists a structural change in the model, then we will prove the consistency of the two statistics.

Theorem 3 Suppose Assumptions 1-6 and H A hold, if the coefficient changes from
(ii).
where 0 < l < h < 1, k�k denotes the Euclidean norm of a vector, ! P means convergence in probability.
Once the null hypothesis is rejected, indicating there may exist a change-point, then we locate the change-point position bŷ The following theorem shows that the change-point estimatork � is consistent for the true change-point k � , as n ! 1.
By the proof of Theorem 1, we have By Theorem 2 of Gombay [6], to prove (2) it is enough to show that where f k ¼ n À 1=2 S ðiÞ k ðθ n Þ À k n S ðiÞ n ðθ n Þ � � . To prove (3), assume that there exists a constant K, K By Theorem 1, choosing δ > 0 arbitrarily if K is large enough, so (3) is proven. The proof of (4) is the same by symmetry. If we consider detecting multiple structural changes in the sequences, we can employ the binary segmentation method [30]. First use the single change test. If H 0 is rejected, then find k � ð1Þ by (2). Next divide the sample into two subsamples fY t ; 1 � t �k � ð1Þg and fY t ;k � ð1Þ > t � ng, and test both subsamples for further changes. One continues this segmentation procedure until no subsamples contain further change-points.

Simulation
To evaluate the finite sample performance of the proposed two test statistics (W 1 and W 2 ), we first simulate an ordinal time series {Y t } with m = 3 categories and length n = 100, 200, 500, 1000. The data are generated by where α 1 = −0.5, α 2 = 0.2, (β 1 , β 2 , β 3 ) 0 = (2, 0.5, 1) 0 , then the parameter vector θ = (α 1 , α 2 , β 1 , β 2 , β 3 ) 0 . All simulation results are based on 1000 replications at the 0.05 significance level. Suppose that we are only interested in α 1 and β 1 , the others are nuisance parameters. Table 1 shows the empirical size of the two statistics under the null hypothesis H 0 . W 1 1 and W 2 1 denote the empirical size of W 1 when testing for change in each of α 1 and β 1 , respectively. W 1 and W 2 denote the empirical size of W 1 and W 2 when testing for change in both α 1 and β 1 , respectively. It can be seen from Table 1 that the empirical size increases as the historical sample size n increases. When the sample size n = 1000, the empirical size of W 1 and W 2 is close to the significance level 0.05. In addition, based on the relation between the probability of type I errors when detecting α 1 or β 1 and the overall probability of type I errors, that is W 1 1 , W 2 1 and The results show that 1 À W 1 � ð1 À W 1 1 Þð1 À W 2 1 Þ, which confirms the above inference. Under the alternative hypothesis H A , we consider the following three different situations:

H ð1Þ
A : a 1 changes from À 0:5 to À 1 at k � ; b 1 changes from 2 to 3 at k � ; H ð2Þ A : a 1 changes from À 0:5 to À 1 at k � ; H ð3Þ A : b 1 changes from 2 to 3 at k � : where k � = 0.1n, . . ., 0.9n. Tables 2-4 summarize the empirical power of W 1 and W 2 under the alternative hypotheses H ð1Þ A , H ð2Þ A and H ð3Þ A when k � = 0.1n, 0.5n, 0.8n. W 1 1 and W 2 1 denote the empirical power when testing for change in each of α 1 and β 1 , respectively. W 1 and W 2 denote the empirical power of W 1 and W 2 when testing for change in both α 1 and β 1 . From the simulation results, it can be seen that the empirical power of the two statistics increases with the sample size n, and is close to 1 when n = 1000. In addition, The empirical power of the two statistics varies according to different change-point locations, and reaches maximum when k � = 0.5n. Fig 1 describes the empirical power of the two statistics when k � = 0.1n, . . ., 0.9n. It is showed that the empirical power of W 1 is higher than that of W 2 when the change-point is located at the centre of the data, but W 2 performs better when the change-point is located at the beginning or end of the data. In simulation for Table 2 both α 1 and β 1 change, whereas in Tables 3 and 4 only α 1 and β 1 changes at different change-points. Tables 3 and 4 indicate that most power stems from the parameter that is changed, which means W 1 that could not only detect change in parameters, but also find the reason for rejecting the null hypothesis.

Application to real data
To illustrate the applicability of our results, we use 1000 sleep data (Y t ) collected from the sleep state measurements of a newborn infant sampled every 30 seconds (Fokianos and Kedem [23]). The sleep states are classified as follows: (1) quiet sleep, (2) indeterminate sleep, (3) active sleep, (4) awake (Fig 2). According to the newborn's sleep pattern, the sleep states have the following order: "(4)" < "(1)" < "(2)" < "(3)", which means {Y t } is an ordinal time series. One goal of analyzing these data is to establish a correct model, and predict the sleep state based on the covariate information. Refer to example 6.3 of [23], Y t−1 = (Y (t−1)1 , Y (t−1)2 , Y (t−1)3 ) 0 is a significant predictor, which can be considered as a covariate. Then these data could be modeled by a cumulative logistic regression model where α 1 = −14.722, α 2 = −10.389, α 3 = −4.078, β 1 = 18.663, β 2 = 12.173, β 3 = 7.566. Let θ = (α 1 , α 2 , α 3 , β 1 , β 2 , β 3 ) 0 , then testing whether there exists a structural change in θ, the result finds that a structural change occurs in θ by computing the test statistics W 1 and W2. After this, using W 1 to check which parameter occurs a structural change, the result shows that there exists a structural change in α 2 at 596. Specifically, the maximum of W 1 is 3.446, and the critical value is 1.35 when p = 1, α = 0.05, which gives a significant result (Fig 3). Re-estimate the parameters based on the first 596 samples and the last 404 samples, we haveâ 2 ¼ À 10:799 for the former andâ 2 ¼ À 8:57 for the latter. We obtain AIC = 1646.65 for the adjusted model, and AIC = 1652.89 when assuming there is no change-point, which means to improve the model in some extent, so that we can make accurate predictions.

Concluding remark
Cumulative logistic regression model is a generalized linear model, and has a wide application in health care. In this paper, two test statistics based on the efficient score vector are proposed to detect the structural change of cumulative logistic regression model. Under the null hypothesis of no change, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. Furthermore, we prove the consistency of the  change-point estimator, and a binary segmentation procedure is provided for estimating the locations of possible multiple change-points. The finite sample performance is investigate by a monte carlo simulation, the results shows that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. From the empirical power of view, the two test statistics have different advantages when the change-point occurs at different locations. Furthermore, the proposed statistic W 1 could find the reason for rejecting the null hypothesis. Finally we apply the two test statistics to study 1000 sleep data collected from the sleep state measurements of a newborn infant sampled every 30 seconds, the results shows there exists a structural change in the model.