Figures
Abstract
Change-point detection in health care data has recently obtained considerable attention due to the increased availability of complex data in real-time. In many applications, the observed data is an ordinal time series. Two kinds of test statistics are proposed to detect the structural change of cumulative logistic regression model, which is often used in applications for the analysis of ordinal time series. One is the standardized efficient score vector, the other one is the quadratic form of the efficient score vector with a weight function. Under the null hypothesis, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the former statistic performs better when the change-point occurs at the centre of the data, but the latter is preferable when the change-point occurs at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two test statistics to a group of sleep data, the results show that there exists a structural change in the data.
Citation: Li F, Hao M, Yang L (2021) Structural change detection in ordinal time series. PLoS ONE 16(8): e0256128. https://doi.org/10.1371/journal.pone.0256128
Editor: Christophe Letellier, Normandie Universite, FRANCE
Received: February 26, 2021; Accepted: July 29, 2021; Published: August 16, 2021
Copyright: © 2021 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting information files.
Funding: This work was supported by National Nature Science Foundation of China (No.: 11801438, URL:https://isisn.nsfc.gov.cn/egrantindex/funcinde/prjsearch-list, Fuxiao Li, Xi’an University of Technology), Innovation Capability Support Program of Shaanxi (No.: 2020PT-023, URL:http://ywgl.sstrc.com/egrantweb/#, Xiaoping Xu, Xi’an University of Technology), National Nature Science Foundation of China (No.: 11702214, URL:https://isisn.nsfc.gov.cn/egrantindex/funcindex/prjsearch-list, Mengli Hao, Xi’an University of Technology) Fundamental Research Funds for the Central Universities, (No.: 300102129107, URL:http://www.edu.cn/zheng_ce_fa_gu…090820_400962.shtml, Lijuan Yang, Chang’an University). Natural Science Basic Research Plan in Shaanxi Province of China (No.: 2018JQ1089, URL:http://ywgl.sstrc.com/egrantweb/#, Fuxiao Li, Xi’an University of Technology), The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
In categorical data analysis, ordinal categorical variables are frequently encountered in many contexts, such as health status (very good, good, so-so, bad, very bad), blood pressure (low, normal, high). The data observed hourly or daily constitutes an ordinal time series. The cumulative logistic regression model is often applied for analyzing the ordinal time series [1]. Sometimes the model may change at some unknown time moments (change-points) while it remains stable between these points. Structural stability is of prime importance in statistical modeling and inference. If the parameters have changed with the observed sample, inferences can be severely biased, and forecasts lose accuracy. Because of the importance of parameter stability, it is necessary to detect the structural change. Studies of structural change detection has been a popular research subject in statistics, see Csörgö and Horváth [2], Bai and Perron [3], Lee et al. [4], Perron [5], Gombay [6], Wang et al. [7], Chen et al. [8], Baranowski et al. [9], Wang et al. [10], Chen [11] and Liu et al. [12] for reviews of the field.
Structural changes detection in categorical data have been considered as well. Höhle [13] proposed a prospective CUSUM change-point detection procedure to detect a structural change in categorical time series; Wang et al. [10] described a procedure based on high-dimensional homogeneity test to detect and estimate multiple change-point in multinomial data; Plasse and Adams [14] illustrated a multiple change-point detection method for categorical data streams, which could adaptively monitor the category probabilities. As generalized linear regression models for categorical time series allow for parsimonious modeling and incorporation of random time-dependent covariates, Fokianos and Kedem [15] suggested the generalized linear model for categorical time series modeling. For change-point detection in the generalized linear model, Xia et al. [16] introduced two procedures to sequentially detect the structural change in generalized linear models with assuming independence; Hudecová [17] investigated the detection of change in autoregressive models for binary time series; Fokianos et al. [18] provided a statistical procedure based on the partial likelihood score process to detect a structural change in binary logistic regression model; Gombay et al. [19] and Li et al. [20] discussed retrospective change detection and sequential change detection in multinominal logistic regression model.
Score test for detection of changes in time series models has been studied by Gombay and Serban [21], Gombay et al. [22]. The test statistic is usually computationally less demanding than the likelihood ratio test statistic. In this paper, we first propose a test statistic based on the efficient score vector to detect a structural change in cumulative logistic regression model, which extends the change-point detection of Gombay et al. [19]. Simulation shows that the empirical power of the proposed statistic is low when the change-point occurs at the beginning or end of the data. To this end, we propose a new statistic, which is the quadratic form of the efficient score vector and has a weight function. Under the null hypothesis of no change, we derive the asymptotic distribution of the two statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. The empirical power of the former statistic is higher when the change-point is located at the centre of the data, but the latter performs better when the change-point is located at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two statistics to study a group of sleep data, and find a structural change in the data.
The model and hypotheses
Consider a categorical time series {Yt} with m categories, Yt = (Yt1, …, Ytq)′, q = m − 1,
for t = 1, 2, …, n and j = 1, …, q,
. The vector of conditional probability πt = (πt1, …, πtq)′ is defined by
for every t,
, where
{Zt−1} denotes the p × q covariate matrices.
Define an ordinal time series {Yt}, where Yt = j is equivalent to Ytj = 1 for j = 1, 2, …, m, t = 1, 2, …, n. Let {Xt} be a latent variable time series, where Xt = −β′zt−1 + et, β ∈ Rd, zt−1 is a d-dimensional covariate vector, et is a white noise process with continuous cumulative distribution function F. Suppose that −∞ = α0 < α1 < ⋯ < αm = ∞ are threshold parameters, such that Yt satisfies
for j = 1, 2, …, m. According to the equivalence relation between Yt and Ytj, we have
then
If F(x) is the logistic distribution function, then F−1 is the logistic link function log it(x), where logit (x) = ln (x/(1 − x)), 0 < x < 1. Thus we have
which is called the cumulative logistic regression model.
Let θ = (α1, …, αq, β′)′ be a p-dimensional parameter vector, p = q + d. In this paper we wish to test if there exists a structural change in the parameter θ, that is,
where θ0 is the true value of θ, k* denotes the change-point which occurs in some of the parameter θ,
, θ0,
and k* are unknown.
Next, we estimate the parameter vector θ by the partial likelihood method (Fokianos et al. [18]). The partial likelihood function
and the partial log-likelihood function
(1)
are defined in Gombay et al. [19]. Denote the partial score vector
where
,
, h(ηt) = (h1(ηt), …, hq(ηt))′,
. h1(ηt), …, hq(ηt) satisfies
where
. Σt(θ) is the conditional covariance matrix of Yt with
for i, j = 1, …, q [23].
To obtain the existence, consistency and asymptotic normality of the maximum partial likelihood estimator, we give a few assumptions on the the covariate matrices {Zt} and parameter vector θ.
Assumption 1 The parameter vector , where Ω is an open set.
Assumption 2 The link function h is twice continuously differentiable, and satisfies det(∂ h(ηt)/∂ ηt) ≠ 0, where .
Assumption 3 The covariate matrix Zt−1 lies almost surely in a non-random compact subset Φ of such that
,
lies almost surely in the domain H of h for all Zt−1 ∈ Φ and θ ∈ Ω, where
, λ ≠ 0.
Assumptions 1 and 2 ensure that the second derivative of l(θ) is continuous, det(∂ h(ηt)/∂ ηt) ≠ 0 implies that Ut(θ) is not singular (Fokianos and Kedem [24]). From Assumption 3,
is positive definite with probability one [24]. Since the likelihood estimation employs an assumption regarding ergodicity of the joint process
(Fokianos and Truquet [25]), let {Yt} be a time series taking values in a finite set E with cardinal m, and such that
where
,
,
, q is a transition kernel. We assume that the applications
are measurable, as applications from
to (0, 1), where
is a sequence,
,
is such that
. Assume that v1 and v2 are two probability measures on E, define
For y, y′ ∈ Em and a positive integer s, we write
if
, 0 ≤ i ≤ s − 1 (Truquet [26]).
Assumption 4 The d-dimensional covariate vector {zt−1} is stationary and ergodic.
Assumption 5 Setting for s ≥ 0,
we have b0 < 1 and
.
Assumptions 4 and 5 guarantees that is stationary and ergodic [26]. Assumptions 1–5 are required to obtain consistency and asymptotic normality of the maximum likelihood estimator. However, existence of moments for the covariate process is still required to study large sample properties of the maximum likelihood estimator [25]. So we have
Assumption 6
, i = 1, 2, ⋯, d, where
, 1 ≤ i ≤ d are components of vector zt‒1.
The proposed testing procedure
Based on the partial likelihood score process, a test statistic is defined by
where
,
is the maximum partial likelihood estimator of θ, which can be obtained by maximizing the partial log-likelihood function (1) (see Fokianos and Kedem [23]).
Under the null hypothesis of no change, we derive the asymptotic distribution of the proposed test statistic.
Theorem 1 If Assumptions 1–6 and H0 hold, then we have where
, B(t) is a p-dimensional vector of independent Brownian bridge,
means convergence in distribution.
Proof: Since , we can write
let
denote the i-th element of Sk, i = 1, 2, …, p, θ0 is the true value of θ, then we have
Next, it is similar to the proof of Proposition 3 in Gombay et al. [19], we can prove that
By Theorem 4.1 of Fokianos and Kedem [23], we get
The error terms
have higher orders of products of
, it can be shown that
. According to Proposition 1 (Gombay et al. [19]) and Slutsky’s theorem, we get
as n → ∞.
Remark 1 When using the above test, if there exists some i, 1 ≤ i ≤ p,
the null hypothesis is rejected and a change-point occurs, α* = 1 ‒ (1 ‒ α)1/p. Let B(u) be a one-dimensional Brownian bridge, Csörgö and Révész [27] suggested that C(α*) could be obtained by
Simulation shows that W1 has poor performance at the boundaries. In particular, the limiting Brownian bridge is tied down at t = 0 and t = 1 (meaning B(0) = B(1) = 0), and hampers the ability of the test to detect the structural change occurring near the beginning or end of the data. Many authors address this problem by adding a weight function [28]. Therefore, we construct a new test statistic
which is the quadratic form of the efficient score vector and has a weight function, where
{i1, i2, …, ij} ⊂ {1, 2, …, p}, j = 1, 2, …, p, 0 < l < h < 1.
Theorem 2 If Assumptions 1–6 and H0 hold, then we have for each 0 < l < h < 1, Bi(t), i = 1, …, j are independent one-dimensional Brownian bridges.
The conclusion of Theorem 2 can be deduced directly from Theorem 1. To obtain the critial values of the asymptotic distribution, Csörgö and Horváth [2] used a result of Vostrikova [29] to show that
as x → ∞. For example, when α = 0.05, l = 0.05, h = 0.95, j = 2, the critical value C(α) = 13.1.
Under the alternative hypothesis, there exists a structural change in the model, then we will prove the consistency of the two statistics.
Theorem 3 Suppose Assumptions 1–6 and HA hold, if the coefficient changes from θ0 to
at k*,
,
is the jth component of θ0, j ∈ {1, 2, …, p}, where δ is a constant, δ ≠ 0, then we have
where 0 < l < h < 1, ‖⋅‖ denotes the Euclidean norm of a vector, means convergence in probability.
Proof: Under the alternative hypothesis θ = θ0, t = 1, 2, …, k*, , t = k* + 1, …, n. Suppose that the coefficient changes from θ0 to
at k*,
,
is the j-th component of θ0, 1 < j < p, where δ is a constant, δ ≠ 0.
When k* < k < n,
where
,
. For the ith component of
, 1 < i < p, we have
where
has two orders of products of
,
has two orders of products of
. By Theorem 1 we have
as n → ∞. Following Assumptions 1–6, we conclude that
Since δ ≠ 0, we have
as n → ∞. When 1 < k < k*, the proof is similar. The proof of (ii) is similar to the proof of (i).
Once the null hypothesis is rejected, indicating there may exist a change-point, then we locate the change-point position by
(2)
The following theorem shows that the change-point estimator
is consistent for the true change-point k*, as n → ∞.
Theorem 4 Let k* be the true position of change-point under the alternative hypothesis HA and be the estimate of k* given by (2). Under Assumptions 1–6, then
is consistent to k*, as n → ∞.
Proof: First we note that
where i = 1, 2, …, p. Since
where
. And because
Therefore
increases as k = 1, 2, ⋯, k*, and decrease as k = k* + 1, k* + 2, ⋯, n, then we take (2) as the change-point estimator.
By the proof of Theorem 1, we have
and
. By Theorem 2 of Gombay [6], to prove (2) it is enough to show that
(3)
and
(4)
where
. To prove (3), assume that there exists a constant K, K < k*,
where
By Theorem 1, choosing δ > 0 arbitrarily
if K is large enough, so (3) is proven. The proof of (4) is the same by symmetry.
If we consider detecting multiple structural changes in the sequences, we can employ the binary segmentation method [30]. First use the single change test. If H0 is rejected, then find by (2). Next divide the sample into two subsamples
and
, and test both subsamples for further changes. One continues this segmentation procedure until no subsamples contain further change-points.
Simulation
To evaluate the finite sample performance of the proposed two test statistics (W1 and W2), we first simulate an ordinal time series {Yt} with m = 3 categories and length n = 100, 200, 500, 1000. The data are generated by
where α1 = −0.5, α2 = 0.2, (β1, β2, β3)′ = (2, 0.5, 1)′, then the parameter vector θ = (α1, α2, β1, β2, β3)′. All simulation results are based on 1000 replications at the 0.05 significance level.
Suppose that we are only interested in α1 and β1, the others are nuisance parameters. Table 1 shows the empirical size of the two statistics under the null hypothesis H0. and
denote the empirical size of W1 when testing for change in each of α1 and β1, respectively. W1 and W2 denote the empirical size of W1 and W2 when testing for change in both α1 and β1, respectively.
It can be seen from Table 1 that the empirical size increases as the historical sample size n increases. When the sample size n = 1000, the empirical size of W1 and W2 is close to the significance level 0.05. In addition, based on the relation between the probability of type I errors when detecting α1 or β1 and the overall probability of type I errors, that is ,
and W1 should satisfy
. The results show that
, which confirms the above inference.
Under the alternative hypothesis HA, we consider the following three different situations:
where k* = 0.1n, …, 0.9n. Tables 2–4 summarize the empirical power of W1 and W2 under the alternative hypotheses
,
and
when k* = 0.1n, 0.5n, 0.8n.
and
denote the empirical power when testing for change in each of α1 and β1, respectively. W1 and W2 denote the empirical power of W1 and W2 when testing for change in both α1 and β1. From the simulation results, it can be seen that the empirical power of the two statistics increases with the sample size n, and is close to 1 when n = 1000. In addition, The empirical power of the two statistics varies according to different change-point locations, and reaches maximum when k* = 0.5n. Fig 1 describes the empirical power of the two statistics when k* = 0.1n, …, 0.9n. It is showed that the empirical power of W1 is higher than that of W2 when the change-point is located at the centre of the data, but W2 performs better when the change-point is located at the beginning or end of the data.
In simulation for Table 2 both α1 and β1 change, whereas in Tables 3 and 4 only α1 and β1 changes at different change-points. Tables 3 and 4 indicate that most power stems from the parameter that is changed, which means W1 that could not only detect change in parameters, but also find the reason for rejecting the null hypothesis.
Application to real data
To illustrate the applicability of our results, we use 1000 sleep data (Yt) collected from the sleep state measurements of a newborn infant sampled every 30 seconds (Fokianos and Kedem [23]). The sleep states are classified as follows: (1) quiet sleep, (2) indeterminate sleep, (3) active sleep, (4) awake (Fig 2). According to the newborn’s sleep pattern, the sleep states have the following order: “(4)” < “(1)” < “(2)” < “(3)”, which means {Yt} is an ordinal time series. One goal of analyzing these data is to establish a correct model, and predict the sleep state based on the covariate information. Refer to example 6.3 of [23], Yt−1 = (Y(t−1)1, Y(t−1)2, Y(t−1)3)′ is a significant predictor, which can be considered as a covariate. Then these data could be modeled by a cumulative logistic regression model
where α1 = −14.722, α2 = −10.389, α3 = −4.078, β1 = 18.663, β2 = 12.173, β3 = 7.566.
Let θ = (α1, α2, α3, β1, β2, β3)′, then testing whether there exists a structural change in θ, the result finds that a structural change occurs in θ by computing the test statistics W1 and W2. After this, using W1 to check which parameter occurs a structural change, the result shows that there exists a structural change in α2 at 596. Specifically, the maximum of W1 is 3.446, and the critical value is 1.35 when p = 1, α = 0.05, which gives a significant result (Fig 3). Re-estimate the parameters based on the first 596 samples and the last 404 samples, we have for the former and
for the latter. We obtain AIC = 1646.65 for the adjusted model, and AIC = 1652.89 when assuming there is no change-point, which means to improve the model in some extent, so that we can make accurate predictions.
Concluding remark
Cumulative logistic regression model is a generalized linear model, and has a wide application in health care. In this paper, two test statistics based on the efficient score vector are proposed to detect the structural change of cumulative logistic regression model. Under the null hypothesis of no change, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. Furthermore, we prove the consistency of the change-point estimator, and a binary segmentation procedure is provided for estimating the locations of possible multiple change-points. The finite sample performance is investigate by a monte carlo simulation, the results shows that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. From the empirical power of view, the two test statistics have different advantages when the change-point occurs at different locations. Furthermore, the proposed statistic W1 could find the reason for rejecting the null hypothesis. Finally we apply the two test statistics to study 1000 sleep data collected from the sleep state measurements of a newborn infant sampled every 30 seconds, the results shows there exists a structural change in the model.
References
- 1. McCullagh P. Regression Models for Ordinal Data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1980; 42: 109–142.
- 2.
Csörgö M, Horváth L. Limit theorems in change-point analysis. New York: John Wiley & Sons Inc, 1997.
- 3. Bai J, Perron P. Estimating and testing linear models with multiple structural changes. Econometrica. 1998; 66: 47–78.
- 4. Lee S, Ha J, Na O, Na S. The Cusum Test for Parameter Change in Time Series Models. Scandinavian Journal of Statistics. 2003; 30: 781–796.
- 5.
Perron P. Dealing with structural breaks, in: Mills TC, Patterson K, Palgrave handbook of econometrics econometric theory: vol 1. Springer, New York, 2006; pp. 278–352.
- 6. Gombay E. Change detection in autoregressive time series. Journal of Multivariate Analysis. 2008; 99(3): 451–464.
- 7. Wang Y, Wu C, Ji Z, Wang B, Liang Y. Non-Parametric Change-Point Method for Differential Gene Expression Detection. PLoS ONE. 2011; 6(5): e20060.
- 8. Chen Z, Jin Z, Tian Z, et al. Bootstrap testing multiple changes in persistence for a heavy-tailed sequence. Computational Statistics & Data Analysis. 2012; 55(7): 2303–2316.
- 9. Baranowski R, Chen Y, Fryzlewicz P. Narrowestoverthreshold detection of multiple change points and changepointlike features. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2016; 81: 649–672.
- 10. Wang G, Zou C, Yin G. Change-point detection in multinomial data with a large number of categories. Annals of Statistics. 2018; 46(5): 2020–2044.
- 11. Chen H. Sequential change-point detection based on nearest neighbors. The Annals of Statistics. 2019; 47(3): 1381–1407.
- 12. Liu B, Zhou C, Zhang X, Liu Y. A unified data-adaptive framework for high dimensional change point detection. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2020; 82(4): 933–963.
- 13.
Höhle M. Online change-point detection in categorical time series, In: Thomas K, Gerhard T,statistical modelling and regression structures. Heidelberg: Physica-Verlag. 2010; pages 377–397.
- 14. Plasse J, Adams NM. Multiple change point detection in categorical data streams. Statistics and Computing. 2019; 29: 1109–1125.
- 15.
Fokianos K, Kedem B. Regression model for time series analysis. Hoboken: John Wiley & Sons. 2002.
- 16. Xia Z, Guo P, Zhao W. Monitoring Structural Changes in Generalized Linear Models. Communication in Statistics- Theory & Methods. 2009; 38(11): 1927–1947.
- 17. Hudecová Š. Structural changes in autoregressive models for binary time series. Journal of Statistical Planning and Inference. 2013; 143(10): 1744–1752.
- 18. Fokianos K, Gombay E, Hussein A. Retrospective change detection for binary time series models. Journal of Statistical Planning and Inference. 2014; 145: 102–112.
- 19. Gombay E, Li F, Yu H. Retrospective change detection in categorical time series. Communications in Statistics-Simulation and Computation. 2017; 46 (14): 6831–6845.
- 20. Li F, Chen Z, Xiao Y. Sequential change-point detection in a multinomial logistic regression model. Open Mathematics, 2020; 18(1):807–819.
- 21. Gombay E, Serban D. Monitoring parameter change in AR(p) time series models. Journal of Multivariate Analysis. 2009; 100(4): 715–725.
- 22. Gombay E, Hussein AA, Steiner SH. Monitoring binary outcomes using risk-adjusted charts: a comparative study. Statistics in Medicine. 2011; 30(23): 2815–2826.
- 23. Fokianos K, Kedem B. Regression theory for categorical time series. Statistics Science. 2003; 18(3): 357–376.
- 24. Fokianos K, Kedem B. Prediction and classification of non-stationary categorical time series. Journal of Multivariate Analysis. 1998; 67: 277–296.
- 25. Fokianos K, Truquet L. On categorical time series models with covariates. Stochastic Process & Their Applications. 2019; 129(9): 3446–3462.
- 26. Truquet L. Coupling and perturbation techniques for categorical time series. Bernoulli. 2020; 26(4): 3249–3279.
- 27.
Csörgö M, Révész P. Strong Approximations in Probability and Statistics. New York: Academic Press, 1981.
- 28. Robbins M, Gallagher C, Lund R, et al. Mean shift testing in correlated data. Journal of Time Series Analysis, 2011; 32(5): 498–511.
- 29. Vostrikova LY. Detection of Disorder in a Wiener Process. Theory of Probability and Its Applications. 1981; 26: 356–362.
- 30. Xu M, Zhong P S, Wang W. Detecting variance change-points for blocked time series and dependent panel data. Journal of Business & Economic Statistics. 2016, 34(2): 213–226.