Figures
Abstract
This study proposes a control chart that monitors conditionally heteroscedastic time series by integrating the Huber support vector regression (HSVR) and the one-class classification (OCC) method. For this task, we consider the model that incorporates nonlinearity to the generalized autoregressive conditionally heteroscedastic (GARCH) time series, named HSVR-GARCH, to robustly estimate the conditional volatility when the structure of time series is not specified with parameters. Using the squared residuals, we construct the OCC-based control chart that does not require any posterior modifications of residuals unlike previous studies. Monte Carlo simulations reveal that deploying squared residuals from the HSVR-GARCH model to control charts can be immensely beneficial when the underlying model becomes more complicated and contaminated with noises. Moreover, a real data analysis with the Nasdaq composite index and Korea Composite Stock Price Index (KOSPI) datasets further disclose the validity of using the bootstrap method in constructing control charts.
Citation: Kim CK, Yoon MH, Lee S (2024) Robust control chart for nonlinear conditionally heteroscedastic time series based on Huber support vector regression. PLoS ONE 19(2): e0299120. https://doi.org/10.1371/journal.pone.0299120
Editor: Petre Caraiani, Institute for Economic Forecasting, Romanian Academy, ROMANIA
Received: October 24, 2023; Accepted: February 5, 2024; Published: February 23, 2024
Copyright: © 2024 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data files are available from https://www.kaggle.com/datasets/jpgrlee/dataset-for-robust-control-chart-for-time-series.
Funding: This research is supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (No. 2021R1A2C1004009).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
In this study, we develop a statistical process control (SPC) method for a nonlinear conditionally autoregressive heteroscedastic (GARCH) time series whose underlying structure is unspecified, possibly possessing noises. SPC has been developed during the past decades to improve the quality of products through the monitor and control of manufacturing processes [1]. Generally, control charts are appreciated as predominant devices that utilize graphical descriptions of the characteristics of a process, which is designed to promptly signal the out-of-control state of processes. The conventional charts are originally built for independent or uncorrelated observations, but these do not work adequately when they are correlated over time as the correlations among observations cause a high rate of false alarms, see [2–7]. As a remedy, control charts that monitor dependent observations have been proposed in the literature, see [8–11], and further refer to [12] for more recent development.
In practice, however, constructing control charts to monitor GARCH-type time series can be quite challenging, especially when financial time series is targeted. This is so because in financial time series, the task of finding an elongated period of in-control state observations is not feasible, as the time series occasionally suffers from structural instabilities attributed to external socioeconomic factors. This implies that financial time series can become nonstationary if observed for an extended amount of time, and it is difficult to single out a specific model due to the unpredictability of such economic affairs. Although variants of parametric GARCH models were coined to estimate time-varying volatilities, they are commonly apt to violate the stationarity assumptions, needed for precisely estimating model parameters [13]. In practice, accurately calculating residuals based on the presumed models is critical as they are directly harnessed in the construction of the control charts, and therefore, the misspecification of underlying models can be a serious issue in designing control charts. To alleviate this problem, practitioners have developed nonparametric machine learning-based control charts. For an overview, we refer to [14, 15]. Among those, we pay special attention to one-class classification (OCC)-based control charts, which are designed to identify the boundary of a specific class among all data points via learning from a training dataset. For relevant works, we refer to [16–18].
To design a flexible model that meticulously captures the structure of the conditional volatility and avoid the model assumptions, we root our method to the support vector regression (SVR), a variant of the support vector machine (SVM), revised for regression. As SVR can afford to approximate the nonlinearity between variables without knowing their underlying dynamic structure a priori, while concurrently implementing the structural risk minimization principle [19] that balances between the model complexity and the empirical risk [20], SVR has been accepted as a promising tool to estimate the volatility of time series. For a general overview of SVR, we refer to [21–23], and further refer to [24–26] for incorporating SVR to estimate time series models. Specifically, [27, 28] used the extended GARCH model, named SVR-GARCH, to obtain residuals and construct both retrospective change point tests and monitoring methods.
However, as GARCH-type time series inherently suffers from instability as observations can become wildly explosive, monitoring methods that accommodate variants of parametric- or SVR-GARCH models are often inefficient for constructing a reliable test. For instance, [29] fitted an AR(p) model to the obtained residuals and truncated excessively large residuals to prevent them from undermining the Type I error rate of the test. This phenomenon arises primarily because standard SVRs still can be susceptible to the outliers lying in the training dataset [30], which generally makes the residuals behave in a more correlated manner, while also rendering some of them to become excessively larger than the others. This deficiency can be severely detrimental against the accuracy of monitoring as it requires “well-behaved” in-control observations or residuals as a primary ingredient of time series monitoring.
To circumvent the problem, we integrate the robust variant of SVR to estimate nonlinear GARCH-type time series models. Among the variants of robust SVR in the literature, such as [31, 32], we specifically employ the Huber SVR (HSVR) by [33] when estimating nonlinear GARCH models. HSVR not only allows to design more flexible models by providing methods to incorporate asymmetry, but also it is suitable for efficiently fitting GARCH models as it suppresses the effects of the explosive observations. Therefore, we also adopt the nonlinear GARCH model using HSVR, called the HSVR-GARCH model, to accommodate a broad class of nonlinear GARCH models.
Moreover, to facilitate the control chart to effectively reflect the sophisticated structure of time series, we propose a control chart that harnesses a nonparametric data description method. In particular, we utilize the support vector data description (SVDD) method, a version of SVM reconstructed to solve the OCC problems [34, 35]. SVDD provides a hypersphere boundary on the data points, aimed to contain the desired proportion of observations within that hypersphere of a minimum radius, which can detect the data point out of the hypersphere boundary as an anomaly. This approach has been taken by several authors and proved to be useful in practice [16, 36–39]. More recently, [40] considered a control chart based on the variant of one-class SVM that well detects mean shifts of a given process. Herein, we utilize squared residuals to train SVDD and determine the decision boundary of in-control time series as a control limit.
The rest of this paper is organized as follows. Section 2 summarizes two variants of SVR, namely, HSVR and SVDD. Section 3.1 elaborates the specification and the process of fitting HSVR-GARCH model alongside with its advantages, while Section 3.2 enumerates the procedure to train the OCC-based control chart using SVDD. Section 4 conducts Monte Carlo simulations to evaluate the performance of control charts that utilize HSVR-GARCH residuals in various circumstances, including the case that observations contain innovational or additive outliers. Our simulation experiments also evaluate the performance of control charts with HSVR-GARCH residuals when they are trained with the samples generated through a bootstrap method. Section 5 performs a real data analysis using the Nasdaq composite index and the Korean stock price index (KOSPI) to further solidify the validity of using bootstrap samples obtained from HSVR-GARCH models. Finally, Section 6 provides the concluding remarks.
2 Machine learning methods
This section introduces two machine learning methods, namely, the HSVR-GARCH model and the support vector data description (SVDD) method that comprise the control chart for monitoring nonlinear GARCH-type time series. We first provide a concise overview of these learning methods and then elaborate the procedure to apply those in constructing our monitoring schemes in the subsequent sections.
2.1 Huber SVR
Support vector regression (SVR) is a statistical learning method that originated from [41], which aims to precisely estimate the structure of a given dataset nonparametrically. Unlike many traditional parametric time series models, the strength of estimating time series model using SVR arises from its innate versatility to model dataset whose underlying structure is unknown or contains a severe degree of nonlinearity. This strength is particularly beneficial especially when estimating the structure of time series with heteroscedastic volatility, such as financial indices, as conditional volatility is unobservable in general and can be complex due to the interdependent nature of the current global economy. Moreover, utilizing SVR to estimate the structure of the time series can help to simultaneously avoid deploying a sophisticated parametric model, such as the family of exponential GARCH (EGARCH) models [42].
In this study, we utilize the Huber SVR (HSVR) when modeling nonlinear GARCH models to escalate the robustness. HSVR is advantageous over the standard SVR as it can effectively offset the effects of sporadic outliers, frequently observed in time series. Although many versions regarding HSVR exist in the literature, we utilize the ϵ-insensitive HSVR proposed by [33, 43].
Given a set of training observations {(xi, yi)} (i = 1, …, n, ), HSVR seeks to find a function of the form:
(1)
where
,
is a matrix whose j-th row consists of
(j = 1, …, n),
and
are parameters to be estimated, and
with K1(x1, x2) = 〈ϕ1(x1), ϕ1(x2)〉 for some predefined kernel K. Here, 〈⋅, ⋅〉 and ϕ1(x) respectively denote the inner product of the feature space and the implicit kernel operator corresponding to the kernel K1.
At first glance, (1) seems to deviate from the function form of the standard SVR, which is given as,
(2)
where K and ϕ is defined analogously to K1 and ϕ1, respectively. However, we can see that (1) and (2) are actually identical, which implies that HSVR can be viewed as a direct extension of the standard SVR. This is because by using the definition of K1(x, AT), (1) can also be rewritten as
where wj is the j-th element of the vector w. Therefore, analogous to SVR, estimating (1) using HSVR can be understood as implicitly performing the following two-stage procedure: (i) first map the given observations (x1, ⋯, xn) onto some feature space using the prescribed mapping function ϕ1 to produce (z1, ⋯, zn) ≔ (ϕ(x1), ⋯, ϕ(xn)), (ii) then estimate the linear function on that feature space that best describes the mapped observations zi. For more details, we refer to [20, 28] for the standard SVR and [33] for HSVR.
To estimate the model parameters w and b in (1), we solve
(3)
where C1 is a tuning parameter that balances between the model risk and the flatness of the estimated function, and L is an ϵ-insensitive Huber loss function defined as follows:
(4)
where x+ ≔ max(x, 0) for
, and ϵ, γ ≥ 0 are tuning parameters that determine the degree of robustness, see Fig 1 for an illustration.
As depicted in Fig 1, ϵ-insensitive Huber loss function returns zero when x ∈ [−ϵ, ϵ], increases quadratically in (−γ, −ϵ] and (ϵ, γ], then increases linearly otherwise. Moreover, in a statistical perspective, the optimization problem (3) attains the shrinkage estimator regarding w and b rather than only w, as seen in the traditional SVR models. This alteration is employed to derive a unique solution of w and b in solving (3) while simultaneously maintaining the strong convexity, refer to [44].
Despite the similarities between HSVR and SVR, the former deviates from the latter in terms of the way of solving the given optimization problem. Indeed, HSVR directly solves the primal problem by taking an iterative approach. To attain the solution, we first explicitly expand (3) by substituting (4), then reparametrize the problem with respect to z ≔ (wT, b)T:
(5)
where y = (y1, …, yn)T, B = [K1 1n], K1 is an n × n matrix, whose (i, j)-th element is K1,ij = K1(xi, xj), A is a matrix of explanatory variables, whose i-th row is defined as
, and
is a vector of ones. As the objective function is strongly convex with respect to z, we attain the optimality by obtaining the solution of the first-order condition, that is,
Therefore, (5) reduces to the problem of the following iterative method that computes the updated solution zi+1 from the previous iteration zi:
for i = 0, 1, …. The general process of using the aforementioned HSVR for GARCH models is presented in Section 3.
2.2 Support vector data description algorithm
To formulate the control chart that integrates the one-class classification algorithm, we use the support vector data description (SVDD) model in this study, which is a complex of support vector machine (SVM) and the data description algorithm proposed by [35, 45]. SVDD is a method to find a boundary that encompasses given observations with a single hypersphere in the feature space. Because the proportion of observations inside the hypersphere can be settled in advance, we can readily employ SVDD as an anomaly detection method. For a general overview, we refer to [36].
The fundamental concept of SVDD stems from constructing a hypersphere with a minimum possible volume that includes the desired number of observations within such a boundary. Namely, given the training dataset (i = 1, …, n), SVDD solves the following optimization problem with respect to the center
and the radius R ≥ 0 of the hypersphere:
(6)
where ξi ≥ 0 (i = 1, ⋯, n) denotes a slack variable that penalizes observations lying outside the hypersphere, C2 ≥ 0 is a tuning parameter which controls the balance between the misclassification errors and the volume of the hypersphere, and ϕ2(x) is an implicit kernel operator that is defined analogously to the kernel function K1 in Section 2.1, namely,
In practice, the following Gaussian kernel function is widely selected:
(7)
where κ > 0 is a tuning parameter that determines the complexity of the decision boundary. In practice, the decision boundary becomes more nonlinear and sophisticated when κ is small.
The general procedure of solving (6) is similar to that of the classical SVM. To elaborate, we first construct the unconstrained problem as below:
(8)
where αi ≥ 0 and γi ≥ 0 (i = 1, …, n) are Lagrange multipliers. Then, by using the Karush-Khun-Tucker conditions, we obtain the following equivalent dual problem with respect to αi:
(9)
Notice that γi vanishes due to the relationship αi = C2 − γi, (i = 1, …, n), which is obtainable using the first-order conditions. Using the optimal αi of the problem (9), the radius R2 can be calculated by the formula:
where xsv denotes one of the observations that is precisely located on the boundary of the constructed hypersphere. Similarly, we measure the kernel distance between a new observation
and the center a by computing
(10)
and z is classified as an anomaly when D2 is larger than R2.
A primary reason why SVDD can be deployed in constructing a control chart is the usage of kernels, which enables practitioners to design a tailored control chart suitable to their needs. Indeed, the control limit of the chart can be determined by the tuning parameter C2 in (6), as it can be exploited to precisely specify the rate of the misclassified datasets. To illustrate, by adjusting C2, we can reset the upper bound ρ > 0 of the misclassification error rate through the following relationship:
(11)
and it is known that ρ matches the misclassification rate when the number of observations are sufficiently large and the distribution of the dataset is continuous [45]. Control charts using SVDD can be remarkably favorable when the underlying structure of the dataset is sophisticated, as SVDD seeks the boundary from a higher-dimensional feature space that can be further adjusted using κ in (7). Therefore, tuning C2 is analogous to adjusting the control limit of control charts, see Sections 3 and 4 for more details.
3 OCC-based monitoring scheme
In this section, we summarize the general procedure of constructing the OCC-based control chart. The procedure is divided into two parts: first fitting the HSVR-GARCH model to obtain residuals, then deploying SVDD to determine the control limit.
3.1 HSVR-GARCH model
We extend the GARCH model to capture the nonlinear structure of the conditional volatility specified as follows:
(12)
where f is an unknown nonlinear function with the form in (1), which would be obtained by solving (3),
is the conditional volatility at time t = 1, …, n, and ϵt are independent and identically distributed (iid) errors with E(ϵ1) = 0 and Var(ϵ1) = 1. Observe that, as
is a (conditional) variance, both
and σt must be nonnegative by definition.
Model (12) includes a broad class of stationary GARCH time series models. For a class of parametric GARCH models, [46] studied the estimation of their parameters and presented a method of conducting the change point test as an application. Their research later offered a theoretical foundation that based the OCC control charts of [47].
Although it seems plausible to directly deploy HSVR to model obtain the function f in (12), it is actually impractical because HSVR does not automatically guarantee the conditional variance to be nonnegative. To rectify the problem, we modify the above model by taking the logarithm to ensure the nonnegativity of
as follows:
(13)
As is not observable in most practical circumstances when estimating g in (13), we substitute
in (12) with
(14)
for t = 1, …, n, and s ≥ 1 denotes a degree of smoothness. When t < s, (14) is defined as
.
is one of the most widely accepted proxy of
and can be readily deployed when predicting the conditional volatility [48] and constructing monitoring schemes [28]. In this study, we set s = 20, as it renders the
to be stabilized around the level of the unconditional variance, thereby enhancing the quality of residuals utilized in the formulation of control charts.
Another repercussion due to the unknown is that an adequate source, purposed to compare with
for computing the empirical loss, is absent when aiming to determine the optimal set of tuning parameters, as
is always estimated to be zero for pure GARCH-type models. This definitely prohibits practitioners from using standard loss functions, such as the mean absolute error (MAE), when performing the tuning parameter optimization. We overcome this problem by introducing a likelihood-based loss ℓ(g) for a function g with a regularization term regarding w. Specifically, by temporarily assuming that the error term ϵt is normally distributed, we consider the negative log-likelihood of g in (13) as follows:
with
. Then, by appending the regularization term regarding w to prevent overfitting, we obtain the following loss function [49]:
(15)
where δ ≥ 0 is a tuning parameter that balances between the model’s flatness and the goodness of fit. This approach is conceptually similar to that of the quasi maximum-likelihood estimator (QMLE), which is the standard method of estimation in parametric GARCH models [13].
The general procedure to fitting a HSVR-GARCH model is based on the sequential k-fold cross-validation method for time series. To elaborate, we follow the steps below:
- Step 1. Partition the time series (xt, yt) into chunks of k ≥ 2, and denote the time series of the m-th chunk as
(m = 1, ⋯, k). Moreover, let nm be the number of observations in the chunk m.
- Step 2. Use the first m − 1 chunks to obtain the estiamted function
and the estimated conditional volatility in (13) as
.
- Step 3. Using the m-th chunk, compute the loss in (15) for the m-th chunk and denote the result as
, namely,
- Step 4. Repeat Steps 2 and 3 for m = 2, ⋯, k.
- Step 5. The loss function that determines the optimal tuning parameter is finally defined as
and the best set of tuning parameters is then set to those that minimize
.
This process of obtaining via HSVR-GARCH models is encapsulated in Algorithm 1 below.
The set of tuning parameters for HSVR-GARCH model consist of four elements: (i) the regularization parameter C in (3), (ii) two parameters that comprise the ϵ-insensitive Huber-loss function ϵ and γ, (iii) the kernel tuning parameter s2 for the Gaussian kernel in Section 2.1
(16)
and (iv) the regularization parameter δ for the likelihood-based loss (15). In this study, we initially fix δ, then seek for the remaining tuning parameters that minimize
via the particle swarm optimization. The specifications regarding the tuning parameters are presented in Sections 4 and 5 below.
One of the prominent advantages of utilizing the HSVR-GARCH model over the SVR-GARCH model of [27] is that it significantly stabilizes the estimated volatility, thus liberates from the necessity of any posterior treatment of residuals when constructing the control chart. Due to the unpredictable nature of GARCH models, conditional volatility is prone to have abrupt anomalies, even when no structural breaks exist. As such, one of the innate drawbacks of the SVR-GARCH model is that it occasionally returns incorrectly estimated conditional variance, as witnessed in Fig 2, which results in the underestimation of in regions where the time series is relatively stable. This may prevent practitioners from directly utilizing the residuals based on the SVR-GARCH model because some residuals are computed to be explosively large. This phenomenon might have occurred as the SVR-GARCH method uses the mean absolute error-type loss as shown below that compares the estimated
with the proxy
in (14) in optimizing tuning parameters:
where k and m respectively denote the length of the training and the total length of the time series. In fact, two aforementioned figures concurrently depict that
estimated from the SVR-GARCH model does not heavily depart from
, and fails to resemble the true
in some occasions. On the other hand, by employing the likelihood-based loss in (15) with an optimal set of tuning parameters, the HSVR-GARCH method can escape from comparing
against the proxy variables which were already deployed in obtaining
. This alteration substantially stabilizes the estimation process producing
, which more closely resembles the true
, as reflected in Fig 2, and thereby, results in producing well-behaved residuals.
Plot of the estimated conditional volatility obtained from fitting HSVR-GARCH with s = 20 (top) and SVR-GARCH models with s = 5 (center), 20 (bottom), respectively, when the underlying model is GARCH(1,1) in (17) with (ω, α, β) = (0.1, 0.1, 0.8). Orange and black lines respectively denote the estimated () and true volatility (
), while the translucent blue line depicts
with the specified s.
Algorithm 1 HSVR-GARCH
Input: A time series , a kernel K1,
a collection of tuning parameters
Output: A conditional volatility estimator
1: Set 1 ≤ ni ≤ n (i = 1, …, k), m0 ≡ 0, nk ≡ n, ni < nj (i < j)
2: (i = 1, …, k)
3: for each pair of do
4: for 2 ≤ i ≤ k do
5:
6: With Ytrain,
7:
8: ;
9: gz ← K1(xi, AT)wz + bz
10:
11: With ,
12:
13: end for
14:
15: end for
16:
17:
return
3.2 Constructing OCC-based control chart
In this study, OCC-based control chart refers to the control chart that exploits the properties of ρ ∈ (0, 1) in (11) mentioned in Section 2.2 to prescribe a desired level of the control limit. Given model residuals and a set of tuning parameters (ρ, κ), where
is estimated by using any suitable estimation method, we fit SVDD with the Gaussian kernel presented in (16) to the squared residuals
obtained from the training time series, and regard R2 as the control limit. Moreover, when a new time series yn+k (k ≥ 1) is observed, we obtain its squared residuals
and then measure the distance
defined in (10) against the origin of hypersphere for each k, and declare the process out-of-control if it exceeds R2.
The control limit is determined by leveraging two tuning parameters, namely, the inclusion rate ρ and the kernel tuning parameter κ, which settle the in-control average run length (ARL) to a desired level. To find the optimal tuning parameters, we use a method that hybridizes the standard grid searching method and quasi-Newton methods, such as the limited-memory BFGS algorithm [50]. This is because solely relying on the latter method or other first-order optimization methods may render tuning parameters not to converge when the provided length of the training time series is insufficient.
Although the general procedure to construct the OCC-based control chart in this study resembles those of [47], our OCC-based control chart is designed to preserve the original structure of the given time series compared to the latter. In implementation, OCC charts of [47] must sequentially perform the Yeo-Johnson transformation and the moving-average smoothing to the residuals to alleviate the instability caused by the inaccurate estimation of the model parameters. Despite these efforts, OCC charts had a relatively inferior performance in terms of ARL1 compared to other charts, such as CUSUM or EWMA charts, when monitoring the conditional volatility using squared residuals, see Section 4 of [47]. Furthermore, as our simulation study indicates, smoothing residuals undermines the overall detection ability when the underlying model is contaminated with noises, see Section 4 of this paper for more detail. We actually suspect that such posterior manipulations on the residuals ignited this phenomenon by contaminating the information that the time series originally contained. However, as our OCC-based chart directly utilizes the squared residuals without any alterations, it not only facilitates the model residuals to retain the original structure, but also enhances the detection ability, as witnessed in the simulation results in Section 4.
The gist of constructing the OCC-based control chart, given a time series and some adequate models, is summarized in Algorithms 2 and 3 below. Specifically, Algorithm 2 describes the procedure of implementing SVDD for building an OCC-based control chart, given two tuning parameters ρ and κ. Meanwhile, Algorithm 3 provides an instruction to formulate OCC-based control charts using SVDD.
Remark 1 Under the circumstance in which obtaining copies of in-control time series is infeasible, we can empirically optimize tuning parameters by performing a wild-bootstrap method as specified below:
- Estimate
from the training time series y1, …, ym, then recursively compute
using (13). When parametric models are employed, we instead estimate the model parameters;
- Generate an iid sequence of standard normal random variables of length n, namely,
, t = 1, …, n, b = 1, …, B, and construct a sequence of bootstrap samples
;
- Obtain the residuals of bootstrap samples by computing
, where
;
- Based on
, fit SVDD and adjust its tuning parameters accordingly to obtain the control limit R2 that sets the in-control ARL to the desired level.
The validity of this method is discussed in the simulation experiments in Section 4. Also, this method is adopted when analyzing financial time series in Section 5, where only a small amount of training sample is available.
Algorithm 2 SVDDb for OCC-based control charts
Input: A sequence of residuals , a kernel K2, a vector of tuning parameters b = (C2, κ)
Output: Dual variables α = (α1, ⋯, αn), a control limit candidate R2
1:
2: Given and b ≔ (ρ, κ),
3:
4: (
: a support vector)
return
,
Algorithm 3 OCC-based control chart for nonlinear GARCH models
Input: A training set , an estimated model
,
N streams of in-control time series {ztj}t≥1 (j = 1, …, N),
a kernel K2, s ≥ 1, a desired level of in-control ARL c*, a tolerance ϵtol > 0,
a collection of tuning parameters for SVDD
Output: A control limit R2
1: for 1 ≤ t ≤ n do
2: ;
3:
4: end for
5:
6: With ,
7: for each pair of do
8:
9: With ,
10: for 1 ≤ j ≤ N do
11: for t ≥ 1 do
12: ;
13:
14:
15: if then
16: ; break
17: end if
18: end for
19: end for
20:
21: end for
22:
23:
return
4 Simulation results
This section assesses the performance of control charts that utilizes HSVR residuals in various nonlinear GARCH models. The first subsection describes the settings of the experiment in more detail, and the subsequent subsection reports the performance measured in terms of the average run length (ARL).
4.1 Specifications of the experiment
For the experiment, we consider the cases where the underlying models are GJR-GARCH(p, q) [51] and log-GARCH(p, q) specified below:
where ω ≥ 0 and {ϵt} is an iid random process. Here, the log-GARCH(1,1) model is a variation of the exponential GARCH model (i.e., EGARCH(p, q); see [42, 52]), which has a high degree of nonlinearity. In this experiment, we set p = q = 1 and set (ω, α1, α2, β) = (0.3, 0.1, 0.5, 0.3) for the GJR-GARCH model, and (ω, α, β) = (0.3, 0.3, 0.3) for the log-GARCH model.
Moreover, to further reflect the circumstance where the underlying model is highly volatile and unstable, we additionally consider the cases where the initial distribution of ϵt in the training time series is specified as follows:
- Case 1. ϵt ∼ N(0, 1), the standard normal distribution;
- Case 2. ϵt ∼ t(5), t-distribution with 5 degrees of freedom;
- Case 3. ϵt ∼ Zk = 0.9X + 0.1Y, X ∼ N(0, 1), Y ∼ N(0, k2).
- Case 4. ϵt ∼ Zl, m = 0.9X + 0.1W, X ∼ N(0, 1), W ∼ N(l, m2).
The latter three distributions symbolize the scenarios in which the observed time series is either inherently heavy-tailed or occasionally unstable. In particular, the latter two distributions are variants of a normal mixture distribution that represents the underlying model being highly volatile due to the innovational outliers. Specifically, Zk and Zl, m respectively denote that the underlying structure of time series is systematically unstable and asymmetric, which are some traits that can be witnessed in various financial time series. In this study, we consider the case of k = 3 and (l, m) = (1, 2).
In addition to considering innovational outliers, we also contaminate the observed time series with additive outliers in some experiments. Namely, we consider the case where we observe rather than directly observing yt, where ηt follows some prescribed distribution function. For this experiment, we respectively set ηt analogous to Cases 1 and 4 for GJR-GARCH model, and to Cases 3 and 4 for log-GARCH model when additive outliers are taken into consideration in the model.
When comparing the performance, we evaluate ARLs of three control charts, namely, the proposed OCC-based control chart in Section 3, CUSUM chart, and EWMA chart, that uses squared residuals. Here, CUSUM chart refers to the control chart that utilizes
where K = kσ, H = hσ, σ denotes the standard deviation of squared residuals
obtained from the training time series, and μ denotes the mean of
. On the other hand, EWMA chart denotes the control chart that uses
with the upper and lower control limits computed as
where λ ∈ (0, 1) is a smoothing parameter, and Z0 is defined to be the sample mean of
. Notice that the control limit of CUSUM and EWMA charts are respectively controlled by tuning k ≥ 0, h ≥ 0, or Le ≥ 0, to dictate control charts to achieve the desired level of ARL0. For an overview of these control charts, we refer to [53–55].
Moreover, as we regard the underlying structure of the time series is unknown, we compare the detection ability of control charts using HSVR-GARCH against those that utilizes the standard GARCH(1,1) model specified below:
(17)
where ω, α, and β are all nonnegative, and α + β < 1 is fulfilled to ensure the stationarity. Notice that fitting GARCH(1,1) models to some other conditionally heteroscedastic time series is customarily accepted in the presence the model uncertainty, see [56].
The general procedure of the experiment is summarized as follows. We first generate a time series of length n for training either the HSVR-GARCH model stated in Section 3 or the standard GARCH(1,1) model to obtain the residuals. We subsequently use squared residuals to formulate the control chart, as they are solely designed to target the change of variance. We then sequentially generate 1,000 independent streams of time series without any structural changes in order to find adequate tuning parameters that fix ARL0 to the desired level, for example, the frequently used 370, though this number must be reduced significantly in a practical situation, as mentioned later in the real data analysis. To evaluate the detection ability, we generate another 1,000 independent streams of time series that includes a structural change, then apply the control charts to obtain ARL1.
We set the length of the training time series as n = 2, 000 and examine ARL1 of control charts when some parameters, innovational distribution of ϵt, or the magnitude of the additive noise experience a change, respectively. Particularly, we specify the way in which the structural change occurs in Tables 1–8 below, alongside with the respective tuning parameters for each used control charts. When fitting HSVR-GARCH to time series, we initially fix δ = 0.1 and the number of chunks k to 5, then obtain other tuning parameters by employing the particle swarm optimization method.
Results of the first two columns directly compares our OCC-based control chart with that of [47].
“HSVR” and “GARCH” denotes that the chart is constructed using residuals obtained from fitting HSVR and GARCH(1,1) models, respectively.
4.2 Experiment results
In this subsection, we report the results of three experiments as follows. First, we compare the performance of our proposed OCC-based control chart with that of [47] by comparing control charts constructed from either or
, where the latter denotes the residuals employed in [47]. For fairness, we consider the control charts constructed from HSVR-GARCH residuals for both models to avoid the model misspecification issue in this particular experiment. Finally, we present the performance of control charts constructed from wild-bootstrap samples introduced in Remark 1 of Section 3.2. For the latter experiment, we set the number of bootstrap samples as B = 1000.
Table 1 report the comparison of OCC, CUSUM, and EWMA control charts that utilize the squared residuals against those that uses
, where the underlying model is GJR-GARCH(1,1) with only innovational outliers of Z1,2 being present. Note that for OCC-based control charts, directly using
yields significantly enhanced results compared to those using
. Moreover, in most circumstances using
, a small improvement is observed regarding the detection ability in the latter two control charts. We speculate that this phenomenon occurred because
diluted the information contained in the residuals, therefore undermining the overall detection ability. Indeed, as HSVR-GARCH model possesses the robustness on the estimated conditional volatility, squared residuals also become stabilized, thereby allowing to circumvent the constraint of using heuristic methods that struggle to suppress numerical instabilities. The result also implies that our proposed OCC-based control chart is superior to that of [47] in terms of the ARL1 performance.
Tables 2–4 depict the ARLs of control charts when the underlying model is GJR-GARCH model, where the experiment of the latter two tables particularly contains additive outliers of Cases 1 and 4, respectively. The overall performance of charts with HSVR-GARCH residuals is relatively superior especially when the dataset is contaminated with either innovational outliers or additive noises, although the detection ability of HSVR-GARCH-based control charts somewhat recedes when neither of them are present. Specifically, ARL1 of CUSUM and EWMA charts with HSVR-GARCH residuals consistently detect a change which is 5-10 percent faster than GARCH(1,1) counterparts. In particular, control charts with HSVR-GARCH residuals commonly exhibit stellar detection ability over their GARCH(1,1) residual counterparts when αj (j = 1, 2) experience a change. This finding validates HSVR-GARCH models to be highly effective in capturing a structural change when the time series is systematically more convoluted.
Furthermore, unlike [47], our OCC-based chart is shown to compare well with the CUSUM and EWMA charts that use identical squared residuals in most cases and to benefit the most from using HSVR-GARCH residuals among the three control charts. Most notably, all results regarding GJR-GARCH model excluding the case of no outliers indicate the loss of the detection ability of the OCC-based control chart when GARCH(1,1) residuals are used. This phenomenon is most likely resulting from the ill-behaved residuals due to the model bias. Therefore, it is practically advised to jointly deploy the OCC-based control charts with the HSVR-GARCH model to avoid this model bias problem.
Tables 5–7 portray the ARL0 and ARL1 of control charts when time series follows the log-GARCH model. Note that the latter two tables specifically depict the dataset contaminated with additive outliers of Cases 3 and 4, respectively. Here, HSVR-GARCH model-based control charts are observed to be more competent, as they detect a structural change up to twice as fast especially when only innovational outliers are present. Indeed, the effect of model misspecification becomes more apparent as control charts with the parametric GARCH(1,1) residuals lose detection ability in some circumstances like the case of α changing from 0.3 to 0.5. Although HSVR-GARCH residuals-based charts do not outperform when the innovations structurally become heterogeneous, HSVR-GARCH residuals still continue to be a superior choice for OCC-based control charts as those residuals significantly shorten ARL1 compared to the other two charts, while still providing a stable chart.
In addition, unlike the results of GJR-GARCH models, control charts in log-GARCH models without any external additive outliers is shown to be much more stable and immensely powerful, especially in the cases when ω or the innovational distribution of ηt experience a change. In particular, when either α or β increases to make a time series to be nearly nonstationary, most control charts with GARCH(1,1) residuals are observed to respond relatively poorly or even fail to respond. In contrast, control charts with HSVR-GARCH residuals not only successfully capture a change under those circumstances, and, on average, detect them twice as fast in the case of the parameter changes. In a nutshell, these results all fortify the strength of constructing control charts with HSVR-GARCH residuals when the underlying model is nonlinear and possibly unknown.
Finally, Table 8 denotes the performance of control charts when they are trained with samples generated by the wild-bootstrap method for both HSVR-GARCH and GARCH(1,1) models, when no outliers exist. Although some degree of performance drop is noticeable in certain cases, the three control charts with HSVR residuals retained the strong detection ability across all settings. Bootstrap methods applied to HSVR-GARCH models can be advantageous particularly when ample samples of in-control time series are unavailable. Note, however, that the bootstrap method can be less effective, for example, when the observed time series is impaired with innovational outliers, as the bootstrap samples are generated from iid standard normal innovations. Therefore, in practice, bootstrap methods can be especially beneficial in real-world scenarios when the source of observations do not possess any serious structural noises.
5 Real data analysis
This section demonstrates the real-world performance of the control charts that utilizes HSVR-GARCH residuals by analyzing financial indices, namely, the log-returns of the Nasdaq composite index and Korea Composite Stock Price Index (KOSPI). We regard the indices from June 2, 2014 to October 31, 2017 (863 observations) for the former index and from March 2, 2013 to October 31, 2019 (1,512 observations) for the latter as the pre-observed training time series, and regard the subsequent time series to be monitored afterwards. For simplicity, we denote the respective time series “Nasdaq” and “KOSPI”.
One of the prominent characteristics of both datasets is that their training set contains a number of abrupt fluctuations, thus is intrinsically unstable. This behavior can be also examined through the indices illustrated in Figs 3 and 4 and Table 9, where the skewness and the excess kurtosis depart from those of a standard normal distribution. Meanwhile, Fig 5 illustrates that both datasets do not contain significant autocorrelations up to lag 12, which validates fitting a pure GARCH-type time series to these datasets. Despite the instability, however, the estimated conditional volatility of HSVR-GARCH models in Fig 6 is observed to resemble the estimates of the GARCH(1,1) model, and the residuals obtained from the former model is revealed to be stable and relatively normally distributed. Moreover, the Ljung-Box test conducted on HSVR-GARCH residuals indicates no significant autocorrelations up to lag 12 on both datasets at the nominal level of 0.05, with p-values of 0.0724 and 0.1118, respectively. Most notably, although the training set of Nasdaq seemingly contain a period of the increased volatility, the Ljung-Box test does not reject the null for the residuals of Nasdaq, which is a strong indication that HSVR-GARCH model can successfully capture the conditional volatility in these highly volatile real-world circumstances.
Orange and blue lines respectively denote the conditional volatility estimated via HSVR-GARCH and GARCH(1,1) model.
The gist of the procedure regarding the analysis is as follows. We fit HSVR-GARCH model to the given dataset to obtain that estimates the conditional volatility, then compute 1,000 independent copies of bootstrap samples using
as presented in Remark 1 of Section 3.2. Afterwards, we sequentially observe the indices by simultaneously using OCC, CUSUM, and EWMA control charts with HSVR-GARCH residuals to monitor a structural change. The decision boundary of the OCC-based control chart and the tuning parameters necessary for CUSUM and EWMA charts are all computed using bootstrap samples and optimized to have the ARL0 of 200, rather than 370, because financial time series is highly infeasible to maintain its structural integrity for a long period of time due to various external or international socioeconomic affairs. This implies that it is generally ill-advised to set a large value of ARL0 when constructing control charts for real-world financial time series. Therefore, practitioners are suggested to construct plural control charts, avoiding a possible biased result when monitoring financial time series. All unmentioned but remaining settings required for constructing the control charts, such as δ in (15), are identical to those of Section 4.
Table 9, alongside with Figs 7 and 8 report the detected location of the change for Nasdaq and KOSPI. For Nasdaq, the optimal set of tuning parameters for OCC, CUSUM, and EWMA charts are respectively computed as (ρ, κ) = (0.9995, 0.05), (k, h) = (0.5, 5.3), and (λ, Le) = (0.2, 3.4). Moreover, the detected location of a change for all three charts appear to be similar, namely on February 2, 2018 for the OCC-based chart, and three days later for CUSUM and EWMA charts, respectively.
Left and right plots respectively depict the results on the raw index and the log-returns of Nasdaq, while solid, dotted, and dashed lines respectively denote the location of a change for OCC-based, CUSUM, and EWMA charts.
The period of the potential structural break is around the commencement of the global trade dispute between the United States and China, which started in late January, 2018. This incident can be considered as a decisive factor triggering the abrupt upsurge of the conditional volatility, as both nations continuously imposed retaliatory tariffs over the following months. In particular, the OCC-based control charts that signaled the change three days before the other two charts illuminate that OCC-based control charts, combined with HSVR-GARCH residuals, can be functional in circumstances where promptly notifying a change is critical.
On the contrary, for KOSPI, the optimal set of tuning parameters for OCC, CUSUM, and EWMA charts are obtained to be (ρ, κ) = (0.9995, 0.025), (k, h) = (0.5, 5.0), and (λ, Le) = (0.2, 3.1), respectively. Also, analogous to the preceding analysis, all three control charts detected a volatility change on January 28, 2020. Indeed, Fig 4 depicts a noticeable decline of the log-returns after the change point. The date of change is precisely located around when the severity of COVID-19 outbreak was rapidly escalating in China, while the first patient was concurrently reported in South Korea. This finding illuminates that South Korean economy suffered doubly because of COVID-19, as this incident predates the crash of the global stock market due to COVID-19 that occurred in early March, 2020. Moreover, this result further connotes that the global economy actually received a warning preceding the significant collapse in the following months.
6 Concluding remarks
This study demonstrated the merits of using residuals obtained from HSVR-GARCH models when formulating control charts, and additionally proposed a significantly improved variant of the OCC-based control chart. Unlike other hybridized GARCH models in the literature, our HSVR-GARCH model uses a likelihood-based loss function to overcome the problem that arises because of the unknownness of true volatility . It was observed that the squared residuals computed from the HSVR-GARCH model significantly bolstered the overall detection ability of all control charts including the OCC-based one, and was proven to outperform those using residuals from the parametric GARCH model, especially in circumstances where the underlying model is nonlinear, sophisticated, or contaminated with innovational or additive outliers. The monitoring method combining the squared residuals of the HSVR-GARCH model and the OCC-based control chart consistently and promptly detected a structural change even when the observed time series are heavily contaminated and unstable. We also verified the validity of using bootstrap samples obtained from the HSVR-GARCH model when constructing the OCC-based chart, which can be crucial practical in training control charts when a large amount of in-control time series is not available, which is a usual case in the financial time series analysis.
Despite a number of improvements having been made to the OCC-based control chart, we can claim to facilitate the OCC-based control chart to be even more powerful by providing a higher dimensional training dataset to SVDD, through embedding the time series or its residual process to some adequate lower-dimensional structure-preserving feature space. Due to its importance, this task is worth further investigation and remains our future research project.
Acknowledgments
This research is supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (No. 2021R1A2C1004009).
References
- 1.
Montgomery DC. Introduction to statistical quality control, 7th Ed. John Wiley & Sons; 2012.
- 2. Berthouex P, Hunter W, Pallesen L. Monitoring sewage treatment plants: some quality control aspects. Journal of Quality Technology. 1978;10(4):139–149.
- 3. Alwan LC, Roberts HV. Time-series modeling for statistical process control. Journal of business & economic statistics. 1988;6(1):87–95.
- 4. Harris TJ, Ross WH. Statistical process control procedures for correlated observations. The canadian journal of chemical engineering. 1991;69(1):48–57.
- 5. Montgomery DC, Mastrangelo CM. Some statistical process control methods for autocorrelated data. Journal of Quality Technology. 1991;23(3):179–193.
- 6. Alwan LC. Effects of autocorrelation on control chart performance. Communications in statistics-Theory and Methods. 1992;21(4):1025–1049.
- 7. Lu CW, Reynolds MR. Control charts for monitoring the mean and variance of autocorrelated processes. Journal of Quality Technology. 1999;31(3):259–274.
- 8. Loredo EN, Jearkpaporn D, Borror CM. Model-based control chart for autoregressive and correlated data. Quality and reliability engineering international. 2002;18(6):489–496.
- 9. Dyer J, Conerly M, Adams BM. A simulation study and evaluation of multivariate forecast based control charts applied to ARMA processes. Journal of Statistical Computation and Simulation. 2003;73(10):709–724.
- 10. Noorossana R, Vaghefi SJM. Effect of autocorrelation on performance of the MCUSUM control chart. Quality and Reliability Engineering International. 2006;22(2):191–197.
- 11. Chang SI, Zhang K. Statistical process control for variance shift detections of multivariate autocorrelated processes. Quality Technology & Quantitative Management. 2007;4(3):413–435.
- 12. Osei-Aning R, Abbasi SA, Riaz M. Mixed EWMA-CUSUM and mixed CUSUM-EWMA modified control charts for monitoring first order autoregressive processes. Quality Technology & Quantitative Management. 2017;14(4):429–453.
- 13. Francq C, Zakoian JM. Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli. 2004;10(4):605–637.
- 14. Issam BK, Mohamed L. Support vector regression based residual MCUSUM control chart for autocorrelated process. Applied mathematics and computation. 2008;201(1-2):565–574.
- 15. Cuentas S, Peñabaena-Niebles R, Garcia E. Support vector machine in statistical process monitoring: a methodological and analytical review. The International Journal of Advanced Manufacturing Technology. 2017;91(1):485–500.
- 16. Zhang H, Albin S. Determining the number of operational modes in baseline multivariate SPC data. IIE transactions. 2007;39(12):1103–1110.
- 17. Maboudou-Tchao EM. Monitoring the mean with least-squares support vector data description. Gestão & Produção. 2021;28.
- 18. Maboudou-Tchao E, Harrison CW, Sen S. A comparison study of penalized likelihood via regularization and support vector-based control charts. Quality Technology & Quantitative Management. 2023;20(2):147–167.
- 19.
Vapnik VN. The Nature Of Statistical Learning Theory. New York: Springer; 2000.
- 20. Smola A, Schölkopf B. A tutorial on support vector regression. Statistics and computing. 2004;14:199–222.
- 21. Fernandez-Rodriguez F, Gonzalez-Martel C, Sosvilla-Rivero S. On the profitability of technical trading rules based on artificial neural networks: evidence from the Madrid stock market. Economics Letters. 2000;69:89–94.
- 22. Cao L, Tay F. Financial forecasting using support vector machines. Neural Computation and Application. 2001;10:184–192.
- 23. Pérez-Cruz F, Afonso-Rodriguez J, Giner J. Estimating GARCH models using SVM. Quantitative Finance. 2003;3:163–172.
- 24. Chen S, Härdle WK, Jeong K. Forecasting volatility with support vector machine-based GARCH model. Journal of Forecasting. 2010;433:406–433.
- 25. Bezerra PCS, Albuquerque PHM. Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels. Computational Management Science. 2017;14:179–196.
- 26. Lee S, Lee S, Moon M. Hybrid change point detection for time series via support vector regression and CUSUM method. Applied Soft Computing. 2020;89(106101).
- 27. Lee S, Kim C, Lee S. Hybrid CUSUM change point test for time series with time-varying volatilities based on support vector regression. Entropy. 2020;22:578. pmid:33286350
- 28. Lee S, Kim CK, Kim D. Monitoring volatility change for time series based on support vector regression. Entropy. 2020;22(11):1312. pmid:33287077
- 29. Kim CK, Lee S. Conditional quantile change test for time series based on support vector regression. Communications in Statistics-Simulation and Computation. 2021; p. 1–18.
- 30.
Zhang X. Using class-center vectors to build support vector machines. In: Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No. 98TH8468). IEEE; 1999. p. 3–11.
- 31. Mangasarian OL, Musicant DR. Robust linear and support vector regression. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000;22(9):950–955.
- 32. Zhao Y, Sun J. Robust support vector regression in the primal. Neural Networks. 2008;21(10):1548–1555. pmid:18829255
- 33. Balasundaram S, Meena Y. Robust support vector regression in primal with asymmetric Huber loss. Neural Processing Letters. 2019;49(3):1399–1431.
- 34. Sun R, Tsung F. A kernel-distance-based multivariate control chart using support vector methods. International Journal of Production Research. 2003;41(13):2975–2989.
- 35. Tax DM, Duin RP. Support vector data description. Machine learning. 2004;54(1):45–66.
- 36. Sukchotrat T, Kim SB, Tsung F. One-class classification-based control charts for multivariate process monitoring. IIE transactions. 2009;42(2):107–120.
- 37. Kim SB, Jitpitaklert W, Sukchotrat T. One-class classification-based control charts for monitoring autocorrelated multivariate processes. Communications in Statistics—Simulation and Computation. 2010;39(3):461–474.
- 38. Gani W, Limam M. Performance evaluation of one-class classification-based control charts through an industrial application. Quality and Reliability Engineering International. 2013;29(6):841–854.
- 39. Gani W, Limam M. A one-class classification-based control chart using the-means data description algorithm. Journal of Quality and Reliability Engineering. 2014;2014.
- 40. Maboudou-Tchao EM. Change detection using least squares one-class classification control chart. Quality Technology & Quantitative Management. 2020;17(5):609–626.
- 41.
Vapnik V. Statistical learning theory. New York: John Wiley and Sons; 1998.
- 42. Nelson DB. Conditional heteroskedasticity in asset returns: A new approach. Econometrica: Journal of the econometric society. 1991; p. 347–370.
- 43.
Fung G, Mangasarian OL. Proximal support vector machine classifiers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining; 2001. p. 77–86.
- 44. Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2011;42(2):513–529. pmid:21984515
- 45. Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC. Estimating the support of a high-dimensional distribution. Neural computation. 2001;13(7):1443–1471. pmid:11440593
- 46. Oh H, Lee S. Modified residual CUSUM test for location-scale time series models with heteroscedasticity. Annals of Institute of Statistical Mathematics. 2019;71:1059–1091.
- 47. Lee S, Lee S, Kim CK. One-class classification-based monitoring for the mean and variance of time series. Quality and Reliability Engineering International. 2022;38(5):2548–2565.
- 48. Chen S, Härdle WK, Jeong K. Forecasting volatility with support vector machine-based GARCH model. Journal of Forecasting. 2010;29(4):406–433.
- 49. Hwang CH, Shin SI. Estimating GARCH models using kernel machine learning. Journal of the Korean Data and Information Science Society. 2010;21(3):419–425.
- 50. Liu DC, Nocedal J. On the limited memory BFGS method for large scale optimization. Mathematical programming. 1989;45(1):503–528.
- 51. Glosten LR, Jagannathan R, Runkle DE. On the relation between the expected value and the volatility of the nominal excess return on stocks. The Journal of Finance. 1993;48:1779–1801.
- 52. Sucarrat G, Grønneberg S, Escribano A. Estimation and inference in univariate and multivariate log-GARCH-X models when the conditional density is unknown. Computational statistics & data analysis. 2016;100:582–594.
- 53. Lu CW, Reynolds MR. CUSUM charts for monitoring an autocorrelated process. Journal of Quality Technology. 2001;33(3):316–334.
- 54. Knoth S, Schmid W. Control charts for time series: A review. Frontiers in statistical quality control 7. 2004; p. 210–236.
- 55. Testik MC. Model inadequacy and residuals control charts for autocorrelated processes. Quality and Reliability Engineering International. 2005;21(2):115–130.
- 56. Hansen PR, Lunde A. A forecast comparison of volatility models: does anything beat a GARCH (1, 1)? Journal of applied econometrics. 2005;20(7):873–889.