A robust cusum control chart for median absolute deviation based on trimming and winsorization

Umair Khalil; Tahira Saeed Khan; Walaa Ahmad Hamdi; Dost Muhammad Khan; Muhammad Hamraz

doi:10.1371/journal.pone.0297544

Abstract

Statistical quality control is concerned with the analysis of production and manufacturing processes. Control charts are process control techniques, commonly applied to observe and control deviations. Shewhart control charts are very sensitive and used for large shifts based on the basic assumption of normality. Cumulative Sum (CUSUM) control charts are effective for identifying that may have special causes, such as outliers or excessive variability in subgroup means. This study uses a CUSUM control chart problems structure to evaluate the performance of robust dispersion parameters. We investigated the design structure features of various control charts, based on currently defined estimators and some new robust scale estimators using trimming and winsorization in different scenarios. The Median Absolute Deviation based on trimming and winsorization is introduced. The effectiveness of CUSUM control charts based on these estimators is evaluated in terms of average run length (ARL) and Standard Deviation of the Run Length (SDRL) using a simulation study. The results show the robustness of the CUSUM chart in observing small changes in magnitude for both normal and contaminated data. In general, robust estimators MADTM and MADWM based on CUSUM charts outperform in all environments.

Citation: Khalil U, Khan TS, Hamdi WA, Khan DM, Hamraz M (2024) A robust cusum control chart for median absolute deviation based on trimming and winsorization. PLoS ONE 19(5): e0297544. https://doi.org/10.1371/journal.pone.0297544

Editor: Kok Haur Ng, Universiti Malaya, MALAYSIA

Received: January 16, 2023; Accepted: January 9, 2024; Published: May 29, 2024

Copyright: © 2024 Khalil et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data are generated by simulation.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Statistical process control (SPC) is a method used in quality control to apply statistical techniques for monitoring and managing a system. The initiation of SPC occurs during the planning phase of a product or service when the relevant attributes are specified. In 1931, Shewhart introduced the concept of control charts, a pivotal technique in SPC. However, the effectiveness of these control charts diminishes when the assumption of normality is violated, and outliers are present in the data.

For enhanced robustness, it is desirable to have control charts that are less influenced by violations of fundamental assumptions. The selection of control charts depends on the process attribute under consideration and the type of change or shift quantity to be evaluated. Control charts are broadly classified into two categories: memoryless control charts and memory control charts.

Memoryless control charts, often referred to as Shewhart-type control charts, are less sensitive to small and moderate parameter changes in location and dispersion. On the other hand, memory control charts, such as CUSUM control charts [1–3] and exponentially weighted moving average (EWMA) control charts [4–7], which are designed to address issues related to outliers and deviations from normality.

The CUSUM charts have gained popularity in quality control due to their simplicity and efficiency, initially used for monitoring mean levels of processes [8, 9]. However, their application for measuring process variability has received less attention. Hawkins suggested a robust chart for individual observations based on winsorization, while Lucas and Crosier explored methods to enhance the robustness of standard CUSUM charts [10–12].

The study by Lee et al. [13], proposed CUSUM charts for systematically correlated data, Wang et al. introduced a nonparametric CUSUM chart focused on the Mann-Whitney statistic, and Wang et al. [14, 15] suggested an adaptive multivariate CUSUM chart. Moustafa [16] introduced modified Shewhart charts for median and median absolute deviations as robust location and dispersion estimators.

Ou et al. [17, 18] conducted a comparison study on the performance of various control charts, including standard charts, CUSUM, and sequential probability ratio test SPRT control charts, considering special situations such as trimmed and winsorized means. Wang et al. [19] introduced Trimmed and Winsorized means for transformed data based on scaled deviation, which proved to be more robust.

The Maxwell CUSUM control chart, proposed by Hossain et al. [20], efficiently monitors failure rates in boring processes. The VCUSUM chart, based on a Maxwell distribution, has been developed to detect tiny changes in a process. Castagliola et al. [21] used the CUSUM median chart, and Moustafa et al. [22] suggested MTSD-TCC, a robust control chart based on the modified trimmed standard deviation (MTSD) as an alternative to Tukey’s control chart (TCC).

This paper aims to enhance the efficiency of CUSUM control charts by modifying the use of dispersion parameters and comparing the efficiency of robust estimators in different environments. The investigation includes the performance of CUSUM control charts in uncontaminated and contaminated environments with symmetric and asymmetric variance disturbances, as well as non-normal environments, using Average Run Length (ARL) and Standard Deviation of the Run Length (SDRL).

To facilitate interpretation, the discussion will focus on the upper side of the CUSUM control charts, with a note that double-sided CUSUM control charts exhibit qualitative similarity. The remaining sections of the paper are organized as follows: Section 2 describes dispersion estimators, Section 3 presents proposed estimators, and Section 4 outlines the proposed CUSUM control chart (Fig 1) with different robust dispersion estimators based on trimmed and winsorization. Finally, major conclusions are summarized in the closing section.

Download:

Fig 1. Flowchart procedure for the proposed method of CUSUM control chart.

https://doi.org/10.1371/journal.pone.0297544.g001

2 Description of process dispersion estimators

Let ϑ be the parameter of the process dispersion that needs to be controlled by control charts and be the estimator based on a sample of size n. For there are several choices. David [23] gives a clear description of standard deviation estimators. Typical estimators are the average of the sample standard deviations, pooled sample standard deviation, and average of sample ranges. Mahmoud et al. [24] investigated the relative ability of estimators for different k samples of size n. Schoonhoven et al. [25] considered various estimators of the population standard deviation and presented a detailed overview of their efficiency and use for different stages in the control chart.

The following estimators are used in this paper, which is described:

The first estimator of ϑ is the sample standard deviation S defined as: (1) where Y_i indicates the i^th observation of sample size n and indicate the sample mean. In a normally distributed environment, the sample standard deviation S is the most effective estimator but is strongly influenced by outliers. The sample standard deviation breakdown point (the ratio of outlying observations that an estimator can deal with) is zero.

The sample interquartile range (IQR) is the next estimator based on CUSUM- charts which are defined by (2) where Q₁ and Q₃ are the first and the third quartiles of the sample, respectively. The sample interquartile range is more stable than the sample standard deviation [26]. The breakdown point of IQR is 25%.

The median absolute deviation from the sample median (MADM) is a very robust dispersion estimator rather than the sample standard deviation. It calculates the differences of the data from the median of the sample. The MADM is defined as: (3) and where the sample median is . For the parameter of interest, the constant 1.4826 is required to make the estimator compatible. In case of normal distribution, σ normal parameter is required to set 1.4826. (In the case of an unbiased estimator of σ, we need to set this constant to 1.4826 if a random sample is taken from a normal distribution. Median Absolute Deviation is 1.4826 times the Median of Absolute Differences of Individual Values of a Dataset from the Median of the Dataset) (Supporting Data).

3 Proposed estimators based on trimmed winsorization

The trimmed mean is a relatively robust estimate of the centre, which decreases the effect of outliers or large tails by eliminating the observations at the distribution. Let Y₁, Y₂, ⋯, Y_n, represents observations on a variable from a random sample of size n. We begin by arranging the Y values from smallest to largest, Y₁ ≤ Y₂ ≤ ⋯ ≤ Y_n, and measuring the number of trimmings required. The symmetrically trimmed sample of k-times is obtained by removing the k-smallest and k-largest values. In this case, k = (αn) is the largest integer, and trimming is done for α % (0 ≤ α ≤ ½) of n. The trimmed mean is defined as: (4)

The breakdown point is calculated by the number of trimmings thus BDP = α. A basic rule of thumb is to deduct from each tail of the distribution 10% of the observations (i.e., set α = 0.2). Mean deviation from trimmed mean MDTM is defined as: (5)

The next proposed estimator is the median of the absolute deviations from the trimmed mean, MADTM is defined as: (6)

The method of substituting a given number of extreme values with having small values has become known as winsorizing data or winsorization. Let Y₁, Y₂, ⋯, Y_n, represents observations on a variable from a random sample of size n. The data of Y values are sorted from smallest to largest, i.e Y₁ ≤ Y₂ ≤ ⋯ ≤ Y_n, and the smallest k values are replaced with the smallest (k+1)^st values. The same process is valid for the largest values, substituting the largest k values with the largest (k+1)^st value. The mean is known as the winsorized mean in this new set of numbers. The winsorized mean is a robust, unbiased approximation of the population mean if the data are from a symmetric population. The k times winsorized mean is defined as: (7)

The mean deviation from the winsorized mean MDWM is (8)

The next proposed estimator is the median of the absolute deviations from the winsorized mean, MADWM is defined as: (9)

For comparison and to determine the precision of the dispersion robust estimators used in this analysis, the standardized variances of the estimators as proposed by Rousseeuw and Croux [27] and relative efficiencies of the estimators as suggested by Abbasi and Miller [28] are calculated.

The dispersion estimator of standardized variance () is measured as: (10)

To obtain a normal measure of the precision of a scale estimator the denominator of is necessary [29]. The estimator’s relative efficiency () is calculated as: (11)

First, the and values for all robust estimators are computed and compared. A simulation study is used to check the performance of robust estimators based on CUSUM- charts. The simulation is run 20,000 times, and are determined by samples of size n = 5,6 and 9 based on the following conditions: uncontaminated normal, contaminated normal, gamma, and logistic scenarios. The and results are listed in Tables 1 and 2 under different scenarios of dispersion estimators based on CUSUM- charts. Results given in the tables of and suggested that, under the uncontaminated normal scenario the S has the largest but smallest under Logistic distribution with sample size n = 9. The smallest of the dispersion estimator is MADWM(with 25% winsorizing) for a small sample size n = 5 under 15% contaminated normal scenario. The efficiency of other dispersion estimators lies between these two estimators MADWM (with 10%, 20%, and 25% winsorizing) and S estimators. The IQR has the smallest under logistic distribution and gamma distribution. Under 5%, 10%, and 15% symmetric variance contaminated normal scenario the MADTM(at 10% and 25% trimming) with sample size n = 6 has the smallest . Under 5% symmetric variance contaminated normal scenario the MADWM(at 10% winsorizing) with sample size n = 5 and 9, 10% contaminated Normal scenario the MADWM(at 20% and 25% winsorizing) with sample size n = 5 and 9 and 15% contaminated Normal scenario the MADWM(at 25% winsorizing) with sample size n = 5 has the smallest . The MADTM(at 10%, 20%, and 25% trimming) obtains the smallest for small 1% contamination with sample sizes n = 5,6 and 9. For Gamma distribution, the MADTM(at 25% trimming) with sample size n = 5 has the lowest value of . Particularly for the non-normal scenarios, the MADTM(at 10%, 20%, and 25% trimming) and MADWM (at 10%, 20%, and 25% winsorizing) performance is best in contrast to the rest of all other estimators. The dispersion estimators MADM, MDTM(at 10%, 20%, and 25% trimming) and MDWM (at 10%, 20%, and 25% winsorizing) are highly affected by contaminations and non-normal environments. It shows that proposed estimators MADTM(at 10%, 20%, and 25% trimming) and MADWM (at 10%, 20%, and 25% winsorizing) performance is more efficient than other estimators.

Download:

Table 1. Standardized variance of robust estimators in different scenarios.

https://doi.org/10.1371/journal.pone.0297544.t001

Download:

Table 2. Relative efficiency of robust estimators in different scenarios.

https://doi.org/10.1371/journal.pone.0297544.t002

4 The proposed method of CUSUM charts for different robust dispersion estimators

For the CUSUM procedures, identify a way to increase the dispersion process parameter ϑ. Let be an estimator from Section 2 of the dispersion process parameter ϑ from a random sample of size n that is taken.at regular intervals from a continuous production process. The CUSUM- chart is defined as: (12)

According to Tuprah and Ncube [30] where Y₀ = 0 and the reference value of the scheme is . Y_t is plotted against the sample number t. The process is assumed to be out of reach if (where defines the decision interval) for any value of t and it is concluded that the dispersion of the process has increased. The procedure of average run length is the expected value of the run length of the process and the random variable run length for the sample number at which . The values are selected such that changes in the dispersion of process parameters are easily identified. When the system is in control in all the scenarios considered in this analysis, values are selected for a fixed value of ARL along with the value and is denoted by ARL₀. ARL₁ stands for the out-of-control ARL, which is predicted to be as small as possible. The reference value is based on Tuprah and Ncube [30], Ewan and Kemp [31], and E.S. Page [32], so the value is taken as half of the expected values of given ϑ₀ = 1 and the expected values of given ϑ₁ = 1.4, where ϑ₀ is the target value and ϑ₁ is the value of dispersion process that needs to be easily detected. E.S. Page [32] in Table 1, presented the reference values for noticing a change (that is ϑ₁ = 1.40 to ϑ₁ = 2.23) easily in the dispersion of the process using the sample range.

(13)

Accordingly, for it is difficult to find the value of analytically. For this purpose, simulation is used, from normal distribution random samples are generated with mean ϑ₀ = 1 respectively, ϑ₁ = 1.40, and variance equal to one and it calculates the said expected value.

The results of CUSUM- charts are obtained in the following scenarios based on Tatum [33] and Schoonhoven et al. [25].

A model in which all observations are from N(0,1) (i.e., uncontaminated scenario).
A symmetric variance disturbances model, in which each observation has a 99% probability from the distribution N(0,1) and a 1% probability from N(0,9).
A model of asymmetric variance disturbances, in which each observation is taken from an N(0,1) and has a 1% probability of adding a multiple of a variable to it, with a multiplier equal to 4.
We consider two situations to examine the impact of non-normal distributions: the first contains disturbing the kurtosis, and the second involves disturbing the symmetry distribution. We use Logistic distribution Logistic (2,1) for the disturbance of kurtosis and the gamma distribution for the disturbance in symmetry G(2,1).

Tables 3 and 4 show the values of and for different robust estimators based on CUSUM- charts under different scenarios (i.e. normal and non-normal).

Download:

Table 3.

Values for CUSUM-

charts in different scenarios with ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t003

Download:

Table 4.

Values for CUSUM-

charts in different scenarios with ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t004

In different scenarios (normal and non-normal) values are searched by selecting random samples separately from the environments described until the value of is obtained in each case. An iterative method is used to modify the desired ARL as well as the reference value. Table 4 is given with ARL₀ = 500 and the values of . Similarly, alternative values of can be found for other values of ARL₀. Since the ARL₀ of the CUSUM- chart’s results are prone to these values, the and values must be carefully selected.

5 Evaluation of CUSUM— charts performance

The ARL is used as simulation method to evaluate the performance of the suggested CUSUM - charts. The ARL of in-control and out-of-control systems is calculated using the monte carlo simulation. The descriptions of the simulation are: 20000 random samples of size n were created from the different scenarios (i.e. normal, contaminated normal, or non-normal) and the dispersion estimators concerned with some recent estimators (i.e. S, IQR, and MADM) as well as some suggested robust estimators (i.e. MDTM, MDWM, MADTM, and MADWM) based on trimming and winsorization at (10%, 20%, and 25%) are measured. Tables 3 and 4 are used to generate the corresponding limits of the control chart. It is noted that the sample number at which statistic Y_t lies beyond the control limits, this sample number is known as run-length, and it is a random variable. To determine the run length distribution, the same process is repeated 12000 times. The ARL represents the average of the run length distribution and SDRL represents the standard deviation of the run length distribution. To determine the run lengths a code has been built in the R language.

5.1 Results and discussions

The ARL₁ and SDRL₁ are used in different environments to evaluate the performance and efficiency of the CUSUM- charts. In terms of ϑ (i.e δϑ) we have identified shifts which specify that the shifted dispersion parameter is defined as . Here δ = 1 indicates that there is no shift in ϑ and the dispersion of the process is constant, and δ > 1 indicates that the process ϑ has increased. ARL₁ increases when the process shift decreases. SDRL decreases as the size of the process shift increases. It depends on the size of the shift. When the process is in control, the ARL and SDRL process to be close to its targeted value namely 500 In all environments, robust MADTM and MADWM estimators based on CUSUM charts work well.

5.1.1 Uncontaminated environment.

All observations are normally distributed in an uncontaminated environment N(0,1). This environment is the fundamental assumption of the design structure of each chart. This provides a conceptual framework for comparing the various types of control charts and the suggested CUSUM- chart. Table 5 shows the results of ARL.

Download:

Table 5. ARL values of robust estimators based on CUSUM-

charts in uncontaminated environment N(0,1) when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t005

A large value of ARL is desired when the process is stable or in control. In Table 5 the bold letter shows the highest score of ARL of robust estimators at different levels of trimming and winsorization with sample sizes of n = 5 and n = 9. It can be seen that the Standard deviation S based on the CUSUM- chart of sample size n = 5 has the best performer as compared to IQR, MADM highlighted values in Table 5. The proposed estimator MDTM (at 10%, 20%, and 25% trimming) performance is best for both sample sizes (n = 5 and 9) as compared to S, IQR, MADM. For both sample sizes n = 5 and n = 9 when the shift δ > 1.25 the MADTM (at 10%, 20%, and 25% trimming) and the MADWM (at 10%, 20% and 25% winsorizing) performs better as compared to the S, IQR, and MADM. The ARL of proposed estimator the MADTM(at 10%, 20%, and 25% trimming) and the MADWM (at 10%, 20%, and 25% winsorizing) are large than all other estimators for both sample size (n = 5 and 9). It shows that the performance of both proposed estimators is best.

To further clarify the distribution of run lengths in an environment of the uncontaminated case, the SDRL of the CUSUM- charts is often recorded to measure the performance of run-length as proposed by Antzoulakos and Rakitzisis [34]. Table 6 shows the details. The SDRL process is to be close to its targeted value namely 500 when the process is in control. Table 6 shows that SDRL has a significantly lower value than their targeted value for certain CUSUM- chart and SDRL decreases for all charts as to the δ increases.

Download:

Table 6. SDRL values of robust estimators based on CUSUM-

charts in uncontaminated environment N(0,1) when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t006

5.1.2 Symmetric variance environment.

A symmetric variance distribution is used when the spread parameter has been disturbed. In such an environment, we examined the performance of the suggested estimators with their corresponding CUSUM charts in which each observation has a 99% probability that is derived from normal distribution N(0,1) and 1% probability taken from normal distribution N(0,9). Tables 7 and 8 present the ARL and SDRL results of symmetric variance for sample sizes n = 5 and n = 9.

Download:

Table 7. ARL values of robust estimators based on CUSUM-

charts under symmetric variance contaminated environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t007

Download:

Table 8. SDRL values of robust estimators based on CUSUM-

charts under symmetric variance contaminated environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t008

From Tables 7 and 8 results of ARL and SDRL show that S and IQR are better than MADM based on CUSUM- charts of sample size n = 5 but less efficient for large sample size (n = 9). The larger values of ARL are highlighted. The MDTM (at 10%, 20%, and 25% trimming) and MDWM (at 10%, 20%, and 25% winsorizing) are reasonably good performances at the sample size n = 9 although they are more efficient than S, IQR, and MADM. The proposed estimator MADTM (at 10%, 20%, and 25% trimming) for both sample sizes n = 5 and n = 9 has shown best overall performance than other estimators for all shifts of the dispersion process. The MADWM (at 10%, 20%, and 25% winsorizing) is very sensitive when the sample size is small n = 5 but as the sample size increases (n = 9) the MADWM (at 10%, 20% and 25% winsorizing) performs well as compared to S, IQR, and MADM. The shift δ > 1.25 the IQR, the MADTM (at 20% and 25% trimming) and MADWM (at 10% winsorizing) are good for small sample size n = 5 when the sample size is large n = 9 the MADTM (at 10%, 20%, and 25% trimming) and MADWM (at 10%, 20% and 25% winsorizing) performs best as compared to other estimators in the increasing shift of the dispersion process.

5.1.3 Asymmetric variance environment.

In an asymmetric variance environment, each observation is taken from normal distribution N(0,1) and has a 1% probability of adding a multiple of Chi-Square with one degree of freedom to it with a multiplier equal to 4. Tables 9 and 10 show the results of ARL and SDRL respectively for sample sizes n = 5 and n = 9.

Download:

Table 9. ARL values of robust estimators based on CUSUM-

charts under asymmetric variance contaminated environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t009

Download:

Table 10. SDRL values of robust estimators based on CUSUM-

charts under asymmetric variance contaminated environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t010

The above Table 9 of ARL clearly illustrates that for a small sample size n = 5 the S and MADM are better than MADWM (at 20% and 25% winsorizing) but less efficient than the other estimators. When the sample size is small i.e n = 5, IQR performance is good based on CUSUM- charts as compared to S, MADM, MDTM (at 10% trimming) and MADWM (at 20% and 25% winsorizing). The larger values of ARL are highlighted. For a small sample size n = 5 MDTM (at 20% and 25% trimming) is better than S, IQR MADM. For a large sample size n = 9 is better than IQR and MADM. The performance of MDTM (at 10%, 20%, and 25% trimming), MDWM (at 10%, 20%, and 25% winsorizing), and MADWM (at 10%, 20% and 25% winsorizing) is best for large sample size n = 9 and more efficient as compared to S, IQR, and MADM. The MADTM (at 10%, 20%, and 25% trimming) shows superior performance to other estimators in increasing all shifts of the dispersion process for both sample sizes n = 5 and n = 9. When δ > 1.25 IQR, MADM and MADTM (at 20%, and 25% trimming) outperform all other estimators for both sample sizes of n = 5 and n = 9.

5.1.4 Non-normal environment.

The samples prepared in this way are transformed without loss of generality. One way to get the resulting sample with zero mean and one variance. For this reason, the mean is subtracted from each sample taken from the non-normal environment and then divided by the non-normal environment of the standard deviation to determine the correct result and comparable performance.

Tables 11 and 12 present the ARL values of different estimators to predict an increase in dispersion process at different magnitudes for in-control ARL_O = 500 and sample size n = 5 when underlying process distribution are Gamma and Logistic. The following are some important outcomes of ARL and SDRL values of Gamma distribution G(2,1).

Download:

Table 11. ARL values of robust estimators based on CUSUM-

charts under G(2,1) environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t011

Download:

Table 12. SDRL values of robust estimators based on CUSUM-

charts under G(2,1) environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t012

The MADM and the MDTM (at 10%, 20%, and 25% trimming) show good performance as compared to S and IQR. The MDWM (at 10%, 20%, and 25% winsorizing) performance is better than S, IQR, MADM, MDTM(at 10%, 20%, and 25% trimming) and MADTM (10% trimming). The performance of the proposed estimator MADTM (10%, 20%, and 25% trimming) is best as compared to all other estimators in the increasing shifts of the dispersion process. The larger values of ARL are highlighted. The MADWM (at 10%, 20%, and 25% winsorizing) perform work well as compared to S, IQR, MADM, MDTM(at 10%, 20%, and 25% trimming), MDWM (at 10%, 20% and 25% winsorizing). For δ > 1.25 the IQR, MADM, MADTM(at 20% and 25% trimming) and MADWM (at 20% and 25% winsorizing) performs work well as compared to other estimators.

Tables 13 and 14 present the ARL and SDRL values based on CUSUM- charts of Logistic distribution.

Download:

Table 13. ARL values of robust estimators based on CUSUM-

charts under Logistic(2,1) environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t013

Download:

Table 14. SDRL values of robust estimators based on CUSUM-

charts under Logistic(2,1) environment when ARL_O = 500.

https://doi.org/10.1371/journal.pone.0297544.t014

For the logistic distribution, the MADM performs work well as compared to S, IQR, and MDTM (at 10%, 20% trimming) and MADWM (at 10%, 20% and 25% winsorizing). The MDWM (at 10%, 20%, and 25% winsorizing) performance is better than S, IQR, MADM, MDTM(at 10% and 20%, 25% trimming), MADTM (10% trimming) and MADWM (at 10%, 20% and 25% winsorizing). The proposed estimator MADTM (at 10%, 20%, and 25% trimming) performance is excellent as compared to all other estimators in increasing shifts of the dispersion process. The IQR, MADM MADTM (at 10%, 20%, and 25% trimming) and MADWM (at 20% winsorizing) perform work well for δ > 1.25.

6 Conclusion

In this paper, several estimators of dispersion parameters are considered for use in the development of Phase II control limits. These include some widely used estimators as well as robust estimators that are uncommon in the literature of control charts. The robust dispersion parameter was monitored using the CUSUM- control chart structure for these estimators. In different environments, the results of these robust estimators are evaluated. The uncontaminated environment, different contaminated environments symmetric variance, asymmetric variance disturbances and non-normal environments. All charts perform well under the uncontaminated environment, but the CUSUM- control chart based on the MADTM (at 20% and 25% trimming) and MADWM (at 10%, 20% and 25% winsorizing) outperform all estimators under normality for large sample size n = 9. The performance of suggested estimators MDTM (at 10%, 20%, and 25% trimming) and MADTM (10%, 20%, and 25% trimming) are good for both sample sizes n = 5 and n = 9 in symmetric variance and asymmetric variance environment. When the environment is non-normal the estimators MDTM (at 25% trimming), MDWM (at 10%, 20%, and 25% winsorizing), MADTM (10%, 20%, and 25% trimming) and MADWM (at 10%, 20% and 25% winsorizing) perform best under Gamma distribution. For Logistic distribution, the MDTM (at 25% trimming), MDWM (at 10%, 20%, and 25% winsorizing), and MADTM (10%, 20%, and 25% trimming) perform best than other estimators. In general, robust estimators MADTM (at 10%, 20%, and 25% trimming) and MADWM (10%, 20%, and 25% winsorizing) based on CUSUM charts perform superior in all environments like uncontaminated environments and different contaminated environments with symmetric, asymmetric variance disturbances and non-normal environment.

Supporting information

S1 Table. Data generating R program, Table 1 standard deviation code for N(0,1).

https://doi.org/10.1371/journal.pone.0297544.s001

(DOCX)

S1 File. Data 2 generating R program, S. D codes for G (2,1).

https://doi.org/10.1371/journal.pone.0297544.s002

(DOCX)

S2 File. Data 3 generating R program, S.D codes for ARL & SDRL.

https://doi.org/10.1371/journal.pone.0297544.s003

(DOCX)

References

1. Page E.S., “Continuous Inspection Schemes,” Biometrika, vol. 41, pp. 100–15, 1954.
- View Article
- Google Scholar
2. Abujiya Mu’azu Ramat, Riaz Muhammad, Lee Muhammad Hisyam, Enhanced Cumulative Sum Charts for Monitoring Process Dispersion, published 22 Apr 2015, PLOS ONE, pmid:25901356
- View Article
- PubMed/NCBI
- Google Scholar
3. Aslam Muhammad, Shafqat Ambreen, Albassam Mohammed, Malela-Majika Jean-Claude, Shongwe Sandile C., A new CUSUM control chart under uncertainty with applications in petroleum and meteorology, published 04 Feb 2021, PLOS ONE, pmid:33539442
- View Article
- PubMed/NCBI
- Google Scholar
4. Roberts S.W., “Control Chart Tests Based on Geometric Moving Average,” Technometrics, vol. 1, pp. 239–250, 1959.
- View Article
- Google Scholar
5. Sukparungsee Saowanit, Areepong Yupaporn, Taboran Rattikarn. Exponentially weighted moving average—Moving average charts for monitoring the process mean. published 14 Feb 2020 PLOS ONE pmid:32059001
- View Article
- PubMed/NCBI
- Google Scholar
6. Riaz Muhammad, Abid Muhammad, Nazir Hafiz Zafar, Abbasi Saddam Akber. An enhanced nonparametric EWMA sign control chart using sequential mechanism. published 21 Nov 2019, PLOS ONE, pmid:31751403
- View Article
- PubMed/NCBI
- Google Scholar
7. Talordphop Khanittha, Sukparungsee Saowanit, Areepong Yupaporn. Performance of new nonparametric Tukey modified exponentially weighted moving average—Moving average control chart. Research Article | published 29 Sep 2022 PLOS ONE pmid:36174049
- View Article
- PubMed/NCBI
- Google Scholar
8. Montgomery D.C, Introduction to Statistical Quality Control, Sixth Edition. New York, John Wiley & Sons, 2009.
9. Siegmund D., “Sequential Analysis: Tests and Confidence Intervals,” New York: Spring-Verlag, 1985.
10. Reynolds M.R., Stoumbos Z.G., “Robust CUSUM charts for monitoring the process mean and variance,” Quality and Reliability Engineering International, vol. 26, no. 5, pp. 453–473, 2010.
- View Article
- Google Scholar
11. Hawkins D., “Robustification of cumulative sum charts by Winsorization,” Journal of Quality Technology, vol.25, no. 4, pp. 248–261, 1993.
- View Article
- Google Scholar
12. Lucas J.M., Crosier R.B., “Fast initial response for CUSUM quality control scheme,” Technometrics, vol. 24, pp. 199–205, 1982.
- View Article
- Google Scholar
13. Lee M.H., “Economic design of cumulative sum control chart for non-normally correlated data,” Matematika, vol. 27, no. 1, pp. 79–96, 2011.
- View Article
- Google Scholar
14. Wang D., Zhang L., Xiong Q., “A non-parametric CUSUM control chart based on the Mann-Whitney statistic,” Communications in Statistics—Theory and Methods, vol. 46, no. 10, pp. 4713–4725, 2017.
- View Article
- Google Scholar
15. Wang T., Huang S., “An adaptive multivariate CUSUM control chart for signaling a range of location shifts,” Communications in Statistics—Theory and Methods, vol. 45, no. 16, pp. 4673–4691, 2016.
- View Article
- Google Scholar
16. Moustafa A., “A control chart based on robust estimators for monitoring the process mean of a quality characteristic,” International Journal of Quality and Reliability Management, vol. 26, no. 5, pp. 480–496, 2009.
- View Article
- Google Scholar
17. Ou Y., Wu Z., Tsung F., “A comparison study of effectiveness and robustness of control charts for monitoring process mean,” International Journal of Production Economics, vol. 135, pp. 479–490, 2011.
- View Article
- Google Scholar
18. Ou Y., Wen D., Wu Z., Khoo M.B.C., “A comparison study on effectiveness and robustness of control charts for monitoring process mean and variance,” Quality Reliability and Engineering International, vol. 28, pp. 3–17, 2012.
- View Article
- Google Scholar
19. WANG Si-yang, CUI Heng-Jian, “Trimmed and Winsorized Transformed Means Based on a Scaled Deviation,” Acta Mathematicae Applicate Sinica, English Series, vol. 31, no. 2 pp. 475–492, 2015.
- View Article
- Google Scholar
20. Pear Hossain M., Ridwan Sanusi A., Hafiz Omar M., and Muhammad Riaz. “On designing Maxwell CUSUM control chart: an efficient way to monitor failure rates in boring processes,” International Journal of Advanced Manufacturing Technology, 2018.
- View Article
- Google Scholar
21. Hu X.L., Castagliola P., Tang A.A., “Conditional design of the CUSUM median chart for the process position when process parameters are unknown,” Journal of Statistical Computation and Simulation, vol. 89, no. 13 pp. 2468–2488, 2019.
- View Article
- Google Scholar
22. Abu-Shawiesh Moustafa Omar Ahmed, Riaz Muhammad, Khaliq Qurat-Ul-Ain. “MTSD-TCC: A Robust Alternative to Tukey’s Control Chart (TCC) Based on the Modified Trimmed Standard Deviation (MTSD),” Mathematics and Statistics, vol. 8, no. 3 pp. 262–277, 2020.
- View Article
- Google Scholar
23. David H.A., “Early Sample Measures of Variability,” Statistical Science, vol. 13, pp. 368–377, 1998. https://www.jstor.org/stable/2676819
- View Article
- Google Scholar
24. Mahmoud M.A., Henderson G.R., Epprecht E.K., and Woodall W.H., “Estimating the Standard Deviation in Quality-Control Applications,” Journal of Quality Technology, vol. 42, pp. 348–357, 2010.
- View Article
- Google Scholar
25. Schoonhoven MM., Nazir H.Z., Riaz M., Does R.J.M.M., “Robust location estimators for the X control chart,” Journal of Quality Technology, vol. 43, pp. 363–379, 2011.
- View Article
- Google Scholar
26. Abbasi S.A., Riaz M., Miller A., Ahmad S., Nazir H.Z., “EWMA Dispersion Control Charts for Normal and Non-normal processes,” Quality and Reliability Engineering International, 2014.
- View Article
- Google Scholar
27. Rousseeuw P.J., Croux C., “Alternatives to the Median Absolute Deviation,” Journal of the American Statistical Association, vol. 80, pp. 1273–1283, 1993. https://www.jstor.org/stable/2291267
- View Article
- Google Scholar
28. Abbasi S., Miller A., “On the proper choice of variability control chart for normal and non-normal processes,” Quality Reliability and Engineering International, vol. 28, pp. 279–296, 2012.
- View Article
- Google Scholar
29. Bickel P.J., Lehmann E.L., “Descriptive Statistics for Non-Parametric Models III: Dispersion,” Annals of Statistics, vol. 4, pp. 1139–1158, 1976. https://www.jstor.org/stable/2958585
- View Article
- Google Scholar
30. Tuprah K., Ncube M., “A Comparison of Dispersion Quality Control Charts,” Sequential Analysis, vol. 6, pp. 155–163, 1987.
- View Article
- Google Scholar
31. Ewan W.D., Kemp K.W., “Sampling Inspection of Continuous Processes with no Autocorrelation Between Successive Results,” Biometrika, vol. 47, pp. 363–380, 1960. https://www.jstor.org/stable/2333307
- View Article
- Google Scholar
32. Page E.S., “Controlling the Standard Deviation by CUSUM and Warning Lines,” Technometrics, vol. 5, pp. 307–315, 1963. https://www.jstor.org/stable/1266335
- View Article
- Google Scholar
33. Tatum L.G., “Robust Estimation of the Process Standard Deviation for Control Charts,” Technometrics, vol. 39, pp. 127–141, 1997.
- View Article
- Google Scholar
34. Antzoulakos D.L., Rakitzis A.C., “The Modified rout of m Control Chart,” Communication in Statistics -Simulations and Computations, vol. 37, pp. 396–408, 2008.
- View Article
- Google Scholar

[ref1] 1. Page E.S., “Continuous Inspection Schemes,” Biometrika, vol. 41, pp. 100–15, 1954.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Abujiya Mu’azu Ramat, Riaz Muhammad, Lee Muhammad Hisyam, Enhanced Cumulative Sum Charts for Monitoring Process Dispersion, published 22 Apr 2015, PLOS ONE, pmid:25901356
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Aslam Muhammad, Shafqat Ambreen, Albassam Mohammed, Malela-Majika Jean-Claude, Shongwe Sandile C., A new CUSUM control chart under uncertainty with applications in petroleum and meteorology, published 04 Feb 2021, PLOS ONE, pmid:33539442
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref4] 4. Roberts S.W., “Control Chart Tests Based on Geometric Moving Average,” Technometrics, vol. 1, pp. 239–250, 1959.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref5] 5. Sukparungsee Saowanit, Areepong Yupaporn, Taboran Rattikarn. Exponentially weighted moving average—Moving average charts for monitoring the process mean. published 14 Feb 2020 PLOS ONE pmid:32059001
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Riaz Muhammad, Abid Muhammad, Nazir Hafiz Zafar, Abbasi Saddam Akber. An enhanced nonparametric EWMA sign control chart using sequential mechanism. published 21 Nov 2019, PLOS ONE, pmid:31751403
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref7] 7. Talordphop Khanittha, Sukparungsee Saowanit, Areepong Yupaporn. Performance of new nonparametric Tukey modified exponentially weighted moving average—Moving average control chart. Research Article | published 29 Sep 2022 PLOS ONE pmid:36174049
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref8] 8. Montgomery D.C, Introduction to Statistical Quality Control, Sixth Edition. New York, John Wiley & Sons, 2009.

[ref9] 9. Siegmund D., “Sequential Analysis: Tests and Confidence Intervals,” New York: Spring-Verlag, 1985.

[ref10] 10. Reynolds M.R., Stoumbos Z.G., “Robust CUSUM charts for monitoring the process mean and variance,” Quality and Reliability Engineering International, vol. 26, no. 5, pp. 453–473, 2010.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref11] 11. Hawkins D., “Robustification of cumulative sum charts by Winsorization,” Journal of Quality Technology, vol.25, no. 4, pp. 248–261, 1993.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref12] 12. Lucas J.M., Crosier R.B., “Fast initial response for CUSUM quality control scheme,” Technometrics, vol. 24, pp. 199–205, 1982.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref13] 13. Lee M.H., “Economic design of cumulative sum control chart for non-normally correlated data,” Matematika, vol. 27, no. 1, pp. 79–96, 2011.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref14] 14. Wang D., Zhang L., Xiong Q., “A non-parametric CUSUM control chart based on the Mann-Whitney statistic,” Communications in Statistics—Theory and Methods, vol. 46, no. 10, pp. 4713–4725, 2017.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref15] 15. Wang T., Huang S., “An adaptive multivariate CUSUM control chart for signaling a range of location shifts,” Communications in Statistics—Theory and Methods, vol. 45, no. 16, pp. 4673–4691, 2016.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref16] 16. Moustafa A., “A control chart based on robust estimators for monitoring the process mean of a quality characteristic,” International Journal of Quality and Reliability Management, vol. 26, no. 5, pp. 480–496, 2009.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref17] 17. Ou Y., Wu Z., Tsung F., “A comparison study of effectiveness and robustness of control charts for monitoring process mean,” International Journal of Production Economics, vol. 135, pp. 479–490, 2011.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. Ou Y., Wen D., Wu Z., Khoo M.B.C., “A comparison study on effectiveness and robustness of control charts for monitoring process mean and variance,” Quality Reliability and Engineering International, vol. 28, pp. 3–17, 2012.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref19] 19. WANG Si-yang, CUI Heng-Jian, “Trimmed and Winsorized Transformed Means Based on a Scaled Deviation,” Acta Mathematicae Applicate Sinica, English Series, vol. 31, no. 2 pp. 475–492, 2015.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref20] 20. Pear Hossain M., Ridwan Sanusi A., Hafiz Omar M., and Muhammad Riaz. “On designing Maxwell CUSUM control chart: an efficient way to monitor failure rates in boring processes,” International Journal of Advanced Manufacturing Technology, 2018.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref21] 21. Hu X.L., Castagliola P., Tang A.A., “Conditional design of the CUSUM median chart for the process position when process parameters are unknown,” Journal of Statistical Computation and Simulation, vol. 89, no. 13 pp. 2468–2488, 2019.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref22] 22. Abu-Shawiesh Moustafa Omar Ahmed, Riaz Muhammad, Khaliq Qurat-Ul-Ain. “MTSD-TCC: A Robust Alternative to Tukey’s Control Chart (TCC) Based on the Modified Trimmed Standard Deviation (MTSD),” Mathematics and Statistics, vol. 8, no. 3 pp. 262–277, 2020.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref23] 23. David H.A., “Early Sample Measures of Variability,” Statistical Science, vol. 13, pp. 368–377, 1998. https://www.jstor.org/stable/2676819
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref24] 24. Mahmoud M.A., Henderson G.R., Epprecht E.K., and Woodall W.H., “Estimating the Standard Deviation in Quality-Control Applications,” Journal of Quality Technology, vol. 42, pp. 348–357, 2010.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref25] 25. Schoonhoven MM., Nazir H.Z., Riaz M., Does R.J.M.M., “Robust location estimators for the X control chart,” Journal of Quality Technology, vol. 43, pp. 363–379, 2011.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref26] 26. Abbasi S.A., Riaz M., Miller A., Ahmad S., Nazir H.Z., “EWMA Dispersion Control Charts for Normal and Non-normal processes,” Quality and Reliability Engineering International, 2014.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref27] 27. Rousseeuw P.J., Croux C., “Alternatives to the Median Absolute Deviation,” Journal of the American Statistical Association, vol. 80, pp. 1273–1283, 1993. https://www.jstor.org/stable/2291267
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref28] 28. Abbasi S., Miller A., “On the proper choice of variability control chart for normal and non-normal processes,” Quality Reliability and Engineering International, vol. 28, pp. 279–296, 2012.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref29] 29. Bickel P.J., Lehmann E.L., “Descriptive Statistics for Non-Parametric Models III: Dispersion,” Annals of Statistics, vol. 4, pp. 1139–1158, 1976. https://www.jstor.org/stable/2958585
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref30] 30. Tuprah K., Ncube M., “A Comparison of Dispersion Quality Control Charts,” Sequential Analysis, vol. 6, pp. 155–163, 1987.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref31] 31. Ewan W.D., Kemp K.W., “Sampling Inspection of Continuous Processes with no Autocorrelation Between Successive Results,” Biometrika, vol. 47, pp. 363–380, 1960. https://www.jstor.org/stable/2333307
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref32] 32. Page E.S., “Controlling the Standard Deviation by CUSUM and Warning Lines,” Technometrics, vol. 5, pp. 307–315, 1963. https://www.jstor.org/stable/1266335
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref33] 33. Tatum L.G., “Robust Estimation of the Process Standard Deviation for Control Charts,” Technometrics, vol. 39, pp. 127–141, 1997.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref34] 34. Antzoulakos D.L., Rakitzis A.C., “The Modified rout of m Control Chart,” Communication in Statistics -Simulations and Computations, vol. 37, pp. 396–408, 2008.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

Figures

Abstract

1 Introduction

2 Description of process dispersion estimators

3 Proposed estimators based on trimmed winsorization

4 The proposed method of CUSUM charts for different robust dispersion estimators

5 Evaluation of CUSUM— charts performance

5.1 Results and discussions

5.1.1 Uncontaminated environment.

5.1.2 Symmetric variance environment.

5.1.3 Asymmetric variance environment.

5.1.4 Non-normal environment.

6 Conclusion

Supporting information

S1 Table. Data generating R program, Table 1 standard deviation code for N(0,1).

S1 File. Data 2 generating R program, S. D codes for G (2,1).

S2 File. Data 3 generating R program, S.D codes for ARL & SDRL.

References