Progressive censoring schemes for marshall-olkin pareto distribution with applications: Estimation and prediction

In this paper two prediction methods are used to predict the non-observed (censored) units under progressive Type-II censored samples. The lifetimes of the units follow Marshall-Olkin Pareto distribution. We observe the posterior predictive density of the non-observed units and construct predictive intervals as well. Furthermore, we provide inference on the unknown parameters of the Marshall-Olkin model, so we observe point and interval estimation by using maximum likelihood and Bayesian estimation methods. Bayes estimation methods are obtained under quadratic loss function. EM algorithm is used to obtain numerical values of the Maximum likelihood method and Gibbs and the Monte Carlo Markov chain techniques are utilized for Bayesian calculations. A simulation study is performed to evaluate the performance of the estimators with respect to the mean square errors and the biases. Finally, we find the best prediction method by implementing a real data example under progressive Type-II censoring schemes.


Introduction
Studying new lifetime models has become necessary and essential as many applications appeared in natural sciences. Over the last four decades, many authors focused their work on generating new lifetime distributions that may fit the experimental data, for example, medical, engineering, social sciences, reliability analysis, and others. In literature, those new models possess good properties and others were superior relative to the original ones. Generalizations of well-known distributions were applied to describe various phenomenal data. One may refer to [1], later [2] and others. The new method which was proposed by Marshall and Olkin was about obtaining a new distribution depending on adding a parameter to the original one. The generated family of distributions is more flexible and has the original distribution as a particular case.
In this paper, we will consider the Marshall Olkin family with Pareto distribution as a baseline. Marshall-Olkin Pareto (MOP) distribution was studied by [13]  data. The authors used several estimation methods and found that it has good properties and behaves well as a generalization of the well-known Pareto distribution. In reliability and lifetime experiments, units under test can be lost or taken out from the experiment before failure. The loss may not be planned, such as in the accidental failure of some units under experiment or if a unit drops out. Sometimes, the experiment must stop due to the unavailability of testing facilities. Most often, the removal of units from an experiment is pre-planned and is made due to cost and time limitations. The benefit of progressive censoring lies in its efficient utilization of the available resources, so when some of the surviving units in the experiment are removed early, they can be used for some other tests. In reliability and life testing experiments, one of the primary purposes is to draw inference about unknown parameters of interest of an underlying lifetime distribution based on certain censored observation, see [14]. Mainly in such studies, either the interest is to provide estimates for unknown parameters or draw some prediction inference about future observations. The most frequently used censoring schemes are called progressive Type-I and progressive Type-II censoring. One can refer to [15] for progressive censoring schemes and their related issues, see also [16,17]. Recently several authors were interested in studying parameter inference for different lifetime distributions under progressive Type-II censoring scheme, see, for example, [18][19][20][21][22][23] in this respect.
Progressive Type-II censoring scheme can be described as follows: Let X1, X2, . . ., Xn denote the numerical outcomes of n independent and identically distributed (i.i.d) units from a life-test experiment. Suppose that R1, R2, . . ., Rm (m < n) are some fixed nonnegative integers such that P m i¼1 R i ¼ n À m, hence only m units will be observed, the remaining n-m units will be censored progressively according to the censoring scheme R = (R1, R2, . . ., Rm). The censoring occurs progressively in m stages, and the failure times of the m observed units are obtained at these stages. When the first failure (the first stage) X 1:m:n occurs, R1 of the n-1 surviving units are randomly censored from the experiment. When the second failure (the second stage) X 2:m:n occurs, R2 of the n-2-R1 surviving units are randomly censored from the experiment. Finally, at the time of the m th failure (the m th stage) X m:m:n , all the remaining Rm = n-m-(R1 + R2 + . . .+Rm-1) units are withdrawn from the experiment. It is obvious that Type-II right censoring and complete sampling schemes are special cases of progressive Type-II censoring scheme by choosing (R1 = R2 = . . . = Rm 1 = 0; Rm = n-m) and (R1 = R2 = . . . = Rm-1 = 0, n = m) respectively.
From our literature survey we realized that most of the work on the Marshall-Olkin Pareto distribution have been based on complete samples, also the idea of predicting censored units has not received much attention. This motivates us to write this article with two main objectives. The first objective is to provide the statistical inference about the unknown parameters of the Marshall-Olkin Pareto distribution when the lifetime data are observed under Progressive type-II censoring. The second objective is to provide the statistical inference about the censored observations. We consider both problems of estimation and prediction under classical as well as Bayesian approaches. Further, predictive interval estimates are also constructed.
The probability density function (pdf) and the cumulative distribution function (cdf) of MOP distribution are given respectively as: Fðx; a; y; bÞ Prediction is very important in statistical inference, some prediction problems were discussed in the literature, see for example [24][25][26][27][28][29][30]. The main idea is predicting the future value of the ordered statistics based on the observed sample. [24] Used Bayes prediction to predict future values of a progressive censored sample under flexible Weibull distribution. [25] compare two prediction methods to predict the unobserved units from new Pareto model with progressive Type-II censoring scheme. In [30] the prediction of the remaining time for the generalized Pareto distribution under a progressive censored sample was considered.
The methodology in this paper is divided into two main objectives: First, find the Maximum likelihood estimator for the MOP parameters using the EM Algorithm, and use Bayes method to estimate of MOP parameters. The sample under consideration is a progressive Type-II censored sampling scheme. For Bayes estimation Gamma priors and quadratic loss function are considered. We compare the performance of the two methods of estimation numerically by simulation analysis using the R code. Second, we consider the prediction problem for the future unobserved data based on the available observation. Therefore, two prediction methods are performed (i) The best unbiased predictor (BUP) and (ii) The Bayes prediction (BP), also predictive intervals (PIs) of the future censored data are constructed. Numerical analysis and simulation are used for comparing the efficiency of the two prediction methods under consideration, and finally an illustrative real data of failure time example is presented.
The rest of this paper is organized as follows: In Section 2, the estimation methods including the MLE and the Bayesian estimation methods for estimating the three parameters of MOP distribution are computed using the EM algorithm and MCMC numerical methods. Also, we discuss the prediction methods for the non-observations from the censored sample using PUB and BP. In Section 3 the real data set is given for illustrative purposes, and numerical simulation study and its results are performed and summarized by tables and figures. Discussions are reported in Section 4, while conclusions are given in Section 5.

Maximum likelihood inference
A well-known classical method of point estimation is used, which is the maximum likelihood method (MLE) for estimating the three unknown parameters of MOP distribution under progressive type-II censoring scheme. Let X = (x 1:m:n , x 2:m:n ,. . ., x m:m:n ) with x 1:m:n � x 2:m:n �. . . �x m:m:n be the observed progressive Type-II censored sample of size m drawn from a sample of size n under MOP distribution with the pdf and the cdf given by Eq (1) and Eq (2) respectively. The likelihood function under a progressive Type-II censored sample X is given by: where A = n(n-1-R1) (n-2-R1-R2) . . . (n-m+1-R1-. . .-Rm-1). See [15].
Using Eq (1) and Eq (2) we obtain: The log-likelihood function of MOP is: From the log-likelihood equation we can compute the derivatives with respect to the parameters, since x�β, then the MLE of the parameter β is x 1:m:n , where x 1:m:n , is the first progressive censored statistic. We need to solve the following normal equations after equating them to zero: Since the closed-form solution for the above equations cannot be obtained explicitly, one needs to employ some numerical method. The most commonly used method in the literature is Newton-Raphson (N-R). But the main drawback of this method is that it requires the second derivatives of the log-likelihood function at all iterations, and it may be computationally cumbersome due to the complicated form of the likelihood function. Instead, numerical packages in various programming languages can also be used to solve the above equations. One of such good algorithms is the Expectation-maximization (EM) algorithm which had been used by several authors to obtain maximum likelihood estimates. This algorithm is very much useful compared to N-R method especially when data are not completely observed under some censoring scheme.
The normal approximation of the MLE can be used to construct approximate confidence intervals on the parameters α, β and θ. From the asymptotic property of the MLE we have ffi ffi ffi and I(F) is the Fisher information matrix given by: where the second partial derivatives are given as follow: The expected values of the second partial derivatives are obtained numerically using R-programming. The variances of the MLEs can be found from the asymptotic property of MLE so where Det(I(F)) is the determinant of the information matrix I. The (1−z)100% approximate confidence intervals for b a MLE ; b b MLE , and b y MLE are given respectively as:

EM algorithm
The basic idea of EM algorithm begins with writing the log-likelihood function given the complete sample W. However, we have the observed data X = (x 1:m:n , x 2:m:n ,. . ., x m:m:n ), while n−m units will be removed or censored. Now, suppose that the lifetimes of the censored observations are Z = (z 1 , z 2 ,. . .,z n−m ), hence the complete sample W is a combination of the observed data X and censored data Z; that means W = (X, Z). Dealing with the complete log-likelihood function and differentiate it partially with respect to the parameters α and θ and equate them to zero, we get: When applying the EM algorithm, we should keep in mind that the MLE of β is x 1:m:n so it will be considered as a known parameter in the above normal equations. The EM algorithm consists of an expected step (E-step) and a maximized step (M-step). The E-step replaces the expressions of observed and censored lifetimes by their expectations, whereas M-step maximizes the E-step at each iteration.

Bayes estimation
In the Bayesian method, all parameters are considered as random variables with a certain distribution called prior distribution. If prior information is not available which is usually the case, we need to select one. Since selecting prior distribution has an important role in the estimation of the parameters, our selection for the priors of α, β, and θ are the independent gamma distributions G(a 1 ; b 1 ); G(a 2 ; b 2 ) and G(a 3 ; b 3 ) respectively. The reason for choosing this prior density is that Gamma prior has flexible nature as a non-informative prior, especially when the values of the hyperparameters are assumed to be zero. The suggested gamma distributions have the following densities: where a 1 , a 2 , a 3 , b 1 , b 2 and b 3 are hyperparameters of prior distributions and all are positive real constants. The joint prior of α, β, and θ is: and the joint posterior of α, β, and θ is: where L(x/α, θ, β) is the likelihood function of MOP distribution under progressive type-II censored samples as in Eq (4) Substituting L(x/α, θ, β) and g(α, β, θ) for MOP under progressive Type-II censoring scheme, the joint posterior density can be written as: � 2þR i , and G(.,.) represents the probability density of Gamma distribution. Therefore, the Bayes estimate of any function of α, β, and θ, say h(α, β, θ), under the quadratic loss function will be the expected value Since it is difficult to compute this expected value analytically, we opt to use the Markov chain Monte Carlo technique (MCMC), see [31] and [32].
The Gibbs sampling method will be used to generate a sample from the posterior density function p(α, β, θ|x) and compute Bayes estimates. To generate a sample from the posterior distribution, it is assumed that the pdf of prior density is as described in Eq (6). The fully conditional posterior densities of α, β, and θ, and the data is given by: Since these full conditional distributions cannot be reduced to well-known distributions, we cannot generate α, β, and θ from these distributions directly by standard methods, therefore we need to generate these distributions using the M-H algorithm, see [33] and [34]. The idea here is to decrease the rate of rejections as much as possible. The algorithm below depends on using the M-H algorithm based on choosing the normal distribution as a proposal distribution which is used to find the Bayes estimators and also to construct the credible intervals for α, β, and θ. To apply the Gibbs technique, we need the following algorithm: 1. Start with initial values (α 0 , β 0 , θ 0 ) 2. Use the M-H algorithm to generate a posterior sample for α, β, and θ from Eq (8).

Prediction
In many fields of life testing and reliability studies predicting the unobserved or censored observation from the observed sample data has a great attention, see for example [24][25][26][27][28][29][30]. In this section, we perform two prediction methods, namely the best-unbiased predictor (BUP) and the Bayes predictor (BP).
where c � ¼ r j ! ðsÀ 1Þ!ðr j À sÞ! . Now substituting the pdf and the cdf of MOP distribution into Eq (11) and after some simplifications we observe: f Y s:r j jX y s:r j ; a; b; y The conditional density in Eq (12) can be rewritten using the well-known binomial expansion to become: f Y s:r j jX y s:r j j a; b; y The best unbiased predictor (BUP) of Y s:r j is the expected value EðY s:r j =X j:m:n Þ that is: Using integration techniques and binomial expansion, Eq (14) will reduce to censored data. In Bayes estimation, we assume: x j:m:n � � y and u = k+r j −s+1. If it is assumed that the parameters α, β, and θ are all unknown, then the BUP of Y s:r j will be: where b a; b b; and b y are the MLEs of α, β, and θ respectively and b x j:m:n � �b y .

Bayesian prediction.
The Bayes prediction of the non-observed units from the future sample depends on the current observed sample which is called an informative sample. For that purpose, we obtain the estimation of the posterior predictive density of the s th order Y s:r j . The posterior predictive density of Y s:r j given the observed censored data X is given by: where f Y s:r j jX ðy s:r j j a; b; yÞ is the conditional density of Y s:r j given α, β, θ, and the data X, which is given in Eq (13), where p(α, β, θ|x) is the joint posterior given in Eq (8). Using the squared error loss function (SEL), the Bayes predictor (BP) of Y ¼ Y s:r j is obtained by: The form of the posterior predictive density in Eq (16) is not easy to compute, therefore evaluating the predictive Bayes estimates E π (Y|data) manually is not an easy task, hence we use numerical techniques such as MCMC samples that was described in Section. 3 to generate samples from the predictive distributions and find the Bayes predictor.
Based on MCMC samples {(α ℓ , β ℓ , θ ℓ ): ℓ = 1,2,. . .M} that are obtained by using the Gibbs sampling and M-H methods, the Bayes predictor b Y BP s:r j is now given by: From the above posterior predictive density one can obtain a two-sided predictive interval for Y ¼ Y s:r j , (s = 1, 2, . . ., r j ; j = 1, 2, . . ., m) For that purpose, we need to find the predictive survival function of Y ¼ Y s:r j at any point y>x j:m:n , which can be defined as: Under the SEL function, the predictive survival function of Y ¼ Y s:r j is given by: The predictive survival function in Eq (17) cannot be easily evaluated analytically, hence numerical approximation technique is preferable in this case. The MCMC samples can be used to approximately evaluate Eq (17), so let {(α ℓ , β ℓ , θ ℓ ): ℓ = 1,2,. . .M}, then the simulated estimator for the predictive survival function is written as: where Now, the (1−ξ)100% predictive interval of Y ¼ Y s:r j can be evaluated by solving the above non-linear equations with the lower bound (L) and the upper bound (U) using a suitable

Real data
We consider a progressively censored real data set from [35], it consists of the failure times of 20 mechanical components, see Table 1.
We examine the behavior of the estimators and predictors based on censored sample data from Table 1 Now, from the given data set we suggest three different progressive Type-II censoring schemes, these censoring schemes are: 1. scheme 1: (0 � (0.9 � n-1), 0.1 � n) 2. scheme 2: (0.1 � n, 0 � (0.9 � n-1)) 3. scheme 3: Based on these censored samples we estimate the parameters of the MOP distribution using MLE and Bayesian approaches. Table 2 shows the MLE computed using Newton-Raphson method and Bayes estimates under SEL obtained using MCMC technique and it also contains the 95% asymptotic confidence and credible interval estimates.
To check the validity of Marshall-Olkin Pareto (MOP) distribution to fit this data set, we use the Kolmogorov-Smirnov (K-S) test is applied. Which is the distance between the fitted and empirical distribution functions. The K-S distance and its respective p-value are computed to be K − S = 0.1304 and p-value = 0.8441, respectively. Therefore, it is quite reasonable to indicate that the MOP distribution is fitting this data well. Table 3 reports some true values of Y s:r j which are compared to predictive observations obtained using censoring scheme 1, 2, and 3, under BUP and BP methods of prediction. It can be shown clearly that the predicted values of Y s:r j under BP method are closer to the real values than the BUP method, and this is true for all censoring schemes under consideration. Hence, we recommend using BP method for predicting the unobserved units in this real data example.

Simulation study
In this subsection, first, we use simulation analysis to check the performance of the Bayes estimators compared with the classical estimators obtained via the MLE approach under the progressive Type-II MOP censored sample. Second, we find the best unbiased predictor and Bayes prediction for the non-observed data based on observed ones. The squared error loss function SEL is used for Bayesian estimation. For the comparison needs, we use the mean square errors (MSEs) for the different estimators based on 10000 replications using R package.
The generator of MOP distribution is Q p , where p is the uniform distribution deviates on (0,1). We obtain the point predictors and the 95% prediction intervals for the nonobserved order statistics Y s:r j ; (s = 1, 2, . . ., r j ; j = 1, 2, . . ., m).
The MLE and Bayes estimators for the MOP parameters under the three censoring schemes are given in Tab. 4 and 5, with initial values of the parameters (α, β, θ) as (2,0.5,1.5) and (2.5,0.5,1.5).  The bias and the MSE are used to assess the performance of the two estimation methods. It can be viewed from Tables 4 and 5 (1) and (2) that the Bayesian estimation method performs better than the MLE with respect to Bias and MSE for estimating α, β, and θ, since the MSE and the bias have smaller values when using the Bayesian estimation method than their values under the MLE.

and Figs
The corresponding 95% prediction confidence intervals (PI) for the non-observed order statistics Y s:r j are reported in Tables (6)(7)(8). In Table 6 censoring scheme 1 is used and we can find the interval lengths which will lead us to the best predicted interval estimation for the unobserved Y s:r j . Since the BUP has shorter interval length than the BP we conclude that BUP is preferable to predict Y s:r j . For other censoring schemes we can obtain different results as shown in Tables 7 and 8, where the best interval estimation is by using BP.

Discussion
The MLE based on EM algorithm and Bayes estimates under SEL function using MCMC method are obtained and displayed in Table 2. Also, the approximate asymptotic confidence intervals and the credible intervals are computed and tabulated in Table 2. From Table 2, we notice that the MLE and the Bayes estimates are co-inside and the Bayes credible intervals have shorter length than the approximate confidence intervals. The analysis of the proposed data set demonstrates the applicability and the importance of the proposed model.
To check the validity of Marshall-Olkin Pareto (MOP) distribution to fit this data set, we use the Kolmogorov-Smirnov (K-S) test is applied. The K-S distance and its respective pvalue are computed to be K − S = 0.1304 and p-value = 0.8441, respectively. Therefore, it is quite reasonable to indicate that the MOP distribution is fitting this data well.
The relative histograms and fit of MOP distribution of data sets are discussed in Fig 3, with the plot of the max distance between the two empirical CDF curves for MOP distribution. Moreover, it indicates that MOP distribution can be fitted to the data set.
In addition to histogram plots, approximate marginal posterior density, and MCMC convergence of α, β, and θ are represented for data set in Fig 4. Suppose the life test ended when the 15th observation is observed, i.e., we observe a Type-II censored sample with n = 20 and m = 18. Based on the prediction methods presented in Section 2, we computed the point predictors, these results are presented in Table 3. It is clearly evident that the point predictors are very close to the true values for different Schemes. Moreover, the Predicted value under BP is the better than Predicted value under BUP for different Schemes.
In simulation algorithm, Monte Carlo experiments were carried out under the following data generated from MOP distribution, where x is distributed as MOP distribution for different parameters O = (α, β, θ) and the initial values of the parameters are as follows:

PLOS ONE
In Tables 4 and 5, we present the MLE for α, β, and θ as well as the Bayes estimates of α, β, and θ, under the loss functions. Numerical results of Bayes estimate for α, β, and θ, and their corresponding Bias and MSE values are computed in case 1,2 respectively, for different sample size (n) under different schemes. We observe that the Bayes estimates perform better than MLE in terms of Bias and MSE values. For fixed n, when the number of failures increase the Bias MSE decrease in all cases and for different estimators. Comparing the three censoring schemes, it is observed that in most of the case 1and 2, Scheme 3 perform better than schemes 1 and 2 in terms of Bias and MSE values.  In Tables 6-8 we have reported the lifetime of higher order units increases as n increases and the length of predictive interval become wider as well for case 3 under different schemes. We have reported best unbiased predictor (BUP) and Bayes predictive estimates (BP) for both set of parameters, and predictive interval for both set of parameters are reported based on classical approach. Here the lifetime of higher order units become larger and length of predictive interval becomes wider.
In Tables 6-8 we have reported the lifetime of higher order units increases as n increases and the length of predictive interval become wider as well for case 3 under different schemes. We have reported best unbiased predictor (BUP) and Bayes predictive estimates (BP) for both set of parameters, and predictive interval for both set of parameters are reported based on classical approach. Here the lifetime of higher order units become larger and length of predictive interval becomes wider.
From Table 6 we present BUP and BP prediction estimates along with pivotal and CI prediction intervals for case 3 based on Scheme 1. Table 7 we present BUP and BP prediction estimates along with pivotal and CI prediction intervals for case 3 based on Scheme 2. While in Table 8 addresses BUP and BP prediction estimates along with pivotal and CI prediction  In general BP estimates are slightly smaller than BUP and Pivotal Interval are slightly larger than 95% CI prediction interval.

Conclusions
In this article, we used two prediction methods to predict the unobserved units under progressive Type-II censored sampling, the lifetimes followed Marshall-Olkin Pareto model. Point and Interval estimation of the unknown parameters of Marshall-Olkin Pareto distribution are obtained using classical (MLE) and non-classical (Bayes) estimation. Numerical analysis using EM algorithm was performed to find the numerical solution of the MLE, and the MCMC method was used for Bayesian calculations. A simulation study was conducted to assess the performance of these estimation methods. Based on the simulation results, we showed the advantage of using Bayesian method over the classical method of estimation. A real data example was used to determine the best prediction method; hence we observed the advantage of the Bayesian prediction method over the BUP method.