Bayesian estimation for Dagum distribution based on progressive type I interval censoring

In this paper, we consider Dagum distribution which is capable of modeling various shapes of failure rates and aging criteria. Based on progressively type-I interval censoring data, we first obtain the maximum likelihood estimators and the approximate confidence intervals of the unknown parameters of the Dagum distribution. Next, we obtain the Bayes estimators of the parameters of Dagum distribution under the squared error loss (SEL) and balanced squared error loss (BSEL) functions using independent informative gamma and non informative uniform priors for both scale and two shape parameters. A Monte Carlo simulation study is performed to assess the performance of the proposed Bayes estimators with the maximum likelihood estimators. We also compute credible intervals and symmetric 100(1 − τ)% two-sided Bayes probability intervals under the respective approaches. Besides, based on observed samples, Bayes predictive estimates and intervals are obtained using one-and two-sample schemes. Simulation results reveal that the Bayes estimates based on SEL and BSEL performs better than maximum likelihood estimates in terms of bias and MSEs. Besides, credible intervals have smaller interval lengths than confidence interval. Further, predictive estimates based on SEL with informative prior performs better than non-informative prior for both one and two sample schemes. Further, the optimal censoring scheme has been suggested using a optimality criteria. Finally, we analyze a data set to illustrate the results derived.

1 Introduction [1] introduced a heavy tail distribution, called the Dagum distribution, especially for modeling income distributions which could be used in place of log-normal and Pareto models. Since then, researchers have employed Dagum distribution for studying the distribution of wealth and income, reliability and survival data, etc. Further, Dagum distribution admits a mixture representation in terms of generalized gamma and inverse Weibull distributions. The distribution can also be obtained as a compound generalized gamma (GG) distribution whose scale a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 parameter follows an inverse Weibull (IW) distribution with identical shape parameters. Lately, researchers have also adopted the Dagum distribution in the context of reliability and survival analysis (see [2][3][4]). [5,6] carried out a detailed study on the origin and applications of the Dagum distribution. Recently, [7] studied the properties and different classical methods of estimation of the Dagum distribution.
A random variable t has the Dagum distribution with three parameters θ, β and λ if it has cumulative distribution function (CDF) (for t > 0) given by The probability density function (pdf) corresponding to (1) reduces to f ðt; y; b; lÞ ¼ yblt À yÀ 1 ð1 þ lt À y Þ À bÀ 1 ; where θ and β are the shape parameters and λ is the scale parameter respectively. The reliability (survival) function is given by, and hazard rate function is hðt; y; b; lÞ ¼ yblt À yÀ 1 ð1 þ lt À y Þ À bÀ 1 In several cases of life testing and survival analysis, the test unit is terminated before failure due to restriction of time, budget cost or accidental breakage. The data obtained from such cases may not be complete which is called a censored sample. For studying different observable physical phenomena, several censoring methodologies have been developed over the last several decades. In literature, type-I and type-II are the two most common censoring schemes which are widely used in reliability and life testing experiments (see [8]). However, none of these schemes allow the removal of any experimental units during the experiment. This limitations led to the development of progressive censoring scheme wherein test items are withdrawn before the termination of experiment (see [9] for more details). In many life testing experiments items put on a test are observed within an interval of time called interval censoring. However, this censoring schemes also does not allow removal of units in between the experiments. [10] introduced progressive type-I interval censoring scheme for the exponential distribution. In such type of censoring, items can be withdrawn between two prescheduled consecutive time points. In the recent past, progressive type-I interval censored scheme gained wide popularity due to its applicability in many practical problems.
The first objective of this paper is to obtain the unknown parameters of the model using the maximum likelihood estimators (MLEs). Further, we obtain the asymptotic confidence intervals for the unknown parameters. The second objective is to obtain Bayes estimators under SEL and BSEL functions using independent gamma priors for both scale and shape parameters of the model. We have also obtained Bayes credible intervals of the unknown parameters using Gibbs sampling technique. The third objective is to obtain Bayes predictive estimates and intervals based on the observed samples using one-sample and two-sample schemes. Finally, the fourth objective is to obtain optimal censoring schemes. As far as we know, no study has attempted to study three parameter distribution by taking into account one and two-sample prediction intervals in the Bayesian framework along with optimal censoring schemes using progressively type-I interval censored schemes perhaps, due to complexities in computational work.
The remainder of this article is organized in the following manner: In Section 3, a brief discussion on progressive type-I interval censored sampling is presented. Section 4 deals with maximum likelihood estimation and its associated approximate confidence intervals. Section 5 discusses Bayesian estimation and its credible intervals. Section 6 includes a discussion on Bayesian prediction based on one sample and two sample schemes. A simulation study is conducted in Section 7, to compare the performance of various estimates developed in this paper. In Section 8 deals with optimal censoring schemes. In Section 9, with the help of a real data set estimation procedures developed in the previous sections are illustrated. Finally, conclusion is presented in Section 10.

Progressive type-I interval censoring
Assume that n identical items are put on a life test at time t 0 = 0 and under inspection at prespecified times 0 < t 1 < t 2 < . . . < t m where t m is the schedule time to terminate the experiment and m is prefixed. At time t j , the total number of observed failures in the interval (t j−1 − t j ] are D j . Further, assume that R j alive items are removed randomly from the life testing at time t j , j = 1, 2, . . ., m. Here the number of surviving units at time t j , say S j is a random variable, and therefore R j � S j . Thus, R j could be determined by the pre-fixed percentage values p 1 , p 2 , . . ., p m−1 with p m = 1, R j be determined by R j = [S j p j ], for j = 1, 2, . . ., m − 1. So the data in progressive type-I interval censoring is given by T j = (D j , R j , t j ), j = 1, 2, . . ., m. If R j = 0, for j = 1, 2, . . ., m − 1, and R m ¼ n À P m j¼1 D j , then progressive type I interval censoring scheme reduces to the normal interval censored sample.

Maximum likelihood estimators
Given the progressively type-I censored data, T j = (D j , R j , t j ), j = 1, 2, . . ., m of size n, from a continuous lifetime distribution with CDF as defined in (1), then the likelihood function can be written as ( [10]) where t 0 = 0 and ϕ � (θ, β, λ) is the parameter vector.. For the CDF of Dagum distribution defined in (1), the likelihood function (5) can be specified as follows: The log likelihood function, after ignoring the constant of proportionality, is given by The corresponding normal equations are and ðU j Þ l � @U j ðy; b; lÞ @l ¼ t À y jÀ 1 lt À y jÀ 1 þ 1 The maximum likelihood estimatesŷ MSL ,b MSL andl MSL of θ, β and λ, respectively, can be obtained by solving the Eqs (8), (9) and (10), respectively. For more details about the maximum likelihood estimates see for example Dong et al. [24], Chen et al. [25] and Chen et al. [26]. For interval estimation of the model parameters, we require the observed information matrix. Applying the usual large sample approximation, asymptotic distribution of the MLE� is ð� À �Þ ! Nð0; I À 1 ð�ÞÞ, see [27], where I(ϕ), the expected Fisher information matrix of the unknown parameters ϕ = (θ, β, λ) is

Bayesian analysis
This section considers Bayes estimation of the parameters. When all the parameters are unknown, no jooint conjugate prior is available. In such a situation, there are number of ways to choose the priors. Here, we consider the piecewise independent gamma priors [see [28][29][30][31]]. It is assumed that θ, β and λ have the independent gamma prior distributions with pdf given by Here all the hyper parameters a 1 , b 1 , a 2 , b 2 , a 3 and b 3 are assumed to be known. Thus, the joint prior distribution is given by Combining (21) with (6), the joint posterior distribution is derived as The marginal posterior probability density functions of θ, β and λ are given respectively as

Bayes estimators under squared error loss function
Bayes estimator of the parameters under SEL function are nothing but the posterior mean of the corresponding parameters. Hence the Bayes estimatorŷ SEL of θ under SEL can be simply expressed asŷ In a similar way, we can obtain the estimatorsb SEL andl SEL of β and λ, respectively, as given below:b As these estimators can not be obtained explicitly, so the estimates are obtained by numerical method.

Bayes estimators based on balanced squared error loss function
Instead of using the well known symmetric SEL function, one can use balanced squared error loss function (BSEL) which was first proposed by [32] and subsequently extended class of BSEL function was introduced by [33] and can be expressed as: where Lð�;�Þ is an arbitrary loss function, when� 0 is a chosen estimator of ϕ and the weight ω 2 [0, 1]. The BSEL is a generalized loss functions which includes absolute error loss function, entropy loss function, LINEX loss function and generalizes SEL function. [34] suggested the use of BSEL function, if Lð�;�Þ ¼ ð� À �Þ 2 is substituted in (26), the BSEL function can be obtained and given by the following form: the corresponding Bayes estimator� BSEL of a function ϕ using BSEL is given bŷ where� ML is the MLE of ϕ and� SEL of ϕ based on SEL function, where ϕ = (θ, β, λ).

Credible intervals
The Bayesian interval estimates can be derived more directly than the frequentist confidence interval estimates. After obtaining the marginal posterior distribution of ϕ, a symmetric 100(1 − τ)% two-sided Bayes probability interval estimate of ϕ, denoted by [L ϕ , U ϕ ], can be obtained by solving the following equation. The Bayesian analog to the confidence interval is called a credible interval. In general, for the limits L ϕ and U ϕ . We need to apply suitable numerical method to compute the above intervals.

Prediction of future values
In the following two subsections, we will investigate the one and two-sample prediction intervals in the Bayesian framework. [35] obtained 100(1 − τ)% predictive intervals based on the two sample schemes. Let t 1 < t 2 < . . . < t r be the ordered informative sample from a distribution function whose CDF is F(x).

Point estimation.
The future sample consists of the remaining order statistics t r+1 < t r+2 < . . . < t n .

Two samples prediction
where pðYjtÞ is defined in (22). Since, f � r ðy r jtÞ cannot be expressed in closed form, it cannot be evaluated analytically. A symmetric 100 γ% predictive interval for Y (r) for the lower bound L and upper bound U can be obtained by solving the following non-linear Eqs (37) and (38).
pðY ðrÞ � UjtÞ ¼ Here we need to apply suitable numerical method to compute the above intervals.

Simulation study
In this section, we obtain the maximum likelihood estimates (MLE), the Bayes estimates of the unknown parameters of the Dagum distribution under SEL and BSEL functions through a simulation study using R language. We obtain one and two-sample Bayes predictive estimates based on observed samples. The MLE of the model parameters ϕ and the approximate 95% confidence intervals are obtained using progressively type-I interval censored data. Here, we consider two different values of n such as n = 50, 100. In Table 2, we report the mean square error (MSE) values for the maximum likelihood estimators of the Dagum distribution model parameters ϕ. We have also considered Bayesian approach of estimating the Dagum distribution parameters ϕ by assuming independent gamma distribution prior, θ * gamma(a 1 , b 1 ), β * gamma(a 2 , b 2 ) and λ * gamma(a 3 , b 3 ) with known and non-negative values for the hyper parameters a 1 , a 2 , a 3 , b 1 , b 2 and b 3 using the SEL and BSEL functions and for both informative and non-informative priors. We generate the progressive type I interval censored sampling data, H j = {D j , R j , t j } of the Dagum distribution by first generating n random variables U 1 , U 2 , . . ., U n , n � m from a U(0, 1) distribution. The data t 1 , t 2 , . . ., t n are there by calculated using where the number D i of failures within (t i−1 , t i ] are generated and R i surviving items are randomly removed from the testing based on the pre-specified inspection times t 1 � . . . � t m and the pre-specified percentage p = (p 1 , p 2 , . . ., p m−1 , 1), respectively. Again, we divided each sample size, n, to five intervals m = 5. The progressive type I interval censoring data is then generated by setting D 0 = 0 and R 0 = 0 and for i = 1, 2, . . ., m. where where, rbinom(n, p) generates a random variable from the binomial distribution with parameters n and p, floor() denotes the largest integer not greater than the argument and 0 � p i � 1, i = 1, 2, . . ., m − 1, p m = 1 and F is given by (1). The following two schemes are used in progressive type-I interval censoring • scheme 1: p (1) = (0, 0.2, 0.50, 0.75, 1), • scheme 2: p (2) = (0, 0.4, 0.7, 0.9, 1).
In simulation study, both informative and non-informative priors have been used. In case of informative prior, we choose prior θ * gamma(10, 10), β * gamma(0.5, 10) and λ * gamma (0.1, 10), while for non-informative prior, we choose uniform distribution ϕ * U(0, 5). We run 1000 iterations for each of the two schemes. In Tables 2 and 3, we report two schemes: scheme 1, with p (1) = (0, 0.2, 0.50, 0.75, 1) and scheme 2, with p (2) = (0, 0.4, 0.7, 0.9, 1) respectively. For each censoring scheme, we compute the estimated mean (Mean), bias, lower and upper confidence (credible) limits for both classical and Bayesian estimates under the SEL and BSEL functions using both informative and non-informative priors. We have also presented the estimates of the posterior predictive values of the censored observation and intervals based on one and two sample predictive intervals for both schemes using both informative and noninformative priors in Table 4.

Optimal censoring scheme
Among all censoring schemes, the optimal censoring scheme provides the maximum information of the unknown parameters. The literature on optimal progressive type I Interval Censoring scheme is rather limited. [13] provided optimal progressive type I interval censoring schemes using A-and D-optimal design criteria. [12] studied optimum reliability sampling plans under progressive type I interval censoring scheme. [36] presented a method on inspection times and optimal censoring for Burr XII distribution under progressive type I interval censoring scheme. Recently, [21] presented a method on optimal censoring for inverse Weibull distribution under progressive type I interval censoring scheme and references cited there-in. In this paper, we consider the optimal censoring scheme with respect to minimum trace criteria based on [37,38], see also [39]. For illustrative purposes, we provide a small table indicating the optimal censoring scheme with respect to minimum trace criteria. Consider the p th quantile of the Dagum distribution: Using the idea of [40], the following information measure for a given censoring scheme {t 1 , . . ., t m } has been used where, V(t 1 , . . ., t m ) p denotes the asymptotic variance oft p the MLE of t p , based on the censoring scheme (t 1 , . . ., t m ). Moreover, w(.)is a non-negative weight function such that where t p is as given by (39) and I −1 (ϕ) is the asymptotic variance covariance matrix of the MLEs of (θ, β, λ) is given by (17) (see in Appendix).

Applications
The data sets illustrate the potentiality of the Dagum model for UK Quarterly Gas Consumption . The data correspond to the quarterly UK gas consumption from the first quarter of 1960 to the last quarter of 1986. The data in Table 1 are reported in [41] 9 Results and discussion The parameters of the Dagum distribution are calculated using MLE and the Bayesian approaches using both SEL and BSEL function with informative and non-informative prior for n = 50 and n = 100. Table 2 gives the values of the parameters when w = 0.3 and the proportions are p (1) = (0, 0.4, 0.7, 0.9, 1). Table 3 gives the values of the parameters when w = 0.3 and the proportions are p (2) = (0, 0.2, 0.5, 0.75, 1). Table 4 gives the values of predicted values Y (s) for one and two samples schemes using proportion p (1) = (0, 0.4, 0.7, 0.9, 1) and p (2) = (0, 0.2, 0.50, 0.75, 1), for sample sizes n = 50 and n = 100. We observe from Tables 2 & 3, in case of MLE, the MSEs of the two censoring percentage, p (1) and p (2) are the same for the scale parameter λ while for the two shape parameters θ and β are different. It is also observe that the Bayes estimates based on SEL and BSEL performs better than maximum likelihood estimates in terms of bias and MSEs. Also, in all the cases, MSEs decrease as we increase the sample sizes. It verifies the consistency of the estimators. The results also show that the performance of SEL based on informative prior is more or less same as non-informative prior both in terms of bias and MSEs values. However, in terms of MSEs of the parameters θ and λ in case of censoring percentage, p (1) performs better than p (2) while for the parameter β, p (2) performs better than p (1) . Further, we observe that the performance of BSEL for both informative and non-informative priors are not better off than the SEL. It is to be noted that under non-informative prior based on BSEL(BSEL-non) function, MSEs in p (2) performs better than p (1) for all parameter values θ, β and λ. The performance of the Bayes estimates under SEL improve in almost all cases with the increase in the sample size n. This behaviour holds true under both the censoring schemes and different choices of n. Further, we observe that as the sample size gets larger, average asymptotic confidence interval estimates and credible interval estimates decreases for the parameters β and λ. Finally, it is seen that the credible intervals have smaller interval lengths. From Table 4, we observe that predictive estimates based on SEL with informative prior performs better than non-informative prior for both one and two sample schemes. We note that the censoring percentage in p (1) is more than p (2) , so the experiment rapidly gets finished in case of p (1) than p (2) , thus censoring percentage p (1) saves the time and money more than p (2) . In Table 5, we report the decision variables D 0 and p 0 and different values of m for optimal censoring scheme. From Table 5, we observe that the optimal values of D 0 decreases as the number of inspection times m increases which implies that as the number of inspection time increases, intermediate time between two successive inspections get shorter. Again, with the increase of number of inspections, the optimal values increase. for real data analysis the results are presented in Table 6.
In real data analysis, we could not observe any changes in the results for both the percentage censoring (i.e., p (1) and p (2) ), thus we just reported one of them (i.e., p (2) ). In addition, there is no difference between SEL results and BSEL as presented in Table 7. In the next Tables we denote lower confidence limit (LCL) and upper confidence limit (UCL).

Conclusions
In this paper we discussed parameter estimation for the Dagum distribution based on progressive type-I interval censored data. Under classical set up we obtain maximum likelihood estimates and confidence intervals. In Bayesian framework, we have obtained Bayes estimates under SEL and BSEL functions using independent informative gamma and non informative uniform priors for both scale and two shape parameters. Moreover, Bayes predictive estimates and intervals are obtained using one and two sample schemes. Further, the optimal censoring scheme based on minimum trace criteria is discussed. Finding an optimum censoring scheme is an open problem from the computational point of view. More work is needed along that direction. So, present work may help the industry people to analyze progressive type-I interval censored lifetime data with optimal censoring. A future work is to estimate procedures of stress-strength reliability for Dagum distribution. Another future work is to study and compare the Bayesian estimation based on maximum likelihood and based on maximum product of spacing to estimate the stress-strength reliability of Dagum distribution.