A 3-Component Mixture of Rayleigh Distributions: Properties and Estimation in Bayesian Framework

To study lifetimes of certain engineering processes, a lifetime model which can accommodate the nature of such processes is desired. The mixture models of underlying lifetime distributions are intuitively more appropriate and appealing to model the heterogeneous nature of process as compared to simple models. This paper is about studying a 3-component mixture of the Rayleigh distributionsin Bayesian perspective. The censored sampling environment is considered due to its popularity in reliability theory and survival analysis. The expressions for the Bayes estimators and their posterior risks are derived under different scenarios. In case the case that no or little prior information is available, elicitation of hyperparameters is given. To examine, numerically, the performance of the Bayes estimators using non-informative and informative priors under different loss functions, we have simulated their statistical properties for different sample sizes and test termination times. In addition, to highlight the practical significance, an illustrative example based on a real-life engineering data is also given.


Introduction
The Rayleigh distribution has many real life applications in testing lifetime of an object whose lifetime depends upon its age. The Rayleigh distribution is often used in different fields of physics to model processes such as wave heights (Rattanapitikon [1] and Van Vledder et al. [2]), sound and light radiation (Siddiqui [3]), radio signals and wind power (Ahmed and Mahammed [4]), ultrasound image modeling (Chivers [5] and Burekhardt [6]) etc. It is also used to model lifetime in hours of tubes, resistors, networks, crystals, knobs, transformers, relays and capacitors in aircraft radar sets. The Rayleigh distribution is used to study the wind speeds over a year at wind turbine sites and the daily average wind speed. In all of above mentioned applications, it is not uncommon to assume that life of particular equipment does depend upon its age. On the other hand, this distribution has got valuable attention in the field of reliability theory and survival analysis, probability theory and operations research. Thus, to model the age dependent lifetimes of devices/ equipments, the Rayleigh distribution may be a suitable candidate distribution.
When the data are given only from overall mixture distributions then modeling these data as a mixture of some component distributions is known as direct application of the mixture models. Li [7] and Li and Sedransk [8,9] discussed different features of two types of mixture models. If the component distributions of a mixture belong to same family, their mixture is known as a type-I mixture model. Otherwise, it is named as a type-II mixture model.Mixture models have been successfully applied in many areas such as engineering, physical sciences, chemical sciences, biological sciences, etc. To understand the need of using mixture models, imagine a practical situation of modeling lifetimes of certain electrical elements where the population of lifetimes may be divided into a number of components depending upon the possible reasons of failure. Several authors have used mixture modeling in different practical problems. For example, Harris [10] fitted mixture distributions to model the crime and justice data, Kanji [11] described wind shear data using mixture distributions, Jones and McLachlan [12] applied the mixture of normal and Laplace distributions to wind shear data.
Most of the researchers worked on the classical and the Bayesian analysis of 2-component mixture models. McCullagh [13] derived some conditions under which quadratic and polynomial Exponential models can be generated as mixtures of Exponential models. Sinha [14] used the Bayesian counterpart of the maximum likelihood estimates of the 2-component mixture model considered by Mendenhall and Hader [15]. Hebert and Scariano [16] compared the location estimators for Exponential mixtures under Pitman's measure of closeness. Sultan et al. [17] investigated the properties of the 2-component mixture of inverse Weibull distributions. Saleem and Aslam [18] discussed the use of the informative and the non-informative priors for Bayesian analysis of the 2-component mixture of Rayleigh distributions. Also, Saleem et al. [19] presented the Bayesian analysis of the 2-component mixture of the Power distributions using the complete and censored sample. Kazmi et al. [20] described the Bayesian analysis for the 2-component mixture of Maxwell distributions.
In daily life, many types of data including simple data, grouped data, truncated data, censored data and progressively censored data are encountered. Censoring is an important and valuable aspect of the lifetime data. Censoring is a form of primary quality and missing life time data problems. A valuable account of censoring is given in Romeu [21], Gijbels [22] and Kalbfleisch and Prentice [23].
Motivated by above mentioned applications of mixture of Rayleigh distributions, we plan to have Bayesian analysis of a 3-component mixture of Rayleigh distributions with unknown mixing proportions. The parameters of component distributions are assumed to be unknown. Four different priors and three different loss functions are used for Bayesian analysis. In addition, we assume an ordinary type-I right censored sampling scheme.
The rest of the paper is organized as follows: The 3-component mixture of Rayleigh distributions is defined in Section 2. The expressions for posterior distributions using the non-informative and the informative priors are derived in Section 3. The elicitation ofhyperparameters, if unknown, is given in Section 4. In Section 5, the Bayes estimators and posterior risks using the uniform, the Jeffreys', the inverted chi-square and the square root inverted gamma priors under squared error loss function (SELF), precautionary loss function (PLF) and DeGroot loss function (DLF) are presented. The limiting expressions of the Bayes estimators and their posterior risksare derived in Section 6. The simulation study and the real data application arepresented in Sections7 and 8, respectively.Finally, the conclusion of this study is given in Section 9.

3-Component mixture of the Rayleigh distributions
The probability density function (p.d.f.) and the cumulative distribution function (c.d.f.) of the Rayleigh distribution for a random variable Y are given by: where λ m is the parameter ofthe Rayleigh distribution. A finite 3-component mixture model with the unknown mixing proportions p 1 and p 2 is defined as: For different values of component and mixing proportion parameters, the behavior of a 3-component mixture of the Rayleigh distributions is depicted in the Fig 1. The cumulative distribution function of 3-component mixture of the Rayleigh distributions is given by: The posterior distribution using the non-informative and the informative priors In this section, likelihood and posterior distributions of parameters given data, say y, are derived using the non-informative (uniform and Jeffreys') and the informative (inverted chisquare and square root inverted gamma) priors.

The likelihood function
Suppose n units from the 3-component mixture of Rayleigh distributionsare used in a life testing experiment with fixed test termination time t. Let the experiment be performed and it is observed that r out of n units failed until fixed test termination time t and the remaining n − r units are still working. It is to be noted that out of r failures, r 1 , r 2 and r 3 failures can be categorized as belong to subpopulation-I, subpopulation-II and subpopulation-III, respectively, depending upon the reason of failure. So, the number of uncensored observations is r = r 1 +r 2 +r 3 . The remaining nr observations are the censored observations. Now we define y lk , 0 < y lk t, be the failure time of the k th unit belonging to the l th subpopulation, where l = 1, 2, 3 and k = 1, 2,Á Á Á, r l . For a 3-component mixture model, the likelihood functioncan be written as: After simplification (see S1 File), the likelihood function of 3-component mixture of Rayleigh distributions is given by: where y ¼ ðy 11 ; y 12 ; . . . ; y 1r 1 ; y 21 ; y 22 ; . . . ; y 2r 2 ; y 31 ; y 32 ; . . . ; y 3r 3 Þ are the observed failure times for the uncensored observations and ϕ = (λ 1 , λ 2 , λ 3 , p 1 , p 2 ).

The posterior distribution usingthe Jeffreys' prior
According to Jeffreys [27,28], Bernardo [29] and Berger [30], the Jeffreys' prior (JP) for λ m It is interesting to note that the JP for proportion parameters p 1 and p 2 cannot be assumed under the current settings. Therefore, again, the uniform distribution over the interval (0,1) is assumed for both the p 1 and p 2 , i.e., p 1 * (0,1) and p 2 * (0,1). Under the assumption of independence of all the parameters, the joint prior distribution of parameters λ 1 , λ 2 , λ 3 , p 1 and p 2 is given by: Now, the joint posterior distribution of parameters λ 1 , λ 2 , λ 3 , p 1 and p 2 given data y, is given by (see S1 File): where A 12 = r 1 , 3.4 The posterior distribution using the inverted chi-square prior As an informative prior, we take inverted chi-square prior (ICP) for component parameters λ 1 , λ 2 , λ 3 and bivariate beta prior for proportion parameters p 1 , p 2 . Symbolically, it can be written as: , and p 1 , p 2 * Bivariate Beta(a,b,c). Again, assuming the independence of parameters, the joint prior distribution of parameters λ 1 , λ 2 , λ 3 , p 1 and p 2 is given by: The joint posterior distribution of parameters λ 1 , λ 2 , λ 3 , p 1 and p 2 given data y is given by (see S1 File): where

Elicitation of hyperparameters
Elicitation is a tool used to quantify a person's prior belief and knowledge. In Bayesian perspective, elicitation most often arises as a method of specifying the prior distribution of the random parameter(s). Elicitation is simply the quantification of prior knowledge about the random parameter(s) so that this can then be combined with the likelihood to obtain posterior distribution for further statistical analysis. Elicitation has remained a challenging problem for the statistician.Authors who have discussed this problem include Kadane et al. [31], Gavasakar [32], Al-Awadhi and Gartwaite [33], Aslam [34], Hahn [35] and Saleem and Aslam [18]. In this study, we adopted prior predictive method based on predictive probabilities given by Aslam [34].

Elicitation of hyperparameters using the ICP
For eliciting the hyperparameters, prior predictive distribution (PPD) is used. The PPD using the ICP for a random variable Y is defined as: On substituting (4) and (15) in (21) and then simplifying, we get: Using the prior predictive distribution given in (22) Using (22), following nine equations in (23) are solved simultaneously in Mathematica package for eliciting the hyperparameters a 1 , b 1 , a 2 , b 2 , a 3 , b 3 , a, b and c.

Elicitation of hyperparameters using the SRIGP
The PPD using SRIGP for a random variable Y is given by: Using (4), (18) and (24), we get: Through the above criteria as defined in Subsection 4.1, the values of the hyperparameters a 1 , b 1 , a 2 , b 2 , a 3 , b 3 , a, b and c are now obtained as 5.74419, 4.97886, 5.65643, 5.43122, 4.93333, 4.93038, 11.8838, 6.41829 and 7.0491, respectively.
Bayes estimators and posterior risks using the UP, the JP, the ICPand the SRIGPunder SELF, PLF and DLF Ifd is a Bayes estimator then rðdÞ is called posterior risk and is defined as: rðdÞ ¼ E ljy fLðl;dÞg. Our purpose, in this study, is to look for efficient Bayes estimators of the different parameters. For this purpose, three different loss functions, namely, SELF, PLF and DLF are used to obtain the Bayes estimators and their posterior risks. The SELF, defined as L (λ,d) = (λd) 2 , was introduced by Legendre [36] to develop the least square theory. Norstrom [37] discussed an asymmetric PLF and also introduced a special case of general class of PLFs, which is defined as Lðl; dÞ ¼ ðlÀdÞ 2 d . The PLF approaches infinitely close to the origin to avert underestimation, so yielding conventional estimators when underestimation may lead to grave results. The DLF is presented by DeGroot [38] and is defined as Lðl; dÞ ¼ lÀd For a given prior, the Bayes  E ljy ðl 2 Þ , respectively. The Bayes estimators and posterior risks using the UP, the JP, the ICP and the SRIGP for the parameters λ 1 , λ 2 , λ 3 , p 1 and p 2 under SELF, PLF and DLF are obtained as: where v = 1 for the UP, v = 2 for the JP, v = 3 for the ICP and v = 4 for the SRIGP. The Bayes estimators and posterior risks using the UP, the JP, the ICP and the SRIGP under PLF and DLF can also be derived in similar way and are presented as supporting information in S1 File.

Limiting expressions
When test termination time t!1, uncensored observations r tends to sample size n and r l tends to n l , l = 1,2,3. Consequently, all the observations which are censored become uncensored and the information contained in the sample is increased. As a result, the posterior risks of the Bayes estimatorsdiminish and efficiency of the Bayes estimators is increased because all the observations are incorporated in sample. The limiting expressions for the Bayes estimators and posterior risks using the UP, the JP, the ICP and the SRIGP under SELF are given in Tables  A-D  For a fixed sample size, test termination time and set of parameters, the p 1 n (p 2 n,(1 − p 1 − p 2 )n) observations are randomly taken from first (second, third) component density.The observations which are greater than a fixed t are declared as censored observations. For each t, only failures are identified either as member of subpopulation-I or subpopulation-II or subpopulation-III. . .Based on such sample, the Bayes estimates (BEs) and posterior risks (PRs)are computed using the UP, the JP, the CIP and the SRIGP under SELF, PLF and DLF. In order to evaluate the impact of test termination time on Bayes estimators, the type-I right censoring scheme is used for fixed test termination times t = 25 and 30. All the above procedure is repeated 1000 times using Mathematica software. The results are then averaged over the 1000 samples and are arranged in S1-S12 Tables.
From S1-S12 Tables, it can be seen that the extent of over-estimation (under-estimation) of thecomponent and proportion parameters (through Bayes estimators)using all considered priors and loss functions is greaterfor small sample size (test termination time)as compared to large sample size (test termination time) at different test termination times (sample sizes).Similarly,the extent of over-estimation (under-estimation) of component and proportion parameters is lesserfor smaller values of component parameters as compared to larger values of component parameters atvaryingtest termination times and sample sizes. It is observed that difference of the BEs from assumed parameters reduce to zero with an increase in sample size for different test termination times.The same observation can be made with larger test termination time as compared to smaller test termination time for varying sample sizes.
It is observed that the PRs of Bayes estimatorsusingthe different priors and loss functions reduce with an increase in samplesize at different test termination times.For smallertest termination time, the PRs of Bayes estimators are larger than the PRs for large test termination time irrespective of the prior, loss function and sample size. Also, the PRs of Bayes estimators are smaller (larger) for smaller (larger) component parametric values for each sample size and test termination time considered in the simulation study.
As far as the problem of selecting a suitable prior is concerned, it can be seen that SRIGP emerges as the best prior amongst the different non-informative and informative priors considered in this study. On the other hand, the DLF is observed performing better than PLF and SELF for estimating component parameters, whereas, for estimating the proportion parameters, SELF is observed superior to PLF and DLF. It is to be noted that selection of best prior (loss function) for a given loss function (prior) is made based on PRs associated with it. Also, the selection of best prior and loss function does not depend on sample size and test termination time.

Real data application
The real mixture data, z ¼ ðz 11 ; z 12 ; . . . ; z 1r 1 ; z 21 ; z 22 ; . . . ; z 2r 2 ; z 31 ; z 32 ; . . . ; z 3r 3 Þ, are taken from Davis [39]. These data represent hours to failure of a V805 Transmitter Tube, a Transmitter Tube and a V600 Indicator Tube used in aircraft radar sets. Davis [39] showed that the data z can be modeled by a mixture of exponential distributions. The transformation y ¼ ffiffiffiffiffi 2z p of an exponential random data (z) yields the Rayleigh random data (y). This transformation allows us to use the Davis mixture data for applying the proposed Bayesian analysis. To have a type-I right censored data we fix t = 600 hours. The tests are conducted 1340 times. Thus, we have a type-I right censored data at t = 600 hours on n = 1340 radar sets. The data summary required to evaluate the BEs and PRs is given by: z 3k ¼ 32500, n = 1340, r 1 = 866, r 2 = 337, r 3 = 83, r = r 1 +r 2 +r 3 = 1286, n−r = 54. The BEs and the PRs using the UP, the JP, the ICP and the SRIGP under SELF, PLF and DLF are presented in S13 Table. From S13 Table, it is observed that the results based on the real data are compatible with simulation results.The results about the best prior and the best loss function are also the same as we have discussed in the Section 7.

Concluding remarks
In this study, we have considered the Bayesian analysis of3-componenten mixture of Rayleigh distributionsusing the non-informative (uniform and Jeffreys') and the informative (IC and SRIG) priors under SELF, PLF and DLFto model lifetimes of objects. We conducted a comprehensive simulation and real life study to judge the relative performance of the Bayes estimators and also to deal with the problems of selecting the priors and loss functions at different sample sizes and test termination times. From simulated results, we observed that an increase in sample size or test termination time provides improved Bayes estimators. The extent of over-estimation (under-estimation) of the Bayes estimators is quite larger (smaller)for relatively smaller(larger) sample sizes (test termination times) at different test termination times (sample sizes). Furthermore, as sample size (test termination time) increases (decreases) the PRs of Bayes estimators decrease (increase) for a fixed test termination time (sample size). However, the PRs of Bayes estimators are large when component parameters are relatively larger and vice versa.Also, the DLF (SELF) is observed as a suitable choice for estimating component (proportion) parameters.Finally, we conclude that the SRIGP is more suitable prior under DLF for estimating the component parameters. In case, when SELFis used, the SRIGP is preferablepriorfor proportion parameters. Moreover, the same pattern is observed for the JP when only non-informative priors (UP and JP) are considered.  Table. The BEs and the PRs using the UP with λ 1 = 11, λ 2 = 13, λ 3 = 15, p 1 = 0.3, p 2 = 0.5 and t = 25, 30. (DOCX) S10 Table. The BEs and the PRs using the JP with λ 1 = 11, λ 2 = 13, λ 3 = 15, p 1 = 0.3, p 2 = 0.5 and t = 25, 30. (DOCX) S11 Table. The BEs and the PRs using the ICP with λ 1 = 11, λ 2 = 13, λ 3 = 15, p 1 = 0.3, p 2 = 0.5 and t = 25, 30. (DOCX) S12 Table. The BEs and the PRs using the SRIGP with λ 1 = 11, λ 2 = 13, λ 3 = 15, p 1 = 0.3, p 2 = 0.5 and t = 25, 30. (DOCX) S13 Table. The BEs and the PRs using the UP, the JP, the ICP and the SRIGP under SELF, PLF and DLF. (DOCX)