Figures
Abstract
The objective of this study is to construct a new distribution known as the weighted Burr–Hatke distribution (WBHD). The PDF and CDF of the WBHD are derived in a closed form. Moments, incomplete moments, and the quantile function of the proposed distribution are derived mathematically. Eleven estimate techniques for estimating the distribution parameters are discussed, and numerical simulations are utilised to evaluate the various approaches using partial and overall rankings. According to the findings of this study, it is recommended that the maximum product of spacing (MPSE) estimator of the WBHD is the best estimator according to overall rank table. The actuarial measurements were derived to the suggested distribution. By contrasting the WBHD with other competitive distributions using two different actual data sets collected from the COVID-19 mortality rates, we show the importance and flexibility of the WBHD.
Citation: Aldallal R, Gemeay AM, Hussam E, Kilai M (2022) Statistical modeling for COVID 19 infected patient’s data in Kingdom of Saudi Arabia. PLoS ONE 17(10): e0276688. https://doi.org/10.1371/journal.pone.0276688
Editor: Anoop Kumar, Amity University - Lucknow Campus, INDIA
Received: September 7, 2022; Accepted: October 11, 2022; Published: October 28, 2022
Copyright: © 2022 Aldallal et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data are available in the paper.
Funding: This project was supported by the Deanship of Scientific Research at Prince Sattam Bin Abdulaziz University under the research project(PSAU-2022/02/20114).
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
The creation of effective statistical models for natural and real-life occurrences that may be represented by established statistical probability distributions is one of the fundamental goals of statistics. Where the probability distributions are being utilised to simulate the unpredictable and potentially dangerous life occurrence that is of interest to the researcher. Because of the complexity and difficulties involved in simulating real life occurrences using standard distributions, a large number of other probability distributions have been devised.
Sometimes, the known and accessible probability distributions continue to be unable to adequately reflect and describe the facts for particular natural events. This may be frustrating. The generalised probability distributions are the ones that end up being expanded and modified as a consequence of these changes and expansions. For more readings see [1].
The addition of a few new or additional parameters to well-known probability distributions improved the applicability of those distributions for the data pertaining to natural events and raised the accuracy with which they presented the tail shape of the distribution. There are various helpful methods to expand and increase the flexibility of the traditional statistical distributions. One of these ways is by including an extra parameter in the distribution. One example of this is the power (P) transformation.
In this article, the two-parameter WBHD, which has a variety of interesting traits, is obtained by referring to the distributions discussed earlier in the subject. Because it may be skewed to the right as positive skewed, skewed to the left as negative skewed, or symmetric, the implemented WBHD does have a PDF that is more flexible. This provides for extra tail flexibility. It can mimic decreasing, rising, bathtub, and reverse-J hazard rates as well as other hazard rate scenarios. In addition to that, the distribution that has been proposed has an exact closed-form CDF and PDF can be managed with relative ease. Because of these benefits, the distribution has a promising potential for applications in a wide range of industries, such as biotechnological life testing, durability, and econometric data. For more readings see [2–7].
Recent years have seen an uptick in the number of writers interested in developing novel lifetime distributions for the purpose of fitting actual lifetime data. One of them is: [8–16]. On the other hand, it is common knowledge that order statistics may deal and apply with the applications and attributes of random variables and of functions associated to them see [17–19] for reference.
Whether we need to have this distribution is the most important issue. In order to answer this question, we will briefly summarise the relevance of the WBHD: (i) The statistical functions of the WBHD may be expressed in a straightforward and closed-form manner. (ii) the features of the WBHD may be inferred clearly without the need of any special and particular mathematical functions; and (iii) the proposed WBHD provides more flexibility than the existing distributions in terms of the form of the hazard rate function. (iv) The proposed model is capable of fitting different kinds of data such as medical data and engineering data, as well as actuarial data which gives it a very interesting usage in many fields of sciences.
The following constitutes the presentation of this article: We provide the suggested distribution WBHD in Portion 2, along with its PDF and CDF functions. The graphical plots of the PDF and HRF are also presented in this section of the paper. In section 3, we establish several statistical features that are relevant to the WBHD. Eleven traditional approaches to estimating were discussed in Section 4. Also, in Section 4, the simulation research along with its numerical findings were carried out. Risk measures of our proposed model were discussed mathematically in Section 5. In Section 6, we are now going to do the real data analysis. Section 7 of this study piece is where the concluding observations are presented.
2 Formulation of the WBHD
In this section we define the formulation of the proposed model. Using the cumulative distribution function (CDF) of the WD-G [20], so we can define the CDF of the two-parameter WBHD as follows
(1)
and its probability density function (PDF) is defined as follows
(2)
The hazard rate function (HRF) of WBHD is defined as follows
(3)
Now we will graph all possible shapes of the PDF of the WBHD and HRF of the WBHD. In Fig 1, we provided three different possible shapes for the PDF of the WBHD: an increasing function, a decreasing function, and a unimodal function. Also, we provided the possible shapes of the HRF of the WBHD in Fig 2.
3 Statistical properties
Defining the mathematical properties of the proposed model is very essential and important to study the behaviour of the model and making computation easy, also generating data from the proposed distribution depends on the quantile function. This section contains a mathematical discussion of the statistical properties of the WBHD.
3.1 Quantile function
In order to calculate the quantile function (QF) of the WBHD, you must determine the inverse function of the CDF (1). This may be done as the following
(4)
where 0 < p < 1 and W(⋅) is Lambert function. It used to find the WBHD quarterlies, and to have randomly generated data sets by the following relation
3.2 Linear representation
The CDF 1 and the PDF 2 of the proposed model can be linearly represented by using the following expansion as follows
where
and
follows the exponentiated Burr-Hatke distribution (ExBHD).
3.3 Moments
The qth moments of the WBHD has the form
where
. Setting q = 1, 2, 3, and 4, respectively, we obtain the first four moments about the origin of the WBHD.
The nth central moment of X, say μn, follows as
3.4 Incomplete moments
The dth incomplete moment of WBHD is calculated as follows
Many fields in our life may find great use for Lorenz curve which can be obtained by incomplete moments, , xp is the quantile function. One other use for the first incomplete moment is to calculate both the mean residual life and the mean waiting time, both of which are calculated using by m1(t) = [1 − T1(t)]/S(t) − t and M1(t) = t − T1(t)/F(t), respectively.
4 Methods of estimation
This section discusses eleven techniques for estimating the WBHD’s parameters, θ = (a, α)⊤, and compares them using Monte Carlo simulations. To determine the estimates of θ in the following approaches, the AdequacyModel package for the the R software offers a thorough and effective universal meta-heuristic optimization method for maximizing or minimizing an arbitrary objective function. Visit https://rdrr.io/cran/AdequacyModel/ for more information.
4.1 Classical methods of estimation
- i. With respect to the WBHD parameters, the maximum likelihood estimation (MLE) is calculated by maximizing the log-likelihood function, which is described as follows (x1, …, xn is a random sample from WBHD)
- ii. The Anderson-Darling estimation (ADE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- iii. The right-tail Anderson-Darling estimation (RADE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- iv. The left-tailed Anderson-Darling estimation (LTADE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- v. The Cramér-von Mises estimation (CVME) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- vi. The least-squares estimation (LSE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- vii. The weighted least-squares estimation (WLSE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- viii. The maximum product of spacing estimation (MPSE) is used to calculate the WBHD estimated parameters by maximizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
where
- ix. The minimum spacing absolute distance estimation (MSADE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- x. The minimum spacing absolute-log distance estimation (MSALDE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
- xi. The percentile estimation (PE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x(1) ≤ x(2) ≤ … ≤ x(n))
4.2 Monte Carlo simulations
Using the results of simulations, we investigate how well the estimate approaches the initial values of the WBHD parameters. We will use the sample sizes n = 25, 60, 100, 200, 300, and 500, as well as various parameter values. We create N = 1, 000 random samples from the WBHD, and then we use the software package R to compute the average absolute biases (ABBs), mean square errors (MSEs), and mean relative estimates (MREs).
We explore the efficiency of the aforementioned estimate methodologies for computing the WBHD parameters using the simulation data. We will use several parameter values and the sample sizes n = 25, 60, 100, 200, 300, and 500. Using the software package R, we generate N = 1, 000 random samples from the WBHD and calculate the average absolute biases (ABBs), , mean square errors (MSEs),
, and mean relative estimations (MREs),
.
We find that the MPSE and MLE techniques are the best ways for estimating randomly generated data sets from WBHD, followed by the ADE method. Tables 1–5 give the numerical results of our simulation, while Table 6 reports ranks of each estimated method.
4.3 Concluding remarks on simulation results
- After recording the results we found that as the sample size get larger the MSE diminishes gradually
- After recording the results we found that as the sample size get larger the MRE diminishes gradually
- After recording the results we found that as the sample size get larger the BIAS diminishes gradually
- By referring to Table 6 we can see that the best estimation method is the MPSE as it has the lowest overall rank.
- By referring to Table 6 we can see that the second best estimation method is the MLE as it has the second lowest overall rank.
5 Risk measures
In this section we study some risk measures for WBHD. One of this measures is value at risk (VR) which refers to to a quantitative total of the cumulative loss distribution (see Artzner [21]). It is defined for WBHD as follows
The second risk measure is called tail value at risk which is used to estimate the worth of a prospective loss when an event occurs outside of the predetermined probability and it is defined for WBHD as follows
5.1 Numerical simulations for risk measures
In this subsection some results for risk measures for WBHD and BHD are discussed. Tables 7 and 8 presented numerical values of the two risk measures which are determined for both of WBHD and BHD, also, these results are presented graphically in in Figs 3 and 4. From these tables, we conclude that our proposed model have larger values for the two measures compared with BHD, so we can say that the WBHD fits heavy tailed model than BHD and it can be used for modeling insurance data set and other heavy tailed real data sets.
6 Analysis of COVID-19 real data sets
The usage of COVID-19 actual world data sets in this section demonstrates the distribution’s adaptability. The first data set provides the COVID-19 mortality rate from Saudi Arabia for a period of forty days, from the 22nd of July to the 30th of August 2021. The second real data set on the mortality rate COVID-19 statistics belongs to Saudi Arabia and covers a period of 32 days, which is recorded from the 15th of September 2020 to the 16th of October 2020. Both of the two real data sets are available at https://covid19.who.int/. We will examine WBHD in contrast to a variety of well-known models, such as Burr-Hatke distribution (BHD) [22], inverse power Burr-Hatke distribution (IPBHD) [23], logarithmic Burr-Hatke exponential distribution (LBHED) [24], alpha power exponential distribution (APED) [25], Frechet distribution (FD), exponential distribution (ED), Lindley distribution (LND), Lomax distribution (LD), Frechet Weibull distribution (FWD) [15], and Maxwell distribution (MD), in order to show how flexible WBHD is.
We make use of a variety of analytical criteria in order to identify which model is the most suited to employ with the COVID-19 actual data sets. These criteria are Akaike information criterion (A1), the correct Akaike information criterion (A2), Bayesian information criterion (A3), Hannan information criterion (A4). We also consider other information on the model’s overall goodness-of-fit, including Anderson Darling (F1), Cramer-von Mises (F2), and Kolmogorov-Smirnov (F3) with its p-value (F3(p)). The best model for fitting the COVID-19 real data sets is the one with the smallest values of these measures, with the exception of G3(p) the model with large value is the best model.
For the two COVID-19 actual data sets that were taken into consideration for assessment, analytical measurements as well as MLE estimations and their accompanying standard errors (SE) are provided, respectively, in Tables 9 and 10. As a direct result of this, we could arrive at the conclusion that the WBHD model performs far better than the other models that are comparable to it. The two COVID-19 actual data sets that are shown in Figs 5 and 6, respectively, are fitted with WBHD using the P-P plot as well as the fitted PDF, CDF, and SF plots. Figs 7 and 8 demonstrate, respectively, for the two COVID-19 real data sets, the behavior of the log-likelihood function with estimated parameters, which is a unimodal function for each value of the estimated parameters.
Which proves that the roots are global maximum.
Which proves that the roots are global maximum.
6.1 Concluding remarks on the application results
- The WBHD is more flexible than our family baseline model (BHD) for fitting the two COVID-19 actual data sets.
- For the two COVID-19 actual data sets that were taken into consideration for assessment, analytical measurements as well as MLE estimations and their accompanying standard errors (SE) are provided, respectively, in Tables 9 and 10, which provides us that our propose model is the best model for fitting the analyzed real data sets.
- As a direct result of analyzing of the two COVID-19 actual data sets, we could arrive at the conclusion that the WBHD model performs far better than the other models that are comparable to it.
7 Conclusion
A new lifetime distribution, which was given the name WBHD, was presented in this paper. We derived its statistical properties. For the purpose of obtaining point estimates for the unknown WBHD parameters α, a eleven traditional estimation approaches were taken into consideration. A simulation research was carried out using R software, allowing for the comparison of the effectiveness of various estimating approaches. Two different sets of COVID-19 data were used to illustrate the benefits of the suggested distribution. In comparison to most of the other distributions under consideration, it was discovered that WBHD best suited the data. In addition to this, it can be shown in Figs 7 and 8 that the log-likelihood function has global maximum roots.
8 Future work
In the work that will be done in the future, we may make use of the model that was presented to model various actual data sets in a number of different areas, such as reliability engineering, survival analysis, and so on. Additionally, the WBHD may be expanded to include the introduction of bivariate WBHD’s, and this expansion can then be used to the modelling of actual data sets. It is possible to have a discussion about using Bayesian estimation to determine the suggested model parameters based on full and several kinds of censored samples and subjected to a variety of loss functions.
References
- 1. Shahzad U., Ahmad I., Almanjahie I., Hanif M., and Al-Noor N. H. L-moments and calibration based variance estimators under double stratified random sampling scheme: an application of covid-19 pandemic. Scientia Iranica, 2021.
- 2. Lone S. A., Subzar M., and Sharma A. Enhanced estimators of population variance with the use of supplementary information in survey sampling. Mathematical Problems in Engineering, 2021, 2021.
- 3. Bhushan S., Kumar A., Akhtar M. T., and Lone S. A. Logarithmic type predictive estimators under simple random sampling. AIMS Mathematics, 7(7):11992–12010, 2022.
- 4. Bhushan S., Kumar A., and Lone S. A. On some novel classes of estimators using ranked set sampling. Alexandria Engineering Journal, 61(7):5465–5474, 2022.
- 5. Shahzad U., Ahmad I., Al-Noor N. H., and Benedict T. J. Use of calibration constraints and linear moments for variance estimation under stratified adaptive cluster sampling. Soft Computing, pages 1–12, 2022.
- 6. Bhushan S., Kumar A., Shahzad U., Al-Omari A. I., and Almanjahie I. M. On some improved class of estimators by using stratified ranked set sampling. Mathematics, 10(18):3283, 2022.
- 7. Shahzad U., Ahmad I., Almanjahie I., and Al-Noor N. H. Utilizing l-moments and calibration method to estimate the variance based on covid-19 data. Fresenius Environmental Bulletin, 30(7A):8988–8994, 2021.
- 8. Teamah A. A. M., Elbanna A. A., and Gemeay A. M. Fréchet-Weibull mixture distribution: Properties and applications. Applied Mathematical Sciences, 14(2):75–86, 2020.
- 9. Aljohani H. M., Almetwally E. M., Alghamdi A. S., and Hafez E. H. Ranked set sampling with application of modified kies exponential distribution. Alexandria Engineering Journal, 60(4):4041–4046, 2021.
- 10. Wang W., Ahmad Z., Kharazmi O., Ampadu C. B., Hafez E. H., and Mohie El-Din M. M. New generalized- x family: Modeling the reliability engineering applications. Plos one, 16(3):e0248312, 2021. pmid:33788850
- 11. Afify A. Z., Gemeay A. M., and Ibrahim N. A. The heavy-tailed exponential distribution: Risk measures, estimation, and application to actuarial data. Mathematics, 8(8):1276, 2020.
- 12. Abd El-Raheem A. M., Almetwally E. M., Mohamed M. S., and Hafez E. H. Accelerated life tests for modified kies exponential lifetime distribution: binomial removal, transformers turn insulation application and numerical results. AIMS Mathematics, 6(5):5222–5255, 2021.
- 13. Almongy H. M., Almetwally E. M., Alharbi R., Alnagar D., Hafez E. H., and Mohie El-Din M. M. The weibull generalized exponential distribution with censored sample: estimation and application on real data. Complexity, 2021, 2021.
- 14. Tung Y. L., Ahmad Z., Kharazmi O., Ampadu C. B., Hafez E. H., and Mubarak Sh. A. M. On a new modification of the weibull model with classical and bayesian analysis. Complexity, 2021, 2021.
- 15. Teamah A. A. M., Elbanna A. A., and Gemeay A. M. Fréchet-Weibull distribution with applications to earthquakes data sets. Pakistan Journal of Statistics, 36(2), 2020.
- 16. Teamah A. A. M., Elbanna A. A., and Gemeay A. M. Right truncated Fréchet-Weibull distribution: Statistical properties and application. Delta Journal of Science, 41(1):20–29, 2020.
- 17. Almongy H. M., Almetwally E. M., Aljohani H. M., Alghamdi A. S., and Hafez E. H. A new extended rayleigh distribution with applications of covid-19 data. Results in Physics, 23:104012, 2021. pmid:33728260
- 18. Al-Babtain A. A., Elbatal I., Al-Mofleh H., Gemeay A. M., Afify A. Z., and Sarg A. M. The flexible burr xg family: Properties, inference, and applications in engineering science. Symmetry, 13(3):474, 2021.
- 19. Alfaer N. M., Gemeay A. M., Aljohani H. M., and Afify A. Z. The extended log-logistic distribution: Inference and actuarial applications. Mathematics, 9(12):1386, 2021.
- 20. Bakouch H., Chesneau C., and Enany M. A weighted general family of distributions: Theory and practice. Computational and Mathematical Methods, page e1135, 2020.
- 21. Artzner P. Application of coherent risk measures to capital requirements in insurance. North Am. Actuar. J., 3(2):11–25, 1999.
- 22. Isaic-Maniu A. and Voda V. G. h. Generalized Burr-Hatke equation as generator of a homographic failure rate. Journal of applied quantitative methods, 3(3), 2008.
- 23. Afify A. Z., Aljohani H. M., Alghamdi A. S., Gemeay A. M., and Sarg A. M. A new two-parameter burr-hatke distribution: Properties and bayesian and non-bayesian inference with applications. Journal of Mathematics, 2021, 2021.
- 24. Abouelmagd T. H. M. The logarithmic burr# hatke exponential distribution for mod# eling reliability and medical data. International Journal of Statistics and Probability, 7(5):1, 2018.
- 25. Mahdavi A. and Kundu D. A new method for generating distributions with an application to exponential distribution. Communications in Statistics-Theory and Methods, 46(13):6543–6557, 2017.