General two-parameter distribution: Statistical properties, estimation, and application on COVID-19

In this paper, we introduced a novel general two-parameter statistical distribution which can be presented as a mix of both exponential and gamma distributions. Some statistical properties of the general model were derived mathematically. Many estimation methods studied the estimation of the proposed model parameters. A new statistical model was presented as a particular case of the general two-parameter model, which is used to study the performance of the different estimation methods with the randomly generated data sets. Finally, the COVID-19 data set was used to show the superiority of the particular case for fitting real-world data sets over other compared well-known models.


Introduction
The spread of COVID-19 has caused international harm and economic instability in recent months. Scientists are looking into this event in great detail right now. But it's essential to have the right facts and numbers in order to do everything you can to stop COVID-19. In the study and use of big data sciences, it is always important to give the best possible description of the data being looked at. Recent research has shown how statistical distributions can be used to model data in applied sciences, especially in medical science. Statisticians often explore new statistical models to suit data sets in diverse domains. Statistical models are very useful in describing and predicting real phenomena. Many distributions have been widely used for data modeling in several domains during the last decades. Recent developments focus on defining new families that extend well-known distributions and, at the same time, provide great flexibility in data modeling in practice. Thus, several distributions used to model lifetime data have been proposed in the statistical literature. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The cumulative distribution function (CDF) and HRF of GTPD are, respectively, defined as follows F x; y; k ð Þ ¼ y k ½bðyÞe À yx þ yGðk þ 1; xyÞ� y k bðyÞ þ k! ; ð3Þ hðx; y; kÞ ¼ yðbðyÞ þ x k Þ bðyÞ þ y À k e yx Gðk þ 1; xyÞ : ð4Þ 3 Statistical properties

Behavior of PDF and HRF of GTPD
This subsection discusses the behavior and possible shapes of the PDF (2) and HRF (4). Also, we determined the mode of GTPD in this subsection. The PDF behavior is described as follows lim x!0 f ðxÞ ¼ y kþ1 bðyÞ y k bðyÞ þ k!

Moments and related measures of GTPD
Let X * GTPD, Then the ith moment of X is determined as follows Hence, the first four moments of the GTPD random variable can be found by substituting i = 1, 2, 3, 4, respectively, in Eq (5). They are used to determine variance, Skewness, Kurtosis, and coefficient of variation of GTPD, respectively, as follows.
The ith incomplete moments of GTPD are determined as follows where Gða; zÞ ¼ R 1 z t aÀ 1 e À t dt. We have first incomplete moments T 1 (t) in the last equation when i = 1, which is used to calculate the mean residual life and the mean waiting time which is, respectively, defined as follows cðtÞ ¼ ½1 À T 1 ðtÞ�=SðtÞ À t; Another use of T 1 (t) is to calculate Bonferroni and Lorenz curves which are, respectively, defined as follows.

Entropy
It is commonly understood that entropy and information can be used to calculate the degree of uncertainty in a probability distribution. However, many correlations have been created based on the features of entropy.
The entropy of a random variable X is a measure of the uncertainty's variation. The entropy of Rényi [18] is defined as follows where s(integer) > 0 and s 6 ¼ 1. For the GTPD, we have by using expansion for (b(θ) + x k ) s , we have we get Shannon's entropy as s ! 1.
Proof. We have For simplification, we use ln

Estimation of GTPD parameters
As we will see in this section, there are many different traditional estimation techniques that can be used to acquire an estimate of the GTPD parameters. These parameters can be derived by either maximizing or minimizing an objective function, as we will discuss in detail.
• The parameters of GTPD that have been estimated using the maximum likelihood estimation (MLE) approach are derived by maximizing the log-likelihood function, which has the following definition x i : • Using the Anderson-Darling estimation (ADE) approach, the estimated GTPD parameters are obtained by minimizing the following equation (

PLOS ONE
General two-parameter distribution: Statistical properties, estimation, and application on COVID-19 • Using the right-tail Anderson-Darling estimation (RADE) approach, the estimated GTPD parameters are obtained by minimizing the following equation (x (1) • Using the left-tailed Anderson-Darling estimation (LTADE) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( • Using the Cramér-von Mises estimation (CVME) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( • Using the least-squares estimation (LSE) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( • Using the weighted least-squares estimation (WLSE) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( �2 : • Using the maximum product of spacing estimation (MPSE) approach, the estimated GTPD parameters are obtained by maximizing the following equation (x (1) � x (2) � . . . � x (n) ) • Using the minimum spacing absolute distance estimation (MSADE) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( • Using the minimum spacing absolute-log distance estimation method (MSALDE) approach, the estimated GTPD parameters are obtained by minimizing the following equation ( jlog D i À log 1 n þ 1 j:

Special case of GTPD
In this section, we presented a new statistical distribution called Gemeay distribution (GD) which is a special case of GTPD, and it is obtained by taking b(θ) = θ −2 . The PDF, CDF, and HRF of GD are, respectively, defined as follows FðxÞ ¼ y k ð1 À e yðÀ xÞ Þ þ y 2 Gðk þ 1; xyÞ y 2 k! þ y k ; ð8Þ Plots of PDF 7 are presented in Fig 1, which deal with the study of the behavior of the GTPD in Subsection 3.1 when replacing b(θ) by 1 y 2 . The HRF 9 of GD are presented graphically in Fig 2, which deal with results proofed in Proposition 1 when replacing b(θ) by 1 y 2 .

Numerical simulation
In this section, we will use all of the estimating techniques discussed in Section 4, but we will substitute 1 y 2 for b(θ). Now that we have these different estimating techniques, we will investigate how well they function when used to estimate the parameters of the GD. In addition to this, we evaluate a comparison of each approach by comparing the numerical values of the average of Bias (BIAS) jBiasðΩÞj ¼ 1 Simulation results may be used to choose the optimal estimation method for model parameters. The R software is used to have M = 10000 random samples from GD for sample sizes equal to 30, 75, 150, 250, 400, and 600.
The numerical results of the simulations are shown in Tables 1-5, and the power of each number relates to its order when compared to the other estimating techniques along the same line. Table 6 shows our estimators' partial and total rankings. We find that MPSE is the best approach for estimating suggested model parameters when using random samples from our model.

PLOS ONE
General two-parameter distribution: Statistical properties, estimation, and application on COVID-19

PLOS ONE
General two-parameter distribution: Statistical properties, estimation, and application on COVID-19

Real data analysis
The flexibility of the distribution is shown in this section via the use of data taken from the real world. The evaluated data is a COVID-19 data set of 30 days of mortality rate that belonged to the Netherlands and was captured from the 31st of March to the 30th of April 2020. It was based on the death rate in the general population, and it is available at https:// covid19.who.int/. In order to illustrate how flexible GD is, we shall evaluate it in comparison to a number of well-known models, such as exponential distribution (ED), Frechet distribution (FD), Lindley distribution (LD), modified Kies exponential distribution (MKED) [19], Lomax distribution (L0D), Weibull Frechet distribution (WFD) [20], Frechet Weibull distribution (FWD) [21], Burr-Hatke distribution (BHD) [22], inverse log-logistic distribution (ILLD) [23], inversely weighted Lindley distribution (IWLD) [24], type I generalized half logistic  distribution (TIGHLD) [25], half-logistic distribution (HLD) and Maxwell distribution (MD). In order to determine which is the most appropriate model to use with the COVID-19 data set, we make use of a number of analytical criteria, among which are: the Akaike information criterion (Cr 1 ), the correct Akaike information criterion (Cr 2 ), Bayesian information criterion (Cr 3 ), Hannan information criterion (Cr 4 ). In addition to this, we base our choice on a variety of additional data about the model's overall goodness-of-fit, such as Anderson Darling (G 1 ), Cramér-von Mises (G 2 ) and Kolmogorov-Smirnov (G 3 ) with its p-value (G 3 (p)). The model with the minimum values of these measures is the best model for fitting the COVID-19 data set.
Analytical measurements, as well as the estimates by MLE and corresponding standard errors (SE), are supplied for the COVID-19 data set that was being considered for evaluation. These numerical values are reported in Table 7, as shown. As a consequence, the GD performs better than the other models that are equivalent to it. The P-P plot and the fitted PDF, CDF, and SF plots are used to fit GD to the COVID-19 data set, which is shown in Fig 3. Using the COVID-19 data set, the GD was shown to be a good fit. TTT and estimated HRF of GD plots are shown in Fig 4 for the COVID-19 data set. The behavior of the log-likelihood function with estimated parameters is shown in Fig 5 for the COVID-19 data set, which is a unimodal function.

Conclusion
In this paper, we derived a general statistical model using a mix of exponential and gamma distributions called a general two-parameter distribution. The formulation of the PDF of the general model was derived in detail with its CDF and HRF. The behavior of PDF and HRF of GTPD at points x = 0 and x = 1 were calculated. Also, the shapes of both PDF and HRF of GTPD were determined mathematically. Many statistical properties of the GTPD were determined, such as moments with its related measures, incomplete moments with its related measures, entropy, and stochastic orders. Ten different estimation methods were used to calculate unknown parameters of the GTPD. Gemeay distribution was presented as a special case of the GTPD. The randomly generated data sets from GD were used to check the performance of the different estimation methods. The flexibility of GD was illustrated by using mortality rate of the COVID-19 real data set, which showed that GD is the best model for fitting the analyzed COVID-19 data set than other compared well-known models.