Bivariate power Lomax distribution with medical applications

In this paper, a bivariate power Lomax distribution based on Farlie-Gumbel-Morgenstern (FGM) copulas and univariate power Lomax distribution is proposed, which is referred to as BFGMPLx. It is a significant lifetime distribution for modeling bivariate lifetime data. The statistical properties of the proposed distribution, such as conditional distributions, conditional expectations, marginal distributions, moment-generating functions, product moments, positive quadrant dependence property, and Pearson’s correlation, have been studied. The reliability measures, such as the survival function, hazard rate function, mean residual life function, and vitality function, have also been discussed. The parameters of the model can be estimated through maximum likelihood and Bayesian estimation. Additionally, asymptotic confidence intervals and credible intervals of Bayesian’s highest posterior density are computed for the parameter model. Monte Carlo simulation analysis is used to estimate both the maximum likelihood and Bayesian estimators.


Introduction
The Lomax (Lx) [1], or Pareto II, distribution presented primarily for modeling business failure data. In statistical literature, many authors used this distribution to model reliability data set and life testing [2], income and wealth data sets [3], biological sciences [4], and data set from receiver operating characteristic (ROC) curves analysis [5].
Rady et al. [6] proposed power lomax (PLx) model as a new extension of the Lx distribution with an extra shape parameter and applied it to medical databases. The cumulative distribution function (CDF) and probability density function (PDF) of power lomax are Fðx; g; b; lÞ ¼ 1 À l g ðl þ x b Þ À g ; g; b; l > 0; x > 0; ð1Þ and f ðx; g; b; lÞ ¼ gbl g x bÀ 1 ðl þ x b Þ À gÀ 1 ; g; b; l > 0; where γ, β is shape parameter and λ is scale parameter. In statistical literature, there are various ways to construct bivariate distributions, and copulas are one of them (see Lai [7] and Nelsen [8]). Copulas are a useful tool for describing a bivariate distribution with dependence structure. They are defined by Nelsen [8] as a function that joins bivariate distribution functions with uniform [0, 1] margins. Sklar [9] presents the joint PDF and joint CDF for two marginal univariate distributions as follows: If F(x i ) is the univariate CDF of X i , i = 1, 2, the joint CDF and PDF, denoted by F(x 1 , x 2 ) and f(x 1 , x 2 ), are defined by the copula function, given as where θ is the dependence measures between X 1 and X 2 , C is copula's cdf, and c is copula's pdf.
Bivariate distribution studies can be advanced through the use of copulas, which model the relationship between two random variables. Popular copulas include Gaussian, Clayton, Farlie-Gumbel-Morgenstern, Gumbel, Frank, and Archimedean copulas. Further details can be found in the references, [10][11][12][13][14][15][16][17]. The choice of copula function depends on the type of dependence structure between the two random variables. For example, the Gaussian copula is used to model linear dependence while the Clayton copula is used to model positive dependence.
The Farlie-Gumbel-Morgenstern (FGM) have been characterized using Eqs (3) and (4). Morgenstern [18] proposed a simple method for constructing a bivariate family of distributions using marginals. Farlie [19] proposed a generalization of Morgenstern's method that is known as the Farlie-Gumbel-Morgenstern (FGM) family of distributions. The FGM copula offers several advantages in modeling bivariate distributions. One of the main advantages is its flexibility in capturing a wide range of dependence structures, from complete independence to perfect dependence [14]. Additionally, the FGM copula is capable of handling asymmetrical dependence, making it well-suited for modeling data with skewed or heavy-tailed distributions [18]. Moreover, the FGM copula allows for the construction of bivariate distributions with a wide range of marginals, including continuous and discrete marginals [8]. Furthermore, the FGM copula has a simple form, which makes it computationally efficient and easy to implement in practice [10]. A new bivariate model based on adaptive progressive hybrid censored has been introduced by [11]. Bivariate Chen distribution based on FGM copula has been obtained by [12]. The bivariate models based on copula function with application of accelerated life testing (ALT) has been suggested by [13]. These features make the FGM copula a useful tool for many real-world applications, particularly in the fields of finance, insurance, and medical research [20].
A lot of work has been done in bivariate distributions based on Morgenstern-type distributions. Gupta et al. [21] derived three-and five-parameter bivariate beta distributions from the Morgenstern system of curves and studied the distributions of the product and quotient of variates. Vaidyanathan et al. [22] proposed a bivariate Lindley distribution using the Morgenstern method and presented some properties. Almetwally et al. [23] proposed bivariate distributions called the FGM Bivariate Fréchet (FGMBF) and AMH Bivariate Fréchet (AMHBF) distributions using Farlie-Gumbel-Morgenstern (FGM) and Ali-Mikhail-Haq (AMH) copulas and univariate Fréchet distributions. El-Sherpieny et al. [24] proposed the bivariate FGM Weibull-G family, which is a new flexible bivariate generalized family of distributions based on the FGM copula. Muhammed et al. [25] presented the bivariate inverted Topp-Leone (BITL) distribution, which is derived from Farlie-Gumbel-Morgenstern, Ali-Mikhail-Haq, Plackett, and Clayton copulas. Abulebda et al. [26] introduced a new bivariate XGamma (BXG) distribution and investigated its statistical properties through examination of real data. Hassan et al. [27] proposed the bivariate generalized half-logistic distribution using the FGM copula to asses household financial affordability in the Kingdom of Saudi Arabia.
Our motivation for conducting this research is to 1. propose a new bivariate called bivariate Farlie-Gumbel-Morgenstern power Lomax (BFGMPLx) distribution based on the Farlie-Gumbel-Morgenstern approach that overcomes limitations of existing distributions in modeling complex data sets.
2. Improve the modeling of complex data. The proposed distribution provides a more flexible and sophisticated method for modeling complex data sets, leading to improved predictions and decision-making.
3. Robustly handle heavy-tailed and skewed data. The new bivariate power Lomax distribution offers improved robustness and accuracy in handling such data, making it an ideal tool for various real-world applications.
4. gain new insights into the relationship between variables in medical data and the potential for improved patient outcomes.
The novelty of this research lies in the combination of the power Lomax distribution and the FGM copula. This new distribution addresses the limitations of prior work by offering improved robustness and accuracy in modeling heavy-tailed or skewed data. Additionally, the use of the FGM copula provides more flexibility in modeling of a variety of different types of dependence structures, including both positive and negative dependence between the two variables compared to traditional copula functions. This makes it a good choice for modeling data with complex relationships The paper is structured as follows: In section 2, the description and notation of BFGMPLx distribution is introduced. Section 3 discusses the statistical properties of BFGMPLx. In section 4, we study the positive (negative) quadrant dependence property of BFGMPLx. In section 5, different reliability measures for BFGMPLx are obtained. The maximum likelihood (ML) and Bayesian methods are used to estimate the parameters in BFGMPLx in section 6. Section 7 covers asymptotic and credible intervals. Simulation and an application to real data are provided in sections 8 and 9, respectively. Finally, the paper is concluded in section 10.
Sreelakshmi [29] introduced the relationship between copula and reliability copula which is defined as follows: According to (9), The FGM reliability is The following is the reliability function for BFGMPLx distribution:

Properties of BFGMPLx distribution
In this section, some important statistical properties of the BFGMPLx distribution are introduced such as marginal distributions, conditional distributions, conditional expectations, product moments and moment generating function.

The marginal distributions
The functions of marginal density for X 1 and X 2 , respectively, which are Power Lomax distributed as shown in Eqs (11) and (12).

Conditional distribution
The distribution of conditional probability for X 2 provided X 1 is obtained as follows The conditional CDF of X 2 given X 1 is as follows: The conditional probability distribution of X 1 given X 2 is derived as follows: The conditional CDF of X 1 given X 2 is as follows: The conditional expectation of X 1 given X 2 = x 2 in BFGMPLx using the conditional density of X 1 given X 2 = x 2 in (15) is calculated as where L j ¼ b j Þ , j = 1, 2 and Bða; bÞ is beta function with a and b which are two real numbers greater than 0.
The above conditional expectation is non-linear in X 2 , as can be seen. In a similar manner it can be demonstrated that the conditional expectation of X 2 given X 1 = x 1 is also non-linear in x 1 .

Generating random variables
By (14), a bivariate sample of the Power Lomax distribution based on the conditional method can be generated: 1. Generate U and V independently from a Uniform(0, 1) distribution. (14) to obtain x 2 using numerical analysis such as Newton Raphson, and etc.

Moment generating function
Let (X 1 , X 2 ) represent a random variable that its PDF defined in Eq (8). The moment generating function of (X 1 , X 2 ) is then obtained by where and j = 1, 2. Proof of moment generating function is given in S1 Appendix.

Product moments
If distribution of the random variable (X 1 , X 2 ) is BFGMPLx, then its r 1 th and r 2 th joint moments around zero denoted by m 0 r 1 r 2 can be presented as follows: where and j = 1, 2. Proof the product moments is given in S1 Appendix.
From (45) in S1 Appendix, the covariance and correlation (ρ) between X 1 and X 2 are calculated as follows: and rðX 1 ; X 2 Þ ¼ yð1 À L 1 À L 2 þ L 1 L 2 Þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi We notice that ρ = 0 when θ = 0, this implies that X 1 and X 2 are independent.

Positive quadrant dependence
Positive quadrant dependence property of BFGMPLx is presented in this section. Positive quadrant dependence, a type of random variable dependence, was introduced by Lehmann [30]. Two random variables X 1 and X 2 are considered positive quadrant dependent (PQD) if Theorem 1: BFGMPLx is PQD (NQD) for positive (negative) value of θ. Proof: Consider where xðx 1 ; which for all values of x 1 and x 2 is always nonnegative, because cdf and reliability function takes values ranging from zero to one. As a result, for positive values of θ, θξ(x 1 , x 2 ) � 0 8x 1 , x 2 . This demonstrates the condition stated in (22). Therefore, BFGMPLx is PQD for positive values of θ. Likewise, for negative values of θ, θξ(x 1 , As a result, inequality in (22) is reversed, hence for negative values of θ, BFGMPLx is NQD. Thus BFGMPLx has both positive and negative quadrant dependence.

Reliability measures
In this section, we derive reliability measures such as hazard rate, mean residual life, and vitality function in the context of BFGMPLx.

Hazard rate function
Using the definition of bivariate hazard rate function introduced by Basu [31], the hazard rate function of BFGMPLx is obtained as In Figs 4-6, we show the 3D plots of the joint hazard of a BFGMPLx distribution for various parameter values.
Basu has a main constraint, which its definition is from is not a vector quantity. To overcome this constraint, Johnson et al. [32] and Sreelakshmi [29] introduced the bivariate hazard rate function in vector form as follows: where R(.) denotes the bivariate reliability function. For FGM copula, Vaidyanathan [22] introduced À @ ln Rðx 1 ;x 2 Þ @x 1 as follows: From (10), we get The vector hazard rate function of BFGMPLx is obtained by substituting the above expressions in (25). Eq (27) consists of two terms: the first term is the hazard rate of the power lomax distribution, which is an inverted bathtub (IBT) for β > 1 and is a decreasing hazard rate (DHR) for 0 < β � 1, according to [6]. The second term ½1 À A À 1 l g 1

Mean residual life
The average remaining life of a unit after it has survived for a particular time t is denoted by mean residual life (MRL). Shanbag and Kotz [33] defined MRL for vector-valued random variables as: where and The expressions for m 1 (x 1 , x 2 ) and m 2 (x 1 , x 2 ) in BFGMPLx are obtained as: ð30Þ Substituting (30) and (31) in (29) yields BFGMPLx's MRL.

Vitality function
Sankaran and Nair [34] defined a bivariate vitality function for a two-component system as: where and Here υ 1 (x 1 , x 2 ) calculates the expected life time of first component as the sum of current age x 1 and the average lifetime remaining to it, assuming the second component has survived past age x 2 . υ 2 (x 1 , x 2 ) has a similar interpretation. Using Eqs (30) and (31) in (33), we obtain υ 1 (x 1 , x 2 ) and υ 2 (x 1 , x 2 ) of BFGMPLx as: ð34Þ From (34) and (35), the vitality function of BFGMPLx can be obtained using (32).

Methods of estimation
In this section, we present two estimation methods for estimating the unknown parameters of the BFGMPLx distribution: maximum likelihood estimation (MLE) and Bayesian estimation.

Maximum likelihood estimation
Let (x i1 , x i2 ), i = 1, 2, . . ., n denote random samples from BFGMPLx with parameters Θ = (β 1 , . the log likelihood function ln L is obtained by using the density function given in (8), We obtain the following likelihood equations by partially differentiating lnL with respect to the vector of parameters Θ and equating them to zero. The following are the first derivatives: @LðYÞ @y ¼ where j = (1, 2), l = (1, 2); j 6 ¼ l, (for example j = 1 then l = 2).

Bayesian estimation
Here, we employ the symmetric loss functions to derive the Bayes estimators of the BFGMPLx distribution's parameters. We must select an acceptable prior density function and hyperparameter values that reflect our belief regarding the data. We use the symmetric square error loss function (SSELF) to get the estimates based on a complete sample and assume that the parameters of BFGMPLx distribution are independent. For the parameters γ l , β l , and λ l , we select gamma-independent priors, specifically, while the copula parameter θ has uniform prior distribution where −1 < θ < 1 The joint prior Eq (41) as follows: The likelihood method's estimate and variance-covariance matrix can be used to determine how to elicit the independent joint prior's hyper-parameters. Gamma priors' mean and variance can be used to represent the derived hyper-parameters. For more information see [35][36][37]. The parameters γ l , β l , and λ l where l = 1, 2, of BFGMPLx distribution should be wellknown and positive. The likelihood function, Eq (43) is as follows: and the joint posterior distribution can be expressed using the joint prior function Eq (41) and likelihood function Eq (43). Consequently, the function of the Θ joint's posterior density is The symmetric loss function is the squared-error loss function, abbreviated as SELF. The Bayesian estimator of Θ under SELF is then the averageỸ ¼ E Y ðYÞ.
The expectation of loss functions are difficult to analyse by mathematical integration, hence the Markov Chain Monte Carlo (MCMC) approach will be utilised. Gibbs sampling and the broader Metropolis-within-Gibbs samplers are the two most significant MCMC algorithm sub-classes. Robert et al. [38] covered this algorithm. Similar to acceptance-rejection sampling, the Metropolis-Hastings (MH) method treats a candidate value derived from a proposal distribution as normal for each iteration of the process. The MH approach computes an appropriate transition in two phases, starting with Y i ¼Ŷ i : • Take π(Θ � |Θ) from a proposal distribution while Θ � is a constant.
a Y � jY ¼ min 1; GðY � jxÞ pðYÞ GðYjxÞ pðY � jYÞ � � : In addition to ensuring that the goal density stays invariant, Θ, this well-stated transition density guarantees that the chain converges to its specific invariant density starting from any initial condition.
Using the posterior conditional density functions of the relevant parameters, this method's basic idea is to provide posterior samples of the relevant parameters. The posterior density function of the relevant parameters is provided by Eq. (5.3). The posterior conditional density functions of γ l , β l , and λ l where l = 1, 2 can be constructed as follows from this equation: where l is 1 and 2.

Confidence intervals
In this section, we present two different methods to construct confidence intervals (CI) for the unknown parameters of BFGMPLx distribution, which are asymptotic confidence intervals (ACI) and highest posterior density Bayesian estimation.

Asymptotic confidence intervals
The asymptotic normal distribution of the MLE is the most widely used technique for establishing confidence bounds for the parameters. With respect to the Fisher information matrix I(Θ), which is comprised of the negative second derivatives of the natural logarithm of the likelihood function evaluated atŶ ¼ ðĝ 1 ;b 1 ;l 1 ;ĝ 2 ;b 2 ;l 2 ;ŷÞ, the asymptotic variance-covariance matrix of the MLE of the parameters, suppose that the asymptotic variance-covariance matrix of the parameter vector is , where l = 1, 2 and Z 0.025 is the percentile of the standard normal distribution with right tail probability a 2 . The second derivatives of the likelihood function with respect to the parameters are as follows > > = > > ; ; > > = > > ; ;

Highest posterior density Bayesian estimation
The approach of Chen and Shao [39] is frequently used to construct highest posterior density (HPD) intervals for unknown distribution parameters in Bayesian estimation. For instance, two endpoints from the MCMC sampling outputs, the lower 5% and upper 95% percentiles, can be used to calculate a 90% HPD interval. The following is how reliable Bayesian intervals of the parameters are obtained: l and θ [1] < θ [2] < . . . < θ [L] where l = 1, 2, 3, and L is the length of MCMC generated.

Simulation
The performance of MLE using the Newton-Raphson (NR) and Bayesian estimates using the Metropolis-Hastings methods are contrasted numerically in this section. For the parameters of the BFGMPLx distribution, the performance of the various techniques and the analytically deduced results may be evaluated exactly. The "maxLik" package was installed in R 4.2.2 programming, and after 1000 samples from BFGMPLx distribution had been gathered, the MLEs and 95% ACI estimations of the parameter values were assessed. To acquire the Bayes point estimates and their HPD interval estimates of the same unknown parameters using the "coda" tool in the R programming language. For simulation study of BFGMPLx distribution, the values of the parameter can be defined as follows: In Table 1: γ 1 = 0.6;β 1 = 3, λ 1 = 2.8, γ 2 = 0.9, β 2 = 2.5, λ 2 = 0.7. In Table 2: γ 1 = 1.3, β 1 = 5, λ 1 = 1.5, γ 2 = 0.9, β 2 = 4, λ 2 = 1.3. In Table 3: γ 1 = 0.8;β 1 = 1.3, λ 1 = 1.5, γ 2 = 1.2, β 2 = 1.5, λ 2 = 2. While θ is a copula parameter is changed in all tables as 0.4, and -0.5. The sample-sizes (n) are as follows 35, 50, 100, and 150. The simulation results of bias, mean squared error (MSE), and length of CI (LCI) (For MLE asymptotic CI (LACI), while Bayesian credible CI (LCCI)) on 5000 iteration of Monte Carlo simulation are shown in Tables 1-3. Nelsen [8] has explored the construction of a sample from a defined joint distribution and utilising a conditional technique to generate random variables. The generated samples are used to calculate the Bayesian estimates, together with the corresponding posterior symmetric loss function and Bayesian HPD intervals.
The following conclusions can be drawn from Tables 1-3: • It can be shown that MLE and Bayesian estimates of unknown parameters are fairly good in terms of Bias, MSE, and LCI.
• With an increase in sample size, it is shown that the posterior risks decrease and the estimated values of the parameters approach the nominal values of the parameters.
• Since Bayesian estimates incorporate prior knowledge based on gamma informative priors, they are superior to MLE in terms of bias, MSE, and LCI.
• Additionally, the HPD credible intervals beat the ACI in terms of LCI outcomes when the sample size n increases.

Application
The fit models were compared using the conventional value of criteria (VC), which included the Akaike information criterion (AIVC), consistent AIVC (CAIVC), Bayesian information criterion (BIVC), Hannan-Quinn information criterion (HQIVC), Anderson-Darling value (ADV), Cramer-von Mises value (CMV), Kolmogorov-Smirnov distance (KSD), p-value of Kolmogorov-S (PVKS), and standard error (SE). Our main statistical objective was to analyse a genuine dataset that is important in various domains using a fitting approach model. In this regard, we contrasted the suggested BFGMPLx distribution's fit with the BFGMWE and BFGMWITL mentioned by El-Sherpieny et al. [24] and Bivariate FGM Lomax-Claim (BFGMLC) distribution which was mentioned by Zhao et al. [40]. We have thought about the length of diabetes and serum creatinine in this part (SrCr). Since the patients' diabetes was already known, we are predicting the complications that may result from it (using the values of SrCr, the data has been divided into two groups: groups with diabetic nephropathy (DN) (SrCr 1.4mg/dl) and groups with nondiabetic nephropathy (SrCr 1.4mg/dl)). SrCr reports were provided for each patient from the 200 patients whose reports were available. From January 2012 to August 2013, the pathology reports of these patients were gathered from the path lab of Dr. Lal. This data has been discussed by Grover et al. [41]. Duration of diabetes: 7.4, 9, 10, 11, 12, 13, 13.75, 14.92, 15.8286, 16.9333, 18, 19, 20, 21, 22, 23, 24, 26, 26.6 Table 4 lists the MLE, SE, KS distance and associated p-value, CVM and AD for the marginal distributions. The fitted pdf with histogram plot, fitted cdf with empirical cdf and P-P plot of PL distribution in Figs 8 and 9 for Duration of diabetes and Serum Creatinine respectively support the findings (KS-test) in Table 4 of our study. show the fitted CDF with empirical CDF, fitted PDF with histogram plot, and PP-plot for the duration of diabetes and serum creatinine, respectively. In Table 5 goodness of fit measures for FGM copula have been obtained as kendall, sperman, tau, rho, statistics, and copula parameter θ with P-Value is 0.1339 > 0.05, then we accept the null hypothesises as the bivariate date is fit for FGM copula. The MLE with SE, and coefficient of variation (CV) for  Table 6. Table 7 obtained AIC, BIC, HQIC, and CAIC for bivariate models to select the best fit bivariate model of this data. MCMC summarized results for parameters of BFGMPLx model has been obtained in Table 8.

Conclusions
The bivariate power Lomax model, a new and more flexible extension of the bivariate distribution based on the FGM copula, is introduced in this paper. The suggested distribution is a significant and novel contribution to the field of bivariate modeling. It provides a robust and accurate tool for modeling heavy-tailed or skewed data and offers an improved way to model the dependence structure between two variables compared to traditional methods. Its fundamental mathematical characteristics are studied. Its hazard rate might be J-HR, IHR, Reversed J-HR, or Bathtube. It is demonstrated that the suggested model fits the positive (negative) quadrant dependence property. The parameter estimators, using likelihood and Bayesian methods, are derived, with the general conclusion that Bayesian estimation is better than its counterpart. A Monte Carlo simulation study is additionally offered to assess the  behavior of the estimators. Asymptotic confidence intervals of the likelihood estimation and highest posterior density of the Bayesian estimation are derived for the parameters of this model. Finally, the significance and adaptability of the BFGMPLx distribution are discussed by analyzing medical data related to the duration of diabetes and serum creatinine. Analytical evaluations revealed that our BFGMPLx model had a satisfactory match when compared to other distributions. Supporting information S1 Appendix. (PDF)