Figures
Abstract
In this paper, from the practical point of view, we focus on modeling traumatic brain injury data considering different stages of hospitalization, related to patients’ survival rates following traumatic brain injury caused by traffic accidents. From the statistical point of view, the primary objective is related to overcoming the limited number of traumatic brain injury patients available for studying by considering different estimation methods to obtain improved estimators of the model parameters, which can be recommended to be used in the presence of small samples. To have a general methodology, at least in principle, we consider the very flexible Generalized Gamma distribution. We compare various estimation methods using extensive numerical simulations. The results reveal that the penalized maximum likelihood estimators have the smallest mean square errors and biases, proving to be the most efficient method among the investigated ones, mainly to be used in the presence of small samples. The Simulated Annealing technique is used to avoid numerical problems during the optimization process, as well as the need for good initial values. Overall, we considered an amount of three real data sets related to traumatic brain injury caused by traffic accidents to demonstrate that the Generalized Gamma distribution is a simple alternative to be used in this type of applications for different occurrence rates and risks, and in the presence of small samples.
Citation: Ramos PL, Nascimento DC, Ferreira PH, Weber KT, Santos TEG, Louzada F (2019) Modeling traumatic brain injury lifetime data: Improved estimators for the Generalized Gamma distribution under small samples. PLoS ONE 14(8): e0221332. https://doi.org/10.1371/journal.pone.0221332
Editor: Feng Chen, Tongii University, CHINA
Received: November 13, 2018; Accepted: August 6, 2019; Published: August 30, 2019
Copyright: © 2019 Ramos et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information file.
Funding: The research was partially supported by the Brazilian organizations CNPq, FAPESP, and CAPES, and was carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by FAPESP.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
Gamma distribution plays an important role in statistics as one of the most used generalizations of the Exponential distribution due to its various special cases (such as Exponential and Chi-square). This distribution has been used in different scenarios, such as reliability engineering, environmental modeling, and health research, to list a few (see Louzada and Ramos [1] and the references therein). Stacy [2] proposed an important generalization of the Gamma distribution to unify other relevant distributions, e.g., Weibull and Lognormal. This generalization, called Generalized Gamma (GG) distribution, has been successfully applied in diverse areas, such as reliability, data processing, and meteorology, among others (see Cox et al. [3] and the references therein). Moreover, it keeps the characteristic of incorporating only the support of a positive random variable T, and its probability density function (pdf) is given by (1) where t > 0, is the gamma function, α > 0 and ϕ > 0 are the shape parameters and μ > 0 is the scale parameter. The GG distribution includes various sub-models as special cases, such as the Log-Normal, Weibull, Gamma, Half-Normal, Nakagami-m, Rayleigh, Maxwell-Boltzmann, and Chi distributions.
Frequentist inference for the GG distribution has been widely considered in the literature. Stacy and Mihram [4] derived the maximum likelihood estimators (MLEs). However, Harger and Bain [5] later showed that the nonlinear equations obtained by the maximum likelihood approach are unstable. DiCiccio [6] discussed approximate conditional inference methods for this distribution. Huang and Hwang [7] used the method of moments to perform inference for the GG distribution. Furthermore, Khodabin and Ahmadabadi [8] compared the method of moment estimators and MLEs, whose results revealed that most of the time the MLEs showed greater performance even though under the presence of estimation limitations. Recently, Noufaily and Jones [9] discussed some different approaches to maximize the likelihood function; the proposed numerical technique returned smaller proportions of errors during the maximization process, but still failed in a significant number of samples, which is undesirable.
1.1 Overview of the TBI problematic
Trauma is a multisystem health condition that represents the third cause of death worldwide, surpassed only by cerebrovascular diseases and cancer [10]. It is estimated that over sixty million people have trauma each year, and nearly 16,000 people die every day after some traumatic injury. Traumatic Brain Injury (TBI) represents one of the significant causes of death and disability among the trauma epidemiology data [10, 11]. Most of the patients with TBI are young, economically active adults and more likely to have been involved in a traffic accident [12–16]. Therefore, TBI is considered a public health concern that leads to high costs of hospitalization with various economic and social burdens.
The limited available data in the TBI problematic, usually presented in a small number of patients, motivated the current study, which relied on a data set of a longitudinal observational investigation of patients after TBI due to traffic accidents admitted to a Brazilian Emergency Department. Investigating optimal statistical analyses in this population is essential for providing impactful information applied not only to patients but also to their families, caregivers, and society in general [17].
1.2 Current statistical methods and their limitations
In the literature, there are various classical methods for estimating the unknown parameters of probability distributions. Under the frequentist approach, the primary interest is to compare the maximum likelihood estimation method with other estimation procedures. Related studies about different distributions have also been presented in the literature [18–22]. In this work, we consider several of these estimation procedures, such as least squares, the weighted least squares, the maximum product of spacings, and the Anderson-Darling maximum goodness-of-fit estimators.
An intensive simulation study is conducted to compare these estimation methods. However, we observe two significant problems. The first is that, for some methods, the estimation procedures fail in finding the target estimates, i.e., report convergence problems in the maximization/minimization process. The second is related to the occurrence of a significant bias in the obtained estimates for small samples. In order to overcome this problem, we propose a penalized maximum likelihood estimator, which, combined with a very useful practical algorithm called Simulated Annealing (SANN) (for further details, see Kirkpatrick et al. [23]), guarantees the best convergence not depending on the conditions of the problem (i.e., the initial values), even when it has several local extrema. Prentice [24] argued that the approximately normal distribution for ϕ, using the maximum likelihood theory, may not be achieved even for sample sizes equal to or larger than 400. Due to the asymptotic relationship of the maximum likelihood estimator with the penalized maximum likelihood estimator, the same problem may be observed. Therefore, we consider a bootstrap approach to building accurate confidence intervals (see DiCiccio and Efron [25]) for small and moderate samples. Finally, by combining all these approaches, one can perform inference for the flexible GG distribution with good precision even for small sample sizes.
The paper is organized as follows. Section 2 presents some properties of the GG distribution, including its cumulative distribution, survival and hazard functions, and its moments. Additionally, the SANN algorithm is also discussed in detail, and implementation procedures are presented. Section 3 discusses the eight estimation methods considered in this paper. Section 4 shows a simulation study, using synthetic data, designed to identify the most efficient estimation procedure. In Section 5, we apply our proposed methodology to three new real data sets provided by a medical school, which contain the TBI patients’ lifetime risk among different hospitalization stages. Finally, some final comments are given in Section 6.
2 Background
Let T be a random variable with GG distribution, i.e. T ~ GG (ϕ, μ, α). Then, its cumulative distribution function (cdf) is given by (2) where is called lower incomplete gamma function. The survival function is given by (3) where is the upper incomplete gamma function. The lower and upper incomplete gamma functions are standard functions in many pieces of software, such as R, SAS and Ox. Finally, the hazard function is given by (4)
Glaser [26] showed that the hazard function (4) of the GG distribution can capture basic shapes, such as constant, increasing, decreasing, bathtub and unimodal. Fig 1 presents some examples of the shapes of the pdf and hazard function, considering different values of ϕ, μ and α.
(A) Pdf of the GG distribution. (B) Hazard function of the GG distribution.
The r-th moment of T about the origin can be obtained by (5)
Then, the mean and variance of the GG distribution are given, respectively, by (6)
2.1 Simulated annealing algorithm
The SANN algorithm was developed via the generalization of the Metropolis algorithm (Metropolis et al. [27]) to simulate the changes in the energy of molten metal when lowering its temperature slowly. The purpose of the cooling process is to reach a globally minimum energy state, i.e., to obtain a solid that is in its ground state. However, as pointed out by Salter and Pearl [28], if the temperature is lowered too fast, the resulting solid can become trapped in a metastable state that is not its ground state. Some authors, such as Kirkpatrick et al. [23], noticed the analogy between the cooling process of a substance to its minimum energy state and the minimization of a function by using a stochastic search strategy. In this case, the metastable state represents a local minimum, the ground state represents the global minimum, and the cooling rate corresponds to some parameters that control the possible solutions by the search algorithm.
Let be the function to be minimized (objective function), let be the initial solution (initial points), and let k0 be the initial value of the control parameter (initial temperature). The SANN algorithm can be described in a formal and general way as follows.
- Set i = 0.
- From the current solution, xi, generate a potential solution, xj, according to a specific generation scheme.
- If Δg = g(xj) − g(xi) ≤ 0, then set xi+1 = xj with probability p = 1. Otherwise, set xi+1 = xj with probability p = exp {− Δg / ki}.
- Update the value of the control parameter, ki, and set i = i + 1. Then, go to step 2.
Steps 2-4 are repeated, e.g., until the value of the control parameter, ki, is sufficiently small, or the same solution is repeatedly generated in many successive iterations.
Next, some useful remarks about the above algorithm are given.
- For some choices of k0, including k0 = log (g (x0)), see, e.g., Aarts and Korst [29].
- In step 2, the potential solutions, xj’s, are randomly chosen within a range. For instance, through xj = xj + rV, where r is a uniformly distributed random number in the interval (−1, 1), i.e. r ~ U(−1, 1), and V is a vector (of length d) of step sizes. After s iterations, the SANN adjusts its search bounds for each variable so that 50% of all moves will be accepted, either enlarging these bounds to select a new ground to move to or shrinking them to a minimum.
- In step 3, for the cases where Δg > 0, we generate u ~ U(0, 1) and move to xj only if u < p. Thus, accepting worse solutions may prevent the process from becoming stuck at local minima.
- In step 4, after m iterations, temperature (control parameter) k drops as k′ = rk × k, where 0 ≤ rk ≤ 1 is the rate of temperature reduction given the initial annealing/cooling schedule. Usually, rk = 0.95.
- As pointed out by Salter and Pearl [28], by the Markov chain theory, the SANN algorithm can be shown to converge to a stationary distribution for which the set of optimal solutions has probability 1, under certain conditions of both the sequence of control parameters and the generation scheme (see, e.g., Aarts and Korst [29], Haario and Saksman [30]).
Although presented above as a minimization problem, the SANN algorithm can be easily modified/adapted to cases where the interest resides on maximizing the function g(⋅).
Several variants of the SANN algorithm have been proposed in the recent literature. Among them, we can mention the relevant works of Torres-Jimenez and Rodriguez-Tello [31], Torres-Jimenez et al. [32], and Izquierdo-Marquez et al. [33].
3 Inference
In this section, we present different frequentist estimation methods to obtain the estimates for the parameters ϕ, μ, and α of the GG distribution.
3.1 Common estimators
The method of moments (MM) is one of the oldest procedures used for estimating parameters in statistical models. It is still widely used mainly because of its simplicity. For instance, for the two-parameter Gamma distribution, MM estimators have closed-form expressions.
Huang and Hwang [7] derived the MM estimators for the GG distribution, pointing out the need for solving two nonlinear equations, to find such estimators. Let t1, …, tn be a random sample of size n from T ∼ GG(ϕ, μ, α). The moments estimators are obtained by solving (7) where and are the sample mean and standard deviation, respectively; while the estimate of ϕ can be obtained by (8) where and are obtained by solving the nonlinear equations in (7). The MM estimators of all GG model parameters do not have closed-form expressions, which is undesirable. Another disadvantage of this approach is that the authors did not discuss the asymptotic properties of the MM estimators. Therefore, no interval estimates for ϕ, μ and α can be constructed without further research.
Another common procedure is to consider the ordinary least squares (OLS) estimators. The , and estimates can be obtained by minimizing, with respect to ϕ, μ and α, the following equation: (9) where t(1) ≤ t(2) ≤ ⋯ ≤ t(n) are the order statistics of a random sample of size n. Equivalently, these estimates can be obtained by solving the nonlinear equations: (10) where (11)
Note that the solution of Δ1(t(i)|ϕ, μ, α) involves a non-trivial partial derivative of the lower incomplete gamma function. However, this can be easily achieved numerically with high precision.
The weighted least squares (WLS) estimators, , and , can be obtained by minimizing (12) with respect to ϕ, μ and α. These estimates can also be obtained by solving the nonlinear equations: (13) for j = 1, 2, 3.
3.2 Maximum likelihood estimators
The maximum likelihood (ML) estimation method is widely used for the GG distribution due to the invariance and asymptotic properties of the obtained estimators. Let t = (t1,…, tn)′ be a random sample of size n from a GG(ϕ, μ, α) population. Then, the likelihood function of (1) is given by (14) The log-likelihood function of (14) is given by (15)
By solving the expressions: , and , the following nonlinear equations can be obtained, respectively: (16) (17) (18) where is the digamma function. The solutions of the above Eq (17) yield the MLEs. After some algebraic manipulations, we have (19) (20) and the MLE of α is obtained by solving the nonlinear equation: (21)
Although only one nonlinear equation has to be solved, there are usually different local maxima, which lead to different estimates than expected. On the other hand, under mild conditions, the MLEs are asymptotically normally distributed with a joint trivariate normal distribution given by (22) where I(ϕ, μ, α) is the Fisher information matrix (see Hager and Bain [5] for a detailed discussion) given by (23) and is the trigamma function.
3.3 Penalized maximum likelihood estimators
Firth [34] proved that the bias of MLEs can be reduced by considering a penalization in the likelihood function. Moreover, the author showed that in exponential families with canonical parameterization, the first-order term is removed by using the Jeffreys prior [35] as a penalization term. The Jeffreys prior for the GG distribution is computed by |I(ϕ, μ, α)|1/2, where |⋅| stands for the determinant of the Fisher information matrix (23), which results in (24)
As previously stated, the first-order term related to the bias is removed in the case of distributions that belong to the exponential family. On the other hand, the GG distribution is not a member of the exponential family. However, this penalization also allows us to improve the estimates, even not ensuring that the improvement is of the first order. Note that when ϕ = 1, the GG distribution reduces to the Weibull distribution, for which the Jeffreys prior is given by πJ (μ, α) ∝ (μα)−1. The extra α−1 helps us to decrease the bias of α; therefore, since πJ (ϕ, μ, α) is not a function of α, we consider the following penalization: (25)
The penalized likelihood function of ϕ, μ and α, using the Jeffreys prior (25), is given by (26)
The log-likelihood function of (26) is given by (27)
By solving the expressions: , and , the following nonlinear equations can be obtained, respectively: (28) (29) (30) Note that one of the parameters can be isolated, in order to obtain two nonlinear equations. The three possible expressions are given by (31) (32) (33) From Eqs (31)–(33), we observe that, considering (32), the penalized maximum likelihood (PML) estimators are achieved with less computational effort. Therefore, we will consider the nonlinear Eqs (28) and (30), where is obtained from (32).
Although Firth [34] proved that the PML estimators obtained from the penalized likelihood or log-likelihood function in the exponential family of distributions are always finite and, in addition, always exist, the same cannot be done for the GG distribution. For this model, the MLEs can have no solution or several solutions (see Wingo [36]). This problem is observed computationally, since it is very complex to prove analytically. It is also complicated to demonstrate analytically the results for the PML estimators. Note that our main goals here are to propose a method to circumvent these computation difficulties, and achieve improved estimates for the parameters.
The Fisher information matrix IP (ϕ, μ, α) is given by (34) where (35)
It can be easily noted that IP(ϕ) → I(ϕ) as n → ∞. Additionally, (36) Therefore, the PML estimators of ϕ, μ and α converge to the MLEs. Hence, under the same mild conditions of the MLEs, the PML estimators are asymptotically normally distributed with a joint trivariate normal distribution given by (37)
It is important to point out that it is not simple to check the regularity conditions necessary to ensure asymptotically normal distribution (see Lehman [37], Theorem 5.1, page 463). In fact, Prentice [24] showed that the approximate normal distribution for ϕ, using the ML theory, could not be achieved even for sample sizes equal to or larger than 400. This result can also be extended to the PML theory, since the MLEs and PML estimators are asymptotically equivalent. In order to overcome this problem, for small sample sizes, we considered the bootstrap approach presented by DiCiccio and Efron [25] to construct improved confidence intervals based on the PML estimates.
3.4 Maximum product of spacings estimators
As an alternative to the ML estimation method, the maximum product of spacings (MPS) is a robust method for estimating the unknown parameters of continuous univariate distributions. Cheng and Amin [38, 39] introduced this method, which was independently developed by Ranneby [40] as an approximation to the Kullback-Leibler information measure. Moreover, Cheng and Amin [39] proved some desirable properties of the MPS estimators, such as asymptotic efficiency, invariance and, more importantly, the consistency of these estimators holds under more general conditions than for MLEs.
The uniform spacings of a random sample from the GG distribution are defined as (38) for i = 1, 2, …, n + 1, where t(i) is the i-th order statistics, F(t(0)|ϕ, μ, α) = 0 and F(t(n+1)|ϕ, μ, α) = 1. This implies that .
The MPS estimates, , and , are obtained by maximizing the geometric mean of the spacings (39) with respect to ϕ, μ and α. Or equivalently, by maximizing the logarithm of the geometric mean of sample spacings (39): (40) Thus, , and can be obtained by solving the nonlinear equations: (41)
In practice, one problem that may occur is the presence of ties due to multiple observations with the same value. In this case, if t(i) = t(i−1) for some i ∈ {1, 2, …, n + 1}, then Di(ϕ, μ, α) = Di−1(ϕ, μ, α) = 0. Thus, the MPS estimators are sensitive to closely-spaced observations, especially ties. Notice that (42) Hence, Di(ϕ, μ, α) should be replaced by the corresponding likelihood L(ϕ, μ, α|t(i)) = f(t(i)|ϕ, μ, α) when t(i) = t(i−1).
Cheng and Amin [38] presented a useful comparison between the MLEs and MPS estimators: (43) where R(t(i), t(i−1)|ϕ, μ, α) is essentially of order O(|t(i) − t(i−1))|) and |t(i) − t(i−1)| → 0 in probability as n increases. For standard situations, log(Di(ϕ, μ, α)) is basically the same as log (f(t(i)|ϕ, μ, α)) with respect to ϕ, μ and α, except for a negligible number of terms. Therefore, the MLEs and the MPS estimators are asymptotically equal and have the same properties, i.e. (44)
3.5 Anderson-Darling estimators
Here, we present one type of minimum distance estimators (also referred to as the maximum goodness-of-fit estimators), which is based on the Anderson-Darling statistic and, due to this, is known as the Anderson-Darling (AD) estimator. The AD estimates, , and , of the GG model parameters ϕ, μ and α, are obtained by minimizing, with respect to ϕ, μ and α, the function (45) These estimates can also be obtained by solving the nonlinear equations: (46)
Turning now to a modified version of the AD statistic, the right-tail Anderson-Darling (RAD) estimates, and , of the parameters ϕ, μ and α, are obtained by minimizing the function (47) with respect to ϕ, μ and α. These estimates can also be obtained by solving the nonlinear equations: (48) where Δj (⋅|ϕ, μ, α), j = 1, 2, 3, are given in (11).
4 Simulation
In this section, we show the results of a simulation study carried out to compare the efficiency of the different frequentist methods used for estimating the three parameters of the GG distribution. Considering the proposed estimators, the following procedure was adopted:
- Generate N samples of size n from the GG(ϕ, μ, α) distribution and compute the estimates using the MM, OLS, WLS, ML, PML, MPS, AD and RAD methods;
- Using and θ = (θ1, θ2, θ3) = (ϕ, μ, α), compute the bias, , and the mean square error (MSE), , where denotes the estimate of θj obtained from sample k, for k = 1, 2, …, N and j = 1, 2, 3.
With this approach, the most efficient estimation method returns both bias and MSE closer to zero. The simulations were conducted using the R software [41]. For numerical optimization, we used the SANN algorithm, which was described in Section 2.1. Finally, the chosen values of the simulation parameters were: N = 20, 000, n = {20, 30, …, 300} and θ = {(0.5, 0.5, 3), (0.4, 1.5, 4)}. It is important to point out that the results of this simulation study were similar for other choices of θ. Since in real applications it is difficult to obtain good initial values, we assumed that the initial values are random and were generated from a uniform distribution on the interval (0, 4). Therefore, we also expect to obtain good estimates regardless of the initial values.
The estimation procedures needed to be performed under the same conditions to make the comparison meaningful. However, for some particular samples and estimation methods, the numerical techniques failed in finding the parameter estimates. Thus, we present the observed proportion of failures/errors of each method in Tables 1 and 2.
As can be seen in these tables, there are high proportions of failures in the optimization process to find the estimates for the MM, OLS, and WLS methods, even for moderate sample sizes. Therefore, such estimation procedures should be avoided when estimating the parameters of the GG distribution. These methods were removed to avoid the inclusion of bias in the results to continue the simulation study. Hereafter, we consider the PML, ML, MPS, AD, and RAD estimators. Figs 2 and 3 present the bias and MSE of the estimates of ϕ, μ and α.
Bias and MSE of the estimates of ϕ = 0.5, μ = 0.5 and α = 3, for N = 20, 000 simulated samples of size n, and using the following methods: 1-ML, 2-PML, 3-MPS, 4-AD, 5-RAD. The horizontal lines in these figures correspond to bias and MSE equal to zero. See text for explanations, definitions and notation.
Bias and MSE of the estimates of ϕ = 0.4, μ = 1.5 and α = 4, for N = 20, 000 simulated samples of size n, and using the following methods: 1-ML, 2-PML, 3-MPS, 4-AD, 5-RAD. The horizontal lines in these figures correspond to bias and MSE equal to zero. See text for explanations, definitions and notation.
The horizontal lines in these figures correspond to bias and MSE equal to zero. In Figs 2 and 3, we observe that both the bias and MSE for all estimators tend to zero as n increases, i.e., the estimators are asymptotically unbiased and consistent for the parameters. The PML method returned improved estimates for the GG distribution when compared with the ML method. Moreover, the SANN algorithm allowed us to successfully find the estimates regardless of the initial values used for starting the optimization process. In this case, under the PML method, all the generated samples returned satisfactory estimates, even for small sample sizes. Therefore, combining all simulation results with the useful properties of the PML estimators, such as asymptotic efficiency, normality, consistency, and invariance, we conclude that the PML estimators should be used for estimating the parameters of the GG distribution.
5 Applications
We applied the proposed statistical methods to a data set of male patients admitted to the Emergency Department of the Ribeirão Preto Medical School, University of São Paulo, Brazil, diagnosed with TBI due to car accidents (excluding patients who where less than 18 years old, or other neurological conditions). Only patients that were admitted and discharged alive were considered in this study. Thus, we did not consider censored data. We considered the length of stay in hospital in the survival function. Our main aim here was to check the average time that a patient stays in hospital, given that some time had already passed. For example, if a patient has been in hospital for ten days, how much longer do we expect him/her to take to be discharged? This problem is discussed in this section.
The TBI data set was firstly studied by Tavares [42]. The data acquisition period was from May 2004 to June 2005, and the male patients were analyzed using three different lifetime variables: the total amount of time (in days) spent in hospital (hereafter, data set D1); the amount of time at the Neurology Inpatient Department (data set D2); and during which time patients used Mechanical Ventilation (data set D3).
We compared the results obtained using the GG distribution with the corresponding ones achieved with the usage of other three-parameter lifetime distributions. The Generalized Weibull (GW) distribution (Mudholkar et al. [43]), with pdf given by (49) where and α > 0 are the shape parameters and σ > 0 is the scale parameter. The Exponentiated Weibull (EW) distribution (Mudholkar et al. [44]), with pdf (50) where θ > 0 and α > 0 are the shape parameters and σ > 0 is the scale parameter. The Marshall-Olkin Weibull (MOW) distribution (Marshall and Olkin [45]), with pdf given by (51) where α > 0 and β > 0 are the shape parameters and λ > 0 is the scale parameter. Finally, the Extended Poisson-Weibull (EPW) distribution (Ramos et al. [46]), whose pdf is (52) where and α > 0 are the shape parameters and β > 0 is the scale parameter.
The goodness-of-fit of the models was checked using the Kolmogorov-Smirnov (KS) test, which is based on the KS statistic: Dn = sup|Fn(t) − F(t|θ)|, where sup is the supremum of the set of distances, Fn(t) is the empirical cdf and F(t|θ) is the cdf of the reference distribution. The KS hypothesis testing was conducted at the 5% level of significance, to reveal whether or not the data came from F(t|θ). In this case, the null hypothesis (i.e. the data came from F(t|θ)) is rejected if the returned p-value is smaller than 0.05.
To carry out the model selection, the following discrimination criteria were adopted: AIC (Akaike Information Criterion) (Akaike [47]) and AICc (Corrected Akaike Information Criterion) (Sugiura [48]; Hurvich and Tsai [49]), which are computed by and AICc = AIC + 2 c (c + 1)/(n − c − 1), where c is the number of model parameters and is the estimate of θ. Given a set of candidate models for the data at hand, the best fitted model is the one that presents the minimum values of these criteria. Furthermore, in order to distinguish between two almost equally well-fitting models, Burnham and Anderson [50], page 70, give a rough rule of thumb for comparing AICs (as well as AIC variations, including AICc), based on the AIC differences, Δw = AICw − AICmin, where AICw denotes the AIC value of the candidate model w and AICmin is the minimum of the AIC values. Thus, models with Δw < 2 are all plausible, i.e. they have substantial support and should receive consideration in making inferences; models with 4 < Δw < 7 have considerably less support; and finally, models with Δw > 10 have either essentially no support and might be omitted from further consideration, or at least fail to explain some substantial explainable variation in the data.
Table 3 shows some summary statistics for these lifetime variables/data sets. According to all statistics, patients would spend less time, in days, on the Mechanical Ventilation than in the Neurology Inpatient Department.
SD = Standard deviation, Min = Minimum, Max = Maximum.
As can be seen in Table 3, we have a small number of observations for the different data sets, which constitutes a situation where the ML method may return high-biased estimates. However, such a problem can be easily overcome by considering the PML estimators, as shown in Section 4. In TBI, specifically, the patient follow-up is time-consuming and requires dedication in order to acquire the data. Thus, small data sets may be recurring and solutions should be provided. Consequently, some models were fitted, and according to the model selection criteria, the best-adjusted one was indicated.
Table 4 provides the AIC and AICc values, as well as the p-values obtained from the KS test, for all five distributions (GG, GW, EW, MOW, and EPW) fitted using the PML estimation method. Observe that, for all three data sets, both criteria provide empirical evidence in favor of the GG distribution. However, the difference between the AIC (AICc) value for the EW model and the AIC (AICc) value for the GG model (“best” model according to both criteria) is less than two. Therefore, we can also consider the EW model as a plausible one for describing all the data sets. Followed up by Fig 4, which presents the fitted survival functions superimposed to the empirical survival function, it can be observed that the GG distribution gives a good fit to all data sets.
Survival functions superimposed to the empirical survival function, considering (A) D1 (B) D2 (C) D3 related to patients’ TBI caused by traffic accidents.
Obtained results support elements towards the development of a decision-making system, using ad hoc evidence, generating its associated probabilistic function, which helps the expert to infer patients’ risks. In addition to the point parameter estimates, we computed the confidence intervals, as well as the mean residual lifetime for the GG model parameters.
To construct such confidence intervals, one can use the asymptotic properties of the PML estimators. However, for the considered data sets, we have sample sizes smaller than 20. Prentice [24] showed that the approximate normal distribution for ϕ, using the ML theory, could not be achieved even for sample sizes equal to or larger than 400. Therefore, we considered a bootstrap approach to build such intervals (see DiCiccio and Efron [25]). It is essential to point out that the obtained bootstrap interval relies on a replication of small samples and the estimation of the related parameters, therefore the proposed approach that does not fail in finding such estimates plays an important role also in computing intervals. The PML estimates and the 95% bootstrap confidence intervals (CI) for the parameters ϕ, μ and α of the GG distribution, for all three data sets, are given in Table 5.
Following the interpretation related to Table 5, data set D1 showed a higher estimate for the parameter ϕ than the others, where the parameters μ and α behave in the opposite way. It is worth mentioning that it is hard to obtain a biological interpretation for the parameters since they influence the higher moments (e.g., the mean and variance) of the distribution simultaneously. Moreover, the obtained bootstrap confidence intervals returned accurate evidence even considering small sample sizes.
From the proposed methodology, the PML estimates were obtained with a satisfactory goodness of fit. With the adjusted parameters, one can solve the problem related to the expected time that will be taken for a patient to be discharged. In order to achieve that, we consider the mean residual lifetime of the GG distribution, which is given by (53)
Fig 5 shows the mean residual lifetime, considering the three data sets related to patients’ TBI caused by a traffic accident. The plotted curves return the conditional expectation (r(t)) given the patient’s spent time (t).
The estimation of the mean residual lifetime considering the three data sets related to patients’ TBI caused by traffic accidents.
From this graph, we can see different expected times given that the patient has been in hospital. For instance, considering the patients from data set D1, given that one patient has been in hospital for ten days, we expect that he/she may be discharged after seventeen more days. On the other hand, if the patient is from data set D3 and has been hospitalized for ten days, then we expect that he/she will leave hospital after four more days.
6 Conclusions
In this paper, we derived and compared, through an extensive simulation study, different frequentist estimation methods for the parameters of the GG distribution. From our simulations, we observed that the OLS, WLS, and MM methods failed in finding the parameter estimates for a significant number of samples. On the other hand, considering the SANN algorithm with the PML estimation method, we were able to find the solutions (i.e., the parameter estimates) for all samples regardless of the initial values used for initiating the iterative procedure. Moreover, the PML method provided better estimates for all three parameters regardless of the sample size. Thus, the PML method is the most efficient estimation procedure, among the ones considered in this study, and should be used for all practical purposes.
Finally, our proposed methodology was illustrated in three real data sets related to patients’ traumatic brain injury caused by a traffic accident, demonstrating that the GG distribution is a simple alternative to be used in such applications for different occurrence rates and risks, even under the presence of small samples.
Supporting information
S1 File. Data are available downloading this file.
https://doi.org/10.1371/journal.pone.0221332.s001
(XLS)
Acknowledgments
We are grateful to three anonymous referees and the associate editor for very useful comments and suggestions, which greatly improved this paper.
References
- 1. Louzada F, Ramos PL. Efficient closed-form maximum a posteriori estimators for the gamma distribution. Journal of Statistical Computation and Simulation. 2018;88(6):1134–1146.
- 2. Stacy EW. A generalization of the gamma distribution. The Annals of Mathematical Statistics. 1962;33(3):1187–1192.
- 3. Cox C, Chu H, Schneider MF, Muñoz A. Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine. 2007;26(23):4352–4374. pmid:17342754
- 4. Stacy EW, Mihram GA. Parameter estimation for a generalized gamma distribution. Technometrics. 1965;7(3):349–358.
- 5. Hager HW, Bain LJ. Inferential procedures for the generalized gamma distribution. Journal of the American Statistical Association. 1970;65(332):1601–1609.
- 6. DiCiccio T. Approximate inference for the generalized gamma distribution. Technometrics. 1987;29(1):33–40.
- 7. Huang PH, Hwang TY. On new moment estimation of parameters of the generalized gamma distribution using it’s characterization. Taiwanese Journal of Mathematics. 2006;10(4):1083–1093.
- 8. Khodabin M, Ahmadabadi A. Some properties of generalized gamma distribution. Mathematical Sciences. 2010;4(1):9–28.
- 9. Noufaily A, Jones M. On maximization of the likelihood for the generalized gamma distribution. Computational Statistics. 2013;28(2):505–517.
- 10.
Mock C, Lormand J, Goosen J, Joshipura M, Peden M. Guidelines for essential trauma care (2004). Geneva: World Health Organization Google Scholar.
- 11. Bonow RH, Barber J, Temkin NR, Videtta W, Rondina C, Petroni G, et al. The outcome of severe traumatic brain injury in Latin America. World Neurosurgery. 2017. pmid:29229352
- 12. Owens PW, Lynch NP, O’Leary DP, Lowery AJ, Kerin MJ. Six-year review of traumatic brain injury in a regional trauma unit: demographics, contributing factors and service provision in Ireland. Brain Injury. 2018;32(7):900–906. pmid:29683734
- 13. Weber KT, Guimarães VA, Pontes Neto OM, Leite JP, Takayanagui OM, Santos-Pontelli TE. Predictors of quality of life after moderate to severe traumatic brain injury. Arquivos de Neuro-Psiquiatria. 2016;74(5):409–415. pmid:27191238
- 14. Chen F, Chen S. Injury severities of truck drivers in single-and multi-vehicle accidents on rural highways. Accident Analysis & Prevention. 2011;43(5):1677–1688.
- 15. Kim JK, Ulfarsson GF, Kim S, Shankar VN. Driver-injury severity in single-vehicle crashes in California: a mixed logit analysis of heterogeneity due to age and gender. Accident Analysis & Prevention. 2013;50:1073–1081.
- 16. Chen F, Chen S, Ma X. Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data. Journal of safety research. 2018;65:153–159. pmid:29776524
- 17. Maas AI, Stocchetti N, Bullock R. Moderate and severe traumatic brain injury in adults. The Lancet Neurology. 2008;7(8):728–741. pmid:18635021
- 18. Louzada F, Ramos PL, Perdoná GS. Different estimation procedures for the parameters of the extended exponential geometric distribution for medical data. Computational and mathematical methods in medicine. 2016;2016. pmid:27579052
- 19. Teimouri M, Hoseini SM, Nadarajah S. Comparison of estimation methods for the Weibull distribution. Statistics. 2013;47(1):93–109.
- 20. Bagheri S, Alizadeh M, Baloui Jamkhaneh E, Nadarajah S. Evaluation and comparison of estimations in the generalized exponential-Poisson distribution. Journal of Statistical Computation and Simulation. 2014;84(11):2345–2360.
- 21. Louzada F, Ramos PL, Ferreira PH. Exponential-Poisson distribution: estimation and applications to rainfall and aircraft data with zero occurrence. Communications in Statistics-Simulation and Computation. 2018; p. 1–20.
- 22. Rodrigues GC, Louzada F, Ramos PL. Poisson–exponential distribution: different methods of estimation. Journal of Applied Statistics. 2018;45(1):128–144.
- 23. Kirkpatrick S, Gelatt CD, Vecchi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–680. pmid:17813860
- 24. Prentice RL. A log gamma model and its maximum likelihood estimation. Biometrika. 1974;61(3):539–544.
- 25. DiCiccio TJ, Efron B. Bootstrap confidence intervals. Statistical Science. 1996;11(3):189–228.
- 26. Glaser RE. Bathtub and related failure rate characterizations. Journal of the American Statistical Association. 1980;75(371):667–672.
- 27. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. Journal of Chemical Physics. 1953;21(6):1087–1092.
- 28. Salter LA, Pearl DK. Stochastic search strategy for estimation of maximum likelihood phylogenetic trees. Systematic Biology. 2001;50(1):7–17. pmid:12116596
- 29.
Aarts E, Korst J. Simulated annealing and Boltzman machines. Wiley & Sons; 1989.
- 30. Haario H, Saksman E. Simulated annealing process in general state space. Advances in Applied Probability. 1991;23(4):866–893.
- 31. Torres-Jimenez J, Rodriguez-Tello E. New bounds for binary covering arrays using simulated annealing. Information Sciences. 2012;185(1):137–152.
- 32. Torres-Jimenez J, Izquierdo-Marquez I, Garcia-Robledo A, Gonzalez-Gomez A, Bernal J, Kacker RN. A dual representation simulated annealing algorithm for the bandwidth minimization problem on graphs. Information Sciences. 2015;303(10):33–49.
- 33. Izquierdo-Marquez I, Torres-Jimenez J, Acevedo-Juárez B, Avila-George H. A greedy-metaheuristic 3-stage approach to construct covering arrays. Information Sciences. 2018;460–461:172–189.
- 34. Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.
- 35.
Jeffreys H. The theory of probability. OUP Oxford; 1998.
- 36. Wingo DR. Computing maximum-likelihood parameter estimates of the generalized gamma distribution by numerical root isolation. IEEE Transactions on Reliability. 1987;36(5):586–590.
- 37.
Lehmann EL, Casella G. Theory of point estimation. Springer Science & Business Media; 2006.
- 38.
Cheng R, Amin N. Maximum product of spacings estimation with application to the lognormal distribution. Mathematical Report. 1979; p. 79–1.
- 39. Cheng R, Amin N. Estimating parameters in continuous univariate distributions with a shifted origin. Journal of the Royal Statistical Society Series B (Methodological). 1983;45(3):394–403.
- 40. Ranneby B. The maximum spacing method. An estimation method related to the maximum likelihood method. Scandinavian Journal of Statistics. 1984;11(2):93–112.
- 41.
R Core Team. R: A Language and Environment for Statistical Computing; 2018. Available from: https://www.R-project.org/.
- 42.
Tavares K. Avaliação da qualidade de vida em pacientes com traumatismo crânio encefálico: correlações funcionais, sociais e cognitivas na fase hospitalar e tardia. University of São Paulo; 2006.
- 43. Mudholkar GS, Srivastava DK, Kollia GD. A generalization of the Weibull distribution with application to the analysis of survival data. Journal of the American Statistical Association. 1996;91(436):1575–1583.
- 44. Mudholkar GS, Srivastava DK, Freimer M. The exponentiated Weibull family: A reanalysis of the bus-motor-failure data. Technometrics. 1995;37(4):436–445.
- 45. Marshall AW, Olkin I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika. 1997;84(3):641–652.
- 46. Ramos PL, Dey DK, Louzada F, Lachos VH. An extended poisson family of life distribution: A unified approach in competitive and complementary risks. Journal of Applied Statistics. 2019; p. 1–18.
- 47. Akaike H. On entropy maximization principle. Applications of Statistics. 1977; p. 27–41.
- 48. Sugiura N. Further analysts of the data by akaike’s information criterion and the finite corrections: Further analysts of the data by akaike’s. Communications in Statistics-Theory and Methods. 1978;7(1):13–26.
- 49. Hurvich CM, Tsai CL. Regression and time series model selection in small samples. Biometrika. 1989;76(2):297–307.
- 50.
Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd ed. New York: Springer-Verlag; 2002.