Figures
Abstract
We suggest a novel method for detecting mortality deceleration by adding a penalty to the log-likelihood function in a gamma-Gompertz setting. This is an alternative to traditional likelihood inference and hypothesis testing. The main advantage of the proposed method is that it does not involve using a p-value, hypothesis testing, and asymptotic distributions. We evaluate the performance of our approach by comparing it with traditional likelihood inference on both simulated and real mortality data. Results have shown that our method is more accurate in detecting mortality deceleration and provides more reliable estimates of the underlying parameters. The proposed method is a significant contribution to the literature as it offers a powerful tool for analyzing mortality patterns.
Citation: C. Patricio S, Missov TI (2023) Using a penalized likelihood to detect mortality deceleration. PLoS ONE 18(11): e0294428. https://doi.org/10.1371/journal.pone.0294428
Editor: Mohamed R. Abonazel, Cairo University, EGYPT
Received: May 11, 2023; Accepted: November 1, 2023; Published: November 16, 2023
Copyright: © 2023 Patricio, Missov. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used is from the Human Mortality Database (HMD), and it can be accessed at: https://www.mortality.org.
Funding: The research leading to this publication is a part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 884328 – Unequal Lifespans). Silvio C. Patricio gratefully acknowledges the support provided from AXA Research Fund, through the funding for the “AXA Chair in Longevity Research”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
Human death-rate patterns are astoundingly log-linear over a wide range of adult ages. The Gompertz distribution [1] with an exponentially increasing hazard function captures this accurately. The theory of unobserved heterogeneity and the associated frailty model [2] predicts a downward deviation at the oldest ages, to which only the most robust individuals in the population survive. Detecting such a deceleration in real data is not always successful [3, 4], even though the vast majority of studies indicate that death rates at older ages increase at lower rates and can even level off [5–15]. In a frailty model setting, testing for mortality deceleration is equivalent to testing whether the non-negative frailty parameter is strictly positive.
Formally, denote by X a non-negative continuous random variable that describes individual human lifespans (complete or after a given adult age). If X has a Gompertz distribution with parameters a and b, where a is the mortality level at the initial age and b is the rate of aging, the associated hazard function (force of mortality) at time x
is μ(x) = aebx. The Gompertz hazard [1] captures adequately the log-linear acceleration observed in death rates in the adult age range [16, 17]. The use of the Gompertz model is also justified by extreme value and evolutionary theories [18, 19]. It is, therefore, appropriate to assume that the general schedule for the individual risk of dying follows a Gompertz law.
Deviations from the log-linear mortality pattern, especially at the oldest ages, can be attributed to the intrinsic diversity among individuals in the study population. While standard survival models capture the variability due to measurable risk factors, it is their extension, the frailty models, that also incorporate the effect of unobserved heterogeneity. The latter can reflect, among others, genetic predispositions to certain diseases [20], mental health, or the general quality of life. The unobserved heterogeneity among individuals affects their endowment for longevity [21, 22] and, to take this into account, standard frailty models introduce a random effect called frailty. This random effect, considered in general to be non-negative, is incorporated into the individual hazard as a multiplicative factor that reflects one’s unobserved susceptibility to death [2].
The force of mortality for an individual with frailty Z = z is
Frail individuals have high values of z and, therefore, tend to die first. The connection between frailty and baseline mortality, operating in a multiplicative manner, is justified by the observed death-rate patterns of elderly humans [10, 12, 18, 23].
The estimation of frailty models requires specifying a distribution for the random and unobserved frailty Z and studying the resulting marginal distribution [2, 18]. The main property of frailty’s distribution is the regular variation at zero of its density [18, 23]. Distributions that meet this condition are the gamma, beta, truncated normal, log-logistic, and even Weibull distribution [23].
Among all admissible distributions, we opt for the gamma distribution as it has a well-defined mathematical structure that facilitates both analytical and numerical computations, and it is also flexible and capable of capturing a wide range of shapes for the frailty component [2, 18]. For a gamma-distributed frailty Z with and
, the force of mortality of the population, i.e., the marginal hazard is
(1)
(see [2, 24] for all technicalities). Note that the variance of Z is often denoted by γ because it is also equal to the squared coefficient of variation of the distribution of frailty among survivors to any age x. If σ2 > 0, the force of mortality for the population
starts deviating from the exponential pattern with increasing x and reaches an asymptote b/σ2. When σ2 = 0, i.e., when there is no unobserved heterogeneity, the model for the population reduces to the (Gompertz) model for individuals with an exponentially increasing hazard function μ(x) = aebx.
Testing for mortality deceleration in this setting reduces to statistical testing whether σ2 = 0 given the alternative σ2 > 0. The frailty parameter σ2 can take a value on the boundary of the parameter space (σ2 = 0). This violates the standard underlying assumptions about the asymptotic properties of likelihood-based inference and statistical hypothesis testing [25]. As a result, the asymptotic distribution of the maximum likelihood estimator may not be Gaussian.
In this paper, we treat the problem of identifying whether σ2 > 0 or σ2 = 0 as a model misspecification problem, i.e., we consider the gamma-Gompertz model when it is the Gompertz model that actually holds. In this setting, we suggest adding a penalty from the log-likelihood function. This penalty will be responsible for shrinking σ2 to zero when there is no heterogeneity, as well as for adding a small bias to the Maximum Likelihood Estimator (MLE) when the effect of unobserved heterogeneity is non-negligible. We carry out Monte Carlo simulation experiments to evaluate the accuracy and precision of the estimates obtained by maximizing the likelihood function, on the one hand, and the penalized likelihood function, on the other.
In Section 2, we formulate the model misspecification problem and introduce inference methodology taking advantage of the maximum a posteriori probability (MAP). Then we carry out a Monte Carlo simulation study to compare the performance of maximizing a standard and a penalized likelihood. In Section 3, we compare the latter on mortality data for France, Japan and the USA. Section 4 discusses the advantages and drawbacks of applying our method to detect heterogeneity (deceleration) in mortality patterns.
2 Methodology
Suppose is a random sample with a cumulative distribution function G(x), and we fit the incorrect family of densities
to the data using MLE. The misspecified log-likelihood is
Applying the law of large numbers, we get in the limit what the misspecified log-likelihood function looks like for each
(see the right-hand side below):
(2)
Assume there is no heterogeneity in the data (σ2 = 0), and we fit a gamma-Gompertz model. In other words, we observe an exponential death-rate increase in the data, but we estimate a model that implies a downward deviation from the exponential at the oldest ages. As shown in Eq 2, we will estimate σ2 close to but never equal to zero.
In this model setting, the standard technique is to estimate both the Gompertz and the gamma-Gompertz models and compare their goodness of fit. However, minor changes in the data can result in different models being selected, which can reduce prediction accuracy and lead to misinterpretations about the mortality deceleration and the mortality plateau. [25] derive the asymptotic distribution of the likelihood ratio test statistic to detect heterogeneity. Here, we suggest an alternative that does not involve hypothesis testing. Using the latter has been widely discussed and rethought in the Statistics community [26–29], especially concerning the arbitrary choice of the α-level (most often 0.1, 0.05, or 0.01) and sample size issues.
Maximum likelihood estimators, obtained by maximizing the log-likelihood function, often have low bias and large variance. Estimation accuracy can sometimes be improved by shrinking some parameters to zero [30]. The associated shrinkage estimator improves the overall accuracy, promotes parsimony and makes the parameter estimates more stable by reducing their sensitivity to minor changes in the data, at the expense of introducing a small bias to reduce the variance of the parameters. This class of estimators is implicit in Bayesian inference and penalized likelihood inference. Using shrinkage estimators is applied as an alternative to hypothesis testing. Lasso, Ridge, and Stein-type estimators are the most widely used examples of penalizing methods [31].
2.1 Inference
Let Dx be the number of deaths in a given age interval [x, x+ 1) for x = 0, …, m, and Ex denote the number of person-years lived in the same interval [32, 33]. Define and
. In addition, let θ = (a, b, σ2)⊤ ∈ Θ be the parameter vector that characterizes the force of mortality at age x of the gamma-Gompertz model given by Eq 1. Finally, we assume that the number of deaths and the number of person-years exposed to the risk of dying can be observed.
Assume Dx are Poisson-distributed with for x = 0, …, m [32]. Under this assumption, the log-likelihood function for θ = (a, b, σ2)⊤ is given by
(3)
Maximizing with respect to θ = (a, b, σ2)⊤ yields the maximum-likelihood (ML) estimate
. Suppose the data come from a Gompertz distribution, and we estimate a gamma-Gompertz model, i.e., the true value of σ2 is 0. Then, for each set of fixed model parameters
, we can use the limit on the right-hand side of (2) to calculate the expected log-likelihood as a function of σ2.
Fig 1 shows the expected log-likelihood function for four pairs of fixed values for a and b. We can see that the likelihood (dashed line) might not be “concave enough,” especially when the true value of σ2 is close to 0, to allow direct optimization. Indeed, the function is almost flat when σ2 ≤ 0.005, which means that these values are (almost) equally likely. In such cases, using a penalty function, also known as a regularization term or a prior distribution, can be instrumental in increasing concavity at the expense of introducing some constraints or biases into the estimation process [31, 34].
Let us now define a penalized log-likelihood function as
(4)
where
is the standard log-likelihood (Eq 3), while p(σ2) is a penalty function. The penalized maximum-likelihood estimate is obtained by maximizing
with respect to θ = (a, b, σ2)⊤. If the effect of unobserved heterogeneity is negligible, i.e., when there is no mortality deceleration, we aim to estimate σ2 equal to 0. For that, the penalty p(σ2) must be a non-increasing monotonic continuous function and
for all σ2 > 0, i.e, the penalty function reaches its maximum for σ2 = 0. The last condition ensures that when there is no unobserved heterogeneity, maximizing (4) yields a frailty parameter exactly equal to 0. An example is shown in Figs 1 and 2.
In the first row, we used synthetic data from a gamma-Gompertz model with parameters a = 0.0001, b = 0.1 and σ2 = 0.1; in the second row, we from a Gompertz model with parameters a = 0.0001 and b = 0.1.
In a Bayesian framework, maximizing (4) is equivalent to maximizing a posterior distribution in a setting in which ,
, is taken as a prior distribution of σ2. This procedure yields the maximum a posteriori probability (MAP) estimator, widely used in image and video processing [35–37].
Given that σ2 characterizes the variance of frailty, it is standard to assign an inverse gamma prior distribution to it [38]. The inverse gamma distribution, known for its heavy-tail, effectively maintains a greater probability mass away from zero than the gamma distribution. Note that the mode of the inverse gamma distribution is consistently positive, whereas the mode of the gamma distribution can potentially be zero [39]. As we aim to test whether σ2 = 0 or σ2 > 0, we will use the log-kernel of the gamma distribution to define the penalty function as
(5)
for some non-negative λ. When λ < 1, using (5) is equivalent to specifying a gamma prior distribution for σ2 with parameters α = 1 − λ and β = λ (maximized at σ2 = 0). When m → ∞, the effect of the penalty diminishes regardless of the size of λ. For human life table data m is finite.
The penalty parameter λ ≥ 0 is a constant that controls the relative impact of the penalty function on the estimates. When λ = 0, the penalty term has no effect, and maximizing the penalized likelihood will produce the standard maximum likelihood estimator (MLE). However, as λ → ∞, the impact of the penalty grows, and the maximum penalized likelihood estimates for σ2 will approach zero, providing high precision, but low accuracy.
Choosing λ is sensible in a wide range of applications [40, 41]. Therefore, we carry out a pilot simulation study, in which we find that choosing provides similar precision to the one by MLE when σ2 > 0, but better accuracy and precision when σ2 = 0 (simulation results are presented in the next subsection). As a result, the final expression for the penalized log-likelihood we propose is
(6)
The expected penalized log-likelihood function for σ2 is shown in Fig 1 (solid line). As the penalty function is maximized at σ2 = 0 and the log-likelihood function is almost flat for σ2 < 0.005, the penalized log-likelihood function has a distinct maximum at σ2 = 0, reflecting that zero is the most likely value for σ2.
From a Bayesian perspective, choosing provides an informative prior distribution for σ2. As for human populations, we are likely to estimate σ2 < 1 [42], the specified prior will provide for σ2 a distribution with a mode equal to zero, a median equal to 0.4549, and mean equal to 1. Furthermore, the prior provides a probability mass of 0.6826 in the interval (0, 1].
Fig 2 shows the log-likelihood and penalized log-likelihood functions for all parameters when σ2 > 0 (first row) and σ2 = 0 (second row). When σ2 > 0, the penalty function affects neither the shape of the log-likelihood nor the location of its maximum. However, when σ2 = 0, adding a penalty yields a higher maximum at 0. Moreover, when σ2 = 0, the first and second derivatives of the penalized log-likelihood are higher than their respective counterparts of the log-likelihood. As a result, derivative-based optimization methods may reach the maximum point faster, and the estimator may have a smaller variance.
2.2 Monte Carlo simulations
We carry out Monte Carlo simulations to explore the performance of the MAP and ML methods in estimating the gamma-Gompertz model parameters. We use the R software [43] to maximize the log-likelihood and the penalized log-likelihood functions via the optim function applying as a pre-step differential evolution [44, 45]. The performance of the ML and MAP estimators are evaluated by calculating two measures: the bias and the standard deviation.
We generate 10,000 random samples from this model for some parameter values (scenarios with sample sizes of 2,000 and 5,000 were also considered, and are presented in the S1 Appendix). From these samples, we generate life tables and use them to estimate model parameters via the MAP and MLE methods. This process was repeated 2,000 times. In the presence of unobserved heterogeneity, the true parameter values are a1 = 0.0001 and a2 = 0.00001 for a, b1 = 0.1 and b2 = 0.15 for b, and and
for σ2. When there is no heterogeneity (σ2 = 0), the true parameter values are a1 = 0.0001, a2 = 0.0003 and a3 = 0.0005 for a, and b1 = 0.09, b2 = 0.10 and b3 = 0.11 for b.
The simulation results are presented in Table 1. In the presence of unobserved heterogeneity, both methods underestimate b and σ2. They also introduce a small positive bias to a, the one provided by ML estimator being slightly smaller. However, in general the ML and MAP estimators perform equally well, with a similar bias and standard deviation.
In the absence of unobserved heterogeneity, the ML estimator provides again a smaller bias for a and b than the MAP estimator. However, in this case, the MAP method estimates more precisely the frailty parameter σ2, with a bias and a standard deviation close to zero (∝ 10−15). The MAP estimator also provides a slight reduction in the standard deviation of parameter b. Similar results were found for smaller samples, Tables.3 and 4 in S1 Appendix present the simulation results for sample sizes 2,000 and 5,000 respectively.
By the Monte Carlo simulation we also calculate the proportion of trials in which MAP estimates σ2 > 0 when the true values is σ2 = 0 (error type I), as well as the proportion of trials in which MAP estimates σ2 = 0 when the true values is σ2 > 0 (error type II). Based on our simulations, the type I errro equals 0.001502, while the type II error is 0.001126.
The Monte Carlo simulations show that using a penalizing likelihood function (6) is an alternative to hypothesis testing, the latter being dependent on the asymptotic distribution of the ML estimator, sample size and the arbitrary choice of the α-level [25].
3 Performance of MAP and ML estimators on HMD data
In this section, we estimate the gamma-Gompertz model via ML and MAP using mortality data from the Human Mortality Database [46]. We take exposures and raw death counts for the female population of France, Japan and the USA in the years 1960, 1980, 2000, and 2020, after age 70. We apply again R [43] to compute the ML and MAP estimates of by using differential evolution. We use the mean squared error given by
to assess the goodness of fit.
Table 2 shows the results of applying ML and MAP methods to the datasets described above. The MAP estimator provides lower MSEs in 8 of the 12 datasets. When the standard ML method estimates σ2 < 10−4, our novel method estimates σ2 = 0 and provides a smaller MSE. This suggests that the MAP provides a slightly better fit to the data. Overall, MAP performs better than ML when unobserved heterogeneity is not detected, and while for estimates of ML has a slight advantage.
The results from the real-data application back up the results from the Monte Carlo simulations in Section 2. In the presence of unobserved heterogeneity, the MLE method provides the most precise and accurate estimates. The MAP method, though, has just slightly lower precision. On the other hand, in the absence of unobserved heterogeneity, the MAP provides a smaller bias and variance in its estimates compared to MLE.
3.1 Examples when MAP and ML estimators yield different outcomes
Using MAP and ML estimators does not always lead to the same statistical inference. One of them can detect heterogeneity in cases when the other does not. We will illustrate this on HMD data for the Japanese female population in 2009 and the French female population born in 1848, ages 70+. To assess the goodness of fit, we will use MSE again.
For Japanese females in 2009, ML yields estimates with standard errors SE(a) = 0.000188, SE(b) = 0.002263 and SE(σ2) = 0.021156. The 95% confidence interval for σ2 is (0.029047, 0.111978) indicating statistically significant unobserved heterogeneity, i.e., the existence of mortality deceleration. On the other hand, the MAP method estimates
, indicating the absence of unobserved heterogeneity. Comparing the goodness of fit of both methods speaks in favor of the MAP outcome: MAP’s MSE is by 37% lower than ML’s LSE (0.018691 for MAP vs 0.029958 for ML). It indicates that unobserved heterogeneity is negligible and that the gamma-Gompertz model is misspecified.
The left panel of Fig 3 shows that both methods estimate a similar logarithmic force of mortality at most ages. However, after age 100, the MLE deviates downward from the observed logarithmic death rates.
The MAP also provides a better fit and different conclusion for the cohort of French females born in 1848. While ML estimates with SE(a) = 0.000317, SE(b) = 0.001273, SE(σ2) = 0.007562 and provides an MSE equal 0.046222, MAP estimates
and provides MSE = 0.034226, i.e., MAP’s MSE is by 26% smaller than ML’s MSE.
Furthermore, while the MAP estimate of σ2 suggests that there is non-negligible unobserved heterogeneity, the ML estimate and standard error for σ2 indicates the opposite: the amount of unobserved heterogeneity is not statistically significant. The right panel of Fig 3 shows the difference between these estimates. MAP’s estimate shows a leveling-off in the force of mortality, while the MLE shows a log-linear increase in the hazard function.
3.2 Comparison between MAP and ML estimators for different populations
To evaluate and compare empirically the performance of the ML and MAP methods we apply them to estimate the force of mortality for the male and female populations of France, Denmark, Sweden, Italy, Japan, Czechia, and the United States of America from 1950 to 2019, overall 980 populations. To access the goodness of fit we are using the MSE.
Fig 4 presents the method that provides a better fit (which has the smallest MSE). Overall both methods provide similar goodness of fit, with MAP providing a slightly lower MSE (on average 0.5% smaller than the ML method). Over the 980 populations, the MAP provides a better fit for 502 of them. For Czechia, Sweden, and France, the MAP provides a slightly better fit than the ML method. In general, within each country, the MAP-ML differences in MSE are small.
We also estimate the standard error for the ML estimates of σ2 to test if the parameter is not statistically significantly different from zero. We compare the results with the ones for the MAP estimate. Fig 5 presents for which populations σ2 is zero through the ML (top panel) and MAP method (bottom panel). From both methods, it is clear that the unobserved heterogeneity is statistically negligible only for some male populations—especially for Danes. This may result from the small number of males surviving to the oldest ages which leads to data fluctuations.
In about 96% of the populations, the methods agree on the statistical significance or non-significance of σ2. However, when the MAP estimates σ2 = 0, the ML provides positive confidence intervals, i.e., non-zero σ2-estimates, in 5.1948% of the cases, which corresponds to the chosen α-level (probability of Type I error) of 5% for those intervals.
4 Concluding remarks
Böhnstedt and Gampe introduced a formal procedure to identify whether σ2 > 0 or σ2 = 0 in a hypothesis testing setting: they studied the asymptotic properties of the maximum likelihood estimator and the likelihood ratio test (LRT) for H0: σ2 = 0 vs. H1: σ2 > 0 for the gamma-Gompertz model [25]. However, LRTs are based on the asymptotic distribution of the maximum likelihood estimator; hence its convergence depends on the sample size. Moreover, conclusions drawn from hypothesis tests are dependent on the arbitrary choice of the significance level or p-value.
We suggest an alternative method by considering the problem as model misspecification based on the Poisson likelihood function. We add a penalty function to the likelihood so that we make sure that is exactly 0 when there is no heterogeneity, and we present its Bayesian interpretation (MAP). We assume death counts to be Poisson-distributed [32], but alternative specifications might yield even higher accuracy. Examples of alternatives for the death-count distribution are the negative binomial and the one-parameter Bell distributions [47–49]. In these cases, the penalty parameter λ may not be equal to
. Instead, it requires careful adjustment, a process best carried out through a comprehensive simulation study as outlined by Li et al. [40], which ensures a robust calibration of the parameter for accurate results.
We take advantage of robust Monte Carlo simulations to measure the bias and standard deviation of the ML and MAP methods in scenarios with and without unobserved heterogeneity. We also compare the performance of both methods for estimating the gamma-Gompertz model parameters using actual mortality data from the Human Mortality Database. The two methods work almost equally well, the ML having a slight advantage, in the presence of unobserved heterogeneity. However, in the absence of the latter, the MAP method provides an estimate closer to 0 () and a better fit to the model in comparison to ML. As a result, the method we propose here can be used as an alternative to likelihood ratio testing for the gamma-Gompertz model with H0: σ2 = 0 vs. H1: σ2 > 0. On the one hand, the MAP method does not depend on any asymptomatic distribution, its performance is not strongly affected by sample size, and it also does not depend on the arbitrary choice of the significance level. On the other hand, MAP provides similar estimates to the ones by ML when σ2 > 0 and more accurate estimates when σ2 = 0.
References
- 1.
Gompertz B. XXIV. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. In a letter to Francis Baily, Esq. FRS &c. Philosophical transactions of the Royal Society of London. 1825;(115):513–583.
- 2. Vaupel JW, Manton KG, Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979;16(3):439–454. pmid:510638
- 3. Gavrilova NS, Gavrilov LA. Biodemography of old-age mortality in humans and rodents. Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences. 2015;70(1):1–9. pmid:24534516
- 4. Newman SJ. Errors as a primary cause of late-life mortality deceleration and plateaus. PLoS biology. 2018;16(12):e2006776. pmid:30571676
- 5. Curtsinger JW, Fukui HH, Townsend DR, Vaupel JW. Demography of Genotypes: Failure of the Limited Life-Span Paradigm in Drosophila melanogaster. Science. 1992;258:461–463. pmid:1411541
- 6. Fukui HH, Curtsinger JW, Xiu L. Slowing of age-specific mortality rates in Drosophila melanogaster. Experimental Gerontology. 1993;28:585–599. pmid:8137895
- 7. Fukui HH, Ackert L, Curtsinger JW. Deceleration of age-specific mortality rates in chromosomal homozygotes and heterozygotes of Drosophila melanogaster. Experimental Gerontology. 1996;36(4):517–531. pmid:9415108
- 8. Carey JR, Liedo P, Vaupel JW. Mortality dynamics of density in the Mediterranean fruit fly. Experimental Gerontology. 1995;30(6):605–629. pmid:8867529
- 9. Khazaeli AA, Pletcher SD, Curtsinger JW. The fractionation experiment: reducing heterogeneity to investigate age-specific mortality in Drosophila. Mechanisms of Aging and Development. 1998;105:301–317. pmid:9862237
- 10.
Gampe J. Human mortality beyond age 110. In: Supercentenarians. Springer; 2010. p. 219–230.
- 11.
Gampe J. Mortality of supercentenarians: Estimates from the updated IDL. In: Exceptional Lifespans. Springer, Cham; 2021. p. 29–35.
- 12. Rootzén H, Zholud D. Human life is unlimited–but short. Extremes. 2017;20(4):713–728.
- 13. Alvarez JA, Villavicencio F, Strozza C, Camarda CG. Regularities in human mortality after age 105. PloS one. 2021;16(7):e0253940. pmid:34260647
- 14. Camarda CG. The curse of the plateau. Measuring confidence in human mortality estimates at extreme ages. Theoretical Population Biology. 2022;144:24–36. pmid:35101435
- 15. Belzile LR, Davison AC, Gampe J, Rootzen H, Zholud D. Is There a Cap on Longevity? A Statistical Review. Annual Review of Statistics and Its Application. 2022;9:21–45.
- 16. Makeham WM. On the law of mortality and the construction of annuity tables. The Assurance Magazine, and Journal of the Institute of Actuaries. 1860;8(6):301–310.
- 17. Feehan DM. Separating the signal from the noise: Evidence for deceleration in old-age death rates. Demography. 2018;55(6):2025–2044. pmid:30390230
- 18. Missov TI, Vaupel JW. Mortality implications of mortality plateaus. siam REVIEW. 2015;57(1):61–70.
- 19. Burger O, Missov TI. Evolutionary theory of ageing and the problem of correlated Gompertz parameters. Journal of Theoretical Biology. 2016;408:34–41. pmid:27503574
- 20.
Colosimo EA, Giolo SR. Análise de sobrevivência aplicada. Editora Blucher; 2021.
- 21.
Strihler B. Times, cells, and aging. Elsevier; 2012.
- 22.
Deyfitz N. Improving life expectancy: an uphill road ahead.; 1978.
- 23. Missov TI, Finkelstein M. Admissible mixing distributions for a general class of mixture survival models with known asymptotics. Theoretical population biology. 2011;80(1):64–70. pmid:21600234
- 24. Vaupel JW, Missov TI. Unobserved Population Heterogeneity: A Review of Formal Relationships. Demographic Research. 2014;31(22):659–686.
- 25. Böhnstedt M, Gampe J. Detecting mortality deceleration: Likelihood inference and model selection in the gamma-Gompertz model. Statistics & Probability Letters. 2019;150:68–73.
- 26. Berk R, Brown L, Zhao L. Statistical inference after model selection. Journal of Quantitative Criminology. 2010;26(2):217–236.
- 27. Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS biology. 2015;13(3):e1002106. pmid:25768323
- 28. Vidgen B, Yasseri T. P-values: misunderstood and misused. Frontiers in Physics. 2016;4:6.
- 29. Bruns SB, Ioannidis JP. P-curve and p-hacking in observational research. PloS one. 2016;11(2):e0149144. pmid:26886098
- 30. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological). 1996;58(1):267–288.
- 31.
Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. vol. 2. Springer; 2009.
- 32. Brillinger DR. A biometrics invited paper with discussion: the natural variability of vital rates and associated statistics. Biometrics. 1986; p. 693–734.
- 33.
Macdonald AS, Richards SJ, Currie ID. Modelling mortality with actuarial applications. Cambridge University Press; 2018.
- 34. Heinze G, Schemper M. A solution to the problem of separation in logistic regression. Statistics in medicine. 2002;21(16):2409–2419. pmid:12210625
- 35. Greig DM, Porteous BT, Seheult AH. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society: Series B (Methodological). 1989;51(2):271–279.
- 36. Afonso MV, Bioucas-Dias JM, Figueiredo MA. An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE transactions on image processing. 2010;20(3):681–695. pmid:20840899
- 37. Belekos SP, Galatsanos NP, Katsaggelos AK. Maximum a posteriori video super-resolution using a new multichannel image prior. IEEE Transactions on Image Processing. 2010;19(6):1451–1464. pmid:20129860
- 38.
Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. Chapman and Hall/CRC; 1995.
- 39.
Llera A, Beckmann C. Estimating an inverse gamma distribution. arXiv preprint arXiv:160501019. 2016;.
- 40. Li P, Chen J, Marriott P. Non-finite Fisher information and homogeneity: an EM approach. Biometrika. 2009;96(2):411–426.
- 41. Bhattacharya S, McNicholas PD. A LASSO-penalized BIC for mixture model selection. Advances in Data Analysis and Classification. 2014;8(1):45–61.
- 42. Missov TI. Gamma-Gompertz life expectancy at birth. Demographic Research. 2013;28:259–270.
- 43.
Team RC, et al. R: A language and environment for statistical computing. 2022;.
- 44. Storn R, Price K. Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization. 1997;11:341–359.
- 45. Ardia D, Boudt K, Carl P, Mullen K, Peterson BG. Differential evolution with DEoptim: an application to non-convex portfolio optimization. The R Journal. 2011;3(1):27–34.
- 46.
HMD. The Human Mortality Database; 2022. http://www.mortality.org/.
- 47. Castellares F, Ferrari SL, Lemonte AJ. On the Bell distribution and its associated regression model for count data. Applied Mathematical Modelling. 2018;56:172–185.
- 48. Castellares F, Patrício S, Lemonte AJ. On the Gompertz–Makeham law: A useful mortality model to deal with human mortality. Brazilian Journal of Probability and Statistics. 2022;36(3):613–639.
- 49.
Patrício SC, et al. Modelagem de mortalidade. 2020;.