Figures
Abstract
The cure rate in clinical trials can be estimated using the Kaplan–Meier (KM) estimator. However, when the clinical trial follow-up period is insufficient and short, the KM estimator may overestimate the proportion of cured patients. Although a correction method was proposed by Escobar-Bach and Keilegom based on bootstrap sampling, this can also lead to bias when the bootstrap distribution is skewed. We propose a median-based approach for bootstrap samples to address these issues. Simulation results showed that the effect of the variation of the proposed method due to many outliers was smaller than that of the other method and enabled stable estimation. The method was successfully applied to real clinical trial data from a D-penicillamine study on primary biliary cirrhosis.
Citation: Ibi Y, Omori T (2026) Cure rate estimation with insufficient follow-up: A median-based bootstrap correction approach. PLoS One 21(3): e0344669. https://doi.org/10.1371/journal.pone.0344669
Editor: Suyan Tian, The First Hospital of Jilin University, CHINA
Received: June 23, 2025; Accepted: February 24, 2026; Published: March 12, 2026
Copyright: © 2026 Ibi, Omori. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: This study used publicly available and fully anonymized data, which were obtained from the following source, PBC data in chapter 6 from this URL: https://regression.ucsf.edu/second-edition/data-examples-and-problems. This data is used in the book: Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. Springer Science + Business Media. 2012. We contacted Springer Nature, the publisher, to request permission for its use, and authorization was granted.
Funding: This study was supported by the Japan Agency for Medical Research and Development (grant number [JP21lk0201702].The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
A cure for an illness is defined as improvement from the illness or injury. In clinical trials with overall survival or progression-free survival as endpoints and in which the follow-up period is limited, participants who do not experience an event of interest during the follow-up period are often regarded as cured patients [1]. They are censored at the end of the trial, and the Kaplan–Meier survival curve shows a long and stable plateau with heavy censoring at the tail [2].
Cure models are used to estimate the cure rate, which is the proportion of the cure rate. There are two types of cure models: mixture and non-mixture cure models [3]. The mixture cure model assumes that trial participants are a mixture of cured and uncured patients. The mixture cure model used in this study is the standard model first introduced by Boag (1949) [4] and later modified by Berkson and Gage (1952) [5] because of its directness and simplicity in interpretation [3]. Mixture cure models also include non-parametric [6,7], semiparametric [8,9], and parametric [10,11] estimations. In semiparametric models, previous studies have proposed transformation models for survival data with cured patients, accounting for measurement error [12,13].
When nonparametrically estimating the cure rate, it is possible to use the estimate from the Kaplan–Meier estimator at the largest observed time. This is possible without bias if the follow-up period is long enough, i.e., if the censoring time is longer than the survival time of uncured patients [14]. In practice, however, the follow-up period of clinical trials may be short or not always sufficiently long until the event of interest occurs [14], and in such situations, the cure rate may be overestimated [15]. To address the issue of overestimating the cure rate in the KM estimator when the follow-up period is insufficient and short, a method adding a correction term to the KM estimator was proposed by Escobar-Bach and Keilegom (2019) [14]. Their method adds a correction term using bootstrap sample means to the KM estimator to estimate the cure rate.
Although the results of their simulation study showed that the bias of their method was reduced compared to that of the KM estimator, the approach can be affected in case of the skewed bootstrap distribution, which may still lead to bias and unstable estimates.
The current study proposes an alternative correction method for estimating cure rates in clinical trials with insufficient or short follow-up periods. Section 2 presents the proposed method and other cure rate estimation methods, Section 3 examines the behavior of the proposed method through simulations, Section 4 applies the proposed method to primary biliary cirrhosis data, and Section 5 provides a discussion.
2. Method
2.1 Mixture cure model
Let the survival time of the patients be T and the censoring time be C. T and C are independent, and C is finite. Under the assumption of random right censoring, the observed time is . Provided that
is the right endpoint of the survival function for T and
is it for
. Here,
denotes the cumulative distribution
of X, and
is defined as inf {
:
}, the right endpoint of the support of
.
is the right endpoint of
.
A parametric mixture cure model was used in this study. In the model, the survival curve, , is expressed as the cure rate, p, and the survival curve of uncured patients,
, as follows: 0 < p < 1.
2.2 Cure rate estimation methods
The KM estimator may be used to estimate p in Equation (1).
The survival times are ordered as follows, as
, where n is the sample size. Let
=
be the largest observed time among the clinical study participants. One method for estimating the cure rate p is to estimate the KM estimator at the largest observed time, which is the KM estimator at the largest observed time.
where denote the KM estimator of the survival function
based on a sample size
. Escobar-Bach and Keilegom (2019) [14] proposed a method that corrects for the KM estimator as the cure rate calculated by the estimator in Equation (2) may overestimate when the follow-up period is insufficient and short. It is an estimate of S (
), but the true cure rate, p = S (
), so this can lead to bias. Their method uses the idea of extreme value theory for the correction term using bootstrap samples. It assumes a survival curve after the censoring time; i.e., it considers what the survival curve would have looked like if followed up after its time. They regarded the largest observed time in a sample as a random variable following an extreme value distribution with parameter y, leading to the following formula:
where y (0,1) is a tuning parameter.
In Equation (3), {
is the correction term. By correcting,
is estimated at
as
(
) is close to
(
) when n→∞.
cannot be calculated until y is determined. As y is not a fixed value and cannot be estimated from the data alone, it must be estimated from the term that best fits the tail function of
(t). They used the bootstrap method, estimating a value for y of 0.6 to 0.98, and the correction term was calculated as in Equation (4), which is explained in detail in Sections 2.2.2 and 2.2.3.
2.2.1 KM method.
We refer to the estimation method, i.e., the KM estimator at the largest observed time, as the KM method. This was performed without adding a correction term, as estimated in Equation (2).
The following theorem uses the KM method. Assume that and that
is continuous at
in the case
. Maller and Zhou (1992) states that the estimator
is consistent if and only if
, meaning that, with probability 1, no uncured patients can survive beyond the largest possible censoring time. This condition ensures that sufficient information is available across the entire support of the survival time, so that no observation is almost surely censored. This scenario, commonly referred to as the follow-up period is sufficient, represents the standard paradigm in the analysis of censored data. However, such a condition is not always satisfied in practical studies, particularly when follow-up periods are short or event times are long. When
, the follow-up period is insufficient,
inevitably underestimates the true cure rate
[14,16].
2.2.2 Escobar and Keilegom method (EK method).
We call the method developed by Escobar-Bach and Keilegom (2019) “EK method”. Since Escobar-Bach and Keilegom (2019) proposed a correction method based on , we focus on
in this section. In the EK method, y is calculated using the mean value of the bootstrap samples in the EK method. Equation (3) is estimated using Equation (4), where
is the number of bootstrap extractions (200).
where the choice of H = {0.6,0.62,…,0.98} follows Escobar-Bach and Keilegom (2019).
2.2.3 Proposed method (EC method).
The proposed method is a correction to the EK method, called the EK’s correction method “EC method”, which differs from the EK method in terms of the estimation method for y. Instead of using the mean of the bootstrap samples, this proposed method uses the median, modifying Equation (4) so that changes from
to
. This is because the bootstrap samples are a proportion with a value between 0 and 1; therefore, we considered the possibility that the distribution of bootstrap samples could be skewed and affected by variations in the cure rate estimate. Using the median, rather than the mean, may be a more appropriate choice for y because of the reduction in the effects of variation and outliers.
3. Simulation
3.1 Aim and settings
The performance of the proposed cure rate estimation method was evaluated. The purpose was to compare the proposed method with the KM and EK methods through simulations.
The mixture cure model in Equation (1) was used to estimate the cure rate and generate the survival time. The cure rate p was assumed to be constant, and four values were considered: 0.2, 0.3, 0.4, and 0.5. As the distribution of , we used the exponential distribution: exp (
). In the survival curves
, we considered the following three situations for the follow–up period: (A) insufficient and short, when the plateau of the survival curve was short; (B) insufficient and too short, when the plateau of the survival curve was shorter than (A); and (C) sufficiently long, when the plateau of the survival curve was sufficiently long.
The follow–up period was assumed to be 3000 days, and those who survived beyond 3000 days were censored. Fig 1 shows where the true cure rate is 0.2. In (A), the parameter of the exponential distribution is
= 0.0013 and the upper 1.83 percentile of the distribution at 3000 days; in (B),
= 0.0005 and the upper 22.31 percentile; and in (C),
= 0.0033 and the upper 0.005 percentile.
When implementing the EK and the proposed method, if the correction term could not be calculated, 0 was substituted; if the correction term was less than 0, 0 was substituted; if the correction term exceeded 1, 1 was substituted; and if the estimated cure rate was less than 0, it was 0. The number of simulation repetitions was 1000. The number of bootstrap samples was 200 extraction times (= ) that satisfied
in Equation (4).
Bias, standard deviation (SD), and mean squared error (MSE) were used as indicators to evaluate the simulation. Bias represents the average estimation error, SD represents the variability of the estimates, and MSE represents the mean squared error with respect to the true value. Each indicator was calculated as:
We also examined the distribution of 200 bootstrap samples in Equations (4) on four typical patterns out of 1000 simulation replications. The four patterns of
, especially at a true cure rate of 0.4, are illustrated in Fig 3. In (a), the difference between the mean and median absolute values of the bootstrap samples was small, and the difference between the EK method and the proposed method was small; in (b), the difference between the absolute values was small, and the difference between the EK method and the proposed method was large; in (c), the difference between the absolute values was large, but the difference between the EK and the proposed methods was small; and in (d), the difference between the absolute values was large and the difference between the EK and the proposed methods was large. The KM estimator for y between 0.6 and 0.98 was also considered to examine which of the three survival curves was best for the proposed method.
EC, EK, and KM are the proposed method, and the existing EK and Kaplan–Meier methods, respectively.
All data analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, USA).
3.2 Results
3.2.1 Characteristics of the estimated cure rate.
The box plots in Fig 2 show each setting using the three estimation methods, and the estimated cure rate, Bias, SD, and MSE are summarized in Table 1.
From the box plots in Fig 2, the EK method, which is based on the mean, has much variability. However, the proposed method suppresses the variability of points out of the lower whisker to a greater extent than the EK method. Across all scenarios, the bias was comparable between EK and the proposed method, and SD was the smallest for KM. MSE was the smallest for the proposed method in scenario (A).
3.2.2 Distributions of the four bootstrap samples.
Fig 3 shows histograms of the bootstrap samples. In Fig 3, the mean and median of the bootstrap samples, the skewness of those samples, and the estimated cure rates by EK and EC are shown. Even when the distribution was skewed, there were cases where there was a difference in the estimated cure rate between the EK and the proposed methods, as well as cases where there was no difference.
4. Real data application
4.1 Aim and settings
The objective was to apply the proposed method to real clinical trial data and examine the behavior of the model. The estimates were calculated using the proposed and EK methods, and the bootstrap sample histogram used for each estimate was examined. The data used were obtained from the D-penicillamine (DPCA) study [17].
This study included a cohort of 312 participants from a placebo-controlled clinical trial of DPCA for primary biliary cirrhosis (PBC) [17,18]. PBC destroys bile ducts in the liver, causing bile accumulation. Progressive tissue damage ultimately leads to liver failure. The time from diagnosis to end-stage liver disease ranges from a few months to 20 years. During the approximate ten-year follow-up period, 125 participants died [17]. Dickson et al. (1989) [18] developed a model to predict survival in patients with PBC using Cox regression analysis and comprehensive data from 312 patients. As previous studies reported no therapeutic differences between control and DPCA treated patients [18], we estimated cure rates using the pooled dataset across the treatment groups.
We were interested in the time to death (years) and patients whose status was death as an event occurrence. The Kaplan–Meier curve for the PBC data is presented in Fig 4.
4.2 Results
The results of the estimated cure rate and a histogram of the bootstrap samples of are shown in Fig 5. The cure rates estimated by EK was 0.215, the proposed method was 0.294, and that estimated by KM was 0.341. The histogram and skew of −0.714 show mild skewness in the distribution, and there was a difference between the EK and EC.
EK is the existing EK method and EC is the proposed method. The absolute value represents the difference between the mean and the median of the bootstrap samples.
EK is the existing EK method, and EC is the method proposed in this study.
As suggested by the simulation results, when the bootstrap distribution is skewed, as shown in Fig 5, the cure rate may be overestimated when using the KM estimator, and underestimated when applying the existing mean-based correction method. Although this finding is based on a single data example, it is reasonable to consider the cure rate estimated by the proposed method as a plausible option for evaluating the cure rate. This example further illustrates that, in real applications, estimates of the cure rate can vary depending on the method used.
For practical implementation, it is preferable to examine the histogram of the bootstrap distribution and to assess characteristics such as its mean, median, and skewness. Such evaluations can help determine which estimation method is most appropriate for obtaining the cure rate in each dataset.
5. Discussion
In this study, we proposed a cure rate estimation method with a median-based correction term when the clinical trial follow–up period was insufficient and short.
The EC method constructs the correction term based on the median of the bootstrap samples of the estimated survival function. For summarizing the location of asymmetric distributions, the advantages of the median over the mean are established in robust statistics [19]. The median can be interpreted as an M-estimator defined as the minimum value of the following function
Furthermore, when is regarded as a random sample from a distribution
, the breakdown point—defined as the maximum proportion of contamination that an estimator can tolerate—equals
for the median, in contrast to
for the mean. This provides a theoretical justification for adopting the median rather than the mean when the bootstrap distribution used for bias correction is skewed, a situation that may occur when the bootstrap samples are constructed from estimated survival functions. Moreover, the bootstrap distribution of the mean is also positively skewed, correctly suggesting that the sampling distribution of the mean is asymmetric [20]. In addition, the median is more appropriate than the mean when the data are not normally distributed [21].
The proposed method was more stable as a cure rate estimator than EK, considering point estimates out of the lower whisker in the boxplots in Fig 2 and the MSE in Table 1. As shown in Table 1 in scenario (A), the bias was similar to that of the EK method; however, both methods showed a smaller bias than the KM method as they added a correction to the KM method.
The SD was larger than that of the KM method because of the variation in the bootstrap samples. Regarding the bootstrap samples used in the proposed method, in Section 3.2.2, there were some cases in which there was no difference between the EK and the proposed method in terms of skewness.
Determining whether the follow–up period in clinical trials is sufficient tends to be subjective and ambiguous [22]. The results of this study showed that the corrections led to large variations after the correction in (B) in Fig 2 and an over-reduction of the estimated cure rates in (C). Although the proposed method was developed under the assumption of (A), the circumstances in which it is appropriate must be carefully considered. Table 2 summarizes the estimate of survival function using the Kaplan–Meier method for several values from the largest observed time (
= 1.0) between 0.6 and 0.98 from the resulting data in 3.2.1, one out of 1000 simulation repetitions, for the true cure rate, 0.2. In Table 2(A), Kaplan–Meier estimators showed minimal change: a small difference between
= 0.6 and
= 0.9 and a negligible difference between
= 0.9 and
= 0.98. In (B), which is set for a follow–up period that is insufficient and too short, the difference between
= 0.6 and
= 0.9 is larger than (A). In (C), there was not much difference between them. In Table 2, a single repetition indicated differences between
= 0.6 and
= 0.9, and
= 0.9 and
= 0.98. To assess the replicability of these findings, we showed the difference in the values of each
from the results of Table 2 (S1 Table) and calculated the proportion of the 1000 repetitions that applied to those differences (S2 Table).
The y, as derived from Equations (4), represents the Kaplan–Meier survival estimate at a point from the largest observed time. Follow–up periods (A) and (B) were deemed insufficient, with (B) exhibiting a severely truncated follow–up. Period (C) provided adequate follow–up. If y = 1.0, it was consistent with the KM method.
The limitations of this study are as follows. The proposed approach, similar to existing methods, relies on a bootstrap-based correction for an unknown parameter. However, summarizing the location of the bootstrap distribution using either the mean or the median may still yield unstable estimates when extreme values occur, even when the median is applied (Fig 2). Because the aim of the correction is to estimate the unknown parameter appearing in the adjustment term, it may be possible to obtain a more stable and still robust correction by refining how the time point at times the largest observed time or the censoring time, is incorporated into the estimation procedure. A detailed investigation of such points is beyond the scope of this study and should be addressed in future work. Furthermore, the subjective classification of follow–up periods underscores the need for more objective criteria in future research. Finally, investigations into optimizing the bootstrap resampling process and developing objective criteria for follow–up adequacy would further strengthen the method’s applicability. Exploring the integration of this method with other survival analysis techniques could also broaden its utility in clinical research across diverse disease types and clinical trial designs. The median-based bootstrap correction can enhance the reliability of clinical trial outcomes, particularly in scenarios where prolonged follow–up is impractical, although future studies should explore the method’s performance. Code is available on S1 File.
Acknowledgments
We would like to thank Professor Satoshi Morita (Kyoto University Graduate School of Medicine, Japan) for supporting this research, and Tosiya Sato (The Institute of Statistical Mathematics Japan), Shiro Tanaka, Masatomo Omiya, and Yumi Takagi (Kyoto University Graduate School of Medicine, Japan) for providing inputs on the research. We would like to thank Editage (www.editage.jp) for English language editing.
References
- 1.
Haybittle J. Cancer clinical trials: methods and practice. Oxford University Press; 1984.
- 2. He H, Han D, Song X, Sun L. Mixture proportional hazards cure model with latent variables. Stat Med. 2021;40(29):6590–604. pmid:34528248
- 3. Sun S, Liu G, Lyu T, Xue F, Yeh T-M, Rao S. Design considerations in clinical trials with cure rate survival data: a case study in oncology. Pharm Stat. 2018;17(2):94–104. pmid:29159922
- 4. Boag JW. Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc B. 1949;11:15–44.
- 5. Berkson J, Gage RP. Survival curve for cancer patients following treatment. J Am Stat Assoc. 1952;47:501–15.
- 6. Peng Y, Dear KB. A nonparametric mixture model for cure rate estimation. Biometrics. 2000;56(1):237–43. pmid:10783801
- 7. Xu J, Peng Y. Non-parametric cure rate estimation with covariates. Can J Stat. 2014;42(1):1–17.
- 8. Niu Y, Peng Y. A semiparametric marginal mixture cure model for clustered survival data. Stat Med. 2013;32(14):2364–73. pmid:23203908
- 9. Gu E, Zhang J, Lu W, Wang L, Felizzi F. Semiparametric estimation of the cure fraction in population-based cancer survival analysis. Stat Med. 2020;39(26):3787–805. pmid:32721045
- 10. Sposto R. Cure model analysis in cancer: an application to data from the Children’s Cancer Group. Stat Med. 2002;21(2):293–312. pmid:11782066
- 11. Gallardo DI, Brandão M, Leão J, Bourguignon M, Calsavara V. A new mixture model with cure rate applied to breast cancer data. Biom J. 2024;66(6):e202300257. pmid:39104134
- 12. Chen L-P. Analysis of receiver operating characteristic curves for cure survival data and mismeasured biomarkers. Mathematics. 2025;13(3):424.
- 13. Chen LP. Semiparametric estimation for cure survival model with left-truncated and right-censored data and covariate measurement error. Stat Probab Lett. 2019;154:108547.
- 14. Escobar-Bach M, van Keilegom I. Non-parametric cure rate estimation under insufficient follow-up by using extremes. J R Stat Soc B. 2019;81:861–80.
- 15. Escobar-Bach M, Van Keilegom I. Nonparametric estimation of conditional cure models for heavy-tailed distributions and under insufficient follow-up. Comp Stat Data Anal. 2023;183:107728.
- 16. Maller RA, Zhou S. Estimating the proportion of immunes in a censored sample. Biometrika. 1992;79(4):731–9.
- 17.
Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE. Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer Science + Business Media; 2012.
- 18. Dickson ER, Grambsch PM, Fleming TR, Fisher LD, Langworthy A. Prognosis in primary biliary cirrhosis: model for decision making. Hepatology. 1989;10(1):1–7. pmid:2737595
- 19.
Garthwaite PH, Jolliffe IT, Jones B. Statistical inference. Oxford University Press; 2002.
- 20. Andrews DF. Robust likelihood inference for public policy†. Can J Stat. 2007;35:341–50.
- 21. Wright DB, London K, Field AP. Using bootstrap estimation and the plug-in principle for clinical psychology data. J Exp Psychopathol. 2011;2(2):252–70.
- 22. Xie P, Escobar-Bach M, Van Keilegom I. Testing for sufficient follow-up in censored survival data by using extremes. Biom J. 2024;66(7):e202400033. pmid:39377280