¶ Membership of the Interval Cancer (INCA) Study Group is provided in the Acknowledgments.
The authors have declared that no competing interests exist.
Analyzed the data: EV CF MR. Wrote the paper: MR EV CF. Co-developed the project that includes this study: MR EV MC RP. Developed costs estimations, contributed to the cost-effectiveness analysis: MC. Developed the code for the data analysis: EV CF. Coordinated the RAFP and the INCA projects: MS XC. Participated in the design and analysis of the INCA study: LD. Revised and approved the manuscript: EV CF MC MS RP XC LD MR.
The one-size-fits-all paradigm in organized screening of breast cancer is shifting towards a personalized approach. The present study has two objectives: 1) To perform an economic evaluation and to assess the harm-benefit ratios of screening strategies that vary in their intensity and interval ages based on breast cancer risk; and 2) To estimate the gain in terms of cost and harm reductions using risk-based screening with respect to the usual practice. We used a probabilistic model and input data from Spanish population registries and screening programs, as well as from clinical studies, to estimate the benefit, harm, and costs over time of 2,624 screening strategies, uniform or risk-based. We defined four risk groups, low, moderate-low, moderate-high and high, based on breast density, family history of breast cancer and personal history of breast biopsy. The risk-based strategies were obtained combining the exam periodicity (annual, biennial, triennial and quinquennial), the starting ages (40, 45 and 50 years) and the ending ages (69 and 74 years) in the four risk groups. Incremental cost-effectiveness and harm-benefit ratios were used to select the optimal strategies. Compared to risk-based strategies, the uniform ones result in a much lower benefit for a specific cost. Reductions close to 10% in costs and higher than 20% in false-positive results and overdiagnosed cases were obtained for risk-based strategies. Optimal screening is characterized by quinquennial or triennial periodicities for the low or moderate risk-groups and annual periodicity for the high-risk group. Risk-based strategies can reduce harm and costs. It is necessary to develop accurate measures of individual risk and to work on how to implement risk-based screening strategies.
Early detection of breast cancer (BC) reduces mortality and may improve quality of life for most of the women diagnosed early by mammographic exams
Organized screening programs for early detection of BC provide screening services where all eligible women are treated as equal risk. For instance, the European guidelines recommend offering mammography screening to women aged 50–69 every two years
In a previous study, we performed an economic evaluation of uniform screening strategies that had different periodicities and varied in the ages at starting or ending the screening exams
We used the probabilistic model developed by Lee and Zelen (LZ), which has been described elsewhere
The model requires input data that was obtained from different sources. BC incidence and survival, and mortality from other causes refer to cohorts born in Catalonia (Spain) in the period 1948–1952
The research protocol was approved by the institutional review board and ethics committee of the Hospital Universitari Arnau de Vilanova de Lleida (Spain) which waived the need for informed consent.
We started estimating the age-specific risk of invasive BC for our study cohort, using the model published elsewhere by Martinez-Alonso
We obtained four aggregated risk groups that combined the profiles of women that had similar levels of BC incidence over time: 1) Low (L) risk which included Category 1 breast density with at most one risk factor - family history or breast biopsy - and Category 2 breast density with no risk factors; 2) Medium-Low (ML) risk which included Category 1 breast density with two risk factors, Category 2 breast density with one risk factor, and Categories 3 or 4 breast density with no risk factors; 3) Medium-High (MH) risk which included Category 2 breast density with two risk factors, Categories 3 or 4 breast density with one risk factor; and 4) High (H) risk which included Categories 3 or 4 breast density with two risk factors. The frequency distributions of the risk groups was 39.6%, 42.8%, 15.6% and 2.0% for L, ML, MH and H, respectively.
The incidence rates of the four aggregated risk groups were estimated as weighted sums of detailed incidence curves (see Section B.2, Tables S2, S3, and Figure S1 in
We analyzed 2,625 screening strategies, 24 of them uniform and 2,601 risk-based. The risk-based strategies were obtained combining the exam periodicity (annual (A), biennial (B), triennial (T), and quinquennial (Q, [every five years])), the starting ages (40, 45 and 50 years) and the ending ages (69 and 74 years) in the four risk groups, L, ML, MH and H. In the following sections, uniform strategies are abbreviated as B5069 or B4574, for biennial exams in the 50–69 or in the 45–74 age groups, respectively. Risk-based strategies are abbreviated with four strings, e.g. Q5069-Q4574-T4574-A4074, that correspond to the L, ML, MH and H risk groups, respectively. A sample of the studied screening strategies is presented in Table S4 in
For each screening strategy and for the background, we measured the benefit of screening with two outcomes: the number of lives extended, LE, and the number of quality-adjusted life years gained, QALY. Because of the lack of Spanish data, the QALYs were estimated using the work of Lidgren
We used the FP rates for non-invasive and invasive tests obtained from the Cumulative Risk of False Positive Study (RAFP) study which included 74 distinct radiology units in eight regions of Spain, from March 1990 to December 2006
In our model, true interval tumors correspond to those that appear between exams and were not in the pre-clinical state when the previous exam was performed. FN cases are tumors that were not detected in the previous exams due to lack of sensitivity of the screening test. We considered that all tumors in pre-clinical state in the previous exam were FN.
Screening may cause overdiagnosis when it detects tumors which would never have been diagnosed during a lifetime without screening because of the lack of progressive potential or death from other causes. To estimate overdiagnosis we made some additional assumptions. We differentiated between overdiagnosis of invasive BC and ductal carcinoma in situ (DCIS). For both types of tumors we assumed that: 1) overdiagnosis only happens when a mammographic exam is performed, 2) a woman with an overdiagnosed tumor would not die of breast cancer, and 3) QALYs and costs of treatment (initial and follow-up) for women with overdiagnosed tumors are the same as for Stage I BC.
Estimates of overdiagnosis show high variation depending on the study design and the method used
Using the incidence model described in
To estimate the impact of screening on detection of DCIS we obtained the incidence and Census data from the Girona and Tarragona Cancer Registries in the period 1983–2008. Data on mammography use was obtained, for the Girona and Tarragona provinces, from three health surveys performed in the years 1994, 2002 and 2006
Because DCIS is treated when detected, it is not possible to accurately estimate the fraction of detected DCIS that would progress to invasive disease. A review of the literature showed that between 14% and 53% of DCIS may progress to invasive cancer over a period of 10 or more years
We have adopted the perspective of the national health system and considered only direct healthcare costs. We have partitioned the estimation of costs into four parts: screening and diagnosis confirmation, initial treatment, follow-up and advanced care costs. All costs were valued in 2012 euros and both costs and outcomes have been discounted at an annual rate of 3%, according to the economic evaluation guidelines of the Spanish Ministry of Health
The costs of screening mammograms, complementary tests and administrative expenses were obtained from the Early Detection Program of
To compare the relative costs and outcomes of the different strategies, we calculated the incremental cost-effectiveness ratio (ICER). The ICER is defined as the ratio of the change in costs to the change in effects of a specific intervention compared to an alternative. The ICER indicates the additional cost of obtaining one additional unit of outcome. We obtained the cost-effectiveness frontier, also called the Pareto frontier, which contains the efficient alternatives for which no alternative policy exists that results in better effects for lower costs.
To perform a harm-benefit analyses, we ordered the studied strategies from less to more adverse effects and obtained the incremental harm-benefit ratio of each strategy in relation to the previous one. We also obtained the harm-benefit frontier.
To search for optimal strategies taking into account benefit, costs and harms, we selected the most recommended uniform strategy in Europe, biennial exams in the 50–69 age interval (B5069), or the alternative towards which some countries are moving, biennial exams in the 45–74 age interval (B4574), as reference strategies. Then, for each reference strategy we obtained the intersection of the subsets that contained strategies with similar benefit (between 1 and 1.05 times) than the reference strategy and lower cost and harms in terms of FP results and overdiagnosed cases (invasive and DCIS). The resulting strategies were located at or near the cost-effectiveness and harm-benefit frontiers with values in the x-axis near the B5069 or B4574 benefit values. We did not include the FN results in the intersection but we assessed them in the resulting optimal subset.
We have compared our results with the results of three published reviews, the Cochrane systematic review
We have compared the following summary indicators in the INCA study and the uniform B4569 strategy of our model: 1) frequencies of screen-detected and interval cancer, by age-group, 2) sensitivity of the program defined as the ratio of the number of tumors detected in the screening exams between all the detected tumors, 3) distribution of true interval cases and FN, by time since last mammogram, and 4) distribution of stages at diagnosis, by type of detection (screening or symptomatic).
There is uncertainty associated with the model inputs and there is also uncertainty associated with the model structure. It is complex and computationally intensive to obtain the variance of the model estimates. Instead, we performed univariate sensitivity analyses to study the impact on our conclusions when some of the inputs were modified. First, we changed the four risk group distributions assuming that 20% of women in the L, ML, and MH groups migrated to the next higher risk group. The new risk group distributions was 31.7%, 42.1%, 21.1% and 5.1%, for L, ML, MH and H, respectively. Second, we changed the amount of overdiagnosis of invasive tumors to 0%, 5% and 25%. Third, we changed the excess of DCIS to 0.1 and 0.26 per 1,000 mammograms. Fourth, we tested the effect of changing the costs of cancer treatment to two-fold and five-fold the costs of the main analysis. Fifth, we assessed the effect of changes in the disutility by false-positive result on QALY. We used zero and two times the disutility of the main analysis.
All the input data will be available to researchers upon request.
Benefits, harms, and costs of each screening strategy were obtained as a function of the risk-groups' incidence and the screening characteristics (periodicity and age-interval of exams by risk group).
Effect measured in lives extended. Dots represent specific screening strategies.
Effect measured in quality-adjusted life years. Dots represent specific screening strategies.
A) Effect measured in lives extended (LE) |
|||||
Schedule | LE | Cost (×106€) | False positive |
Overdiagnosis |
False negative |
Uniform B5069 | 201.9 | 139.6 | 19,256.3 | 347.6 | 223.9 |
Risk-based strategies |
Percentage of change, compared to fixed B5069 | ||||
Q5074-Q5074-Q4574-A4574 | 0.6 | −9.3 | −25.1 | −25.9 | 22.7 |
Q5074-Q5074-T5074-A5074 | 3.8 | −8.9 | −25.1 | −20.6 | 20.8 |
Uniform B4574 | 264.7 | 154.5 | 26,578.5 | 493.1 | 298.2 |
Risk-based strategies |
Percentage of change, compared to fixed B4574 | ||||
T5069-B5074-A5074-A5074 | 0.5 | −7.7 | −23.0 | −12.4 | −21.6 |
T5074-T5074-A4574-A4574 | 5.0 | −6.8 | −21.9 | −10.1 | −9.7 |
B) Effect measured in quality-adjusted life years (QALY) |
|||||
Schedule | QALY | Cost (×106€) | False positive |
Overdiagnosis |
False negative |
Uniform B5069 | 2,333.3 | 139.6 | 19,256.3 | 347.6 | 223.9 |
Risk-based strategies |
Percentage of change, compared to fixed B5069 | ||||
Q5069-Q4574-Q4574-A4574 | 0.3 | −8.3 | −18.3 | −25.9 | 24.9 |
Q5069-Q4574-Q4574-A4074 | 1.5 | −8.0 | −17.2 | −25.0 | 26.2 |
Uniform B4574 | 2,848.8 | 154.5 | 26,578.5 | 493.1 | 298.2 |
Risk-based strategies |
Percentage of change, compared to fixed B4574 | ||||
Q5074-Q5074-A4074-A4074 | 0.4 | −9.2 | −25.3 | −23.4 | −10.5 |
Q4574-Q4574-A4574-A4074 | 4.0 | −9.2 | −20.4 | −23.0 | −7.2 |
Data correspond to a cohort of 100,000 women at birth assessed in the age-interval 40–79 years.
All the absolute values have been discounted at an annual rate of 3%.
False positive includes both non-invasive and invasive procedures.
Overdiagnosis of invasive and DCIS cases.
Periodicity and age-interval for Low, Medium-Low, Medium-High and High risk groups, respectively.
Exams periodicities: A = annual, B = biennial, T = triennial, Q = quinquennial. The first two numbers refer to the age at starting the exams and the last two numbers refer to the age at the last exam.
We have analyzed the incremental ratios of FN results per unit of benefit separately from the other cost-effectiveness or harm-benefit ratios because the pattern of changes in FN results is affected differently by the periodicity of the exams and the age-interval of screening. For instance, moving from uniform B5069 to uniform A5069 reduces the amount of FN by 29%, but moving from uniform B5069 to uniform B4574 increases the amount of FN by 33%.
The last column of
When all the risk-based strategies that are at or near the Pareto frontier are considered and benefit is measured as LE, the risk-based strategies that provide a similar benefit than the B5069 strategy are caracterized by quinquennial for the L and ML, triennial for the MH and tri-, bi- or annual periodicities for the H risk groups. When benefit is measured as QALYs, the risk-based strategies are characterized by quinquennial periodicities for the L, ML and MH and annual for the H risk groups. When the standard of comparison is the uniform strategy B4574, the risk-based strategies that provide similar benefits, either LE or QALY, are characterized by quinquennial for the L, triennial for the ML, and annual periodicities for the MH and the H risk groups.
Figures S4 and S5 in section G of
When we assumed a scenario without screening, for the age interval 0 to 74 years, we obtained a cumulative incidence of BC equal to 5.8% and a mortality rate from BC equal to 1.5%. These values were consistent with the literature
Time since last mammogram (months) | Interval cancer | True interval and minimal signs | False negative and occult tumors | |||
N | % | N | % | N | % | |
The INCA study |
||||||
0–11 | 420 | 32.4 | 142 | 26.2 | 117 | 38.7 |
12–23 | 876 | 67.6 | 399 | 73.8 | 185 | 61.3 |
Probabilistic model, biennial screening | ||||||
0–11 | 529 | 35.3 | 287 | 26.8 | 242 | 56.5 |
12–23 | 971 | 64.7 | 785 | 73.2 | 186 | 43.5 |
The total number of interval cases in the INCA study is higher than the sum of true interval and FN, occult and minimal signs, because 60.3% of all the interval cases were reviewed.
Our study |
Independent UK Panel on Breast Cancer Screening review |
Cochrane systematic review |
Euroscreen review |
||
B5069 | B4574 | ||||
Mortality reduction (%) | 14.4 | 19.6 | 20.0 | 15.0 | 23.0–30.0 |
Deaths averted | 4.3 | 5.8 | 4.3 | 0.5 | 7–9 |
Overdiagnosis | 5.5 | 8.1 | 12.9 | 5.0 | 4 |
Non invasive FP | 265.5 | 347.8 | - | >100 | 200 |
Invasive FP | 24.9 | 28.7 | - | - | 30 |
Number needed to screen to extend 1 live | 233 | 172 | 235 | 2000 | 111–143 |
Benefits and harms per 1,000 women screened.
time horizon 40–79 years.
10 years of follow-up.
time horizon 50–79 years.
Figures S6 and S7 in
Tables S14 and S15 in
Our analysis aimed to be a global assessment of the impact that a new paradigm of screening would have on benefit, costs and harms rather than a detailed guideline of how personalized screening should be done.
Using probabilistic models, we have found that risk-based screening strategies are more efficient and have lower harm-benefit ratios than uniform strategies. If, instead of screening biennially all women 50 to 69 years old, we combined quinquennial, triennial and annual exam periodicities for women at L or ML, MH, and H risk, respectively, in the age interval 50 to 74, we would avert the same number of deaths. Similarly, strategies that combine quinquennial exams for women at L or ML risk with annual exams for women at MH or H risk, respectively, in the age interval 45 to 74, result in similar gain in QALYs than the uniform biennial strategy in the age interval 45 to 74. But, the important result is that in both cases the risk-based strategies would result in remarkable reductions of costs, FP results and overdiagnosis.
It is important to notice that a risk-based screening strategy Q5074-Q5074-Q4574-A4574 has similar benefits and less costs and harms than the uniform B5069. This does not mean that Q5074-Q5074-Q4574-A4574 should be recommended, only that the same benefits as B5069 can be achieved more efficiently and safely. In fact, in terms of LE, Q5074-Q5074-T5074-A5074 improves the uniform B5069 and has similar costs and harms to Q5074-Q5074-Q4574-A4574. The cost-effectiveness and harm-benefit analyses show the trade-offs when moving along the Pareto frontier. Drawing horizontal lines at the level of uniform strategies, one can estimate the improvement in benefit for a specific cost or harm. Drawing vertical lines allows estimation of the reduction in costs or harms for a specific benefit.
Some recent works have proposed personalized recommendations for BC screening based on cost-effectiveness or cost-utility analyses
van Ravesteyn
Ayer
We have used a very detailed model that allowed us to thoroughly assess the cost-effectiveness and harm-benefit of 2,625 different screening scenarios, either risk-based or not. However, our study has several limitations.
First, our model relies on data and assumptions that may be not correct. When available, we have used Catalan or Spanish data from population based registries or BC screening programs. If the input data was not available at the region or country level, we used data that the Cancer Intervention and Surveillance Modeling Network (CISNET) had prepared for BC mortality modeling research groups in the USA, like the distribution of disease stages at diagnosis
Second, we have assumed that BC risk influenced only the incidence of the disease and not the distribution of stages at diagnosis, the sensitivity and specificity of mammography, the sojourn time in the preclinical state or the mortality from other causes. It could happen that tumors for women at MH or H risk groups had a less favorable stage distribution at diagnosis and the benefit of screening for these groups was lower than estimated. Also, it is known that mammography performance is associated with the considered risk factors
Third, we have assumed that there are no changes in the risk factors after the age at which screening exams start. We considered that the proportion of women in the risk groups remained constant over time and it was the overall sample estimate for the BCSC data. This assumption may not be correct, because as women get older breast density tends to decrease and personal history of biopsy and family history of breast cancer have more chances to be present. We think that our results are robust to changes in the risk group weights over time, as the sensitivity analysis has shown to be the case for changes in the risk group distributions. However, when considering personalized screening, BC risk should be updated when new information on risk factors or their trends is available.
Forth, our model used age-specific sensitivities of the screening exam that correspond to a more prevalent use of film mammography than digital mammography. We did not assess the impact of changing the mammography performance in this study. van Ravesteyn
Fifth, our probabilistic model assumes that screening results in a stage-shift at BC diagnosis, but does not consider DCIS as one of the BC stages. Therefore, the fraction of DCIS tumors that would have progressed and been diagnosed as invasive in the absence of screening, are re-distributed under screening in more favorable stages at diagnosis, but not as DCIS. This may have produced an underestimation of the benefit of the screening strategies, both uniform or risk-based. If bias had affected uniform and risk-based strategies similarly, the cost-effectiveness and harm-benefit analyses would remain valid.
We agree with Mandelblatt
In conclusion, risk-based screening strategies seem to be more efficient and have better harm-benefit ratios than the standard uniform strategies. We have proposed a reduced number of risk-based screening strategies that combine quinquennial or triennial exams for women in low or moderate-low risk groups and annual exams for women in the moderate-high or high risk groups, for the consideration of researchers, decision makers and policy planners. Now, it is necessary to develop accurate measures of individual risk of BC and to work on how to organise risk-based screening programs.
(PDF)
We thank Sandra Lee, Marvin Zelen and Hui Huang for their support and for providing the initial code of the probabilistic models used, Jordi Blanch for data analysis of the INCA study and revising the manuscript and Montserrat Martinez-Alonso for developing the breast cancer incidence and mortality models and revising the manuscript. We are also grateful to JP Glutting for review and editing, to the Cumulative False Positive Research Group (RAFP) project researchers, and to Rebecca Hubbard and two anonymous reviewers for their valuable comments on a preliminary version of the manuscript.