Flexible transition probability model for assessing cost-effectiveness of breast cancer screening extension to include women aged 45-49 and 70-74

Breast cancer is the most common cancer among Western women. Fortunately, organized screening has reduced breast cancer mortality. New recommendation by the European Union suggests extending screening with mammography from 50–69-year-old women to 45–74-year-old women. However, before extending screening to new age groups, it’s essential to carefully consider the benefits and costs locally as circumstances vary between different regions and/or countries. We propose a new approach to assess cost-effectiveness of breast cancer screening for a long-ongoing program with incomplete historical screening data. The new model is called flexible stage distribution model. It is based on estimating the breast cancer incidence and stage distributions of breast cancer cases under different screening strategies. The model parameters, for each considered age group, include incidence rates under screening/non-screening, probability distribution among different stages, survival by stages, and treatment costs. Out of these parameters, we use the available data to estimate survival rates and treatment costs, while the modelling is done for incidence rates and stage distributions under screening policies for which the data is not available. In the model, an ongoing screening strategy may be used as a baseline and other screening strategies may be incorporated by changes in the incidence rates. The model is flexible, as it enables to apply different approaches for estimating the altered stage distributions. We apply the proposed flexible stage distribution model for assessing incremental cost of extending the current biennial breast cancer screening to younger and older target ages in Finland.


Introduction
Breast cancer (BC) is the most common cancer among Western women [1]. Fortunately, mortality due to BC can be and has been successfully reduced by organized screening [2][3][4]. Subsequently, the EU has recommended screening with mammography for 50-69-year-old women and this has been adapted widely in Europe [5][6][7]. New recommendation suggests widening screening for 45-74-year-old women [6]. However, prior to possible widening, it's essential to carefully consider the benefits and costs locally as circumstances vary between different regions and/or countries. The assessment of benefits, breast cancer mortality reduction or life-years gained (LYG) against costs, should take into account country-specific conditions, and provide support for an optimal use of available health care resources. Cost-effectiveness modelling has been recommended to support policy-making for a new or a recently started screening program [8]. In cost-effectiveness, one is calculating the additional cost per one gained life year under one or more interventions. Usually, in cost-effectiveness modelling, alternative screening strategies are assumed to occur in the future for a cohort's lifetime and no-screening is used as a reference. From a modelling perspective, ideally, one could rely on recent data where individuals were randomly assigned to different screening strategy groups, including the no screening group. In practice this is seldom the case and lack of data may severely complicate cost-effectiveness modelling. The registration of BC screening may also have been incomplete, especially if screening has started decades ago. Moreover, we have seen significant changes in both, incidence and treatment guidelines. Thus, lack of suitable data makes standard cost-effectiveness modelling unreliable. Cost-effectiveness modelling is indeed challenging if a screening program has been ongoing for decades and its previous performance is not fully known. Indeed, reliability of future predictions depends on the past, on a prediction base. If underlying data are incomplete or heterogeneous, one cannot lay high confidence on future predictions either. This occurs if there is an ongoing nationwide screening program that has already existed for decades. No screening is then not an appropriate reference as there is no representative data available about situation without screening. Indeed, e.g. environmental and life style factors have an effect on cancer incidence and that is why historical data does not provide reliable information about the current situation. Instead, it is reasonable to compare alternative screening strategies to the current screening strategy, which is exactly what we do in this article.
We propose a new approach to assess cost-effectiveness of BC screening for a long-ongoing program with incomplete historical screening data. The new model is called flexible stage distribution model. It is based on estimating the stage distributions of BC cases under different screening strategies. In the model, an ongoing screening strategy may be used as a baseline and other screening strategies may be incorporated by changing the incidence rates and stage distributions. The model is flexible, as it enables to apply different approaches for estimating the altered stage distributions affected by screening. Further, if randomized data are available, one may rely on that. On the other hand, if randomized data is not available, altered stage distributions may be estimated by extrapolating the stage distributions of the youngest and oldest screened/nonscreened age groups. Moreover, the model enables to use either TNM (tumor (T), nodes (N), and metastases (M)) or some other cancer classification depending on availability of data.
We apply the proposed flexible stage distribution model for assessing incremental cost of extending the current biennial breast cancer screening to younger and older target ages in Finland.

Ethical considerations
Permit for this register-based study has been granted by Finnish Social and Health Data Permit Authority Findata (THL/504/14.06.00/2020). The permit covers secondary use of the anonymized registry data without additional consents. Only fully anonymized registry data was used in the analysis and thus the requirement for separate informed consents was waived.

Breast cancer screening in Finland
In Finland the nationwide organized screening started in 1992 targeted biennially to 50-59-year-old women. Individual municipalities are responsible to offer screening to their permanent residents and have often offered breast screening also to wider target ages [9]. Screening has therefore been ongoing with varying target ages by time and municipalities. The current nationwide target age, 50-69 years, was introduced gradually and adopted fully in 2017. Participation to screening is currently about 82% and varies very little within the target age [10]. Throughout the article, we assume that the participation rate is constant for different age groups. Registration of screening data has been complete since 2000.

Flexible transition probability model
The proposed flexible transition probability model is based on modeling the effect of screening on cancer incidence and its stage distributions at the time of the first diagnosis. This is done separately for different age groups. Costs of treatment and survival depend on the stage distribution and the age group. The model was built in collaboration with mathematicians and context experts. The assumptions in the analysis are based on discussions with context experts from Finnish Cancer Registry.
Age groups are indexed by j = 1, 2, . . ., J, and screening rounds are taken at every time instance j. In our study, we consider biennial screening, and the first age group j = 1 corresponds to the 46-47 year old females, the second to the 48-49 year old females, and the last one to the 98-99 old females. For simplicity, we assume that no one survives over 100 years. The age groups and screening intervals in our study are chosen to match available data and current policies in Finland. We consider different policies h = [h(1), h(2), . . ., h(J)] 2 {S, NS} J where S corresponds to the screening and NS to no-screening. For example, with two agegroups, i.e. J = 2, a policy h = [S, S] corresponds to screening of both age groups, h = [NS, S] to screening of older age group, and so on. In each of the screening rounds, individuals belong to one of the exclusive categories, either having a breast cancer of stage k or no cancer. These are indexed by k = −1, 0, . . ., K, with k = −1 corresponding to no breast cancer. In our study, we have K = 4, where the stages correspond to 0-Unknown, 1-Localized, 2-Non localized/ Regional lymph nodes metastases, 3-Metastasized farther than regional lymph nodes or invades adjacent tissues, and 4-In situ carcinoma. Alternatively, one could, for example, use TNM classifications here. That was not possible in our study as Finnish Cancer Registry does not register TNM classifications. The observed state of an individual i of age j is denoted by X i j;h . Note that the observed state depends on the policy h through the value h(j) determining whether the age group j is screened or not. We denote by m i j;h ¼ m j;h the state distribution of X, observed at screening. That is, m j;h ðkÞ ¼ PðX i j;h ¼ kÞ. Note that here it is assumed that μ does not depend on the particular individual (only the age), but it depends on the policy through value h(j) representing the choice whether age group j is screened or not. For this reason, we omit the superscript i in the notation and simply write, e.g. X j,h whenever confusion may not arise.
Remark 1. For our purposes, we model the conditional distribution PðX i j;h ¼ kjX i j;h 6 ¼ À 1Þ and incidence rates separately. That is, we model separately the probabilities of given stages conditioned on breast cancer being diagnosed, and the incidence rate PðX i j;h 6 ¼ À 1Þ. The connection to actual stage distribution μ j,h is simply Once an individual i of age j is diagnosed with stage k, the time to death is denoted by T i j;k and its distribution, not depending on i, is denoted by λ j,k . That is, we have l j;k ðtÞ ¼ PðT i j;k ¼ tÞ. Hence the expected number of years left, once the individual i of age j is on stage k, is We are interested in the effect of screening only and individuals are censored (removed) from the population once they either die (for any reason) or are diagnosed with breast cancer. We model a cohort of size N 0 = 100000 of individuals throughout their life span, starting at the first (possible) screening age. The dynamics of the screening population N j is hence given by Here T j denotes the time to death of an individual at age j, regardless of the stage. That is, at each screening step (age j) we invite N j individuals to screening, and the amount N j+1 invited to the next screening contains N j from which we have removed those who have died during the two years (two year screening interval) with proportion P(T j = 0, T j = 1) and those who survive at least two years for the next screening to happen, but got diagnosed, i.e. the stage The total number of life years left, measured at the beginning, of the entire population is given by the following result.

Proposition 2. The total number of life years left T h with a given policy is given by
where ET j;k is given by (1). Proof of Proposition 2. On each screening round, we invite N j individuals to screening. The proportion of those who are not diagnosed with breast cancer is μ j,h (−1), and the proportion of age j at stage k = −1 (no breast cancer) live only one year is λ j,−1 (1). These individuals have one remaining year, contributing a factor λ j,−1 (1)μ j,h (−1)N j . Similarly, the individuals diagnosed with breast cancer (k 6 ¼ −1) have expected remaining years ET j;k , and the proportion of individuals is μ j,h (k). Summing over different stages k 6 ¼ −1 contributes with factor P K k¼0 m j;h ðkÞET j;k N j . Finally, the amount of individuals who continue to the next screening round j + 1 is N j+1 , who at age j + 1 are then contributed with two lived years. Summing over screening rounds (i.e., over age groups j) contributes with the factor This completes the proof. Consider next the related cancer costs. The cost of treatment and death of an individual i of age j depends on the stage k (given by the variable X i j;h ), the time to death t (given by the variable T i j;k ), the age j, and the cause of death. We denote by d = 1, 2, . . ., D the different causes of death, and for an individual i of age j in stage k, the random variable D i j;k 2 f1; . . . ; Dg determines the cause of death. In our study, we use D = 2 and consider deaths due to breast cancer (d = 1) or due to other causes (d = 2). Given the quadruple (j, k, t, d), the average costC j;k;t;d is considered as constant, and consists of treatment related costs solely. That is, if k = −1, then we obtain no costs as no treatment is required. Hence, the treatment and death cost C i j;k of an individual i of age j and at state k is given by ¼d denotes the indicator taking value 1 if the individual i at age j lives t years after diagnosed with stage k and dies to the cause d. We denote by π j,k the joint distribution of T i j;k and D i j;k that is assumed to be independent of the chosen individual (of age j and at stage k). That is, we have giving the probability that an individual at age j and stage k lives exactly t years and dies due to d. Obviously, we have the connection We also denote byC hðjÞ the average screening cost per individual, given the policy h determining whether the group j is screened or not. This gives us the following expected costs. Proposition 3. The total expected costs related to age group j is given by j;k;t;d p j;k ðt; dÞ: ð2Þ and the total expected costs during the entire follow up period is given by Proof of Proposition 3. Expected costs related to an individual i of age j and diagnosed with stage k is given by j;k;t;d p j;k ðt; dÞ: Hence the expected costs of the individual i of age j is obtained by conditioning on the stage k, leading to j;k;t;d p j;k ðt; dÞ: Adding the average screening costC hðjÞ and multiplying with the number N j of individuals in age group j leads to (2), from which the total costs (3) follows by summing over age groups.

Data and parameter estimation
Our analysis relies on registry data from years 2000-2018. The data were created by linking the individual breast cancer data of Finnish women from the Finnish Cancer Registry (FCR data) with the breast cancer screening data from the Mass Screening Registry, a part of the FCR. We restricted the data to diagnoses of first primary cancer, invasive and in situ carcinomas in breast (ICD-10 C50 & D05). We also restricted the data to subsequent screening rounds for those aged at least 60 years and therefore excluded 4.6% women (N = 4226). After this exclusion we were able to assess the effect of a natural, long term (steady-state) screening on stage distribution. As total, 88607 women were included in the analysis with their cause of death (breast cancer, other cause). In the data, the cancer stages used by the FCR were coded further to five separate classes that were the stage 0-Unknown, 1-Localized, 2-Non localized/Regional lymph nodes metastases, 3-Metastasized farther than regional lymph nodes or invades adjacent tissues, and the stage 4-In situ carcinoma. The corresponding overall stage distributions were 9.6%, 52.2%, 25.4%, 0.85%, and 11.9%. Age-group specific incidence rates of invasive breast cancer were extracted from NordCan [11] with some adjustments. For example, the incidence rate for the age group 50-51 year old females, that is the youngest screened age group in the current screening policy, was obtained from the Finnish Cancer Registry data. Note that NordCan provides data only on invasive cancer. Finnish Cancer Registry collects data on incidence of in situ carsinomas as well. Total age specific incidence rates for our analysis were obtained by combining data from NordCan and Finnish Cancer Registry. Stage specific survival rates were computed separately for all the age groups. We followed all age groups until the age of 99. We thus assumed that after 18 years of follow-up, patients did not have excess mortality due to their breast cancer and would survive as general female population in 2019. General population mortality rates were obtained from Statistics Finland.
The costs of screening and the age and stage specific costs of treatment from specialized medical care were obtained from calculations for Lehtinen et al. (2019) [12]. In the data analysis, the unit costs of screening are 30 euros per invitee. The age and stage specific treatment costs of breast cancer corresponding to the first year after diagnosis, C 1 , are displayed in Table 1. The age and stage specific treatment costs of breast cancer corresponding to the years 2-5 after diagnosis, C 2 , are displayed in Table 2. The age and stage specific treatment costs of breast cancer corresponding to the last year before breast cancer death, C 3 , are displayed in Table 3. If the patients dies during n years after the breast cancer diagnosis for some other reason than breast cancer, the overall treatment costs are equal to: If the patients dies from breast cancer during n years after the breast cancer diagnosis, the overall treatment costs are equal to: C 1 þ ðn À 2ÞC 2 þ C 3 for n 2 f3; 4; 5g Note that our approach to treatment costs is very conservative as in general breast cancer treatment and follow up lasts from five to ten years.
For assessing the effect of extending screening to the younger age groups, we extrapolated from the closest screened age groups. That is, we changed the stage distribution of the age group 46-47 old females to be the same as the stage distribution of the age group 50-51 old females under the current policy, the stage distribution of the age group 48-49 old females to be the same as the stage distribution of the age group 52-53 old females under the current policy, the stage distribution of the age group 50-51 old females to be the same as the stage distribution of the age group 52-53 old females under the current policy. In order to model the effect of the first screening, incidence rate of the age group 46-47 old females were increased by 28% the incidence rate of the age group 48-49 old females were increased by 24.7% and the incidence rate the age group 50-51 old females were decreased by 11.9%.
When assessing the effect of extending screening to the older age groups, we assumed that the incidence with screening among elderly women would follow the pattern observed in a steady-state, in the neighbouring country Sweden. We shifted the decline in incidence to follow the pattern observed in Sweden where screening continues until the age of 74. Incidence rates in our model, under different policies, are displayed in Fig 1. The stage distributions of age groups older than 69 year old females were changed in the following way. The first two groups are assigned to have the same stage distribution as the group 68-69 old females and then each older group is assigned to have the stage distribution of the preceding groups.
In our results, we compare expected difference in costs per expected life years gained. That is, we compare where h 1 is the current screening policy and h i , i = 2, 3, 4 is the modelled screening policy. A benefit of our model is that modelling of changes in the screening policy is affecting only the

Sensitivity analysis
In addition to the main analysis, we conducted sensitivity analysis in order to assess the effect of the modelled incidence rates, estimated costs and the modelled stage distributions. The sensitivity analysis was conducted separately for the incidence rates, for the costs and for the stage Table 3. Age and stage specific breast cancer treatment costs corresponding to the last year before cancer death. distributions. In the sensitivity analysis, all the other factors were kept as in the main model. The sensitivity analysis is motivated by uncertainty in expert opinions on our model assumptions. In sensitivity analysis, we have chosen upper and lower limits and studied the effect to the cost-effectiveness.

Age group C 3 (Stage 0) C 3 (Stage 1) C 3 (Stage 2) C 3 (Stage 3) C 3 (Stage 4)
In assessing the effect of incidence rates we considered two separate scenarios. In the first one, we increased all the modelled incidence rates by 10%. That is, when considering extending screening for the younger age group, we increased the incidence rates for the 46-69 year old women by 10% while for 70 year old women and onwards the current incidence rates were used. When considering extending screening for the older age group, we increased the incidence rates by 10% for women older than 69 years and the current incidence rates were used for 46-69 year old women. When considering extending to both, younger and older age groups, all incidence rates were increased. In the second scenario, the same was repeated using 10% decrease.
In assessing the effect of treatment costs we again considered two separate scenarios. In the first one, all treatment costs were increased by 10%. In the second one, all treatment costs were increased by 50%.
In assessing the effect of the stage distribution, we considered a positive and a negative scenario. In the positive scenario all the modelled stage distributions were modified such that when considering extending screening for the younger age group, we kept the incidence rates as in the main analysis, but for 46-51 year old women the conditional probability (if diagnosed with cancer) for having localized cancer (stage 1) was increased by 0.02 and the conditional probability for having non-localized cancer (stage 2) was decreased by 0.02. When considering extending screening for the older age group, the conditional probability for having localized cancer was increased by 0.02 and the conditional probability for having non-localized cancer was decreased by 0.02 for women older that 69 years. When considering extending to both, younger and older age groups, the modifications were done for the stage distributions of 46-51 year old women and of the women older than 69 years. In the negative scenario, the modifications were done to the other direction. That is, the conditional probabilities for having localized cancer (stage 1) was decreased by 0.02 and the conditional probability for having non-localized cancer (stage 2) was increased by 0.02 for the same age groups as in the positive scenario.
Our sensitivity analysis highlights that cost-effectiveness depends heavily on treatment costs and incidence rates, see Tables 5-10. These and other factors can vary over time. Hence   Table 8. Sensitivity analysis, the effect of increasing the treatment costs by 50%. it would be a good idea to assess the cost-effectiveness of ongoing screening program on a regular basis.

Results and discussion
Our analysis show that one should consider very carefully before extending any effective screening program to new age groups. For a cohort of 100000 individuals, extending the current screening strategy in Finland does not seem to provide large benefits, see Table 1. Current strategy in Finland is to provide biennial breast cancer screening for all 50-69 year old women. In our analysis, we considered extending screening to younger age groups starting from age 46 and to older age groups ending to age 74. In particular, based on our modeling, overall costs of extending screening to older age groups in Finland would be 92736 euros per a saved life year. In a cohort of 100000 individuals, the number of breast cancer deaths would decrease from 1686 to 1658. That is, the number of breast cancer deaths would decrease by 30 in the cohort's lifetime, 3.8 million women-years. Although extending breast cancer screening to older age groups would reduce mortality, the costs are high and screening may yield unnecessary worries and treatment related morbidities. This is in line with [13], where screening the age group 50-74 was found to be suboptimal and extending screening to younger age groups was supported. Based on our modeling, the overall costs of extending screening to younger age groups in Finland would be 963 euros per a saved life year. The main reason for a much lower cost per a saved life year, when compared to extending screening to older age groups, is simply that the expected remaining life years are naturally much higher in young age groups. Based on our modelling, the decrease in the number of breast cancer deaths in a cohort of 100000 individuals, when extending screening to younger age groups, would be the same as when extending to the older age groups. Note, however, that our model does not consider screening related radiation burden that might be important on population level especially when screening starts from young age groups. Based on our model, if screening would be extended to both directions, the overall costs would be 7766 euros per a saved life year and in the cohort of 100000 individuals, the number of breast cancer deaths would decrease from 1686 to 1616. Note that incremental cost effectiveness ratio depend highly on incidence rates. In our analysis, we assumed that if we extend screening up to 74 year old women, the incidence rates of over 70 year old depend on the starting age of screening. If, under extending to older age groups only, we had used the same incidence rate for over 70 year old women as under extending to both directions, the incremental cost ratio for 50-74 would have been approximately 33 500 euros. Thus the small changes in incidence, displayed in Fig 1, do have a large impact on the results. Note also that the estimations rely on extrapolation. The proposed model is suitable also for randomized data and when ever recent randomized data is available, it should be used instead of extrapolation.
Our sensitivity analysis reveals that if extended screening yield much higher incidence rates, extending screening is harmful, see Table 5. In that case screening would increase the costs but it would not lead to increase in expected life years. This highlights the importance of discussing the effects of increased incidence due to screening, overdiagnoses, and the harmful effects of unnecessary cancer treatment. Note, however, that based on our main model, the increase in incidence rates is much smaller than in the sensitivity analysis. Assessing the effect of unexpectedly high increase in incidence rates is, however, important before making new policies about screening target age groups. Note however, if the incidence rates are unexpectedly low, the benefits of screening increase, see Table 6. Under that scenario, extending screening to younger age group would both, lead to increase in expected life years and decrease in health care costs. Note that our sensitivity analysis related to incidence rates does not model the effect of changes in the actual incidence rates. Instead, it models the effect of estimating the changes in incidence rates incorrectly under the new screening policies.
Our sensitivity analysis related to the treatment costs reveals that if the treatment costs are increased by 10%, then the overall costs of extending screening to younger age groups are only 308 euros per a saved life year and if the treatment costs are increased by 50%, then extending screening to younger age groups actually saves money. The reason behind is that then the costs of additional screening in a cohort of 100000 individuals become lower than savings in treatment costs when cancers are detected earlier. If the treatment costs are increased by 10%, then the overall costs of extending screening to older age groups are still high, 96099 euros per a saved life year, and if the treatment costs are increased by 50%, then the overall costs of extending screening to older age groups are 109553 euros per a saved life year.
Our sensitivity analysis related to the effect of stage distribution reveals that if, assuming that cancer is diagnosed, the conditional probability of localised cancer increases, then the benefits of screening increase. If, assuming that cancer is diagnosed, the conditional probability of localised cancer decreases, then the benefits of screening decrease as well. The larger positive effect screening has on the stage distribution, the larger are the benefits of screening (assuming that the incidence rate does not increase too much).
When we applied the screening strategies to younger and older target ages, we made several assumptions. Several sensitivity analyses were performed to assess effect on results. In addition, we implicitly assumed that the attendance rate of the younger and older age groups would be the same as the attendance rate of the closest screened age groups. Since attendance to screening vary little by age group [10], this assumption can be regarded feasible and realistic for a Finnish cohort of 100,000 women with that respect. Often, the attendance rate is assumed to be 100% in modeling studies and their results will thus represent an ideal and unrealistic world. For example, the popular MISCAN-breast microsimulation model [14] assumes that the attendance rate is 100%. Our model implicitly considers the natural progression of breast cancer, while MISCAN-model relies on that in more comprehensively. While MISCANmodel provides more detailed results, it requires data on situation without screening. In our case, recent and representative data is not available. In addition, our model is transparent and computationally very light compared to MISCAN-model. We also assumed that the probability of detecting breast cancer in screening in younger and older ages is the same than in the closest screened age group which is the most feasible assumption. In addition, even if we used all Finnish breast cancer data, estimated survival probabilities may be uncertain which may affect our results. We are aware that cost-effectiveness models without natural history are likely to provide overly optimistic estimates. Nevertheless, magnitude and order of incremental costs-effectiveness ratios per life-years gained can be used to support policy-making.
Extending screening saves lives, but it is expensive for older ages. The best strategy is to extend screening to younger ages with low additional incremental cost per life-year gained. Even then, however, number of averted breast cancer deaths is rather small compared to the current screening which prevents annually 100 breast cancer deaths each [15]. Therefore, one possibility would be, instead of extending the target age groups, to extend screening to targeted groups. That is, to screen individuals whose risk for being diagnosed with breast cancer is exceptionally high. Note that the modeling approach provided in this paper can be applied for assessing the effects of risk-stratified screening assuming that suitable data is available. One could consider the cost effectiveness of screening individuals with known risk factors. In the model, the age groups would be further divided into separate groups, based on risk factors.
In our study, we have restricted the analysis to the case of biennial screening due to data availability. An interesting topic of future research would be to assess the effects of the changes in screening intervals. In addition, we could analyse the effects of possibly different attendance rates within different age groups. Indeed, although the attendance rate in Finland does not vary between the age groups, the same is not necessarily true elsewhere. The flexibility of our model allows to adapt these and other different characteristics provided that all the data would be available. Note that the model also would allow to utilise an age distribution similar to the underlying population, if one were more interested in the exact costs and gains instead of comparing different strategies. Writing -review & editing: Pauliina Ilmonen, Lauri Viitasaari, Tytti Sarkeala, Sirpa Heinävaara.