Novel Anticoagulants for Stroke Prevention in Atrial Fibrillation: A Systematic Review of Cost-Effectiveness Models

Objective To conduct a systematic review of economic models of newer anticoagulants for stroke prevention in atrial fibrillation (SPAF). Patients and Methods We searched Medline, Embase, NHSEED and HTA databases and the Tuft’s Registry from January 1, 2008 through October 10, 2012 to identify economic (Markov or discrete event simulation) models of newer agents for SPAF. Results Eighteen models were identified. Each was based on a lone randomized trial/new agent, and these trials were clinically and methodologically heterogeneous. Dabigatran 150 mg, 110 mg and sequentially-dosed were assessed in 9, 8, and 9 models, rivaroxaban in 4 and apixaban in 4. Warfarin was a first-line comparator in 94% of models. Models were conducted from United States (44%), European (39%) and Canadian (17%) perspectives. Models typically assumed patients between 65–73 years old at moderate-risk of stroke initiated anticoagulation for/near a lifetime. All models reported cost/quality-adjusted life-year, 22% reported using a societal perspective, but none included indirect costs. Four models reported an incremental cost-effectiveness ratio (ICER) for a newer anticoagulant (dabigatran 110 mg (n = 4)/150 mg (n = 2); rivaroxaban (n = 1)) vs. warfarin above commonly reported willingness-to-pay thresholds. ICERs vs. warfarin ranged from $3,547–$86,000 for dabigatran 150 mg, $20,713–$150,000 for dabigatran 110 mg, $4,084–$21,466 for sequentially-dosed dabigatran and $23,065–$57,470 for rivaroxaban. Apixaban was found economically-dominant to aspirin, and dominant or cost-effective ($11,400–$25,059) vs. warfarin. Indirect comparisons from 3 models suggested conflicting comparative cost-effectiveness results. Conclusions Cost-effectiveness models frequently found newer anticoagulants cost-effective, but the lack of head-to-head trials and the heterogeneous characteristics of underlying trials and modeling methods make it difficult to determine the most cost-effective agent.


Introduction
Atrial fibrillation (AF) affects approximately 3 million people in the Unites States (U.S.), and this number may reach as high has 12 million by 2050 [1]. AF is associated with a significant financial burden, costing the U.S. healthcare system about $26 billion annually [2]. While hospitalizations are the primary driver of these costs (52%); the cost of pharmacologic management of AF is also noteworthy (23%) [3].
One of the primary concerns accompanying the diagnosis of AF is the associated 4-to 5-fold increase in ischemic stroke risk [4]. Guidelines for the management of AF recommend the use of pharmacologic agents for the prevention of stroke depending on baseline risk [5][6][7]. For patients at moderate-to-high risk of stroke, a vitamin K antagonist such as warfarin has traditionally been recommended. However, its use has been limited by its narrow therapeutic index and food and drug interactions [8,9]. Therefore, alternative anticoagulants have been evaluated in recent years. To date, two agents (dabigatran, rivaroxaban) have received approval by the United States Food and Drug Administration (FDA) for prevention of stroke and systemic embolism in patients with AF, with a third (apixaban) currently under consideration. Clinical trials have demonstrated these agents to have at least similar impact on reducing stroke rates compared to warfarin with comparable or improved safety profiles [10][11][12].

Data Sources and Searches
We searched the MEDLINE, EMBASE, National Health Service Economic Evaluation Database (NHS EEDS) and Health Technology Assessment (HTA) bibliographic databases along with the Tufts Cost-Effectiveness Analysis Registry. Searches were conducted for economic studies published between January 2008 and October 10, 2012. The start date of our search corresponded with the first published outcomes study of dabigatran. Our searches utilized Medical Subject Heading (MeSH) terms and keywords for AF, economic modeling and the newer anticoagulants (see Text S1). Finally, we also reviewed references from included models to identify additional relevant citations.

Study Selection
Two investigators independently reviewed all abstracts and screened all potentially relevant, full-text articles for inclusion in a parallel manner using a priori-defined criteria. We included evaluations of the cost-effectiveness of pharmacologic agents for SPAF using a Markov or discrete event simulation model design. To be included models had to evaluate both cost (in monetary units) and effectiveness outcomes (i.e., life-years or quality-adjusted life-years (QALYs)). Models had to be available as a full-text publication and be published in the English language. Manufacturer's models reported as part of government reports [i.e., National Institute for Health and Clinical Excellence (NICE) or Canadian Agency for Drugs and Technologies in Health (CADTH)] were also included in this review; however, models presented solely at professional meetings or available only in abstract form were excluded.

Data Extraction
Two investigators used a standardized data abstraction tool to independently extract data for each model with disagreement resolved by discussion. We collected the following information from each model: 1) primary comparisons made; 2) characteristics of the base-case population; 3) model structure and assumptions (e.g., similarity to ''progenitor'' models, health states, study perspective, discount rate, time horizon, cycle length, types of sensitivity analysis, willingness-to-pay threshold(s) (WTP(s)) utilized etc.); 4) characteristics related to both internal and external of the models themselves and that of the randomized trials underlying/ driving the economic models (e.g., use of blinding, intention-totreat methods, inclusion/exclusion criteria, CHADS 2 scores, methods for dosing warfarin, time in the therapeutic international normalized ratio (INR) range, etc.); and 5) results including basecase and sensitivity analyses. For the purpose of this review, a ''progenitor'' model was defined as the earliest published models using a distinct structure and serving as a template for future models.

Quality Assessment of Economic Models and Underlying Trials
We conducted a critical appraisal of the methodology and reporting of the included models (with the exception of the government reports) using the Quality of Health Economic Studies (QHES) rating scale [31,32]. The QHES is a validated assessment of quality for cost-effectiveness analyses and contains 16 evaluable items. Each item carries a weighted point value, with total possible scores ranging from 0 (lowest quality) to 100 (highest quality). An explanation of our QHES scoring of included models is available in Supporting Information: Text S2. In addition, we evaluated the internal validity of the models' ''underlying'' trials using the Jadad scale [33]. For the purpose of this review, ''underlying'' trial(s) were defined as those used as the principal sources for drug-specific safety and efficacy inputs in each of the economic analyses. The Jadad scale assesses inherent controllers of bias by assessing randomization, double-blinding, and proper reporting of patient withdrawals. These individual components were assessed and an aggregate score was calculated for each included trial (0 = weakest, 5 = strongest). Two investigators performed all quality assessments independently with disagreement resolved through discussion.

Data Synthesis
The current report provides summary statistics and qualitative (descriptive) synthesis of identified economic models in the form of tables and figures. Categorical data are reported as percentages, while continuous data are reported as means 6 standard deviations. The authors have followed the PRISMA Statement in reporting this systematic review (see Checklist S1).
All of the analyses were Markov models except one [14], which was a discrete event simulation. The majority of Markov models appeared to be derivatives of one of 2 earlier models created to assess the cost-effectiveness of adjusted-dose warfarin [16,34]. Authors utilizing these ''progenitor'' models by Gage and Sorensen as templates made small modifications; such as the inclusion of myocardial infarction or dyspepsia as a health state [15,23], or the alteration of the method for handling recurrent strokes [15], but preserved the core design of the models. A noteworthy difference between the two basic model structures is Sorensen's inclusion of both ischemic stroke and systemic embolism as health states, which more closely matches the FDAapproved indication of the newer anticoagulants (see Supporting Information: Figure S1).
Included models reflected the healthcare systems of various countries, including eight from the U.S. [13,15,20,[23][24][25][26][27], four from the United Kingdom [14,17,21,28], three from Canada [16,29,30], and one each from Denmark [22], Sweden [18] and Spain [19]. Patients, with a CHADS 2 score generally between 2-3 (ranging from 0-6, often with percentages of the cohort at varying stroke risks to match the RE-LY [10] or ROCKET-AF [11] populations), initiated anticoagulant therapy between 65 and 73 years of age and were followed for as little as one year and up to a lifetime. Warfarin and dabigatran were the most common treatment arms, used in 94% and 78% of included models (see Figure S2), respectively, and dabigatran versus warfarin (56%) was the most frequent primary comparison (see Figure S3). Greater than two thirds of the warfarin containing models tested the impact of varying INR control on the reported results. There was a lack of consensus regarding drug persistence after acute events. After experiencing an intracranial hemorrhage (ICH), patients typically permanently discontinued anticoagulation and may or may not have initiated aspirin monotherapy, whereas after a non-fatal extracranial bleed, patients either temporarily discontinued treatment for up to 3 months before restarting the initial anticoagulant or permanently discontinued therapy. Drug discontinuation rates were typically derived from the underlying randomized controlled trial (RCT). Just under a quarter of models reported using a societal perspective, though none included indirect costs due to lost productivity. Cycle lengths ranged from two weeks to one year, with the most common being three months (44%). Costs and health outcomes were generally discounted appropriately using country-specific guidance at rates ranging from 2%-5%. Finally, just over one third of included models were funded or supported by pharmaceutical companies with other models receiving funding from government institutions and foundations.
The quality of the included models, using the QHES tool, ranged from a low of 68 [22] to a high of 89 [21,23]. Thirteen of the 18 models (72%) had a QHES score .75 and were considered high quality. The most common reasons for lower quality scores on the QHES included incorrectly reporting the perspective used   (i.e., claiming a societal perspective but not including indirect costs) or not justifying the chosen perspective; not conducting or describing a literature search to identify model inputs; failure to report or justify the discount rate used; not including health states such as minor bleeding or dyspepsia in the model (when relevant); and not providing information regarding model funding/sponsorship (see Supporting Information: Figure S4). All of the included models were strongly based upon/driven by at least one of 4 randomized controlled trials, or in the case of the few models comparing the cost-effectiveness of newer anticoagulants head-tohead, through an indirect statistical comparison of these same trials [10][11][12]35]. Table 2 includes detail from the clinical trials that ''underlie'' the reviewed models, including quality scoring for each. Of note all but one trial [10], which utilized an open-label design to compared dabigatran vs. warfarin, scored a five on the Jadad scale.

Dabigatran Models
Of the 13 models that directly compared dabigatran to warfarin, 8 assessed dabigatran 150 mg, 7 assessed dabigatran 110 mg, and 8 assessed sequentially-dosed dabigatran. Seven models based on the ''progenitor'' model by Sorensen et al. [16] were very similar in terms of model characteristics, with slight adjustments pertaining to specific countries (e.g., country-specific costs, discount rates, life tables to model non-event death). On the other hand, the four models based on Gage et al. [34] had more variation in model properties and structure (e.g., time horizon, cycle length, population characteristics, health states modeled). Of note, one model based on Gage et al. included only patients with a prior stroke or transient ischemic attack (TIA) [20], while the other models included a mixed population of AF patients with or without a prior stroke or TIA (typically around 20%). Of the remaining two models, one employed discrete event simulation, and the other exhibited a unique model structure. All 13 dabigatran models included a myocardial infarction (MI) health state, 11 included a minor bleed health state, and 12 assessed the impact of INR control on the results. Eight of the 13 models included a systemic embolism health state (seven of which were derivatives of Sorensen et al.), but only two of 13 modeled a dyspepsia health state despite this adverse event significantly differing in incidence between treatment groups in RE-LY. All 13 models derived efficacy and safety data from the RE-LY trial. In total, 78% of dabigatran vs. warfarin ICERs were cost-effective at their respective WTP thresholds (four dabigatran 110 mg and two 150 mg comparisons vs. warfarin had ICERs above commonly reported WTPs) and ranged from $3,547-$86,000 for dabigatran 150 mg; $20,713-$150,000 for dabigatran 110 mg; and $4,084-$21,466 for sequentially-dosed dabigatran (Table 3, Figure 2). The model by Shah et al. [15] did not find dabigatran costeffective, perhaps due to the chosen cost of dabigatran. The authors surveyed four retail pharmacies and used the median cost of USD$9 per day, whereas other models typically used a cost less than USD$5 per day. Freeman et al. [13] also utilized a higher cost for dabigatran which may have pushed the ICER for dabigatran 110 mg above the WTP threshold. Though dabigatran 150 mg was cost-effective in their original analysis, the authors updated the results based on a lower cost of dabigatran 150 mg which decreased the ICER from $43,372 to $12,386 compared to warfarin. Of the 13 models comparing dabigatran to warfarin, 9 performed probabilistic sensitivity analyses (PSA) which demonstrated dabigatran 150 mg to be cost-effective in 44.9%-93% of iterations; dabigatran 110 mg in 42%-67% of iterations; and sequentially-dosed dabigatran in 82%-100% of iterations at the lowest reported WTP threshold compared to warfarin. All 13   models performed one-way sensitivity analyses and the results were often sensitive to baseline rates/relative risks of ischemic stroke or ICH on dabigatran/warfarin, time in therapeutic INR range, and costs of acute events and long term disability care.

Rivaroxaban Models
Of the four models directly comparing rivaroxaban to warfarin, three were derivatives of Sorensen et al. [16] and one of Gage et al. [34]. Similar to dabigatran, rivaroxaban models adapted from Sorensen et al. tended to be consistent in model structure and characteristics, adjusting as necessary for country specific costs, discount rates and life tables. All four models used safety and efficacy data from the ROCKET-AF trial, though base-case population characteristics varied among the four models, with three models employing hypothetical cohorts with CHADS2 risks similar to or matching patients in ROCKET-AF, and one employing a typical patient profile from RE-LY. All four models included MI and minor bleed health states, whereas the three models based on Sorensen et al. also included a systemic embolism health state. Even though all four models compared rivaroxaban to warfarin, only two of four models measured the impact of INR control on their results. In total, 3 of the 4 of rivaroxaban vs. warfarin ICERs were cost-effective at their respective WTP thresholds and ranged from $23,065-$57,470 (Table 3, Figure 2). Regardless, upon PSA, rivaroxaban was found to be cost-effective in at least 75% (up to 80.1%) of iterations at the lowest reported WTP thresholds. Upon one-way sensitivity analysis, results were typically sensitive to baseline rates/hazard ratios of ischemic stroke or ICH on rivaroxaban/warfarin, time horizon and the percentage of time spent in a therapeutic INR range.

Apixaban Models
Four models included apixaban as a first line therapy for SPAF, three of which were compared to warfarin, and one compared to aspirin in a cohort of patients deemed unsuitable for warfarin. Three of the four models were adapted from Gage et al. [34], and as with the other drug models, varied in model characteristics and structure (e.g., time horizon, cycle length, health states modeled). Of note, one model based on Gage et al. modeled only patients with a prior stroke or TIA [20]. All four models included an MI health state; three modeled minor bleeding; and only one included systemic embolism as a possible health state. Of the three models comparing apixaban to warfarin, two assessed the impact of INR control on their results. In all the models comparing apixaban to warfarin, apixaban was shown to be at least a cost-effective strategy with ICERs ranging from $11,400-$25,059, if not dominant (Table 3, Figure 2). Upon PSA, apixaban was deemed a cost-effective strategy between 62%-98% of iterations compared to warfarin. Results of these three models were typically sensitive to changes in the cost of apixaban, baseline rates of stroke/ICH and time horizon. One model directly compared apixaban to aspirin in a hypothetical cohort of patients unsuitable for warfarin therapy. The authors chose to run two base-case analyses; one assuming a trial-length follow-up (1-year to match the mean follow-up of the AVERROES trial [35]), and one employing a longer-term (10 year) follow-up of patients. In the trial-length model, apixaban was dominated by aspirin and upon PSA was estimated to be cost-effective in only 11% of iterations. However, when a longer-time horizon was utilized, apixaban was the dominant strategy to aspirin, and was shown to be costeffective in 96.7% of iterations at the reported WTP threshold. Results of this model were sensitive to the time horizon, rate of stroke on apixaban/aspirin and the monthly cost of major stroke upon one-way sensitivity analysis.

Models Based Upon Indirect Treatment Comparison Meta-Analyses
Three models indirectly compared newer anticoagulants; two compared rivaroxaban to dabigatran, and one compared rivaroxaban, dabigatran and apixaban. The models derived clinical event rates using methodologies of either a mixed or indirect treatment comparison meta-analysis with warfarin as a common comparator. Data for these indirect comparisons were taken from RE-LY and PETRO, ROCKET-AF and ARISTOTLE for dabigatran, rivaroxaban and apixaban, respectively [10][11][12]36]. Two models  [28,29] compare dabigatran and rivaroxaban outcomes based consistently on the safety-on-treatment (SOT) populations, whereas Wells et al. [30] compared dabigatran and apixaban outcomes based on the intention-to-treat (ITT) population with rivaroxaban outcomes based on both SOT and ITT populations. All three models were derivatives of Sorensen et al. [16], though two modeled a cohort of patients similar to the ROCKET-AF trial, while the third more closely matched RE-LY. Rivaroxaban was the dominant strategy compared to both sequential dabigatran and a pooled dabigatran 110 mg/150 mg strategy in one model, whereas sequential dabigatran and dabigatran 150 mg were found to be dominant strategies compared to rivaroxaban in the remaining models. Apixaban was dominated by dabigatran 150 mg, dominant compared to dabigatran 110 mg (in one model) and dominant compared to rivaroxaban (in one model), while rivaroxaban was dominant in its lone comparison versus dabigatran 110 mg. Upon PSA, one model did not report PSA results for the rivaroxaban to dabigatran comparison; while another model showed dabigatran 150 mg to be the most costeffective agent in 68.1% of iterations, followed by apixaban (29%), rivaroxaban (1.4%), warfarin (0.9%), and dabigatran 110 mg (0.6%); and the last model showed sequential dabigatran to be the most cost-effective agent in 98% of iterations compared with rivaroxaban and warfarin (Table 3, Figure 2). Results of the model by Edwards et al. were sensitive to the time spent in INR range upon one-way sensitivity analysis [28]. In the comparison of dabigatran, rivaroxaban, and apixaban by Wells et al., results were also sensitive to time spent in INR range, along with the cost of apixaban, time horizon, and baseline stroke risk [30]. Interestingly, in the model by Kansal et al., dabigatran remained the preferred treatment option in all one-way sensitivity analyses performed [29].
One of the challenges in attempting to evaluate the comparative cost-effectiveness of newer oral anticoagulants is the difficulty in making cross-model comparisons. This is likely true in the case of these newer SPAF models, even though a majority of them used the basic and common structures of Gage [34] or Sorensen [16]. This is because the models had some differences in health states included, made different assumptions and used varying inputs. In some instances, similar models were performed from the perspective of varying countries, this was necessary in order to not only address differences in costs, discount rates and average life spans (life tables), but also to address the varying approved dosing schemes from country-to-country (i.e., sequentially-dosed dabigatran is not an FDA approved regimen). Three models used data from either adjusted indirect comparison meta-analyses or network meta-analyses [28][29][30]; however, even the results of these models must be interpreted with caution due to important differences in the studies that underlie the comparisons and the conduction of the indirect comparisons themselves. Of importance, the 3 major clinical trials evaluating the newer oral anticoagulant agents vs. warfarin differ in notable ways [10][11][12].The ROCKET-AF trial enrolled patients at higher baseline ischemic stroke risk than the RE-LY or ARISTOTLE trials, with mean CHADS 2 scores of 3.5, 2.1, and 2.1, respectively. In addition, the quality of warfarin dosing was not consistent across studies with patients spending less time within the therapeutic INR range in ROCKET-AF (55%) versus either RE-LY (64%) or ARISTOTLE (62%). In fact, methodological guidance documents would suggest this may be an inappropriate situation for indirect comparison due to the lack of comparability/heterogeneity of the trials to be pooled [37][38][39]. Also, as alluded to previously, endpoint data used both within and across the indirect comparisons were not always based on the same trial populations/analysis methods, some using ITT populations and others using SOT populations. Thus, it is not surprising that these indirect comparison metaanalyses had disparate effect size estimates for many of the key model inputs [29,30,[40][41][42]. In 5 identified meta-analyses making indirect comparison of at least 2 of the newer agents, marked variation in relative effect size estimates can be observed. For example, odds ratios of dabigatran versus rivaroxaban ranged from: 0.74-0.85 for stroke/systemic embolism, 0.95-1.06 for allcause mortality, and 1.59-1.76 for acute MI. Similarly hazard ratios ranged from 0.96-1.04 for all-cause mortality, 1.40-1.57 for acute MI and 0.48-0.63 for ICH.
Importantly, all of the identified models in this review utilized a lone RCT (or an indirect comparison in which only a lone study existed for a given direct comparison) to characterize the main efficacy and safety comparisons between treatments. Data from these short-term clinical trials had to be extrapolated to longer time horizons in order to estimate the cost-effectiveness of agents. While in theory, conducting a piggy-backed economic analysis alongside a substantially longer RCT would yield more rigorous results, this would be both time and cost prohibitive. Thus, this limitation of the underlying trials leads to the greatest asset of models; that is, they systematically allow for extrapolation of data to provide decision-makers with some, albeit not perfect, data to make necessary coverage decisions. In addition, while these extrapolations involve generalizations and assumptions, modeling provides a way of systematically managing uncertainty and assessing the impact of these assumptions on the results through sensitivity analyses [43,44].
The lack of standardized guidelines for conducting economic analyses poses problems in the accurate validity assessment, and therefore interpretation of the results and conclusions of these analyses. The use of outdated non-drug specific may reduce the validity of some of these models. Variations in the inclusion of health states, even across models assessing similar drugs, also presents difficulties in translating results, especially in cases of disagreement in the conclusions of those models. Decision makers must be aware of these caveats when clinical and coverage decisions are formed on the basis of these economic analyses.

Conclusions
Many researchers have published cost-effectiveness models of the novel anticoagulants for SPAF. These models suggest that the novel anticoagulants are cost-effective, but do not provide adequate data for direct comparison of the individual agents. For now, it seems prudent to choose anticoagulation therapy on a patient-specific basis. Standardization of the structure and inputs to assure that important health states are not being ignored and the best and most recent inputs are utilized would improve future comparisons between SPAF models. In addition, head-to-head trials of the newer oral anticoagulants would aid health economists to assess their comparative cost-effectiveness.