Less intensive antileukemic therapies (monotherapy and/or combination) for older adults with acute myeloid leukemia who are not candidates for intensive antileukemic therapy: A systematic review and meta-analysis

Introduction Elderly patients with acute myeloid leukemia not eligible for intensive antileukemic therapy are treated with less intensive therapies, uncertainty remains regarding their relative merits. Objectives To compare the effectiveness and safety of less intensive antileukemic therapies for older adults with newly diagnosed AML not candidates for intensive therapies. Methods We included randomized controlled trials (RCTs) and non-randomized studies (NRS) comparing less intensive therapies in adults over 55 years with newly diagnosed AML. We searched MEDLINE and EMBASE from inception to August 2021. We assessed risk of bias of RCTs with a modified Cochrane Risk of Bias tool, and NRS with the Non-Randomized Studies of Interventions tool (ROBINS-I). We calculated pooled hazard ratios (HRs), risk ratios (RRs), mean differences (MD) and their 95% confidence intervals (CIs) using a random-effects pairwise meta-analyses and assessed the certainty of evidence using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach. Results We included 27 studies (17 RCTs, 10 NRS; n = 5,698), which reported 9 comparisons. Patients were treated with azacitidine, decitabine, and low-dose cytarabine (LDAC), as monotherapies or in combination with other agents. Moderate certainty of evidence suggests no convincing difference in overall survival of patients who receive azacitidine monotherapy compared to LDAC monotherapy (HR 0.69; 95% CI, 0.31–1.53), fewer febrile neutropenia events occurred between azacitidine monotherapy to azacitidine combination (RR 0.45; 95% CI, 0.31–0.65), and, fewer neutropenia events occurred between LDAC monotherapy to decitabine monotherapy (RR 0.62; 95% CI 0.44–0.86). All other comparisons and outcomes had low or very low certainty of evidence. Conclusion There is no convincing superiority in OS when comparing less intensive therapies. Azacitidine monotherapy is likely to have fewer adverse events than azacitidine combination (febrile neutropenia), and LDAC monotherapy is likely to have fewer adverse events than decitabine monotherapy (neutropenia).


Introduction
Elderly patients with acute myeloid leukemia not eligible for intensive antileukemic therapy are treated with less intensive therapies, uncertainty remains regarding their relative merits.

Objectives
To compare the effectiveness and safety of less intensive antileukemic therapies for older adults with newly diagnosed AML not candidates for intensive therapies.

Methods
We included randomized controlled trials (RCTs) and non-randomized studies (NRS) comparing less intensive therapies in adults over 55 years with newly diagnosed AML. We searched MEDLINE and EMBASE from inception to August 2021. We assessed risk of bias of RCTs with a modified Cochrane Risk of Bias tool, and NRS with the Non-Randomized Studies of Interventions tool (ROBINS-I). We calculated pooled hazard ratios (HRs), risk ratios (RRs), mean differences (MD) and their 95% confidence intervals (CIs) using a random-effects pairwise meta-analyses and assessed the certainty of evidence using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach.

Introduction
Acute myeloid leukemia (AML) is a heterogeneous hematopoietic stem cell cancer with incomplete maturation of blood cells and a reduced production of normal hematopoietic elements [1]. AML is more common in older adults with a median age at diagnosis of 67 years old; one-third of cases occur in patients older than 75 years [2]. Overall survival (OS) is strongly linked to clinical and biologic characteristics; age, performance status (PS), karyotype, mutational status and response to induction therapy [3]. For example, younger patients (2 to 30 years) have a much better 5-year OS than older patients (65 to >85 years) (57% to 42%, compared to 6.8% to 1.2%) [4,5].
Some older patients diagnosed with AML are not eligible for intensive treatment, limiting their therapeutic options [6]. Less intensive therapy with hypomethylating agents or low-dose cytarabine, as examples, has been used to treat older AML patients who are not candidates for intensive therapy [7].
In their 2020 guidelines, the American Society of Hematology (ASH) provided recommendations for the treatment of older adults with newly diagnosed AML who are considered appropriate for antileukemic therapy, but not intensive antileukemic therapy [8]. When choosing between monotherapies, the guideline panel conditionally recommended the use of either hypomethylating-agents (azacitidine or decitabine) or low-dose cytarabine and, when choosing between monotherapies or combinations, the guideline panel conditionally recommend using monotherapy [8].
To inform the recommendations provided by the ASH 2020 guideline for Treating Newly Diagnosed Acute Myeloid Leukemia in Older Adults [8]. We conducted a systematic review to compared the comparative effectiveness and safety of low-intensity antileukemic therapies (monotherapy and/or combination) in older adults with newly diagnosed AML who are not candidates for intensive therapy.

Eligibility criteria
We included randomized clinical trials (RCTs) and comparative non-randomized studies (NRS) of adults 55 years or older, with newly-diagnosed AML published in any language comparing the following less intensive therapies against each other, either as a monotherapy or in combination with any secondary agent: gemtuzumab ozogamicin, low dose cytarabine (LDCA), azacitidine (AZA) and decitabine (DEC). Outcomes of interest were mortality, quality of life, functional status, recurrence, morphologic complete remission, severe toxicity (CTC adverse effects grade 3 or higher), or burden on caregivers, measured in any way. We excluded studies that enrolled patient with acute promyelocytic leukemia, or myeloid proliferations related to Down syndrome and those in which researchers combined any of the interventions of interest with any agent considered a component of intensive antileukemic therapy regimens. Detailed description of the eligibility criteria-type of studies, participants, interventions and outcomes-is reported in S1 Appendix.

Information sources and search
We searched MEDLINE and EMBASE from inception to August 2021 without restrictions on language of publication. For informing the ASH recommendations, we searched for studies published through July 2019.
We conducted an umbrella search that encompassed all the questions addressed in the guideline [8]. The supporting information file describes the search strategies items (S2 Appendix). We checked the reference lists of reviewed studies and contacted clinical experts for additional references.

Study selection and data collection process
Pairs of reviewers screened titles and abstracts obtained through the electronic searches and identified those potentially eligible. We then grouped studies according to the question they addressed and conducted full text screening specifically for our question. Four reviewers, independently working in pairs (BPR, NKF, AA, LECL) made eligibility decisions. If reviewers could not resolve disagreement through discussion, a third reviewer adjudicated (RBP).
Pairs of reviewers independently abstracted data on a standardized form. We extracted the following information: type of study, recruitment time-frame, follow-up (months), sample size, participant characteristics, as age (years), gender, cytogenetics (intermediate or poor), performance status (ECOG or WHO classification), white cell count, AML diagnosis criteria, trial location, source of funding, trial registry interventions (main agent, dose and second agent for combination therapy groups), comparisons (main agent, dose and second agent for combination therapy groups), and outcomes (mortality, quality of life, functional status, recurrence, morphologic complete remission, severe toxicity (CTC adverse effects grade 3 or higher), or burden on caregivers, at any time point. If reviewers could not resolve disagreement through discussion, a third reviewer adjudicated (RBP).

Risk of bias in individual studies
Pairs of reviewers (BPR, NKF, AA, LECL), independently assessed risk of bias for each randomized controlled trial using a modified version of the Cochrane risk of bias tool for randomized trials [12] and, for nonrandomized studies, the Risk of Bias Assessment Tool for Non-Randomized Studies of Interventions ROBINS-I tool [13].

Data analysis
We calculated the relative effect of less intensive therapies using hazard ratios (HR) for time to event data, relative risk (RR) for dichotomous outcomes, and mean difference for continuous outcomes, with their 95% confidence intervals (CIs). We used random-effects models with the DerSimonian-Laird estimate of heterogeneity to pool data across studies reporting the same comparison and outcome [10]. We used forest plots to display comparisons with two or more pooled studies. We carried out all statistical analyses using Review Manager 5.3 [14]. We planned to conduct a network meta-analysis to compare all interventions against each other, but there was no sufficient data to conduct such analysis (data not shown). We analyzed data from RCTs and NRSs separately.

Dealing with missing data
When details about study design or descriptive statistics for outcomes were not presented in original publications, we did not impute data but rather contacted authors for additional information.

Assessment of the certainty of evidence by outcome
We used the Grading of Recommendation, Assessment, Development, and Evaluation (GRADE) methodology to rate the certainty of evidence (also known as quality of evidence) for each outcome as high, moderate, low, or very low [15]. The assessment included judgments addressing risk of bias, imprecision, inconsistency, indirectness, and publication bias [15]. In addition, we assessed the magnitude of the effect, the presence of dose-response relationships, and residual confounding when rating the certainty of evidence from NRS [16]. We estimated absolute effect measures to facilitate the decision-making process [17]. Using absolute effects that we calculated based on the baseline risk of the comparator arms in the included studies, we rated the certainty that there was any benefit or any harm using a minimally contextualized approached [18]. We rated down due to imprecision if the confidence intervals crossed the null effect, and if the effect estimate was obtained from a small number of participants or events [19]. We assessed inconsistency between studies by visual inspection of forest plots, in particular the extent of overlap of confidence intervals (CI), the Q statistic (with a p value � 0.1 as a suggestion of important statistical heterogeneity), and the I 2 value [20]. We planned, if ten or more studies were available for a particular outcome, to create a funnel plot to assess publication bias by visual inspection [21]. Because we had multiple comparisons, we created Summary of Findings Tables for each comparison [22] and outcome using GRADEpro GDT (www.gradepro.org) [23].

Subgroup and sensitivity analysis
We pooled and reported results from RCTs and NRS separately. We planned to conduct sensitivity analyses to explore the impact of the risk of bias in the effect estimates. We performed a subgroup analysis to explore the impact of the secondary agent (when comparing a combination therapy group) in the effect estimates, when there were sufficient studies. The number of studies per comparisons did not allow us to explore any subgroup analysis based on patients' characteristics (e.g., gender)

Risk of bias of the included studies
We provide a detailed description of the risk of bias assessment per study and domain in S4 Appendix. All NRS had serious risk of bias due to confounding because patient baseline characteristics were different between the treatment groups [36,[38][39][40][41][42]44,45,50,51]; two of the 10 studies had bias in the selection of participants into the study (serious (36) and moderate [51]); three of the studies had moderate risk of bias in the classification of the interventions, [38, 42,45]; and seven of the studies had bias due to deviations from the intended interventions (serious [42] and moderate [38,39,41,[43][44][45]). None of the studies had risk of bias due to missing data, outcomes measurements and selective reporting (S4 Appendix). All RCTs had low or probably low risk of bias in the sequence generation domain [24][25][26][27][28][29][30][31][32][33][34][35]37,[46][47][48][49]; three of the 17 studies had high risk of bias in the allocation concealment domain [26-28]; all the studies had low or probably low risk of bias in the blinding domains (performance and outcome measurement), missing data and selective reporting (S4 Appendix).

Effects of the interventions
We summarize the effects of the interventions and their associated certainty of the evidence by creating one table per outcome. Table 2 summarize the effect of the interventions on the overall survival of the participants, Table 3 summarizes the effect of the interventions on the infectious severe adverse events (CTC adverse effects grade 3 or higher), and Table 4 summarizes the effect of the interventions on the non-infectious severe adverse events (CTC adverse effects grade 3 or higher). S1 Table summarizes the effect of the interventions on 1-year mortality, 30-days mortality, complete remission and length of hospital stay, and S2 Table summarizes the certainty of evidence from the sub-group analyses.
NA 10 NA 10 DECC compared to AZAC may not have little or no effect on mortality however, we are very uncertain about this effect.
Baseline risk information came from control group from the included studies. HR; hazard ratio, RR, relative risk, RCT, Randomized controlled studies, NRS, Nonrandomized trials.   [42,45]). However, the certainty of the evidence was low, and very low, respectively, which means that we are not certain about the true effect of the interventions (S1 Table). DEEC compared to AZAC may have little or no effect on sepsis, however, we are very uncertain about this effect.
Baseline risk was obtained from the control group from the included studies. 1. We decided to rate down two levels due to imprecision: effect estimate is not consistent with benefit or harm and effect estimate comes from a single study.
2. We decided to rate down two levels due to risk of bias and imprecision: Allocation concealment was not described; adaptive randomization based on results, increase likelihood to be predicted and effect estimate comes from a single study and effect estimate is not consistent with benefit and harm. 3. We decided to rate down by one level due to imprecision: effect estimate is not consistent with benefit or harm.
4. We decided to rate down by one level due to imprecision: effect estimate comes from a single study.
5. We decided to rate down two levels due to inconsistency and imprecision: I2 62% (p-value 0.05) and effect estimate is not consistent with benefit or harms. 6. We decided to rate down two levels due to risk of bias and imprecision: Some of the covariates were not equal distribute among the participants (e. g. Hydroxyurea before study initiation) and The interventions related to the second agent might influence the treatment in the comparisons; Different proportions of patients in each group received granulocyte colony-stimulating factor or prophylactic non-azole antifungal agents. Venetoclax dose could be modified according to toxicity and effect estimate is not consistent with benefit or harms.
7. We decided to rate down two levels due risk of bias and imprecision: Performance status is different between the treatments under comparison (ECOG 3; 35.8% vs 0%), intervention status is well defined but some aspects of the assignments of intervention status were determined retrospectively and not clear if switches in treatment happen or co-interventions, also not clear if this was adjusted in the analysis and effect estimate comes from a single study.
(Continued ) LDACM compared to LDACC may have little or no effect on Hypoxia/Respiratory Failure.
Baseline risk was obtained from the control group from the included studies.
1. We decided to rate down one level due to imprecision: effect estimate is not consistent with benefit or harm. 2. We decided to rate down two levels due to imprecision: Effect estimate comes from single study and is not consistent with benefit or harm.
3. We decided to rate down two levels due to inconsistency and imprecision. I2 41% (p-value 0.18) and effect estimate is not consistent with benefit or harm.
4. We decided to rate down two levels due to serious inconsistency and imprecision; effect estimate not consistent with benefit or harm and I2 of 45%. 5. We decided to rate down two levels due to risk of bias and imprecision. Some of the covariates were not equal distribute among the participants (e. g. Hydroxyurea before study initiation) and The interventions related to the second agent might influence the treatment in the comparisons; Different proportions of patients in each group received granulocyte colony-stimulating factor or prophylactic non-azole antifungal agents. Venetoclax dose could be modified according to toxicity and effect estimate is not consistent with benefit or harms.
6. We decided to rate down one level due to imprecision; effect estimate come from a single study.
7. We decided to rate down two levels due to inconsistency and imprecision: I2 84% (p-value 0.01) and effect estimate is not consistent with benefit and harm. 8. We decided to rate down by two levels due to risk of bias and imprecision. Confounding expected due to imbalance in the compared groups. ( 49]). The comparisons suggested have little or no difference on patient mortality at 30 days. However, the certainty of the evidence was low (S1 Table).

Infectious adverse events (AEs)
Septic shock. Two RCTs (421 patients) addressing two comparisons reported septic shock (AZAM monotherapy vs LDAC monotherapy [24], and, AZA monotherapy vs AZA plus vorinostat [33]). The comparisons suggested little or no difference in the development of septic shock. However, the certainty of the evidence was low, and very low respectively (Table 3).
Hospitalization and hypoxia. One NRS (478 patients) [40] and one RCT (87 patients) [30] addressing two comparisons reported on hospitalization (very low certainty evidence) and hypoxia/respiratory failure (low certainty evidence). When comparing LDAC monotherapy vs LDAC combination no difference was found in hypoxia/respiratory failure development. When comparing AZA monotherapy vs DEC monotherapy, fewer hospitalizations occurred in favor of AZA monotherapy. However, we are very uncertain about this effect (Table 4).

Subgroup and sensitivity analysis
The included studies did not provide sufficient information to performed a sensitivity analysis base on the risk of bias. We observed important inconsistency in two comparisons from two outcomes: Overall survival (DEC monotherapy vs DEC combination [37, 46,47] [46] has little or no effect in the overall survival of participants compared to DEC monotherapy. When comparing all-trans retinoic acid (HR 0.58, 95% CI 0.37-0.91, N = 1 RCT arm) and all-trans retinoic acid plus valproate (HR 0.62, 95% CI 0.40-0.96, N = 1 RCT arm) against DEC monotherapy, patients treated with the combination therapy shown higher overall survival (S1 Fig) [46]. However, we are uncertain about the true effect of these comparisons (S2 Table).
Complete remission. We identified four secondary agents from four RCTs (843 patients) reporting the 12-month relapse-free survival. All the comparisons are low certainty of evidence. Gemtuzumab ozogamicin plus LDAC against LDAC monotherapy (HR 1.11, 95% CI 0.73-1.69, N = 1 RCT, 494 participants) has little or no effect in the 12-month relapse-free sur-  [49] we found an improvement of the 12-month relapse-free survival on patients treated with LDAC combination therapy (S3 Fig). However, we are uncertain about the true effect of these comparisons (S3 Table).

Discussion
The elderly population diagnosed with AML who are not candidates for intensive antileukemic therapy propose an important challenge. In the last two decades' new therapeutic options have become available with a reasonable effectiveness and excellent toxicity profile. However, uncertainty remains about the comparative effectiveness and safety of the different available options. In order to help clinicians and patients during the decision-making process, we summarize the best available evidence by conducting a systematic review with several metaanalyses.

Summary of the evidence
Our systematic review identified three main drugs (azacitidine, decitabine and low-dose cytarabine), as monotherapies or in combination, addressing nine comparisons. We found information on patients´OS, 1-year mortality, 30-days' mortality, infectious and non-infectious AEs, complete remission and length of hospital stay. We found no evidence regarding quality of life, functional status and burden of caregiver for any comparison.
Most of the evidence comes from RCTs (3,902 patients). However, due to the small number of patients per comparison (imprecision), and inconsistency between the treatment effects reported by different studies, most of the evidence was judged as low or very low certainty. Evidence about the effects on OS was available for all nine comparisons, with no compelling evidence in favor of any of the available options. There is moderate certainty in one of the comparisons (AZA monotherapy vs LDAC monotherapy), showing little no differences in the OS between the patients treated with these drugs. We performed two subgroup analyses for this outcome (DEC monotherapy vs DEC combination, and LDAC monotherapy vs LDAC combination). Also, we performed another subgroup analysis for the complete remission outcome (LDAC monotherapy and LDAC combination). Overall, we found single studies with favorable effects in combination therapy groups (LDAC combination and, DEC combination). However, due to the number of studies, the sample size, and the inconsistency between the pooled estimates, we classified the evidence as low certainty ( Table 2). The evidence for other outcomes and comparisons was scarce and we could not conduct more of these analyses.
Toxicity is a very important feature during the decision-making process. We observed a similar prevalence of severe adverse events (CTC grade 3 or higher), except for two. AZA combination therapy (venetoclax) had more febrile neutropenia events when compared against AZA monotherapy (Table 3), and DEC monotherapy had more neutropenia events when compared against LDAC monotherapy (Table 4).

Strengths and limitations
No prior SRs addressed alternative chemotherapy for older patients with AML in whom intensive therapy was not an option. We conducted a comprehensive database search; specified explicit eligibility criteria; and conducted duplicate, independent study selection, data extraction and risk of bias assessment with resolution of disagreement with discussion and thirdparty adjudication where necessary. We used the GRADE approach to assess the quality of the evidence for NRS and RCT studies and where informative included both relative and absolute effects. We included all the relevant options that either RCTs or NRS had addressed.
We faced an important challenge when conducting our meta-analysis: The secondary agents varied across the studies within each comparison and, for most of the comparisons the type of secondary agent was not the same. We decided to pool studies within the comparisons regardless the secondary agent, and to explore if the secondary agent was associated with the treatment effect when comparing monotherapies vs. combination therapies. During the clinical practice guideline development, we planned additional analyses based on the input from the panel members. Unfortunately, the number of studies within comparisons and outcomes was insufficient to conduct such analyses. With the available evidence when developing the recommendations, the panel believed that any extra analyses, including sensitivity analyses that would exclude specific studies (e.g., diagnostic criteria for AML), was unlikely to change their conclusions. Also, we planned to performed a network meta-analysis (NMA) to compare all interventions against each other. At the end of data extraction, we identified insufficient evidence to do so (data not shown). This decision created the challenge to summarize all the useful evidence across the nine comparisons; we provided a summary on main text but also provide extensive supplementary information in the appendices.

Implications
Treating older AML patients can be challenging, as clinicians and patients must balance the goal of increasing longevity with the risk that more aggressive treatment may increase adverse events and hospitalization. During the recommendation formulation process, with the evidence available at that time, the guideline panel found no compelling evidence of additional benefit with more aggressive treatment with more than one agent, and instances in which such therapy did increase adverse events. After the meeting, however, some new studies (RCTs and NRS) reported benefits of combinations over monotherapy, for example, DEC combined with ATRA and VPA+ATRA may result in better survival than DEC monotherapy [Lubbert 2020] [46], and AZA combined with venetoclax may also result in better survival than AZA monotherapy [DiNardo 2020] [48]. Because these results were inconsistent with the previously identified studies, when including these new studies in the meta-analyses, the certainty of the overall evidence decreased. It is important to notice, however, that the certainty of evidence for each of these specific comparisons is low.
Therapy selection for older adults with AML who are not candidates for intensive antileukemic therapy is based on the patient fitness, patients' characteristics (cytogenic and molecular profiles), the trade-off between drug safety and toxicity, and patients' values and preferences [52]. The scientific community agrees on offering therapies based on HMA agents (e.g., azacytidine, decitabine) with some exceptions: liver and kidney severe disease, prior HMA therapy, and the presence of an actionable mutation [52,53]. For these populations other options are available (e.g., Low-dose cytarabine). Currently, combination therapy has become the standard of care for unfit AML older patients. However, the secondary agent depends on their availability in each setting and the presence of specific genetic mutations. Venetoclax (BCL2 inhibitor) is the preferred secondary agent to add to the HMA therapies, this is based on promising results from NRS and RCTs (mentioned previously). In our review, we identify benefits from the combination therapy with venetoclax. However, the certainty of the effect was judged to be low after creating a pooled estimate (imprecision and inconsistency). The same situation was identified with other secondary agents. We are aware that creating pooled estimates without stratifying based on the second agent may impact the effect estimate of a specific agent (e.g., venetoclax). In the comparison with enough studies, we undertook a subgroup analysis to explore their effect. However, the AZA monotherapy vs AZA combination did not have sufficient studies to explore it.
Our evidence suggests HMA therapies are acceptable options with similar efficacy and safety to other less-intensive treatment options. The certainty of the evidence was, however, low for most comparisons and outcomes, and there was no published evidence for several outcomes considered critical for decision-making. The limitations of the evidence also highlight the need for additional randomized trials including a wider range of patient-important outcomes-in particular quality of life-to definitively establish the relative merits of alternative regimens in older patients with AML in whom more aggressive therapy is not an option.