The Cost-Effectiveness of Monitoring Strategies for Antiretroviral Therapy of HIV Infected Patients in Resource-Limited Settings: Software Tool

Background The cost-effectiveness of routine viral load (VL) monitoring of HIV-infected patients on antiretroviral therapy (ART) depends on various factors that differ between settings and across time. Low-cost point-of-care (POC) tests for VL are in development and may make routine VL monitoring affordable in resource-limited settings. We developed a software tool to study the cost-effectiveness of switching to second-line ART with different monitoring strategies, and focused on POC-VL monitoring. Methods We used a mathematical model to simulate cohorts of patients from start of ART until death. We modeled 13 strategies (no 2nd-line, clinical, CD4 (with or without targeted VL), POC-VL, and laboratory-based VL monitoring, with different frequencies). We included a scenario with identical failure rates across strategies, and one in which routine VL monitoring reduces the risk of failure. We compared lifetime costs and averted disability-adjusted life-years (DALYs). We calculated incremental cost-effectiveness ratios (ICER). We developed an Excel tool to update the results of the model for varying unit costs and cohort characteristics, and conducted several sensitivity analyses varying the input costs. Results Introducing 2nd-line ART had an ICER of US$1651-1766/DALY averted. Compared with clinical monitoring, the ICER of CD4 monitoring was US$1896-US$5488/DALY averted and VL monitoring US$951-US$5813/DALY averted. We found no difference between POC- and laboratory-based VL monitoring, except for the highest measurement frequency (every 6 months), where laboratory-based testing was more effective. Targeted VL monitoring was on the cost-effectiveness frontier only if the difference between 1st- and 2nd-line costs remained large, and if we assumed that routine VL monitoring does not prevent failure. Conclusion Compared with the less expensive strategies, the cost-effectiveness of routine VL monitoring essentially depends on the cost of 2nd-line ART. Our Excel tool is useful for determining optimal monitoring strategies for specific settings, with specific sex-and age-distributions and unit costs.


Introduction
The latest World Health Organization (WHO) guidelines recommend routine viral load monitoring for HIV-infected patients on antiretroviral therapy (ART) in resource-limited settings [1]. Routine viral load (VL) monitoring is the gold standard for detecting treatment failure and deciding when patients should switch to 2 nd -line ART. VL monitoring may also support adherence and prevent HIV transmission, thus offering advantages beyond patient survival [2]. Most ART programs in resource-limited settings currently rely on CD4 or clinical monitoring [3], and the debate over the long-term benefit of routine VL monitoring still continues. It centers on the high cost of VL and on the logistical constraints that may make it infeasible to implement this recommendation [4][5][6][7][8].
We have shown that monitoring VL with a qualitative, moderately-sensitive POC VL test benefits the patient, and may be cost-effective compared with CD4 or clinical monitoring [5]. Rapid and affordable point-of-care (POC) tests for VL are already in development [9,10]. POC tests may improve on laboratory-based monitoring by simplifying the process. Patients could get same-day test results, adherence counseling, and/or make treatment decisions, all during the same visit. Though cheaper, POC tests may not be as accurate for diagnosis as sensitive and fully quantitative tests. Nevertheless, in settings where VL monitoring is still unavailable, POC tests may offer an affordable entry into VL monitoring.
The feasibility and cost-effectiveness of VL monitoring with a qualitative POC test depends on many factors, and these can vary substantially among settings. We built on our earlier mathematical simulation model of patients on ART, to create a flexible and user-friendly tool that allows users to evaluate the cost-effectiveness of a wide range of monitoring and switching strategies with varying assumptions.

Simulation model
We modeled cohorts of HIV-infected patients from ART initiation until death. We used three indicators to define the progression of HIV infection qualitatively: virological; immunological; and, clinical. All three indicators have two possible values: normal and failing. We assumed that all patients started in the normal virological, immunological and clinical stages. This represents the successful introduction of ART: VL decreases rapidly to undetectable values and remains suppressed; CD4 cell count increases; and, the patient will have no more clinical (WHO stage 3 or 4 defining) symptoms. The patient can proceed to virological failure at any time: this represents a rebound in VL to a detectable value of >1000 copies/ml (or, at the early stages of treatment, the failure to suppress). Progression to immunological failure represents a CD4 cell count decline to a level that meets the WHO immunological criteria of treatment failure [1]. Progression to clinical failure represents the occurrence of WHO stage 3 or 4 defining symptoms. Both immunological and clinical progression can occur either as a consequence of virological failure ("concordant"), or independent of the patient's virological stage ("discordant"). Clinical progression also depends on the patient's current immunological status. The parameterization of hazard functions related to disease progression was the same as in previous modeling studies [2,4,5], which were mostly based on two routine ART programs from the Cape Town area, Gugulethu and Khayelitsha. The characteristics of these cohorts were described in a previous publication [2]. The structure of the simulation model is presented in Fig. 1 and the key parameters are listed in Table 1.
The failing virological, immunological and clinical stages will persist, unless the patient switches to a 2 nd -line ART regimen. When the patient switches, the failing virological as well as concordant failing immunological and clinical stages will return to normal. Failing discordant immunological and clinical stages will remain failing after switching. During 2 nd -line therapy, the patient is again at risk of proceeding to the failing virological, immunological or clinical stage. The parameterization was the same as for 1 st -line therapy, except that the risk of proceeding to virological failure was scaled up with a resistance penalty factor, which depended on the time the patient spent on virologically failing 1 st -line therapy.
Mortality consists of two components: HIV-free background mortality and HIV-related mortality. We assumed the risk of mortality increased for patients in the failing virological, immunological or clinical stage. Although we assumed that all patients are retained in care from ART initiation until death, we accounted for the expected high mortality among patients lost to follow-up when we estimated the HIV-related mortality rates [4].
We considered a total of 13 monitoring and switching strategies ( Table 2). With clinical monitoring, the treatment failure is observed at the next regular appointment after the patient proceeds to the advanced clinical stage, and the patient switches 3 months later. With CD4 monitoring, the failure is observed at the next appointment when the CD4 cell count is measured (which depends on the measurement frequency), and the patient switches 3 months later. Both clinical and CD4 monitoring are assumed to be fully specific and sensitive for detecting clinical or immunological failure. With routine VL monitoring, observing the failure depends on the definition of failure and the test itself. With a laboratory VL test, a failure is observed at the first monitoring appointment after the failing virological stage begins, and we assumed that CD4s are measured simultaneously. With a qualitative POC VL test, a failure may be observed at any visit, and the probability of detecting a failure depends on the detection limit of the test: we assumed it was 5000 copies/ml throughout the study. The test is repeated three months after observing the failure; if the failure is confirmed, the patient switches to 2 nd -line ART. With targeted VL monitoring, only CD4 counts are measured routinely. A POC VL test is performed immediately if an immunological failure is detected. If this test is also positive, a second VL test is performed 3 months later.
We modeled two scenarios. In Scenario A, we assumed that the risk of treatment failure did not depend on the monitoring and switching strategy, and thus, for all strategies, we used the rates estimated from the South African data (where VL is monitored regularly). In Scenario B, we assumed that VL monitoring can prevent treatment failure, and that the South African rates underestimate the risk in strategies where VL monitoring is unavailable. We assumed this because routine viral load monitoring can detect poor adherence. Many patients with a detectable viral load can re-suppress viral load on first-line ART, after an adherence intervention [11][12][13][14]. Since poor adherence is a major predictor of treatment failure [15,16], routine viral load monitoring may prevent treatment failures caused by poor adherence. For these strategies, we assumed that the hazard would be twice as high across the entire follow-up period.
The model was constructed in three steps. In the first step, we simulated cohorts of 100,000 patients for all 13 monitoring and switching strategies, without baseline characteristics or background mortality. We implemented the model using 'gems', an R package for generalized multistate simulation models [17,18]. 'Gems' models disease progression as a series of events (e.g., diagnosis, treatment and death) that can be displayed in a directed acyclic graph (DAG). The vertices correspond to disease states and the directed edges represent events. Similar models that use the package 'gems' or a similar algorithm have been published elsewhere [2,4,5,19]. For strategies without routine VL monitoring we modeled two cohorts with different failure rates (scenarios A and B). This resulted in a total of 20 cohorts.   In the second step, these cohorts were updated to account for differences in background mortality. Thirty-two copies of each of the 20 cohorts were created, for the different sexes (male and female), baseline age groups (15-24, 25-34, 35-44 and 45-54 years), and four different scenarios of background mortality. In the first three scenarios, the background mortality rates represent the overall HIV-free mortality in the general populations of Malawi and Zimbabwe [20], and in Africans in the Western Cape [21]. In the fourth scenario, we assumed that the HIV-free life expectancy from birth would be 75 years for all patients. Each patient was assigned an age, sampled from a uniform distribution within the given range. A time of HIVunrelated death was sampled for each patient, based on the gender, baseline age and the background mortality rate.
In the third step, we analyzed the outcomes of interest. The main outcomes were disabilityadjusted life-years (DALY) lost to HIV, total cost, and cost-effectiveness ratios of the intervention compared to current practice as well as to the next less expensive strategy (incremental cost-effectiveness ratio, ICER). Definitions are given in S1 Text.

Excel tool
We developed an Excel spreadsheet tool to adapt the model outputs to specific scenarios. The Excel table contains all outputs of the simulation and presents the results according to the scenario defined by the user. The user can vary the following input variables continuously ( Table 3): size of cohort; unit costs for clinic visit, VL and/or CD4 test, one year of 1 st -and 2 ndline ART; and, disability weight of symptomatic and asymptomatic HIV. The user can specify the age and gender distribution of the cohort by giving proportions for each of the eight age and gender groups. Finally, the user can specify the failure rate scenario (A or B), background mortality (Malawi, Zimbabwe, Western Cape, or constant life expectancy 75 years) and discounting (0%, 1%, 2%, 3%, 4%, or 5%). The main results are then updated based on the assumptions. The Excel table also shows the ICERs, and graphically presents the costs and averted DALYs of each scenario.
We present the results in this manuscript for a set of input parameters (Table 3). We assumed the costs of 1 st -line ART were US$99/year and 2 nd -line ART were US$280/year. VL tests (both POC and laboratory-based) were assumed to cost US$10, and CD4 tests US$5. We did not consider the cost of clinic appointments. Disability coefficients were 0.135 for asymptomatic HIV and 0.369 for symptomatic HIV. The disability coefficient for symptomatic HIV was the product of the coefficients of asymptomatic HIV and tuberculosis, the most common opportunistic infection [22]. All results are presented per one patient. Costs and DALYs are discounted annually by 3%. Two separate analyses were conducted: one in which virological failure rates were identical (Scenario A); and one in which it was twice as high in strategies without routine VL monitoring than in strategies with routine VL monitoring (Scenario B). We also present 10 sensitivity analyses in the S2 Text and S1 Table, in which the unit costs of tests and ART and the discounting rate were varied.

Excel tool
The Excel tool is presented in S1 Excel File.

Scenario A: Failure rate is identical in all monitoring and switching strategies
In the absence of 2 nd -line and monitoring, the average lifetime cost of ART was US$1419 per patient (   Table 2 for the parameters) Twice as high compared to strategies 4.1-5.3** Identical or twice as high compared to strategies 4. 1-5.3 VLs. The total costs for strategies with VL monitoring ranged from US$1840 (4.1) to US$2216 (5.3). VL monitoring averted 0 to 0.05 DALYs more than the most effective CD4 monitoring strategy (3.5), and 0.09 to 0.14 more than clinical monitoring (2.1). There were no clear differences between POC and laboratory-based VL monitoring in the number of averted DALYs. The POC VL strategy with most frequent (every 6 months) monitoring (4.2) averted slightly fewer DALYs than POC VL monitoring every 12 months (4.3).

Scenario B: Failure rate is twice as high without VL monitoring as with VL monitoring
When only 1 st -line ART was available (1.1), the average lifetime cost of ART was US$1401 per patient (Table 5). On average, each patient lost 7.5 DALYs to HIV. When 2 nd -line ART with clinical monitoring was added (2.1), this increased the costs to US$1569 per patient and   Table 2 for a detailed description of the monitoring strategies. Cost-Effectiveness of ART Monitoring Strategies All performed sensitivity analyses are listed in S1

Sensitivity analyses
The results of the sensitivity analyses are presented in detail in Table 6 and S2 Text and S2 to S11 Tables. Of note is the dependence of the benefit of targeted routine VL monitoring on the ratio of the costs between 1 st -and 2 nd -line regimens. If the prices of 1 st -and 2 nd -line regimens were similar, targeted VL monitoring was dominated; if 2 nd -line ART was assumed to be substantially more expensive, targeted VL monitoring was on the cost-effectiveness frontier.

Discussion
We simulated cohorts of patients who initiated ART under 13 different ART monitoring and switching strategies. VL monitoring was slightly more effective than CD4 monitoring (in particular if we assumed that VL monitoring also reduces the risk of treatment failure). CD4 monitoring was more effective than clinical monitoring, and clinical monitoring was more effective than 1 st -line ART only. However, differences in the effectiveness of any two strategies were all below 1 DALY per patient. We observed no clear difference between monitoring strategies that measured at different intervals, or between VL monitoring that used fully quantitative, highly specific and sensitive laboratory tests or those that used a qualitative POC test. The cost-effectiveness of POC VL monitoring clearly improved if we reduced the gap between prices of 1 stand 2 nd -line ART (Table 6). Across our analyses, 12-month routine POC VL monitoring was on the cost-effectiveness frontier. The cost-effectiveness ratio of this strategy compared to clinical monitoring varied between US$700 and US$3300 per DALY averted. Cost-effectiveness improved if we assumed that VL monitoring reduces the risk of failure, and when the price of 2 nd -line ART and 1 st -line ART were close. The cost-effectiveness ratio of US$700 per DALY averted was reached if 2 ndline costs were reduced to minimum and we assumed that routine VL monitoring prevents failure. If we define a cost-effective intervention as having a cost-effectiveness ratio of less than 3 times the local per-capita gross domestic product, a cost-effectiveness ratio of US$700 per DALY averted can be considered cost-effective in any country [23,24]. We found no major differences between fully quantitative laboratory-based VL monitoring with CD4 tests, and qualitative POC VL monitoring. The potential disadvantage of a POC test that we assumed was the possibility of "false positive" switches to 2 nd -line ART, i.e. switching patients who do not have a persistent detectable viral load. For example, if a patient has two successive detectable viral load values, and if the exact values of the VLs and the CD4 cell count are known, the clinician may be able to distinguish a patient failing therapy and a patient with blips, poor adherence, measurement errors, etc. The qualitative test only gives a positive or negative result, upon which a decision must be based. Although we did not find differences between laboratorybased and POC VL monitoring overall, increasing the frequency of POC VL monitoring from 12 to 6 months slightly decreased, rather than increased life-years. This was not the case for laboratory-based VL monitoring. We think this is caused by false positive failures with POC tests: increasing the measurement frequency increases the number of patients who switch unnecessarily, and this may affect their future treatment options.
The cost-effectiveness of targeted VL monitoring varied changed with the input parameters. We found that if the price of 2 nd -line ART is considerably higher than for 1 st -line ART, and if we assume no additional benefits for routine VL monitoring, it may be cost-effective to conduct routine CD4 monitoring with targeted VL testing. This strategy uses routine CD4 tests to detect patients who may be on a failing 1 st -line regimen. Patients are then given VL tests to confirm their status. This strategy reduces the total costs by restricting use of 2 nd -line ART to patients who need it most (those with low CD4 cell counts and a high risk of mortality), and by not switching patients with suppressed VLs. However, if we assume that routine VL monitoring can also reduce the risk of failure, for example, or if the cost difference between the regimens is small, routine VL monitoring is preferable.
Optimal monitoring strategy has been investigated in a number of mathematical modeling studies. Walensky et al published a review of five modeling studies that assessed the costeffectiveness of different strategies for monitoring HIV-infected patients on ART [25]. Four of the models investigated VL monitoring [6,[26][27][28]. After the review was published, similar modeling studies have appeared [29,30]. A recent article [31] systematically compared three models. All these studies suggest that VL monitoring may moderately improve the outcomes of ART programs, but cost-effectiveness estimates vary substantially between them. The variance may be caused by the different assumptions in the input values. We believe our study is the first to include a tool that allows users to easily vary input values.

Limitations
Our analysis has several limitations. First, we only included time on ART. Diagnostic tests, and, in particular, CD4 cell counts, are usually recommended for monitoring patients before ART is initiated. If assessment of ART eligibility continues to depend on CD4 cell counts, CD4 monitoring will continue to be important. However, there is a growing tendency towards simpler rules for ART initiation, such as "Option B+", in which all pregnant and breastfeeding women start lifelong ART [32], or universal "test and treat" [33]. We anticipate that, in many settings, there will be a decreased need to measure CD4 cell count to assess eligibility for ART. However, to estimate the true costs of CD4 testing, the use of CD4 cell measurements for purposes other than on-treatment monitoring must be taken into consideration. The usefulness of monitoring immune response by CD4 cell measurement during the first year of ART should also be evaluated.
We did not include loss to follow-up (LTFU). High rates of LTFU are a serious problem in most ART programs in resource-limited settings [34]. LTFU is a combination of unregistered deaths, unregistered transfers, and cases in which a patient has stopped ART [35]. We took the effect of LTFU on mortality into account in the parameterization of mortality. Patients who transferred to another ART clinic can be expected to take ART as recommended. We did not take include the effect of patients who stopped ART. POC monitoring may offer advantages over laboratory-based monitoring by making it easier to manage patients and by shortening wait times. This may, in turn, improve retention [36], and is another argument for POC monitoring, but it is not studied in the current analysis. We did not model onward transmission. VL monitoring reduces the time the patient spends on a failing regimen. VL monitoring can prevent about 30% of transmissions from patients on treatment [2,5], and this may substantially improve the cost-effectiveness of VL monitoring. Therefore, the long-term population-level benefits of VL monitoring may be larger than we estimated.
We assumed that laboratory-based VL monitoring, together with CD4 tests, is 100% sensitive and specific to detect true treatment failure. Although this is a simplifying assumption, we wanted to include two distinct types of viral load test: one of these is as accurate as possible, and one has a lower sensitivity and specificity. The POC tests were assumed to be neither fully sensitive nor specific to detect true treatment failure. The users of the Excel tool can therefore choose either laboratory-based or POC viral load monitoring, depending on the diagnostic capacities of the viral load test they want to investigate. We also assumed CD4 tests are fully sensitive and specific. But since CD4 cell count poorly predicts true treatment failure, this assumption should not have much effect on the results.
The Excel tool allows users to vary input costs and several other input parameters. However, parameters related to disease progression cannot be changed in the Excel tool. Most key parameters for HIV progression are based on two large ART cohorts from the Cape Town area: Gugulethu and Khayelitsha. The results of the model reflect the characteristics of these routine ART programs, typical of southern Africa: the majority of patients are women, and most patients start ART with low CD4 cell counts and advanced clinical symptoms. We believe the results of our model can be generalized widely for ART programs in sub-Saharan Africa. However, there are also important differences between the cohorts we drew on, and others: the Cape Town cohorts had access to frequent laboratory measurements, and there is a continuous tendency to start ART earlier. Our model may not be able to catch all site-level characteristics of the different settings.

Conclusion
POC VL testing appears to be a promising alternative for routinely monitoring ART in resource-limited settings, especially if we assume that viral load monitoring can prevent treatment failure by improving adherence, and that the gap between prices of 1 st -and 2 nd -line ART can be decreased. Under these conditions, VL monitoring every 12 or 24 months, with an affordable qualitative test, does not considerably increase the costs above those of CD4 monitoring. POC VL testing may offer the same benefit as frequently monitoring VL with a quantitative test. Routine VL monitoring may also have benefits beyond more accurate detection of treatment failure; it may for example be able to prevent treatment failure by improved adherence. Targeted VL monitoring, based on routine CD4 monitoring, may be an option if these potential benefits are small, if 2 nd -line ART remains substantially more expensive than 1 st -line ART, and if it is expected that 2 nd -line ART cannot be provided for everyone failing virologically. Special attention should be paid to the price of 2 nd -line ART, which we expect will play a more substantial role than the price of the diagnostic tests. To make routine viral load monitoring cost-effective, efforts must be made to drop the costs of 2 nd -line antiretroviral drugs. The optimal monitoring and switching strategy for each setting depends substantially on factors such as the unit costs. Our Excel tool allows researchers and policy-makers to vary important cost and population structure parameters and may be a valuable tool for developing local monitoring guidelines.
Supporting Information S1 Excel File. Excel tool for adapting the model outputs to specific scenarios. (XLSX) S1  Table. Model outcomes: sensitivity analysis SL2 assuming that the annual cost of 2 ndline ART is US$140. (DOCX) S10 Table. Model outcomes: sensitivity analysis SL3 assuming that the annual cost of 2 ndline ART is US$350. (DOCX) S11 Table. Model outcomes: sensitivity analysis DI1 assuming no discounting.