A Single CD4 Test with 250 Cells/Mm3 Threshold Predicts Viral Suppression in HIV-Infected Adults Failing First-Line Therapy by Clinical Criteria

Background In low-income countries, viral load (VL) monitoring of antiretroviral therapy (ART) is rarely available in the public sector for HIV-infected adults or children. Using clinical failure alone to identify first-line ART failure and trigger regimen switch may result in unnecessary use of costly second-line therapy. Our objective was to identify CD4 threshold values to confirm clinically-determined ART failure when VL is unavailable. Methods 3316 HIV-infected Ugandan/Zimbabwean adults were randomised to first-line ART with Clinically-Driven (CDM, CD4s measured but blinded) or routine Laboratory and Clinical Monitoring (LCM, 12-weekly CD4s) in the DART trial. CD4 at switch and ART failure criteria (new/recurrent WHO 4, single/multiple WHO 3 event; LCM: CD4<100 cells/mm3) were reviewed in 361 LCM, 314 CDM participants who switched over median 5 years follow-up. Retrospective VLs were available in 368 (55%) participants. Results Overall, 265/361 (73%) LCM participants failed with CD4<100 cells/mm3; only 7 (2%) switched with CD4≥250 cells/mm3, four switches triggered by WHO events. Without CD4 monitoring, 207/314 (66%) CDM participants failed with WHO 4 events, and 77(25%)/30(10%) with single/multiple WHO 3 events. Failure/switching with single WHO 3 events was more likely with CD4≥250 cells/mm3 (28/77; 36%) (p = 0.0002). CD4 monitoring reduced switching with viral suppression: 23/187 (12%) LCM versus 49/181 (27%) CDM had VL<400 copies/ml at failure/switch (p<0.0001). Amongst CDM participants with CD4<250 cells/mm3 only 11/133 (8%) had VL<400copies/ml, compared with 38/48 (79%) with CD4≥250 cells/mm3 (p<0.0001). Conclusion Multiple, but not single, WHO 3 events predicted first-line ART failure. A CD4 threshold ‘tiebreaker’ of ≥250 cells/mm3 for clinically-monitored patients failing first-line could identify ∼80% with VL<400 copies/ml, who are unlikely to benefit from second-line. Targeting CD4s to single WHO stage 3 ‘clinical failures’ would particularly avoid premature, costly switch to second-line ART.


Introduction
Most HIV-infected individuals on antiretroviral therapy (ART) in low/middle-income countries are treated following the WHO public health approach [1]: the public sector provides one standard first-line regimen, with alternative drug substitutions for anti-tuberculosis co-therapy/toxicity; when first-line failure occurs, the patient switches to a standard boosted-protease inhibitor (bPI)based second-line regimen. Current WHO guidelines [2] define failure by virological (.5,000 copies/ml), immunological (CD4 below pre-therapy baseline; 50% fall from on-treatment peak; persistently ,100 cells/mm 3 ) or clinical criteria after 6 months on ART. Low-income countries differ in their ability to provide laboratory tests to identify first-line failure and support routine follow-up; if available at all, CD4 testing is most common with viral loads (VL) sometimes used to confirm clinical/immunological failure [2]. Routine virological monitoring is rarely available or feasible [3]. Such approaches differ markedly from individualised management in high-income countries, where routine VL monitoring is used to modify initial or subsequent therapy and many drugs are available.
WHO 2010 definition of clinical failure includes WHO 4 and certain WHO stage 3 conditions. Without VLs, it is strongly recommended that immunological criteria confirm clinical failure (noting moderate quality of evidence), but no CD4 threshold value is proposed. We therefore evaluated switches from first to secondline ART in the DART trial [4,5], particularly considering the unique group randomised to clinically-driven monitoring (CDM) and managed without CD4 counts, but for whom CD4s (and some VLs) were available for retrospective analysis. The aims were to investigate the characteristics of immunological/clinical failures determined without routine VL monitoring and with or without routine real-time CD4 monitoring; and to identify optimal CD4 thresholds to confirm clinical failure and switch to second-line ART when VLs are unavailable.

Ethics statement
Informed written consent was obtained from each participant and the trial was approved by ethics committees in Uganda, Zimbabwe and the UK.
All participants were reviewed 4-weekly by a nurse, and saw a doctor and had routine lymphocyte subsets (CD4, CD8) at screening, weeks 4 and 12, then 12-weekly. All results from LCM participants were returned to clinicians, whereas for CDM participants CD4 counts were measured but never returned. For all participants, if clinically indicated, diagnostic and other tests could be requested (excluding CD4/total lymphocytes in CDM) and concomitant medications prescribed. VLs were not performed in real-time, but were measured retrospectively on stored plasma samples using Roche Amplicor v1. 5.
Participants could substitute alternative antiretrovirals from the nucleoside reverse transcriptase inhibitor (NRTI)/non-NRTI classes in first-line regimens for toxicity, TB-treatment or other reasons; these substitutions did not count as first-line failures. The decision to switch to a second-line bPI-containing regimen for first-line failure was based on clinical criteria in both groups (new/ recurrent WHO 4 event (per-protocol); or single or multiple WHO 3 events at clinician discretion), or confirmed CD4,100 cells/ mm 3 on ART (,50 cells/mm 3 before July 2006) for LCM. Switching was discouraged before 48 weeks on first-line. Clinician's decision to switch also took account of adherence and social circumstances, following standard clinical practice. VL at switch was assayed in all participants enrolling in nested second-line studies from 2007 onwards [7,8], plus a random sample of other second-line switches (including switches during 2004-2006; 85% assayed on day of switch, 93% within 4 weeks, all within 4 months previously). 336/459 (73%) participants switching after 1 January 2007 chose to join one or both studies. As per the DART protocol, all reported WHO 4 events (but not WHO 3 events) were reviewed against pre-specified criteria by an independent Endpoint Review Committee(ERC) blinded to randomised allocation. This was done retrospectively and did not impact clinical decisionmaking.

Statistical analysis
Switch from first-to second-line ART for clinical or immunological failure (the latter only in LCM participants being managed using routine CD4 counts) was the primary outcome of interest. Participants were followed under CDM/LCM strategies until 31 December 2008; this analysis includes only switches to this timepoint and is an exploratory analysis of trial data not specified in the original trial protocol. We used Kaplan-Meier methods to evaluate the time delay from first meeting WHO 4 criteria for switch (in all patients with WHO 4 events after 44 weeks on ART, see below), and actual change in regimen. Although the protocol discouraged switching before 48 weeks, we chose 44 weeks as the cut-off for this analysis to include a small number of patients who switched just before 48 weeks because they returned early for their 48 week visit. The secondary outcome was mortality following switch to second-line. Main exposures considered in those who switched were reported reason for switching, CD4 and VL at switch. Analyses of VL included all available VL which had been measured retrospectively on a subset of participants (see above). Where LCM participants met both immunological (CD4,100 cells/mm 3 ) and clinical failure criteria, they were counted as immunological failures. Where CDM participants had both WHO 4 and WHO 3 events, they were counted as WHO 4 failures. Categorical variables were compared using chi-squared/exact tests, continuous variables using t-tests/rank-sum tests. To inform clinical practice when 'tiebreaker' VL tests are not available to confirm that clinical/immunological failure has occurred with detectable VL [2], we used receiver-operating-characteristic (ROC) curves to identify the most sensitive and specific (equal weighting) CD4 threshold cut-off for detecting suppressed VL at the point of clinical/immunological first-line failure.

Results
Patients monitored clinically without CD4 (or VL) results 1660 CDM participants initiated ART with median(IQR) CD4 86(31-139) cells/mm 3 and were clinically monitored without CD4 counts for median 5 years. 314(19%) switched to bPI-containing second-line for first-line failure after median(IQR) 3.4(2.5-4.2) years on first-line (only 2 before 48 weeks when switching was discouraged, both at 46 weeks). In those who switched, median(IQR) pre-ART CD4 was 47(14-104) cells/mm 3 and age at switch 39(34-44) years; 193(61%) were female. 223(13%) CDM participants had new/recurrent WHO 4 events accepted by the ERC after 44 weeks on ART: 187/223 (84%) switched to second-line, 14(6%) died on first-line before switching and 22(10%) had not switched before trial closure. The Kaplan-Meier median(IQR) time to switch after meeting failure criteria was 7(1-23) weeks. The most commonly reported reasons for delaying switch for .8 weeks/not switching were that the WHO 4 event was judged unrelated to ART failure by the clinician (45%) or because of drug-drug interactions between rifampicin and bPI (32%) ( Table 1; n = 44). An additional 20 patients switched to second-line for WHO 4 events eventually judged not to meet predefined protocol criteria by the ERC, leading to a total 207/314 (66%) switches in CDM being for WHO 4 events.
In the same period, 70 multiple (within 60 days) and 392 single WHO 3 events were reported in participants not switching for WHO 4 events. Clinicians used clinical judgement to assess which of these events were likely first-line ART failure, leading to 30(10%) and 77(25%) of the 314 CDM switches being for multiple or single WHO 3 events respectively ( Table 2). More switches for multiple/single WHO 3 events occurred over calendar time (2004)(2005)(2006)(2007)(2008) reflecting wider promotion of WHO 3 events as switch criteria in WHO 2006 guidelines [9]; eg 85% of pre-2007 switches were due to WHO 4 events compared to 53% subsequently.
Most WHO 4 events triggering switch had relatively similar proportions with CD4$250 cells/mm 3 and ,50 cells/mm 3 (Table 3), with the exception of cryptococcal meningitis, where one third (11 events) triggered switch with CD4$250 cells/mm 3 . Interestingly CD4 was $250 cells/mm 3 in three of the four switches to second-line triggered by lymphoma. Although generally considered a less severe WHO 4 event, median(IQR) CD4 at switch triggered by oesophageal candidiasis was only 30(8-68) cells/mm 3 . Weight loss, severe bacterial infection (SBI) and diarrhoea were the main single WHO 3 events triggering switch with CD4$250 cells/mm 3 (44%, 43% and 100% respectively), likely reflecting their frequency in adults irrespective of HIV status or CD4. However, combinations of $2 WHO 3 events triggered switch at lower CD4 counts, similar to WHO 4 events.
Overall 181(58%) CDM participants had VL at switch to second-line assayed retrospectively. 49(27%) had VL,400 copies/ ml, with similar proportions across reasons for triggering switch (p = 0.29) ( Table 2). Thus 3.7 'tie-breaker' VL tests at clinical failure would be needed to prevent one switch with suppressed VL. There was a very wide range of VLs in clinically-monitored participants failing and switching with CD4,100 cells/mm 3 (Figure 1(a)). In contrast, most with CD4.250 cells/mm 3 had suppressed VL.
To inform practice where VL testing is unavailable or performed off-site (when return of results may be delayed considerably), we evaluated the predictive ability of a single tiebreaker CD4 count at clinically-triggered switch to identify participants with VL,400 copies/ml, using data on VLs assayed retrospectively on stored samples and CD4 counts which had not been returned to clinicians. VL was ,400 copies/ml in 38/ 48(79%) with CD4$250 cells/mm 3 versus only 11/133(8%) with CD4,250 cells/mm 3 (p,0.0001). The area under the receiveroperating-characteristic (ROC) curve (Figure 2(a)) was 0.91 (95% CI 0.86-0.96), with an optimal threshold where most observations were correctly classified (90%) of CD4$220 cells/mm 3 . This cutoff had 86% sensitivity, 92% specificity, 79% positive predictive value and 95% negative predictive value for identifying participants with VL,400 copies/ml (LR+ = 10.3, LR-= 0.16). ROC areas were similar according to reason for switch and in those joining and not joining second-line studies ( Table 2). Therefore a threshold of .250 cells/mm 3 , a close but more likely cut-off for a CD4 point-of-care assay, would capture most individuals without virological failure, for whom switching could be premature and unnecessary.
Patients monitored clinically with 12-weekly CD4 but no VL results 1656 LCM participants initiated ART with median(IQR) CD4 86(32-140) cells/mm 3 and were clinically monitored together with 12-weekly CD4 counts for median 5 years. 361(22%) switched to bPI-containing second-line for first-line failure after a median(IQR) 2.8(2.1-3.8) years on first-line (shorter than in CDM, p = 0.0001) (only 1 ,48 weeks). In those who switched, median(IQR) pre-ART CD4 was 42(17-85) cells/mm 3  The most commonly reported reasons for delaying switch for .8 weeks or not switching was high CD4s (71%) or because of drugdrug interactions between rifampicin and bPI (24%) ( Table 1; n = 42). An additional 75 participants switched to second-line ART for WHO 4 events not judged to meet pre-defined protocol criteria by the ERC, single/multiple WHO 3 events or other CD4-related reasons.

Discussion
Most public sector ART clinics in low-income settings have very limited laboratory capacity to monitor patients on therapy, so justifying and prioritising services provided to support clinical monitoring is critical. DART has already shown that routine CD4 monitoring alone from the second year of ART has a small but important impact on survival [4]. WHO recommends the use of VLs to confirm immunological/clinical failure [2]. Currently, neither are widely accessible; eg in April-June 2011, there were only 50 functional CD4 machines in 449 ART clinics in Malawi[10] and in early 2012, only 4/59 ART centres in Malawi, Zimbabwe, Uganda, including hospitals, had the possibility to monitor VL, including off-site [3]; even if theoretically functional, lack of electricity/consumables/personnel may further reduce their availability in practice. Further analysis of the unique group of DART participants failing first-line ART who were clinicallymonitored without routine CD4/VL tests, but with retrospective CD4 and VLs available, now shows the utility of multiple, but not single, WHO 3 events as clinical failure criteria in the absence of any CD4 monitoring; and that, where access to single CD4 tests is available, a CD4 tiebreaker at a 250 cells/mm 3 threshold could identify ,80% of those failing clinically with VL,400 copies/ml who may be unlikely to benefit from switching to more costly second-line therapy. Despite WHO recommendations to use a VL 'tie-breaker' test to confirm clinical/immunological failure [2], access to expensive HIV RNA testing is unlikely to improve soon given the current financial crisis. Point-of-care VL testing could dramatically change this, but is unlikely to become available for several years, and may still be relatively costly. Meanwhile, many public sector ART programmes will continue to monitor ART patients with negligible access to VL. In contrast, CD4 testing is more widely accessible [3], and point-of-care devices already in evaluation will soon increase coverage [11]. However, the sheer volume of testing will remain challenging, as evidenced by stockouts even of simple HIV tests [3] and given the 6.6 million adults/children receiving ART in low/middle-income settings [12]. Making routine CD4 monitoring available to all would require significant additional investments in laboratory infrastructure, personnel and consumables, which may not be possible given the current financial situation, particularly as, at current costs, it is not cost-effective for most African countries [13,14], and a more pressing priority is to rollout ART to more who need it. Additional benefits of routine VL over CD4 monitoring are small [15,16] or negligible [17] and even less cost-effective [18]. It is therefore essential to consider parsimonious ways to use CD4 testing without VLs to support clinical monitoring in the critical decision of when to switch to second-line.  The straight line indicates performance no better than chance. The threshold with the greatest probability of correctly classifying each CD4 count according to whether it has VL,400 copies/ml or not is indicated with sensitivity (proportion with VL,400 c/ml who have CD4$threshold), specificity (proportion with VL$400 c/ml who have CD4 ,threshold), positive predictive value (proportion of patients with CD4 $threshold who have VL,400 c/ml) and negative predictive value (proportion of patients with CD4 ,threshold who have VL$400 c/ml). doi:10.1371/journal.pone.0057580.g002 As our analysis investigated characteristics of those patients switched to second-line, we did not (and cannot) estimate the overall accuracy of CD4 (or clinical) criteria for identifying virological failure in all individuals on treatment. However, our data clearly confirm that monitoring for clinical failure alone overidentifies immunological failure, potentially resulting in unnecessary and premature switching to more costly second-line ART [14,19]: 20% of clinical failures/switches had CD4.250 cells/mm 3 . The low CD4 nadir in DART participants (10% had pre-ART CD4 ,10 cells/mm 3 ) may have contributed to this, with patients at long-term risk for events such as lymphoma, despite immune reconstitution. Furthermore, 12% of CD4-monitored and 27% of clinically-monitored participants had suppressed VL at failure/switch, as previously reported [19][20][21][22]. Discordance between clinical, immunological and virological failure at any single timepoint is expected, as they track different processes. Nor is failure always absolute: eg 50% of patients with virological failure and genotypic NNRTI resistance in one South African cohort resuppressed while receiving an NNRTI [23]. In resource-limited settings with access to single confirmatory laboratory tests, the challenge of how to deal practically with discordance remains.
Whilst relatively short periods on ART leading to incomplete immune reconstitution may account for some discrepancies in previous studies [19,20], the 18 participants switching with CD4,110 cells/mm 3 and VL,400 copies/ml in DART had been on first-line ART for median 3.4 years. Some had variable CD4 responses on first-line, with periods of low CD4 without documented non-adherence; others had never responded immunologically, similarly to Kantor et al where 3/7 patients with persistent CD4,100 cells/mm 3 had undetectable VL [20]. Most DART participants with VL,400 copies/ml but very low CD4 at failure benefited considerably immunologically from second-line switch; however 22% (4/18) died shortly after switching. Given increased mortality risks at CD4,100 cells/mm 3 even with suppressed VL in resource-rich [24,25] and resource-limited settings [26], there may be clinical benefits from switching this specific group of discordant responders with very low CD4 from an NRTI/NNRTI-based first-line to a bPI-based second-line regimen. Of interest, we also observed relatively high mortality in participants switching with high CD4 counts and WHO 4 events, possibly reflecting that developing such events despite apparently high absolute CD4 counts may indicate underlying functional immune deficits that may themselves impact mortality risk.
Those 'failing ART' clinically with only single WHO 3 events can be viewed in two ways. The fact that nearly 40% had CD4.250 cells/mm 3 and 33% VL,400 copies/ml demonstrates lack of sensitivity of these events for ART failure, probably because of their frequency in the underlying population irrespective of HIV status. Whilst our results suggest single WHO 3 events should not trigger switch in isolation, conversely 64% of this group had CD4#250 cells/mm 3 and 67% VL.400 copies/ml, greater than the wider population on first-line. Targeting them for confirmatory tiebreak CD4 testing could therefore identify additional ART failures whilst avoiding premature switching. Of note, participants switching with multiple WHO 3 events had lower CD4 and higher VL at switch than single WHO 3 events. Adding WHO 3 (single or multiple) events to the WHO 4 criteria for clinical failure increased the numbers identified by about 50%.
One potential limitation of our study is that clinical monitoring was conducted by nurses and clinicians in relatively wellsupported, staffed and supervised sites with no ART stock-outs. DART had excellent retention (just 7% loss to follow-up over five years) and 5-year survival (87% CDM, 90% LCM). Among participants randomised to CDM, CD4s were measured but not returned and health-workers remained blinded to CD4s throughout the trial. This enabled us to perform analyses impossible outside a trial design such as DART. Whilst several WHO 4 events triggering switch in DART were not considered to meet criteria for trial endpoints on independent review, we included them in these analyses. In ART programmes, more WHO 3/4 events might be over-diagnosed (with clinicians conservatively ascribing clinical episodes as WHO 3/4 events), but these would likely occur with high CD4 and VL,400 copies/ml supporting generalisability of our findings. DART participants were severely immunecompromised (median 86 cells/mm 3 ) when initiating ART: generalizability to programmes initiating ART earlier is unknown. However, such patients would take longer to fail on first-line therapy, and therefore it is plausible that a greater (rather than smaller) proportion of clinical events on first-line would be single WHO 3 rather than multiple WHO 3 or 4 events, in whom we found greatest VL suppression. Our findings may therefore be more, rather than less generalizable, to such settings. The DART protocol only included one of the three WHO immunological failure criteria (confirmed ,100 cells/mm 3 ) for LCM participants (although a small number of participants switched for other CD4 concerns before this). The other criteria require a series of CD4s (50% decline from peak) or a pre-ART CD4 (drop below pre-ART baseline) and were not included in the protocol because they were judged impractical in settings with limited access to CD4 testing and are anyway not validated; it was considered that switching should be determined only by known predictors of mortality on ART (ie overall CD4, not declines or drops below baseline). These other immunological failure criteria also generally lead to switch at higher CD4 counts and might be expected to be associated with higher rates of VL suppression at switch than demonstrated here, but we cannot assess this. In addition, as real-time VL monitoring was not performed, we cannot evaluate the VL.5000 copies/ml WHO criteria for switching without immunological or clinical failure [2], nor investigate the performance of CD4 criteria in identifying this threshold value. Figure 1 shows that most participants switching for immunological/clinical failure with VL.400 copies/ml had values around or above this threshold.
The unique opportunity afforded by the DART trial design to evaluate the value of using CD4 testing parsimoniously to support clinical monitoring when VL monitoring is unavailable, provides clear evidence that a single CD4 tie-breaker at clinical failure with a 250 cells/mm 3 threshold can identify patients who have suppressed VL, providing the potential to reduce unnecessary switching to costly second-line ART. Adding a single CD4 count in those failing clinically could therefore improve the specificity (and positive predictive value) of clinically identified failure for virological failure: clearly a single count cannot improve the sensitivity for detecting virological failure in patients without clinical events. Our results further suggests that multiple (but not single) WHO 3 events should be considered as well as WHO 4 events in the definition of clinical failure when CD4 testing is unavailable. If limited CD4 testing is possible, targeting this to patients with a single WHO stage 3 event could identify additional failures with detectable VL whilst still avoiding premature, costly switching to second-line. Programmes in low-income countries that are considering how to scale-up laboratory services to rationally support regular clinical follow-up can use these results to plan how to widen access to CD4 monitoring taking advantage of new point-of-care technologies.