Simplified Clinical Prediction Scores to Target Viral Load Testing in Adults with Suspected First Line Treatment Failure in Phnom Penh, Cambodia

Background For settings with limited laboratory capacity, 2013 World Health Organization (WHO) guidelines recommend targeted HIV-1 viral load (VL) testing to identify virological failure. We previously developed and validated a clinical prediction score (CPS) for targeted VL testing, relying on clinical, adherence and laboratory data. While outperforming the WHO failure criteria, it required substantial calculation and review of all previous laboratory tests. In response, we developed four simplified, less error-prone and broadly applicable CPS versions that can be done ‘on the spot’. Methodology/Principal Findings From May 2010 to June 2011, we validated the original CPS in a non-governmental hospital in Phnom Penh, Cambodia applying the CPS to adults on first-line treatment >1 year. Virological failure was defined as a single VL >1000 copies/ml. The four CPSs included CPS1 with ‘current CD4 count’ instead of %-decline-from-peak CD4; CPS2 with hemoglobin measurements removed; CPS3 having ‘decrease in CD4 count below baseline value’ removed; CPS4 was purely clinical. Score development relied on the Spiegelhalter/Knill-Jones method. Variables independently associated with virological failure with a likelihood ratio ≥1.5 or ≤0.67 were retained. CPS performance was evaluated based on the area-under-the-ROC-curve (AUROC) and 95% confidence intervals (CI). The CPSs were validated in an independent dataset. A total of 1490 individuals (56.6% female, median age: 38 years (interquartile range (IQR 33–44)); median baseline CD4 count: 94 cells/µL (IQR 28–205), median time on antiretroviral therapy 3.6 years (IQR 2.1–5.1)), were included. Forty-five 45 (3.0%) individuals had virological failure. CPS1 yielded an AUROC of 0.69 (95% CI: 0.62–0.75) in validation, CPS2 an AUROC of 0.68 (95% CI: 0.62–0.74), and CPS3, an AUROC of 0.67 (95% CI: 0.61–0.73). The purely clinical CPS4 performed poorly (AUROC-0.59; 95% CI: 0.53–0.65). Conclusions Simplified CPSs retained acceptable accuracy as long as current CD4 count testing was included. Ease of field application and field accuracy remains to be defined.


Introduction
Scaling-up of antiretroviral treatment (ART) is currently ongoing in low and middle income countries (LMIC), aiming to initiate 15 million individuals on ART by 2015 [1]. One of the key challenges for program managers and policy makers in these countries is how to monitor these individuals for treatment failure, considering the often limited financial resources at hand [2].
Routine viral load (VL) testing is now recommended by the World Health Organization (WHO) [3] but, with currently available technologies, comes at a high cost and is technically demanding. Thus, it will still take many years before this will be readily available in routine program settings in many LIMC [4,5]. Instead, for settings with limited VL testing capacity, the 2013 WHO guidelines recommend targeted VL testing [3].
Targeted VL testing, whereby VL testing is done only in individuals meeting failure criteria, aims to avoid unnecessary and costly switches to second line treatment for patients with falsepositive 'screening' tests [2]. Effective implementation of such a strategy requires accurate and evidence-based tools to target VL testing. Whereas most programs have been applying WHO clinical and immunological failure criteria [6], studies have consistently demonstrated the low sensitivity and specificity of these criteria [2,7].
We previously developed a clinical prediction score (CPS) for virological failure integrating clinical, adherence and laboratory data [8]. At the same time, we constructed an algorithm combining the CPS with targeted VL testing (in patients with a CPS $2). The score performed substantially better than the WHO failure criteria, and performed well in internal validation (Cambodia) and external validation (Uganda) [9,10]. Experience from Lesotho provided additional support for its use in patients who were identified based on WHO immunological and clinical criteria as treatment failure [11]. A few other CPSs have been developed for the same purpose, but they have either not been validated, or performed poorly during validation [9,12,13].
Two limitations of the original CPS were identified during the validation study in Cambodia [10]. Firstly, frequent errors were made when physicians applied the score, which affected the CPS performance. Errors in scoring of the individual items were most commonly seen for items like the percentage decrease from peak CD4 cell counts or the decline in hemoglobin values since these items require calculation, and availability and review of all previous laboratory results. Secondly, the reliance on regular laboratory monitoring of CD4 count and hemoglobin values limit implementation of the original CPS in settings where such tests are not routinely performed. In response to these limitations, we derived and validated several simplified versions of the original clinical score, aiming to make the tool less error-prone, easier to apply and more broadly applicable.

Study Setting
Sihanouk Hospital Center of HOPE (SHCH) is a nongovernmental hospital in Phnom Penh Cambodia. Since 2003, the hospital has provided ART at no cost as part of the national program. Patients were initiated and treated according to WHO recommendations [6,14]. First line treatment consisted of a generic combination of stavudine, lamuvidine and nevirapine. Zidovudine and efavirenz was used in case of contraindications. The revised 2010 guidelines were implemented in May 2010. Patients were seen at regular intervals for clinical and laboratory monitoring and adherence assessment. All care was provided by physicians. More information on the program has been published before [15][16][17].

Validation Study of the Original CPS
Details of the original validation study have been previously reported [10]. In brief, we conducted a cross-sectional study within an established ART cohort in Cambodia between May 10, 2010 and June 3, 2011, including all consenting adults on a nonnucleoside containing (standard) first-line regimen for a minimum of one year (n = 1490). The original score -derived in this hospital -contained eight items: 1) CD4 decline from peak .25%, 2) CD4 decline from peak .50%, 3) hemoglobin drop$1 g/dL over the last six months, 4) current CD4 below baseline, 5) CD4 count , 100 cells/mL at 12 month of ART, 6) papular pruritic eruption (PPE), 7) treatment adherence ,95% (measured using a visual analogue score); 8) ART-experience [8]. Summing the predictor scores of the individual's risk factors yielded the total predictor score for each patient. Virological failure was defined as a VL above 1000 copies/ml at the cross-sectional survey. The overall test performance was assessed by calculating the area under the receiver-operating characteristic curve (AUROC).

Development and Validation of Simplified CPS
The following simplified CPSs were evaluated with a progressive increase in simplification by removing: 1) the 'decline from peak CD4 count values' and only using 'current' CD4 count (CPS1) -ie the value of the CD4 count at the time of assessment for virological failure; 2) hemoglobin as item in the CPS (CPS2); 3) 'decrease in CD4 count below baseline value' as item in the CPS (CPS3); and 4) all laboratory items from the CPS (CPS4).
The same methodology for score development was used as for the original score derivation [8]. In brief, we used the Spiegelhalter and Knill-Jones method adapted by Berkley et al [18][19][20]. Continuous variables were dichotomized as guided by ROC curves. Variables independently associated with virological failure with a likelihood ratio (LHR) $1.5 or #0.67 were retained. The score of each predictor was calculated as the natural logarithm of the adjusted LHR, rounded to the nearest integer. Summing the predictor scores of the individual's risk factors yielded the total predictor score for each patient. For simplification, the score was recoded by setting the reference level for each predictor to zero [20]. The performance of the different CPSs was based on the AUROC and 95% confidence intervals (CIs). A CPS was considered clinically useful if, during validation, the lower limit of the 95% CI was 0.6 or above. In addition, we calculated the sensitivity, specificity, positive and negative predictive values, the positive and negative likelihood ratio's and proportion of individuals that would require VL testing at the different cut-off values of each CPS. The simplified scores were developed using the same dataset of the study mentioned above, conducted in 2010-11. The different CPSs were validated in an independent dataset from the same hospital (2005)(2006)(2007), as reported before [8]. In this study, patients were enrolled following the same inclusion criteria as the 2010-2011 study mentioned above (Table 1). Confidence intervals for the validated AUROCs were calculated using robust standard errors.

Ethical Considerations
The VL validation study was approved by the National Ethics Committee for Health Research, Phnom Penh, Cambodia; the institutional review board (IRB) of the Institute of Tropical Medicine, Antwerp, Belgium; and the Ethics Committee of the University Hospital of Antwerp, Belgium. All study participants provided written consent.

Results
A total of 1490 adults on first-line ART for at least one year were included. As previously described, median time on ART at the time of evaluation of virological treatment response was 3.6 (IQR 2.5-5.1) years. The median age at study inclusion was 38 (IQR 33-44) years and the median CD4 count was 379 (IQR 265-507) cells/mL. Approximately half (56.7%) were female and .90% were on a zidovudine or stavudine-based regimen at the time of evaluation. A total of 45 patients (3.0%) were detected with virological failure.
In a first step (CPS1), we removed CD4 decline from peak value, since it requires substantial calculation and complete on treatment CD4 count data, but retained 'current CD4 count'. The AUROC of CPS1 was 0.78 (95% CI 0.70-0.85) in derivation and 0.69 (95% CI 0.62-0.75) in validation ( Table 2). With the WHO 2013 guidelines arguing against routine hemoglobin monitoringeven for patients using zidovudine -hemoglobin measurement was additionally removed in CPS2. The validated AUROC decreased to 0.68 (95% CI 0.62-0.74). To obtain a tool devoid of any requirement of previous laboratory results (hence only relying on ''current'' lab tests), comparison with baseline CD4 count was removed in CPS3. Only a slight decrease in AUROC was observed. This provided a system essentially relying on basic clinical information collected during routine patient assessment: clinical evaluation (WHO T-stage, PPE, treatment information (ART-experience and ART adherence) and CD4 response at the time of assessment. No laboratory monitoring was included in CPS4. However, this tool performed poorly (AUROC 0.59 in validation). The sensitivity, specificity, positive predictive value and percentage of individuals requiring a VL at various score cutoff values of the different CPS is given in Table 3, applying the CPS in an independent dataset.

Discussion
We developed four simplified CPSs, of which three performed relatively well and have a number of advantages. Compared to the original CPS, no regular laboratory monitoring while on ART is Table 1. Datasets used for the development and validation of the simplified prediction score to identify virological failure.  required. Moreover, with a choice of CPS, programs can decide which is most appropriate and feasible to apply in their program settings. Performing hemoglobin measurement as part of an ART monitoring strategy can be cumbersome in many settings, comes with an additional cost and is not required in the latest WHO guidelines. Moreover, the added value in terms of diagnostic performance was limited, especially at lower cut-off values. CPS2 and CPS3 seem most useful and practical. CPS2 could be used when a baseline CD4 cell count is available, while CPS3 would apply when patients have initiated ART without CD4 guidance. With only a limited amount of key clinical information and minimal calculation required, these CPSs are also substantially simpler and possibly more adapted for use by non-physicians. Some resource-constrained countries predominantly use clinical monitoring to detect treatment failure and CD4 cell counts are not always regularly done while on ART, especially in remote areas. However, we found that, in line with previous studies, CPS4relying on clinical information only -performed poorly [2]. Adding a CD4 cell count test 'on the spot' would substantially improve the performance of the algorithm without requiring routine CD4 monitoring. Current CD4 count was also found to be a strong predictor in a Ugandan scoring system [9]. Of interest, the same CD4 count cut-off (250 cells/mL) was identified in a recent study using data from the DART trial [21]. Pending the development of an affordable and widely available point of care VL test, use of these CPSs combined with targeted VL testing should be considered. If CD4 count testing can be done 'on site' or with a point of care test [22], application of the CPS could be done within one day. If not, patients might need two visits to apply the CPS. However, this creates an additional burden for both patients and health care staff, and increases chances for delays and missed appointments.
Further study is required to determine at what time points during ART these scores should be applied, and in which different health care settings and patient populations. Moreover, these algorithms were designed for patients on ART for at least one year. One approach could consist of a routine VL early after ART initiation (eg six months after ART initiation) followed by application of the score at predefined intervals, combined with targeted VL testing. Whereas applying the score every six months would equate to routine CD4 count monitoring, the current CPS would still have the advantage of not requiring interpretation of a trend of laboratory tests. While these simplifications would likely render them easier to apply and less error-prone compared to the original CPS, this is subject to further study.
A number of issues remain to be assessed before implementing these CPSs in routine care. Validation in other study populations and different settings should be conducted. Additional studies on the clinical utility and impact are warranted [23,24]. The study population in this analysis consisted of patients on ART for several years. Possibly, the optimal 'current CD4 cell count' cut-off might differ in patient populations with short or very long treatment duration. Moreover, most patients in our cohort started ART with relatively advanced HIV disease. It remains to be assessed whether the performance of the CPSs would be different in patients starting early ART, as would occur within a test-and-treat strategy.
Additional limitations include the fact that treatment failure was based on a single VL measurement, in line with the previous derivation and validation study. Since a substantial proportion might have an undetectable VL on repeat measurement, only a proportion of those with virological failure in our study would be true treatment failures. On the other hand, the main purpose of targeted VL testing is early detection of active viral replication, since these patients need enhanced adherence counseling followed by a repeat VL. We did not evaluate whether individual opportunistic infections should be included as predictors in the CPS, but used 'pooled' WHO clinical staging groups. While this might be a more 'crude' approach, it has the advantage of being simpler. Finally, the failure rate in our study was clearly lower than what has been observed in most other programs. Consequently, positive predictive values will be higher and negative predictive values lower in settings with higher failure rates.
We acknowledge that the present algorithms are unlikely to be universally applicable and generalizable across a wide range of geographical regions, populations and health care settings. Derivation and validation of algorithms for local use (eg at the country or regional level) might be a reasonable alternative, and might also be valuable for a wider range of diagnostic tests or health care interventions. Targeted evaluation using evidencebased tools might be a rational way to optimize the use of often scarce resources available for health delivery in LMIC [2]. At the same time, such an approach also appreciates that patient populations are not homogeneous, but consist of different 'risk groups' with different needs and benefits of testing and other interventions. Especially as long as cheap point of care tests are lacking, such approaches merit further study.
In conclusion, we have developed three simplified CPSs for targeted VL testing that retained an acceptable diagnostic performance as long as 'spot' CD4 testing was included. They have the advantages of not requiring regular CD4 cell count or hemoglobin testing, or complete and reliable data collection while on ART, or the need to interpret trends in laboratory tests. They Table 3. Diagnostic performance at different cut-offs of the clinical prediction scores to identify virological failure using an independent dataset. should, therefore, be less prone to error and more readily adopted in field conditions. Their use would enable more accurate targeted VL testing in contexts where it is too expensive to be done routinely. We suggest that they be tested and tailored to the other health care settings.

Author Contributions
Conceived and designed the experiments: JvG. Performed the experiments: VP. Analyzed the data: JvG OK. Wrote the paper: JvG OK LL. Assistance in data interpretation: VP ST LL. Improvement of the intellectual concept of the paper: VP ST.