Diagnostic accuracy of cervical cancer screening and screening–triage strategies among women living with HIV-1 in Burkina Faso and South Africa: A cohort study

Background Cervical cancer screening strategies using visual inspection or cytology may have suboptimal diagnostic accuracy for detection of precancer in women living with HIV (WLHIV). The optimal screen and screen–triage strategy, age to initiate, and frequency of screening for WLHIV remain unclear. This study evaluated the sensitivity, specificity, and positive predictive value of different cervical cancer strategies in WLHIV in Africa. Methods and findings WLHIV aged 25–50 years attending HIV treatment centres in Burkina Faso (BF) and South Africa (SA) from 5 December 2011 to 30 October 2012 were enrolled in a prospective evaluation study of visual inspection using acetic acid (VIA) or visual inspection using Lugol’s iodine (VILI), high-risk human papillomavirus DNA test (Hybrid Capture 2 [HC2] or careHPV), and cytology for histology-verified high-grade cervical intraepithelial neoplasia (CIN2+/CIN3+) at baseline and endline, a median 16 months later. Among 1,238 women (BF: 615; SA: 623), median age was 36 and 34 years (p < 0.001), 28.6% and 49.6% ever had prior cervical cancer screening (p < 0.001), and 69.9% and 64.2% were taking ART at enrolment (p = 0.045) in BF and SA, respectively. CIN2+ prevalence was 5.8% and 22.4% in BF and SA (p < 0.001), respectively. VIA had low sensitivity for CIN2+ (44.7%, 95% confidence interval [CI] 36.9%–52.7%) and CIN3+ (56.1%, 95% CI 43.3%–68.3%) in both countries, with specificity for ≤CIN1 of 78.7% (95% CI 76.0%–81.3%). HC2 had sensitivity of 88.8% (95% CI 82.9%–93.2%) for CIN2+ and 86.4% (95% CI 75.7%–93.6%) for CIN3+. Specificity for ≤CIN1 was 55.4% (95% CI 52.2%–58.6%), and screen positivity was 51.3%. Specificity was higher with a restricted genotype (HPV16/18/31/33/35/45/52/58) approach (73.5%, 95% CI 70.6%–76.2%), with lower screen positivity (33.7%), although there was lower sensitivity for CIN3+ (77.3%, 95% CI 65.3%–86.7%). In BF, HC2 was more sensitive for CIN2+/CIN3+ compared to VIA/VILI (relative sensitivity for CIN2+ = 1.72, 95% CI 1.28–2.32; CIN3+: 1.18, 95% CI 0.94–1.49). Triage of HC2-positive women with VIA/VILI reduced the number of colposcopy referrals, but with loss in sensitivity for CIN2+ (58.1%) but not for CIN3+ (84.6%). In SA, cytology high-grade squamous intraepithelial lesion or greater (HSIL+) had best combination of sensitivity (CIN2+: 70.1%, 95% CI 61.3%–77.9%; CIN3+: 80.8%, 95% CI 67.5%–90.4%) and specificity (81.6%, 95% CI 77.6%–85.1%). HC2 had similar sensitivity for CIN3+ (83.0%, 95% CI 70.2%–91.9%) but lower specificity compared to HSIL+ (42.7%, 95% CI 38.4%–47.1%; relative specificity = 0.57, 95% CI 0.52–0.63), resulting in almost twice as many referrals. Compared to HC2, triage of HC2-positive women with HSIL+ resulted in a 40% reduction in colposcopy referrals but was associated with some loss in sensitivity. CIN2+ incidence over a median 16 months was highest among VIA baseline screen-negative women (2.2%, 95% CI 1.3%–3.7%) and women who were baseline double-negative with HC2 and VIA (2.1%, 95% CI 1.3%–3.5%) and lowest among HC2 baseline screen-negative women (0.5%, 95% CI 0.1%–1.8%). Limitations of our study are that WLHIV included in the study may not reflect a contemporary cohort of WLHIV initiating ART in the universal ART era and that we did not evaluate HPV tests available in study settings today. Conclusions In this cohort study among WLHIV in Africa, a human papillomavirus (HPV) test targeting 14 high-risk (HR) types had higher sensitivity to detect CIN2+ compared to visual inspection but had low specificity, although a restricted genotype approach targeting 8 HR types decreased the number of unnecessary colposcopy referrals. Cytology HSIL+ had optimal performance for CIN2+/CIN3+ detection in SA. Triage of HPV-positive women with HSIL+ maintained high specificity but with some loss in sensitivity compared to HC2 alone.


Conclusions
In this cohort study among WLHIV in Africa, a human papillomavirus (HPV) test targeting 14 high-risk (HR) types had higher sensitivity to detect CIN2+ compared to visual inspection but had low specificity, although a restricted genotype approach targeting 8 HR types decreased the number of unnecessary colposcopy referrals. Cytology HSIL+ had optimal performance for CIN2+/CIN3+ detection in SA. Triage of HPV-positive women with HSIL+ maintained high specificity but with some loss in sensitivity compared to HC2 alone.

Author summary
Why was this study done?
• Invasive cervical cancer is the second most common cancer among women in low-and middle-income countries and a leading cause of cancer-related death in women in sub-Saharan Africa.
• Women living with human immunodeficiency virus (WLHIV) have an increased risk of cervical cancer and precancer. The majority of WLHIV live in low-and middle-income countries, where access to cervical cancer screening and treatment of precancerous cervical lesions is limited.
• Screening approaches most commonly used in sub-Saharan Africa, including visual inspection of the cervix and cervical cytology, which checks for cervical cell abnormalities, have shown variable diagnostic accuracy to detect precancerous lesions. Molecularbased screening approaches, such as human papillomavirus (HPV) DNA testing, which screens for oncogenic HPV infection, have shown high sensitivity for cervical precancer but can result in over-referral to colposcopy, a procedure to determine eligibility for treatment.
• The optimal screening test, age to begin screening, and frequency of screening for WLHIV remain uncertain.
What did the researchers do and find?
• We evaluated several cervical cancer screening strategies in over 1,200 WLHIV in sub-Saharan Africa.
• We found that an HPV DNA test identified a greater number of women with precancer compared to visual inspection and cytology methods. However, there was a greater proportion of women without precancer who had a positive HPV DNA test, meaning a triage test using cytology or visual inspection was required to determine treatment eligibility.
• Simple user-applied modifications to the HPV-DNA-based test resulted in fewer women without precancer testing positive and fewer women needing triage.
• In settings with adequate infrastructure, cervical cytology was a useful triage test for HPV-positive women.
What do these findings mean?
• Our data contribute to the evidence on choice of screening strategies for detection of cervical precancer among WLHIV in low-and middle-income settings.

Introduction
In May 2018, the Director-General of the World Health Organization (WHO) announced a global call for action towards the elimination of invasive cervical cancer (ICC) as a public health problem, calling for more innovative technologies for detection of high-grade cervical intraepithelial neoplasia (CIN) grade 2 and higher (CIN2+) and better strategies to increase screening coverage and uptake [1]. The 2030 cervical cancer elimination targets include vaccinating 90% of eligible girls against human papillomavirus (HPV), screening 70% of eligible women for cervical cancer, and effectively treating 90% of those with a positive lesion [2].
Since the introduction of HPV vaccination, cervical cancer screening in high-income settings has shifted from the identification of cellular changes in cytology towards the molecular detection of high-risk HPV (HR-HPV) types as the form of primary screening, allowing for increased automation and standardisation of diagnostic procedures. Studies among regularly screened women in Europe have shown that HPV-based screening reduces the risk of ICC by up to 70% compared to cervical cytology, also allowing extension of screening intervals due to the higher negative predictive value (NPV) [3]. Approaches using HPV DNA tests can be easily adapted to resource-limited settings, allow for self-collected samples, and are less observer dependent than visual inspection methods, which have variable sensitivity and specificity among women living with HIV (WLHIV) [4][5][6]. However, HPV DNA tests can detect many transient infections, meaning that their specificity for high-grade CIN is low, especially in populations with high prevalence of HR-HPV [7]. This is problematic among WLHIV, who are more likely to have multiple HR-HPV co-infections with a broader range of HR-HPV genotypes [8] and have a higher risk of HR-HPV incidence and persistence compared to HIV-negative women [9]. However, there is increasing evidence that WLHIV on effective ART with sustained HIV viral suppression have lower prevalence of HR-HPV [10], which may impact on diagnostic accuracy of HPV-DNA-based testing in screening and screening-triage for CIN2+ detection. Current WHO guidelines recommend that cervical cancer screening should be started in sexually active girls and women, as soon as they have tested positive for HIV, and if the screening test is negative, a repeat test should be done within 3 years [11], although more recent evidence from cross-sectional and prospective studies is being considered in the revision of these recommendations, including optimal screen and screentriage modalities, age to initiate screening, and screening intervals.
The majority of WLHIV live in low-and middle-income countries (LMICs), where cervical cancer incidence is high [12] but where cervical cancer screening coverage and linkage to care is low [13,14] and largely unknown for WLHIV [15], as infrastructure and personnel requirements for screening and management put high demands on the health systems. Furthermore, cervical cancer screening approaches more commonly utilised in LMICs, including visual inspection methods and cervical cytology, have variable and often suboptimal sensitivity and specificity for CIN2+ detection and can lack reproducibility both in women in the general population and among WLHIV. Screening and screening-triage strategies that can be feasibly implemented and that have high diagnostic accuracy to detect CIN2+/CIN3+ are needed. We previously evaluated the association of HIV-related factors with the natural history of HPV infection and CIN2+/CIN3+ in a prospective cohort of WLHIV followed over a median 16 months in Burkina Faso (BF) and South Africa (SA) [16,17]. The primary objective of the current study was to evaluate the diagnostic accuracy of 3 screening approaches (index tests): HR-HPV DNA tests (Hybrid Capture 2 [HC2] and careHPV), visual inspection (standard of care in BF), and cervical cytology (standard of care in SA) for the detection of prevalent histology-confirmed CIN2+/CIN3+ (reference method) in screening and in triage (Analysis 1). Secondary objectives were to evaluate the diagnostic accuracy of those test strategies by ART status and age (Analysis 2). To inform on frequency of screening, we evaluated CIN2+ incidence over a median 16 months among baseline screen-negative WLHIV (Analysis 3).

Study design and participants
We enrolled WLHIV recruited from the Hôpital de Jour (the HIV outpatient clinic of the Internal Medicine Department at Centre Hospitalier Universitaire Yalgado [CHU-Yalgado]), Ouagadougou, BF, and the Esselen Street Clinic (a primary health clinic) and Ward 21 of Hillbrow Community Health Centre (an ART initiation site) in Johannesburg, SA, from 5 December 2011 to 30 October 2012 in a prospective evaluation study of cervical cancer screening strategies, as previously described [17]. In brief, women were enrolled consecutively in the HARP (HPV in Africa Research Partnership) study if they were HIV-1 seropositive, aged 25-50 years, and resident in the study city. Women were excluded if they had a history of prior treatment for cervical cancer, had previous hysterectomy, or were pregnant or less than 8 weeks postpartum. Enrolment was stratified in a 2:1 ratio of ART users:ART-naïve WLHIV. Participants were followed up every 6 months for CD4+ T lymphocyte cell count monitoring and up to month 18, when procedures similar to baseline were repeated (median 16 months after baseline). All women provided written informed consent. Ethical approval was granted by the Ministry of Health in BF (no. 2012-12-089), the University of the Witwatersrand in SA (no. 110707), and the London School of Hygiene & Tropical Medicine (no. 7400).

Procedures
At baseline and endline (median 16 months later) visits, cervical samples were collected from all women using a Digene cervical sampler (Qiagen, Courtaboeuf, France) for HPV DNA testing and genotyping, a cytobrush for Papanicolaou smear cytology, and swabs from the ecto/ endocervix and vagina to detect sexually transmitted infections (STIs). All participants had a visual inspection using acetic acid (VIA) and a visual inspection using Lugol's iodine (VILI) performed by trained nurses following the International Agency for Research on Cancer (IARC) guidelines [18]. All participants were referred for colposcopy at a median of 12 weeks (interquartile range [IQR] 8-15) following the baseline visit, performed by trained colposcopists applying the Swede score for clinical severity grading [19]. Colposcopists were aware of VIA/VILI, cytology, and HPV DNA test results. Systematic 4-quadrant cervical biopsy, including directed biopsy of any suspicious lesions, was performed for participants who had abnormalities detected by cytology (atypical squamous cells of undetermined significance, or greater [ASCUS+]) or VIA/VILI or during colposcopy, or who were HR-HPV DNA positive. A venous blood sample was collected to confirm HIV-1 serostatus if needed, and to obtain HIV-1 RNA plasma viral load and CD4+ T cell count.
HR-HPV testing using the qualitative Digene HC2, which detects 13 HR-HPV types (HPV16, -18, -31, -33, -35, -39, -45, -51, -52, -56, -58, -59, and -68), at baseline was performed centrally at the University of Montpellier (UM) virology laboratory by trained laboratory technicians in France as previously described [20]. The qualitative careHPV (Qiagen, Gaithersburg, MD), which detects 14 HR-HPV types (HPV16, -18, -31, -33, -35, -39, -45, -51, -52, -56, -58, -59, -66, and -68), was performed at endline by trained laboratory technicians at the Centre de Recherche Biomoléculaire Pietro Annigoni (CERBA), Ouagadougou, BF, and the National Health Laboratory Service (NHLS), Johannesburg, SA. A high level of agreement between HC2 and careHPV was reported in a nested study [21]. Quality assurance was performed by the UM virology laboratory. Results were displayed by the careHPV test controller without additional specification of the luminescent signal intensity. Genotyping with the INNO-LiPA HPV Genotyping Extra assay (Innogenetics, Courtaboeuf, France) was conducted at UM as previously described [20]. Conventional cytological reading was based on the Papanicolaou method and performed at the pathology department at CHU-Yalgado in Ouagadougou and the NHLS in Johannesburg according to the Bethesda classification system [22], with a quality assurance scheme organised by the UM virology laboratory for both countries. The NHLS lab was also subscribed to the Cytopathology Quality Assurance Program of the Royal College of Pathologists of Australasia Quality Assurance Program.
Cervical biopsies were processed at the local pathology laboratories and read using the 3-tier CIN classification system [23]. The reference standard of histology was classified as 'negative' (�CIN1) or 'positive' (CIN2+) based on the highest reading across all findings from the 4-quadrant biopsies and endocervical curettage if collected. The histopathologist was blind to VIA/VILI, cytology, and HPV DNA test results but was aware of colposcopy diagnosis. All histological slides from women with a local diagnosis of CIN2+ and approximately 10% of slides from women with �CIN1 histological findings were reviewed by the HARP Endpoint Committee of 5 pathologists, for consensus classification, which showed high agreement [24].
Participants were recalled for CIN2+ management according to local guidelines at each site, if found to have CIN2+ lesions by histology at the baseline and/or endline visit. The management visit was scheduled at the earliest convenient date once the result was known. Due to demands on local health services in SA, this often meant that CIN2+ management was scheduled up to 14 months after diagnosis. CIN2+ status was therefore defined according to whether the participant had received CIN2+ management between enrolment and follow-up. CIN2 + prevalence at baseline was defined as the number of women with CIN2+ detected at baseline among all women enrolled in the HARP study. Cumulative CIN2+ prevalence at endline was defined as the number of women with CIN2+ detected at endline among all women attending the endline visit, irrespective of whether women were treated for prevalent CIN2+ between baseline and endline. CIN2+ incidence was defined as newly detected CIN2+ at the endline visit among women without CIN2+ at baseline.

Statistical analysis
In the analysis of diagnostic accuracy (Analysis 1), the index tests evaluated included VIA alone, VIA/VILI (co-testing when either test is positive among all women screened), cytology (using thresholds of low-grade squamous intraepithelial lesion or greater [LSIL+] and high-grade squamous intraepithelial lesion or greater [HSIL+]), and HR-HPV DNA (Digene HC2) for the detection of histology-confirmed CIN2+ and CIN3+ (reference method) at baseline. Sensitivity, specificity, positive predictive value (PPV) complement of NPV (1 − NPV), the number of referrals to colposcopy that would be generated for each CIN2+ or CIN3+ case identified (number needed to refer [NNR] = 1/PPV) [25], and the number of referrals per 1,000 women screened were reported for each of the index tests. For HC2, we considered test positivity at varying thresholds of the relative light unit (RLU) between �1 and �20, corresponding with increasing HPV viral load [26], to evaluate the threshold effect on test specificity to distinguish CIN2+/CIN3+. Among HR-HPV (HC2) positive women, we evaluated the diagnostic accuracy of triage approaches, including VIA, VIA/ VILI, cytology (ASCUS+ and HSIL+), a combination of HPV16/18 genotyping and cytology (test positive if HPV16 or HPV18 positive, or cytology [ASCUS+ or HSIL+] when negative for both HPV16 and HPV18), and combination of HPV16/18 genotyping and VIA (test positive if HPV16 or HPV18 positive, or VIA abnormal when negative for both HPV16 and HPV18).
We evaluated the diagnostic accuracy of a restricted genotyping approach using results of the INNO-LiPA genotyping assay in the following combinations (positive for any genotype): HPV16; HPV16/18/45 (3 high-risk [HR] types), and HPV16/18/45/31/33/35/52/58 (8 HR types). Because of the low limit of detection of INNO-LiPA and to improve clinical relevancy, test positivity was defined as positivity for any of those genotypes among women who were also HC2 positive. We also evaluated the diagnostic accuracy of an HPV-based test targeting HR types previously reported to be most significantly associated with CIN3+ in the HARP cohort [27]. Relative sensitivity (RSen) and relative specificity (RSpec) and 95% confidence intervals (CIs) of screening tests compared to the standard of care in each country (VIA/VILI in BF and HSIL+ cytology in SA) were calculated [28].
In order to observe the performance of screening strategies in an already screened population, we evaluated the diagnostic accuracy of endline VIA/VILI, HPV DNA (careHPV), and cytology for cumulative CIN2+ detection at endline, excluding women who were treated for prevalent CIN2+ at baseline.
To evaluate the association of HIV-related factors with diagnostic accuracy of screening strategies, diagnostic accuracy for CIN2+/CIN3+ was evaluated separately among women on prolonged ART (>2 years), women on short-duration ART (�2 years), and ART-naïve women at baseline (Analysis 2). Diagnostic accuracy was also evaluated according to age at screening (Analysis 2). The cumulative incidence of CIN2+ at endline was calculated among women who screened negative for each of the screening strategies at baseline (Analysis 3). Analyses for diagnostic accuracy were conducted for discrete outcomes of CIN2+ and CIN3+. Data are presented separately for each country. Data were analysed using Stata (version 16) and according to the study statistical analysis plan (S1 Text). This article was reported according to the Standards for Reporting of Diagnostic Accuracy (STARD) statement (S1 Checklist) [29]. The dataset is available in the Mendeley Data online repository at doi: 10.17632/yd5ygw38vj.1.
At endline, 27 (84.4%) women in BF with CIN2+ detected at baseline who underwent management of their CIN2/3 lesions returned for the endline visit. In SA, 97 women with CIN2 + detected at baseline returned for the endline visit, and of these, 61 (63%) underwent management before the colposcopy/biopsy endline visit. Of the 36 participants who did not undergo treatment, 20 (55.6%) had CIN2/3 detected again at endline, and 16 (44.4%) had lower grade lesions (�CIN1). Of the women who underwent management, the median time from colposcopy visit to management was similar in both countries (BF: 10.5 months, IQR 7.3-12.6; SA: 10.7 months, IQR 6.2-13.8).

Discussion
We evaluated the diagnostic accuracy of screen and screen-triage approaches for CIN2 +/CIN3+ in a large prospective cohort of WLHIV from 2 African countries with different HIV epidemics, different burdens of HPV infection and cervical cancer, and differing approaches to screening for cervical cancer. This allows the findings to be extended to a range of low-and middle-income settings. We found that an HPV-DNA-based test had high sensitivity but low specificity for CIN2+/CIN3+, but with simple modifications to increase the threshold for test positivity and with a restricted genotype approach resulted in higher specificity and correspondingly fewer referrals to colposcopy. Triage of HPV-positive women with VIA/VILI in BF and cytology (HSIL+) in SA resulted in a further reduction in referrals, with minimal impact on sensitivity for CIN3+, but not for CIN2+.
HPV-based tests have high sensitivity for CIN2+/CIN3+ in both HIV-negative women and WLHIV, but specificity to distinguish CIN2+ is lower in WLHIV compared to HIV-negative women [30][31][32][33][34]. HPV-based tests targeting up to 14 HR types, including HC2, careHPV, and GeneXpert, have all shown high sensitivity but low specificity for CIN2+/CIN3+ in WLHIV [4][5][6]31,35,36], due to the high prevalence of HPV infection among these women. In a metaanalysis of 20 studies evaluating the association between HR-HPV prevalence and the specificity of HPV DNA testing (HC2) to distinguish CIN2+, HC2 specificity decreased by 8.4% (95% CI 8.02%-8.81%) for each 10% increase in HR-HPV prevalence [7]. In the HARP study, approximately half of the WLHIV with HR-HPV were infected with 2 or more HR types, and 19% were infected with 3 or more at baseline. Over 16 months, 35% of infections persisted, and 54% of women acquired a new HR infection [27]. An HPV test that can distinguish clinically relevant from transient HR-HPV infection is thus warranted. Improved specificity could be achieved with a modified approach to use of HPV DNA by increasing the threshold for test positivity, corresponding to higher HPV viral load, which is associated with persistent infection or infection further along the pathway to CIN2+ [26], and by utilising a restricted genotype approach to target a smaller number of genotypes that are most associated with cervical cancer [37]. We have shown in this study that increasing the threshold for test positivity and restricting the test to specific HR genotypes can increase the specificity of HC2 to distinguish CIN2+ by 20%. These findings are consistent with a cross-sectional study evaluating the diagnostic accuracy of GeneXpert among WLHIV in Cape Town, SA, that reported an increase in specificity to distinguish CIN2+ from 60% using the manufacturer-defined threshold and targeting 14 HR-HPV types to 77% using a higher threshold to determine test positivity and restricting analysis to 8 HR-HPV types [31]. There is however some loss in sensitivity associated with this approach, and a balance will need to be achieved based on capacity to refer HR-HPV-positive women for colposcopy and treatment.
We also found that the specificity of HC2 varied according to ART status, with the highest specificity observed in women taking ART for more than 2 years, corresponding with lower HR-HPV and CIN2+ prevalence. These findings are consistent with that reported in a cohort of WLHIV undergoing screening in Nairobi, Kenya [5], and Johannesburg, SA [4]. In the future, all women newly diagnosed with HIV should start ART immediately [38], irrespective of CD4+ cell count. It is expected that women starting ART at the time of HIV diagnosis who experience a shorter duration of immunosuppression, or none, will have lower risk of HR-HPV persistence, CIN2/3 incidence, and cervical cancer compared to WLHIV who may have initiated ART according to older guidelines [10]. As a consequence, the specificity of HPV-DNA-based approaches may be higher in these women due to the lower prevalence of transient or non-clinically relevant HPV infections. An HPV-based strategy using a modified threshold, with or without a restricted genotype approach, could be a highly accurate and reproducible screening strategy in these women. However, there will remain a significant proportion of WLHIV who started ART under older guidelines and at lower CD4+ cell count, or women in settings where early access to ART may be a challenge, who remain at elevated risk. HPV-based test specificity remained low in these women in our study, irrespective of the threshold for test positivity or use of a restricted genotype approach. In the short term, it may be necessary to consider a risk stratification approach with alternative screening strategies for women with poorly controlled HIV, and if HPV-based tests are used for screening, this group may require a second test in triage or repeat testing over time due to the low specificity of a one-time HPV test among these women.
Alternative approaches to the use of HPV tests could include repeat HPV DNA testing over time, which may distinguish HR-HPV persistent infection associated with CIN2+ from transient infections. We found in this study that 74% of WLHIV with CIN2+ detected at endline had type-specific persistence from baseline, compared with 23% of women without CIN2+ at endline. While such an approach may result in fewer women being unnecessarily treated or referred to colposcopy, the limitation is the potential for loss to follow-up of screen-positive women compared to a one-time HPV DNA test. On the other hand, repeat testing over a shorter interval (e.g., 6 months) may be a feasible approach to integrate in routine HIV care, where WLHIV may be more frequently followed. Further data collection on the effectiveness and feasibility of such an approach is warranted.
VIA is commonly used in LMICs, but we have shown it has low sensitivity for CIN2 lesions in WLHIV, consistent with other studies in Africa [4,5,36], but has higher sensitivity for CIN3 + in BF only. We also evaluated the diagnostic accuracy of VIA/VILI in HR-HPV-positive women, but this approach resulted in similarly low sensitivity as for VIA alone, although the addition of VILI to VIA (i.e., either test positive) improved sensitivity by approximately 15% for CIN2+/CIN3+. The combination of VIA/VILI also had better accuracy for CIN3+ compared to CIN2+ in BF, but not in SA. This may be because VIA/VILI is more frequently used as a screening test in BF compared to SA, although study nurses and midwives were trained on VIA/VILI procedures in a similar way in both settings prior to participant recruitment in this study. The difference might also be explained by the higher prevalence of other STIs and cervical inflammation among women in SA compared to women in BF [17], which could impact the visualisation of the cervix. Visual inspection methods are highly variable due to their subjective nature, and optimal performance is dependent on observer training and experience and the availability of quality assurance, including review of digital cervicography to ensure standardisation of VIA/VILI [4,6,36,39,40], which may be challenging to implement at scale [41]. Computer-aided approaches using automated visual evaluation (AVE) could improve the accuracy and reproducibility of visual inspection methods. AVE applied to cervigrams has been evaluated in HIV-negative women in Costa Rica and shown to have higher accuracy (area under the curve [AUC] = 0.91, 95% CI 0.89-0.93) compared to conventional cytology (AUC = 0.71, 95% CI 0.65-0.77) [42] but has not yet been studied in WLHIV, although studies are ongoing.
Cytology was the strategy with the best combination of sensitivity and specificity in SA, but only when the threshold for test positivity was increased to HSIL+. Similar high accuracy of cytology for CIN2+/CIN3+ has been reported in other studies in SA, which has an established cytology-based screening programme with quality control measures routinely implemented [4]. Studies conducted in the sub-Saharan African region have reported variable sensitivity and specificity of cytology for CIN2+/CIN3+ in WLHIV [5,40,[43][44][45]; however, in countries where established cytology services exist, strengthening cytology services should ensure high accuracy. Sensitivity of HC2 for CIN2+/CIN3+ was higher than that achieved by cytology HSIL+ in SA, but HC2 detected fewer CIN2+ and CIN3+ cases in SA compared to BF, and the reasons behind this finding are unclear. Based on genotyping using INNO-LiPA, 11% (8/76) of women with prevalent CIN2 and 8% (4/53) of women with CIN3 were HR-HPV negative; 5% of CIN2+ cases were negative for any HPV DNA. It is not uncommon to find women with CIN2+ being HR-HPV negative. A systematic review comparing the HPV type distribution in ICC biopsy and cervical cell specimens of 770 WLHIV from 21 studies in 12 African countries reported that prevalence of any HPV was 89% in biopsy samples and 95% in cervical samples [37]. Similarly, in a review of 10,575 biopsies of ICC, 85% were positive for any HPV [46], and in a sub-analysis of a large cervical cancer screening study (ATHENA), among 497 cases of CIN2+, 55 (11%) tested negative by Cobas HPV test and 12 (2.4%) were negative by all HPV tests (Cobas, Amplicor, and Linear Array) [47]. Our finding of 5% of CIN2+ cases being negative for any HPV is not dissimilar to the findings of these large international studies. It is unlikely that CIN2+ cases were misclassified, as all CIN2+ cases were verified by consensus among 5 independent pathologists [24], although the risk of misclassification cannot be eliminated.
This study has several limitations. The study maximised the chances of obtaining histological results by basing the biopsy decision on positivity of any of 3 screening tests (HC2, cytology ASCUS+, or VIA/VILI abnormal) or colposcopy (abnormal), to which all participants were subjected (96% of women underwent all tests). This approach and the threshold to trigger biopsy for histology are in excess of usual recommendations to minimise ascertainment bias. The number of post-biopsy adverse events was low; 6 (1.0%) women in BF and 4 (0.6%) in SA reported post-biopsy bleeding and/or abdominal pain. Women negative by all tests were considered to be at extremely low risk of CIN2+ since in particular HPV DNA and cytology have very high NPV for CIN2 diagnosis [48], and it is therefore unlikely that many cases would have been missed. In addition, the study built a strong review of histological results by consensus of 5 pathologists, which included all histological slides from women with a local diagnosis of CIN2+ and approximately 10% of slides from women with �CIN1, which showed high agreement [24]. WLHIV included in this study were recruited from 2011 to 2012, at a time when they may have started ART according to older guidelines. As such, the study population may not be representative of contemporary or future cohorts of WLHIV in the universal ART era. However, our analysis of diagnostic accuracy according to ART status and duration attempted to correct for this period effect by restricting analysis to women with controlled HIV, which corresponds to the approach recently used in a contemporary cohort of WLHIV enrolled in 2013-2015 in the US [49]. We did not evaluate the HPV tests in the local study settings at baseline, and HC2 was conducted in France due to challenges in acquiring the car-eHPV assay in time for study initiation. However, careHPV testing was conducted locally at study sites at endline and showed equivalent diagnostic accuracy as HC2 in a head-to-head comparison, previously published [21].

Conclusion
HPV-based tests may be sufficient as a screening strategy in WLHIV if a restricted genotype approach is utilised and a higher threshold for test positivity is established. Molecular-based tests such as HPV tests have the added advantage of being automatable and less prone to training and interpretational errors than morphological tests such as VIA/VILI and cytology and can be performed using the same clinician-collected or self-collected sample, thereby simplifying sample collection, which may facilitate cervical cancer screening without the need for women to attend clinical services. Cytology remains optimal in settings with an existing cytology-based programme, such as SA. ART users with low or unknown nadir CD4+ cell count and ART-naïve women should be screened frequently, although the optimal screening intervals remain unclear. Although cervical cancer screening is not widely implemented in LMICs, integration of cervical cancer screening within HIV treatment services would ensure that women at high risk of developing cervical cancer precursor lesions are screened, and would lead to continuity in primary prevention, favouring early detection and management of HPVrelated cervical lesions with minimal loss to follow-up [50]. More longitudinal data are needed on the effectiveness and cost-effectiveness of different cervical cancer screening strategies in cervical cancer reduction in WLHIV.
Supporting information S1 Checklist. STARD checklist. (DOCX) S1  Table. HPV test positivity at baseline and endline and type-specific HR-HPV infection according to CIN status at baseline and endline among 933 WLHIV followed over a median 16 months in Burkina Faso and South Africa. (DOCX) S9 Table. CIN2+ incidence at endline among baseline screen-negative women, by ART status (countries combined). (DOCX) S10