A systematic review of alternative surveillance approaches for lymphatic filariasis in low prevalence settings: Implications for post-validation settings.

Due to the success of the Global Programme to Eliminate Lymphatic Filariasis (GPELF) many countries have either eliminated the disease as a public health problem or are scheduled to achieve this elimination status in the coming years. The World Health Organization (WHO) recommend that the Transmission Assessment Survey (TAS) is used routinely for post-mass drug administration (MDA) surveillance but it is considered to lack sensitivity in low prevalence settings and not be suitable for post-validation surveillance. Currently there is limited evidence to support programme managers on the design of appropriate alternative strategies to TAS that can be used for post-validation surveillance, as recommended by the WHO. We searched for human and mosquito LF surveillance studies conducted between January 2000 and December 2018 in countries which had either completed MDA or had been validated as having eliminated LF. Article screening and selection were independently conducted. 44 papers met the eligibility criteria, summarising evidence from 22 countries and comprising 83 methodologically distinct surveillance studies. No standardised approach was reported. The most common study type was community-based human testing (n = 42, 47.2%), followed by mosquito xenomonitoring (n = 23, 25.8%) and alternative (non-TAS) forms of school-based human testing (n = 19, 21.3%). Most studies were cross-sectional (n = 61, 73.5%) and used non-random sampling methods. 11 different human diagnostic tests were described. Results suggest that sensitivity of LF surveillance can be increased by incorporating newer human diagnostic tests (including antibody tests) and the use of mosquito xenomonitoring may be able to help identify and target areas of active transmission. Alternative sampling methods including the addition of adults to routine surveillance methods and consideration of community-based sampling could also increase sensitivity. The evidence base to support post-validation surveillance remains limited. Further research is needed on the diagnostic performance and cost-effectiveness of new diagnostic tests and methodologies to guide policy decisions and must be conducted in a range of countries. Evidence on how to integrate surveillance within other routine healthcare processes is also important to support the ongoing sustainability of LF surveillance.


Introduction
Lymphatic filariasis (LF) is a mosquito-borne parasitic infection which is caused by three species of filarial worms: Wuchereria bancrofti, Brugia malayi and Brugia timori [1,2]. It can damage the human lymphatic system, resulting in disabling complications including lymphoedema and hydrocele [1]. An estimated 886 million people live in areas at risk of LF infection and 36 million people are currently suffering from LF-related complications [2].
The Global Programme to Eliminate LF (GPELF) was established in 2000 with the intention of eliminating LF as a public health problem [3]. This has involved actions to interrupt transmission, through the systematic delivery of mass drug administration (MDA) at a population level, and to ensure that cases of morbidity linked to LF receive appropriate treatment [4].
Since 2010, demonstrating interruption of transmission has required three successful Transmission Assessment Surveys (TAS). These are school-based surveys which use rapid antigen tests (e.g. BinaxNOW) to sample a population of 6-7-year-old children at least 6 months after the final MDA [4,5]. Successful delivery of these TASs allows a country to be validated as having eliminated LF as a public health problem.
By the end of 2018, 14 countries had been validated as having eliminated LF, with a further 59 requiring ongoing interventions and surveillance [2]. In the coming decade, many of these countries are expected to be validated as having achieved elimination status. This work is supported by the continued funding commitment from international donors and new drug regimens such as triple therapy which could be scaled up in challenging areas, including India which has the largest burden of disease [1,6].
Following validation of elimination of LF as a public health problem, the WHO recommend that countries continue surveillance for LF to detect any possible recrudescence of infection but there are no clear recommendations on specific surveillance methods and thresholds to be used [4,6]. It is acknowledged that the TAS methodology is resource-intensive and may also lack sensitivity in low-prevalence settings [5,7]. Consequently, there is increasing interest in the appropriateness and effectiveness of alternative methods of LF surveillance, and whether these can be integrated within health systems in post-validation settings.
This review focuses on alternative (non-TAS) LF surveillance studies conducted in lowprevalence settings since 2000, including both human and mosquito studies. This cut-off represents the establishment of GPELF and the introduction of a more standardised approach to LF surveillance and the emergence of newer diagnostic tests. It aims to describe these studies in relation to factors including diagnostic tests, sampling methods and reported results, and to compare results with concurrent TAS outcomes where possible, in order to make recommendations to programme managers and highlight areas requiring further research.

Protocol and registration
This review was conducted and reported according to Preferred Reporting Items for Systematic Reviews and Meta-analyses Statement (PRISMA) guidelines (S1 File).

Search strategy
The following databases were searched for papers published from 2000 to November 2018: PubMed, Scopus and the Cochrane Database of Systematic Reviews. A combination of MeSH terms and text words were used to describe concepts relating to both LF and surveillance (S2 File). Any additional papers found to be relevant during this process were included.

Inclusion criteria
Studies were included in the systematic review if they (1) were a primary research study investigating methods of population-based LF surveillance other than routine TAS surveys; (2) included surveillance methods pertaining to either humans (reservoir) and/or mosquitoes (vector); and (3) were conducted in a low prevalence setting, either post-MDA or post-validation. The review was limited to English-language publications with full-text availability conducted after 2000, following the establishment of the GPELF. Studies describing diagnostic test studies were not included if their design did not include population-level sampling.

Study selection and data extraction
A two-stage process was followed for data selection. Firstly, titles and abstracts of all eligible studies were independently reviewed (co-authors NR and XBR). Any article deemed 'potentially' relevant then underwent independent full-text review (NR and XBR). Discrepant ratings for any papers at stage two were discussed until consensus was reached. A standardised data extraction form was developed, piloted and refined. Where papers reported on more than one study design, these were extracted separately. NR extracted from all the papers and XBR extracted from a sample of 10% of the total. No significant discrepancies were identified during this process.
Extraction focused on the core themes identified during scoping work: (1) location (WHO Region and country, predominant mosquito type); (2) programme context (number of MDA rounds, date of last MDA and elimination status; (3) study design; (4) sampling strategy (including sample size and sampling methods); (5) diagnostic tests used; (6) outcomes of surveillance activity, including comparison with TAS results where applicable; and (7) integration of surveillance with other disease programmes.

Risk of bias assessment
Risk of bias was assessed using a modified version of the Crowe and GATE validated appraisal tools. Scores of 0-2 were assigned for all studies based on study design (not stated, cross-sectional, longitudinal). Human sampling studies were further assessed in relation to sample size terciles (0-760/761-2,464/>2,464), method of sampling participants (not stated/non-random/ random) and study population (not stated/children or adults/children and adults). Mosquito sampling studies also assigned scores according to sample size terciles (0-4,679/4,680-10,871/ >10,871), catch-site sampling (not stated, non-random, random) and method of analysis (not stated/dissection/PCR analysis). It was decided not to include location sampling in the assessment since it may be preferable to use non-random methods in some scenarios (e.g. conducting surveillance activities in response to a suspected hotspot). Total risk of bias scores (marked out of 8) were calculated for each study and are presented in Tables 2 and 4. A full breakdown of scores for each study is listed in S1 Table.

Data synthesis and analysis
Details of publication details, programme context and study design are presented for all studies combined. This is followed by data on sampling strategy, diagnostic test usage and outcomes, split for human and mosquito surveillance studies separately. The impact of age and gender on diagnostic test performance in humans is explored. Analysis then included: (1) comparison between human and mosquito surveillance studies; (2) comparison with TAS results, where applicable; and (3) evidence of integration of surveillance methods within health systems. The analysis aims to determine factors which can increase the sensitivity (defined as the proportion of true positive cases identified by a diagnostic test) in low prevalence settings.

Selected studies
Fig 1 highlights the PRISMA steps of identification, screening, eligibility and inclusion of papers. A total of 1,378 papers were identified from the initial search, once duplicates had been excluded. Of these, 71 were considered potentially relevant following title/abstract screening by two independent reviewers. 57 of these were labelled potentially relevant by at least one reviewer following full-text screening. When discrepant results were reviewed, this total reduced to 40 papers which then proceeded to data extraction. An additional four papers were identified during the peer review process. The 44 papers which met eligibility criteria comprised of 83 methodologically distinct study designs (Table 1). These studies are henceforth considered separately except for one paper which pooled results of school and community surveys.
A significant degree of heterogeneity was identified in the included studies. This included variation in study design, baseline endemicity, population sampled, use of diagnostic tests and reporting metrics. It was agreed that this variation precluded formal meta-analysis and instead required a narrative review structured according to the core themes identified.
Publication details. 26 Table 2 summarises the characteristics of the 35 papers which reported data on human surveillance for lymphatic filariasis, comprising 60 distinct studies. Full results from these studies can be found in S2 Table. Sample size. The median sample size was 1,472 (range = 40 to 35,582; interquartile range = 596-3,207). The majority of studies (n = 36; 60.0%) included both children and adults in their study design. 15 studies (25.0%) focused on children only and five (8.3%) on adults only. In total, the studies reported data on 208,568 participants.

Human surveillance studies
Sampling methods. Where stated (n = 42), the most common approach to selecting a sampling location involved non-random methods, such as purposive or convenience sampling (n = 30, 71.4%). In most cases surveillance was conducted in response to identification of a hotspot of infection. Other methods involved using random sampling methods (n = 7, 17.5%) while four studies described national surveillance studies [10,11,31]. Participants were then sampled using either non-random methods (n = 34, 69.3%) or random methods (n = 15, 30.6%).
Diagnostic tests. Included studies described results using 12 different diagnostic tests. 58 studies involved blood samples of which the majority were finger prick samples. The most common tests were microscopy for microfilaraemia (MF) (n = 38; 63.3%); BinaxNOW (n = 36; 60.0%); Bm14 Ab (n = 20; 33.3%), Og4C3 Ag (n = 17; 28.3%), Wb123 Ab (n = 9; 15.0%) and Wb PCR (8, 13.3%). Table 3 compares results where the same diagnostic test was    Table 3 shows that antibody tests produce a higher proportion of positive results. Bm14Ab and Wb123Ab values are, on average, 5.1 and 6.7 times higher respectively than the corresponding BinaxNOW values, based on the median value of this ratio across the selected studies. Og4C3Ag values are similar to BinaxNOW values in studies where both are used (median ratio = 0.95, range 0.2-1.6). Impact of age. Age-specific prevalence was extracted for twelve different LF diagnostic tests from studies which reported data allowing 10-year age bands to be calculated (Fig 3). A similar pattern is seen for each test, with rates generally increasing through childhood and adolescence before stabilising during adulthood and occasionally falling in older age.
Impact of gender. Reported prevalence of LF tests are also known to generally be higher among men in comparison to women (Fig 4). Table 4 summarises the characteristics of the 23 papers which reported data on mosquito surveillance for LF. Full results from these studies can be found in S3 Table. Sample size. The median number of mosquitoes collected was 7,860 per study (range 115-69,680, interquartile range 4,383-18,865).

Mosquito surveillance studies
Sampling methods. Similar to human surveillance studies, location sampling typically used non-random methods, following identification of a hotspot area by other methods. The

PLOS NEGLECTED TROPICAL DISEASES
Alternative surveillance approaches for lymphatic filariasis in low prevalence settings majority of studies then described various methods for taking a random sample of households from which to sample mosquitoes, either indoors or outdoors. The most common mosquito sampling method was the gravid trap (n = 9; 39.1%) followed by various baited traps (n = 6;

PLOS NEGLECTED TROPICAL DISEASES
Alternative surveillance approaches for lymphatic filariasis in low prevalence settings 26.1%), human landing collection (n = 5; 21.7%) and pyrethrum space spray catches (n = 4; 17.4%). The variation was partly due to the different species of mosquito being sampled.  Table 5 summarises studies which performed both human testing and xenomonitoring in the same geographical area. Overall, there was great variability in survey methods and results which limited comparisons. Interpretation is also limited by the fact that there are currently no recommended species-specific Mosquito Infectivity Rate (MIR) thresholds for LF [8,53]. A number of studies reported similar results between human testing and xenomonitoring. For example, Rao et al 2018 (38) showed ICT rates of 3% and an MIR of 3%, but a similar pattern was not demonstrated in other Sri Lankan studies. There were also examples where human testing did not detect significant transmission but xenomonitoring did. For example, the study by Ramaiah et al. reported a mosquito infection rate of 4.7% of mosquitoes when a community survey performed concurrently found no evidence of human infection on ICT testing.

Comparison with TAS results
18 studies reported alternative surveillance methods which were performed concurrently with, or subsequent to, a TAS which was passed successfully. The comparative results are illustrated in Table 6 which shows that alternative surveillance methods can identify evidence to support ongoing transmission in areas which passed TAS. For example, Sheel et al. report LF prevalence (using Filarial Test Strips) of 6.2% in a community survey in an area, which had recently passed TAS [14]. In American Samoa, Lau et al. (2014) found levels of Og4C3Ag to be 3.2% and Wb123 Ab to be 8.1% in an area which had recently passed TAS. Xenomonitoring surveys also appear to have utility in identifying hotspots, as in the case of Rao et al. (2018) who detected a MIR of 5.2% in an area which had recently passed TAS [38].

Integration of surveillance with other disease programmes
The WHO recommend integrating post-MDA surveillance strategies with other ongoing surveillance activities [4]. Only three papers reported on efforts to integrate LF surveillance with other activities. A study from American Samoa tested stored bloods from a leptospirosis survey for LF [10]. Two studies from Togo integrated LF testing (using either MF or Og4C3 Ag) within routine malaria investigations either at the point of the diagnostic test being taken in the healthcare facility, or when the blood film was being analysed in the laboratory [40,42].

Reference (Location)
Human survey type (Age range)

Discussion
This review provides a timely collation of important information on alternative surveillance strategies for low prevalence and/or post-validation settings that will be useful to national programmes over the next decade as they seek to reduce LF incidence and meet the challenges of the NTD Roadmap 2030 [54]. However, the significant heterogeneity found in the study designs, population sampled, use of diagnostic tests and reporting metrics, highlights the need for more systematic methods and new WHO guidelines to be developed to supplement TAS. This review has identified that the sensitivity of LF surveillance in selected low prevalence populations can be increased by changes to the diagnostic test and/or study population. TAS is an important programmatic tool to guide decisions on when to stop MDA but several studies report that it lacks sensitivity when used in low prevalence settings, such as a post-validation context [13,17,23,38], and may not accurately describe the spatial distribution of LF at community-level [14]. This is important because evidence from countries that have recently eliminated LF indicates an increased risk of disease recrudescence, with ongoing hotspots of infection documented recently in both American Samoa and Sri Lanka [5,12,38]. The lag time between infection with LF and onset of symptoms may be 10 years or more, demonstrating the critical importance of maintaining surveillance programmes following elimination [41,51]

Alternative diagnostic tests
The studies included in this review indicate that there may be benefit in moving from the conventional rapid antigen tests to antibody tests as they increase the proportion of positive results and, hence, the likelihood that residual hotspots will be detected. However, antibody tests are a measure of the host response to infection which can persist for some time after all antigenic material from the original infection has been eliminated. This means that antibody tests are associated with an increased false-positive rate and the detection of more historical cases, meaning there would be financial and logistical implications to switching to widespread antibody testing [13].
Antibody tests could be added to TAS without significant changes to study design [13]. Reported results suggest that testing Wb123 antibody (Ab) may have particular utility since it is thought to both become positive relatively soon after infection and decay faster following clearance, compared to Bm14 Ab [27,55]. It also has been found to be significantly associated with molecular xenomonitoring results, suggesting it could act as an indicator of ongoing transmission [13,55]. Urine ELISA may have greater acceptability than blood testing but requires further validation in LF-endemic regions [16,35]. However, the current increased costs of antibody and ELISA tests may limit their widespread uptake and further research is needed to characterise the spatial distribution of antibody signals [13].
All methods of human surveillance are affected by the persistence of the marker (antibody or antigen) in circulation. This is of variable duration for different test types, meaning that their results are not directly comparable. It also means that results are not truly indicative of current infectivity and will therefore include cases of historical disease. By contrast, mosquito surveillance gives a snapshot indication of current infection, and could serve as a useful adjunct to human surveillance methods [27,47]. However, mosquito surveillance requires entomology and laboratory capacity, which are both costly and time-consuming, meaning that it is typically only used in very defined areas, rather than for population-wide surveillance [27,45].

Alternative approaches to sampling
Studies reported that LF tests typically report higher prevalence of infection in adults than in children [14,31] and it is thought that adults (particularly adult men) may represent the majority of the reservoir of infection for LF [37]. As prevalence reduces it may therefore also be appropriate to target surveillance to focus on these high-risk populations. Methods that have been suggested include adopting a 'test and treat' approach for adult males, which could focus on settings in which they may be more likely to congregate, such as marketplaces [37]. Post-validation surveillance in Togo found positive cases in low-risk areas, highlighting the importance of developing surveillance systems with nationwide coverage [40,42]. Areas with high levels of migration from endemic countries (e.g. border areas) may also require additional monitoring [24,56]. Other recommended sampling methods include community-based methods targeting adults and children, school-based surveys with a wider age range and snowball sampling of positive cases [14].

Future research needs
In order to support countries to develop appropriate surveillance in low prevalence or postvalidation settings, further research will be required to inform choices regarding the selection of diagnostic tests and appropriate sampling strategies. This will include work to determine the diagnostic performance and cost-effectiveness of novel tests in a range of different epidemiological settings and the identification of suitable threshold values for new LF diagnostic tests in humans [13]. Further research is also required to determine appropriate sample size and infection cut-off thresholds for surveillance in different mosquito species [18,26,44,52].
There is a need to better understand the spatial and temporal dynamics of LF hotspots and their drivers, which will require more longitudinal studies to help inform future control and surveillance activities [12,23]. Emerging evidence suggests that LF hotspots may be highly focal, increasing the likelihood that cluster-based methods will lack sensitivity to detect them [10, 12, 23,57]. The risk of recrudescence of infection will depend on a range of factors including population density, baseline endemicity, uptake of MDA and concurrent vector control interventions. It may be appropriate to stratify the intensity of population-level surveillance based on assessment of these factors [58]. This must be supported by the development of data systems capable of continuously collecting, analysing and interpreting data in order to rapidly inform service planning and policy [6].
Further, there is a particular need to increase the evidence base in the African and South Asian Regions, which currently have the majority of ongoing transmission [1]. The evidence base supporting integration of surveillance activities with other health system processes must also be strengthened. Examples may include blood donation systems, surveillance for other co-endemic NTDs (e.g. onchocerciasis) or malaria and routine household surveys [24,40,41,48]. Finally, post-validation surveillance programmes will require clear guidance on how to respond to the identification of new cases. Such interventions may include watchful waiting, vector control, resumption of MDA, treatment of cases only, or a combination of methods [36].

Limitations
It was not possible to conduct a meta-analysis of surveillance results which was largely due to the variation in study methods, but also because of the variation in the infectivity of different mosquito vectors and the influence of different environmental factors that are difficult to control for.
Regarding the study exclusion criteria, the decision to limit the analysis to the English language led to the exclusion of a small number of papers published in Spanish or Chinese, but we consider it unlikely that these results would have significantly changed the main outcomes of this review. The decision to limit the review to papers published after 2000 also excluded a small number of papers but it was considered that the results of more historical studies were likely to have limited transferability to current LF programmes. Finally, our search for unpublished data was limited. It is likely that some studies examining surveillance methods are conducted as part of routine LF programmatic activities and, hence, not published. If collected, such data could strengthen the evidence base in this area.

Conclusions
This is the first review to systematically investigate the evidence supporting alternative (non-TAS) approaches to LF surveillance in low prevalence and post-validation settings. The results demonstrate a need for a more standardised approach to LF surveillance in low prevalence and post-validation settings. Surveillance methods with greater sensitivity and more targeted sampling strategies to better detect residual hotspots than the current TAS methodology will be required. However, further research on the diagnostic performance and cost-effectiveness of new diagnostic tests, and how these can be integrated within routine health system activity, is needed to inform policy decisions over the next decade.