Detecting infection hotspots: Modeling the surveillance challenge for elimination of lymphatic filariasis

Background During the past 20 years, enormous efforts have been expended globally to eliminate lymphatic filariasis (LF) through mass drug administration (MDA). However, small endemic foci (microfoci) of LF may threaten the presumed inevitable decline of infections after MDA cessation. We conducted microsimulation modeling to assess the ability of different types of surveillance to identify microfoci in these settings. Methods Five or ten microfoci of radius 1, 2, or 3 km with infection marker prevalence (intensity) of 3, 6, or 10 times background prevalence were placed in spatial simulations, run in R Version 3.2. Diagnostic tests included microfilaremia, immunochromatographic test (ICT), and Wb123 ELISA. Population size was fixed at 360,000 in a 60 x 60 km area; demographics were based on literature for Sub-Saharan African populations. Background ICT prevalence in 6–7 year olds was anchored at 1.0%, and the prevalence in the remaining population was adjusted by age. Adults≥18 years, women aged 15–40 years (WCBA), children aged 6–7 years, or children≤5 years were sampled. Cluster (CS), simple random sampling (SRS), and TAS-like sampling were simulated, with follow-up testing of the nearest 20, 100, or 500 persons around each infection-marker-positive person. A threshold number of positive persons in follow-up testing indicated a suspected microfocus. Suspected microfoci identified during surveillance and actual microfoci in the simulation were compared to obtain a predictive value positive (PVP). Each parameter set was referred to as a protocol. Protocols were scored by efficiency, defined as the most microfoci identified, the fewest persons requiring primary and follow-up testing, and the highest PVP. Negative binomial regression was used to estimate aggregate effects of different variables on efficiency metrics. Results All variables were significantly associated with efficiency metrics. Additional follow-up tests beyond 20 did not greatly increase the number of microfoci detected, but significantly negatively impacted efficiency. Of 3,402 protocols evaluated, 384 (11.3%) identified all five microfoci (PVP 3.4–100.0%) and required testing 0.73–35.6% of the population. All used SRS and 378 (98.4%) only identified all five microfoci if they were 2–3 km diameter or high-intensity (6x or 10x); 374 (97.4%) required ICT or Wb123 testing to identify all five microfoci, and 281 (73.0%) required sampling adults or WCBA. The most efficient CS protocols identified two (40%) microfoci. After limiting to protocols with 1-km radius microfoci of 3x intensity (n = 378), eight identified all five microfoci; all used SRS and ICT and required testing 31.2–33.3% of the population. The most efficient CS and TAS-like protocols as well as those using microfilaremia testing identified only one (20%) microfocus when they were limited to 1-km radius and 3x intensity. Conclusion In this model, SRS, ICT, and sampling of adults maximized microfocus detection efficiency. Follow-up sampling of more persons did not necessarily increase protocol efficiency. Current approaches towards surveillance, including TAS, may not detect small, low-intensity LF microfoci that could remain after cessation of MDA. The model provides many surveillance protocols that can be selected for optimal outcomes.


Results
All variables were significantly associated with efficiency metrics. Additional follow-up tests beyond 20 did not greatly increase the number of microfoci detected, but significantly negatively impacted efficiency. Of 3,402 protocols evaluated, 384 (11.3%) identified all five microfoci (PVP 3.4-100.0%) and required testing 0.73-35.6% of the population. All used SRS and 378 (98.4%) only identified all five microfoci if they were 2-3 km diameter or highintensity (6x or 10x); 374 (97.4%) required ICT or Wb123 testing to identify all five microfoci, and 281 (73.0%) required sampling adults or WCBA. The most efficient CS protocols identified two (40%) microfoci. After limiting to protocols with 1-km radius microfoci of 3x intensity (n = 378), eight identified all five microfoci; all used SRS and ICT and required testing 31.2-33.3% of the population. The most efficient CS and TAS-like protocols as well as those using microfilaremia testing identified only one (20%) microfocus when they were limited to 1-km radius and 3x intensity.

Conclusion
In this model, SRS, ICT, and sampling of adults maximized microfocus detection efficiency. Follow-up sampling of more persons did not necessarily increase protocol efficiency. Current approaches towards surveillance, including TAS, may not detect small, low-intensity LF microfoci that could remain after cessation of MDA. The model provides many surveillance protocols that can be selected for optimal outcomes.

Introduction
Disease elimination is the endgame for much infectious disease-related public health work. Considered infinitely cost-effective when successful [1], elimination or eradication programs cost little per case prevented in the beginning, and enormous sums per case prevented at the end as efforts to prevent, detect, or treat every last case continue despite few remaining cases [2]. In part due to resource challenges and donor fatigue, efforts to eliminate infectious diseases have more often failed (malaria, yaws, yellow fever, hookworm) than succeeded (smallpox, rinderpest) [3]. Currently, several infectious diseases including polio, onchocerciasis, guinea worm, trachoma, malaria, and lymphatic filariasis are targeted for elimination or are at various stages of elimination or eradication programs [2,4,5].
Lymphatic filariasis (LF), a mosquito-borne filarial disease causing lymphedema, hydrocele, and elephantiasis has been targeted by the World Health Organization's (WHO) Global Program to Eliminate Lymphatic Filariasis (GPELF) for elimination as a public health problem by 2020. For LF, this is defined as interruption of transmission using preventive chemotherapy, and management of morbidity and prevention of disability in persons already infected. The GPELF recommends steps to achieve interruption of transmission, including (i) mapping to define endemic areas; (ii) mass drug administration (MDA) in endemic areas to reduce infection below a threshold at which transmission is considered unsustainable; (iii) conducting and passing transmission assessment surveys (TAS) as a prerequisite for stopping MDA; (iv) posttreatment surveillance (PTS) after stopping MDA, comprising two repeat TAS and ongoing surveillance for at least five years; and (v) development of a dossier documenting these steps to achieve validation of the elimination of LF as a public health problem [6]. Although specifics of the last component are still to be determined, there is no doubt that complete elimination of LF transmission is the ultimate goal.
Unlike the elimination of smallpox or polio, the path to elimination for LF likely does not require the absence of every infection. This is largely due to the poor transmission characteristics of LF: multiple infective mosquito bites are needed to establish a patent infection with the causative filarial agents, and at least one pair of opposite-sex worms must be present for an infected person to manifest infectious microfilariae. The likelihood of both occurrences decreases as infection prevalence declines during multiple years of MDA [7][8][9]. Passing the TAS requires the identification of fewer infections during the survey than a pre-set cutoff, intended to signify a mean LF prevalence below which infections are likely to irreversibly decline. The TAS involves a community-based survey of 6-7 year old children in areas where school enrollment is <75%, or a school-based survey where enrollment is at least 75%. The design is usually a cluster survey, and the threshold is set at either <2% antigenemia (in W. bancrofti-endemic areas with Anopheles or Culex as the principal vector) or <1% antigenemia (in W. bancrofti-endemic areas where Aedes is the primary vector). In Brugia-endemic areas, thresholds are set for <2% antibody prevalence [10,11]. 'Passing' the TAS-detecting no more positive children than the critical cutoff value specified in the guidelines-is a prerequisite for stopping MDA.
However, whether or not this cutoff universally leads to a decline in infections is unclear. The existence of LF-and other diseases, such as malaria-in endemic foci as small as 1 km in diameter [12][13][14][15][16][17][18] before or during MDA suggests at least the possibility of residual endemic foci after treatment. The few data that exist about LF in post-MDA settings suggest that there are residual foci [19][20][21] amid large areas relatively free of infection. In addition, the area and population over which TAS are carried out vary widely [22], and may include as many as 2 million persons. Thus, an average antigenemia prevalence of 1% or 2% among 6-7 year old children might look quite different in different areas, depending on multiple factors both affected and not affected by LF elimination program activities. Beyond this, the absence of infection markers in children is not necessarily associated with the absence of infection and transmission among adults. Even in post-MDA settings, adults have a higher prevalence of infection markers than children [19,[23][24][25]; whether or not these adults are actively transmitting infection is unclear. While single individuals infected with LF who are surrounded by large areas without infections are unlikely to restart active transmission cycles, there clearly exists a number and concentration of infected persons above which transmission will be sustained or expand. In these situations, the average antigenemia prevalence may indeed be below the cutoff in children aged 6-7 years without cessation of transmission; elimination of LF 'as a public health problem' may be briefly achieved, only to be lost in coming decades due to recrudescence.
For countries stopping MDA, PTS represents the last opportunity to detect any remaining foci of infection that may still lead to recrudescence [6,10]. However, methods which efficiently detect and address infections-including small residual foci of infections-in a lowprevalence setting are undefined, particularly for a disease such as LF for which infectiousness and clinical symptoms may be separated by years or even decades [26][27][28]. In this paper, we use microsimulation modeling to compare the effectiveness of different programmatic surveillance approaches in detecting both residual endemic foci ('microfoci') and individual, dispersed infections. Data in this paper are intended to provide a realistic framework in which to consider surveillance for low-prevalence infections, not only for LF but also for other diseases targeted for elimination, and shed light on how much assurance different approaches to surveillance can provide during the last stages of an elimination program.

Model flow
Microfoci, or areas of elevated infection prevalence relative to the background, were placed randomly on a map at the start of a simulation, with one household serving as the geographic center. Primary sampling included either 30-cluster sampling (CS) or simple random sampling (SRS), and only occurred in the population group specified by the simulation. The identification of a single infection-marker-positive person triggered follow-up testing ('trigger-based sampling') of the nearest X persons, irrespective of population group, around the initial positive (Fig 1, Table 1). Persons tested in trigger-based sampling included household members of the initial positive and persons living in the next-nearest households. A pre-set number of positives found during trigger-based sampling ('threshold') indicated the identification of a suspected microfocus (and the presumed requirement for action on the part of a country program). The number of true (known) microfoci was divided by the number of suspected microfoci to determine the predictive value positive of each simulation in identifying microfoci.

Variables
Each simulation utilized a 60 x 60 km region (3,600 km 2 ), based on the approximate sizes of TAS evaluation areas in Chu et al [22], and included 360,000 persons with a mean population density of 100 persons/km 2 , similar to population densities of Ghana and Kenya [29]. The mean village size was 1,200 individuals; thus, each simulation comprised 300 villages. A mean household size of six was estimated from the literature [30] and World Family Map data from Ethiopia and Nigeria [31]; the simulation population was developed using population projections for Sub-Saharan Africa 2015 [32]. Villages were heterogeneously distributed throughout the area; a single household was chosen in each village as an 'anchor' and household density in each village decreased as distance from the anchor household increased (Fig 2). Due to the computation time required to create an area, ten simulation areas were created to use in modeling. Density plots of these areas and further details on how the ten areas were used in the simulation models are included in the supplementary materials (S1 Fig). The population proportion for children in 1-year age groups for children 5 years and 6-10 years was determined by dividing the total population proportion assigned to those age groups by five ( Table 2). Background age-prevalence curves for infection markers were estimated based on the literature [19,24,25], using a background ICT prevalence among 6-7-year-old children of 1.0% to approximate a plausible post-MDA setting ( Table 2).
Parameters that were varied in simulations included those related to the simulation area and those related to the sampling approach. Variables related to the simulation area included microfocus size (1, 2, or 3 km in radius), microfocus intensity (3x, 6x, or 10x greater prevalence of infection markers in the microfocus compared to the background infection marker prevalence), and the number of microfoci (5 or 10) included in each simulation, resulting in 18  (Table 2).
Primary Wb123 sampling only evaluated children aged 5 years, due to the potential for high prevalence of Wb123 antibodies among older age groups. The number of persons sampled in each cluster during CS was determined by the number of persons targeted for sampling, divided by 30 clusters. If the required sample size (1,800 or 7,200) for primary sampling was not reached in 30 clusters, all persons in the target age group in the 30 clusters were sampled. Additionally, a cluster sample comprising 1,548 6-7-year-olds was drawn to approximate a TAS.
Detailed definitions of the model terminology and a full list of outputs are included in the supplementary materials. All simulations were run in R versions 3.1 and 3.2 [33] (code available in S1 File).
Each set of simulated surveillance activities was termed a 'protocol.' We considered the priority of each protocol to maximize 'efficiency', defined as (in this order) maximizing the proportion of microfoci identified (which can also be thought of as the probability that any one microfocus is detected in each simulation), minimizing the number of total tests required to identify them, and maximizing the predictive value positive (PVP) of identification of microfoci. Protocols were sorted to optimize those three relevant outputs, with results reported as median proportions with 95% confidence intervals. The true proportion positive in each microfocus was tested against the background proportion positive for the targeted age group and test method to determine if a significantly higher proportion of positive persons was present in each microfocus. All comparisons were made with a two-sided Fisher's Exact test. The number of statistically significant associations was summed for each protocol and the percentage of microfoci with statistically more infections than background prevalence was recorded. Negative binomial regression was used to estimate aggregate differences across all protocols in regression analyses. Results of regression analyses are reported as rate ratios with 95% confidence intervals and p-values.

Regression analyses
In total, 6,804 protocols were identified and evaluated in regression analyses, presented in Tables 3 and 4. Modifying nearly all variables included in the model significantly affected the three relevant outputs.

Table 2. Population proportion and infection marker test status in background population.
Microfilaremia was estimated to be 10-fold lower than ICT in every age group. Wb123 prevalence was assumed to be four-fold higher than ICT prevalence among persons <20 years of age, 4.5 times ICT prevalence among persons 20-40 years of age, and 5-fold greater than ICT prevalence among persons >40 years of age [19,[23][24][25]. Because these are based on test prevalences derived from real data (rather than gold standard prevalences for each infection marker), test sensitivity and specificity were not employed.  Table 3. Effect of varying the model variables on the median predictive value positive of identifying microfoci, the median proportion of the population requiring testing, and the median proportion of microfoci detected. In this analysis, TAS-like protocols are excluded. The threshold for all microfilaremia testing is set at 1. For ICT tests, the threshold for identification of a suspected microfocus is 1 positive when 20 follow-up tests are used, and 2 in all other cases. For Wb123 testing, the threshold is 50% of the persons followed up testing positive. Maximizing the PVP is critical to protocol efficiency; protocols with low PVP waste resources by causing follow-up actions on suspected microfoci that are not truly microfoci. Increasing the number of microfoci primarily affected the PVP: having more microfoci in the model increased the likelihood that any microfocus investigated would be a 'true' microfocus. Similarly, increasing the radius of each microfocus increased the PVP, but had less effect on the median proportion of persons tested in each protocol. However, it did significantly increase the median proportion of microfoci detected.

Predictive value positive
Increasing the intensity of the microfocus caused a large and significant increase in PVP and the proportion of microfoci detected in each model, but had a smaller effect on the proportion of persons tested. Increasing the proportion of the population sampled in primary sampling increased the median total proportion of persons tested, but also increased the median proportion of microfoci detected, without affecting PVP (that is, more true microfoci were detected with increased primary sampling, but the number of false positive microfoci did not increase). All three metrics were significantly improved by using SRS instead of CS as the primary sampling methodology. In contrast, when using Wb123 or microfilaremia testing, compared with ICT, only PVP was increased; the proportion of persons tested declined but the median proportion of microfoci detected also declined significantly.
Increasing the number of follow-up tests generally caused a decrease in the PVP and had relatively small-though significant-effects on the proportion of microfoci detected. However, it did have a large and significant effect on the proportion of persons requiring testing (Table 3), with nearly five times more persons requiring testing when 500 persons were followed-up instead of 20 persons.
To investigate the effect of testing different age groups on the same metrics, we additionally excluded Wb123 testing, which only included children 5 years of age. The results for all other variables (Table 4) were largely the same as described above (Table 3). Results for Table 4. Effect of varying the model variables on the median predictive value positive of identifying microfoci, the median proportion of the population requiring testing, and the median proportion of microfoci detected. In this analysis, both TAS-like protocols and Wb123 protocols are excluded. The threshold for all microfilaremia testing is set at 1. For ICT tests, the threshold for identification of a suspected microfocus is 1 positive when 20 follow-up tests are used, and 2 in all other cases. women of child-bearing age are largely the same as results for all adults: compared with testing children, testing adults generally increases the PVP minimally, and increases both the proportion of the population tested and the proportion of microfoci identified more substantially. Based on the above results and to represent likely post-MDA scenarios as simply as possible, for the remaining analyses we limited the number of microfoci to five (n = 3,402 protocols). All 3,402 protocols with relevant inputs and outputs are available in S1 Table. Description of microfoci Microfoci are described in Table 5. Although most simulations resulted in more infections in each microfocus compared with expected infections at background prevalence, we separated the simulations into those where >80% vs 80% of microfoci had statistically significantly more infections than would be expected ( Table 5, shaded vs unshaded boxes). In total, 57 (70%) of the 81 simulation combinations shown had >80% of microfoci with statistically significantly more positives, as measured by the target infection marker in the target age group, than expected at background rates. Simulations with larger and more intense microfoci, those utilizing more sensitive tests, and those involving adults were more likely than others to have statistically significantly more infections in each microfoci than background.

Protocols which identify the most microfoci
Of the 3,402 protocols, 384 (11.3%) identified all five microfoci. The number of protocols identifying all five microfoci declined as size and intensity of the microfoci decreased ( Table 6).
The 384 protocols required testing a range of 0.73%-35.6% of the total population, and had a range of PVPs of 3.2-100.0%. All used SRS and 378 (98.4%) only identified all five microfoci if they were 2-3 km diameter or high-intensity (6x or 10x) ( Table 6). Of the 384, 374 (97.4%) required ICT or Wb123 testing to identify all five microfoci, and 281 (73.0%) required sampling adults or WCBA. The top 10 most efficient protocols (those which identified the most microfoci with the fewest tests at the highest PVP) are shown in Table 7.

Protocols which identify small, low-intensity microfoci
The most efficient protocols identified during this initial evaluation invariably involved models with high-intensity (6x or 10x) and large (radius 2-3 km) microfoci. However, most residual microfoci post-MDA are likely to be low-intensity and small. To address this, we limited our model to include only protocols with microfoci of size 1 km in radius and 3x intensity. Of these, eight (2.1%) protocols identified all five microfoci; nine (2.4%) identified four; 13 (3.4%) identified three; 18 (4.8%) identified two; 39 (10.3%) identified one; 291 (77.0%) identified no microfoci. Among the eight identifying all five microfoci, each required testing of 31%-33% of the population. As the proportion of microfoci identified decreased, the proportion of the population requiring testing similarly decreased. In Table 8, we show a sample of the most efficient protocols when 100%, 80%, 60%, or 40% of microfoci are identified.
Because all of the 'most efficient' protocols in the preceding examples specified ICT and SRS, we explored the most efficient options using different test types and diagnostic methods with small, low-intensity microfoci. Table 9 shows the most efficient protocols using microfilaria testing, Wb123 testing, or cluster sampling. The most efficient protocols using CS, MF, or Wb123 fail to identify more than one or two of the five microfoci and are markedly less efficient than the ICT/SRS protocols shown in Table 8. Spatial modeling for surveillance for lymphatic filarisis

Children vs adults
Different settings will facilitate sampling of different populations with greater ease. For this reason, we compared the most efficient sampling protocols limited to 6-7 year olds, WCBA, and adults. To facilitate a fair comparison across population groups, we limited the radius of microfoci to 1 km, the intensity to 3x, and the threshold to 1. The results are shown in Table 10. Both WCBA and adult sampling outperform sampling of 6-7 year olds in terms of efficiency. Children <5 years were not included in this evaluation as they were only evaluated with Wb123.

Cluster sampling protocols
Simple random sampling consistently resulted in higher efficiencies than cluster sampling. However, recognizing that simple random sampling might not always be feasible, we evaluated the peak efficiency of cluster sampling protocols at varying microfoci radii and intensities. In total, 1,458 protocols used cluster sampling (not including TAS-like sampling). The protocols identified 0%-40% of the five microfoci, and tested 0.9%-8.0% of the total population. Only 74 (5%) of the 1,458 protocols identified 40% of the microfoci; the remainder identified fewer. All Table 8. Protocols which required the fewest total persons to be tested to identify 100%, 80%, 60%, or 40% of the microfoci of size 1 km in radius and 3x in intensity. Spatial modeling for surveillance for lymphatic filarisis involved large (3-km), high-intensity (10x) microfoci. The three most efficient protocols with cluster sampling are presented in Table 11. The targeted primary sample size was not achieved for 180 of 729 cluster sampling protocols specifying 2.0% primary sampling; all involved 6-7 year olds.

Sampling protocols that use microfilaremia testing
Microfilaria testing is used in many countries despite challenges with sensitivity, which decreases as the prevalence of infection declines. As demonstrated above, for small, low-intensity microfoci, child populations are much less likely than adults or WCBA to have statistically more microfilaremia-positive persons than background (Table 5), and thus detection of any microfocus using microfilaremia testing in children is unlikely (or due to chance). In total, 1,458 protocols used microfilaria testing; of these, 10 identified 100% of the microfoci by testing 3.0%-7.0% of the population. These primarily involved large, high-intensity microfoci. When the parameters were limited to microfoci of radius 1 km and 3x intensity, the highest proportion of microfoci that could be identified was 20%, by testing 5.5%-5.9% of the population. Table 12 shows the top two most efficient protocols when 100%, 80%, 60%, or 40% of the microfoci are identified. Table 13 shows the three most efficient protocols using microfilaremia testing when microfoci are limited to radius 1 km and 3x intensity. Notably, even though 81% of the 1 km, 3x-intensity microfoci have statistically greater numbers of infected adults than background as indicated by microfilaremia testing (Table 5), the most efficient protocol in adults detects only one in five microfoci (Table 13).

TAS-like sampling
To evaluate how well the TAS would perform at detecting microfoci of various sizes, we simulated a TAS (as described in the Methods). As with the other protocols, each of these did incorporate trigger-based follow-up testing around each positive and a threshold for determining a suspected microfocus; thus, each demonstrates how few people could be tested and how many microfoci found if follow-up testing around infected persons was carried out. There were 486 protocols that utilized TAS-like sampling. The most efficient of these, using each diagnostic tool, are shown in Table 14. Although each required testing <1% of the population, none of the protocols identified more than one of the five microfoci in each simulation.

Discussion
Endemic foci that lead to stable or increasing numbers of infections threaten the success of infectious disease elimination or eradication programs. Current approaches to post-treatment surveillance for LF require that we assume the absence of infections in between well-defined areas where infections are known to be absent. Due to the highly focal nature of LF endemicity, this approach may not be sufficient to confirm LF elimination. The model presented here provides several important pieces of information about what can be expected from various types of surveillance in terms of identifying small endemic disease foci.
First, we confirm the challenges in detecting small, low-intensity microfoci. While this in itself is unsurprising, this model demonstrates precisely how much additional effort is needed to identify microfoci-and which diagnostic test and follow-up testing combinations can do so most efficiently-as they become incrementally smaller and/or less intense. Programs may wish to have a specific level of confidence in their ability to detect microfoci of specific size, as measured by a specific marker: this model enables them to select approaches that could yield that level of certainty. While many protocols enabled the detection of all five large (3-km radius), high-intensity (10x) microfoci while testing small proportions of the total population, the only protocols that identified all five small, low-intensity microfoci required testing of an impractically high number of persons (>30% of the population). Notably, there are some protocols (Table 8) which identify most of the microfoci-4 of 5 -and require testing of only 3.5% of the population (12,600 persons in this model). While this may seem like a high number, the use of protocols such as this two years in a row might provide reasonable confidence in the absence of microfoci. These protocols include simple random sampling of adults or women of childbearing age using ICT, conducting follow-up testing of the 20 nearest persons to any identified positives, and using a threshold of just one additional infected person to identify areas needing additional programmatic attention. One way to achieve this in a country with high antenatal clinic attendance might be to test all women attending antenatal clinics until the sample size is reached for two years in a row. Interestingly, follow-up sampling of 500 persons around each infected person in this model, as is done in Togo [34], did not appear to provide more confidence in detection of microfoci than follow-up sampling of 20 persons, and was carried out at a large cost to the predictive value positive. The model also demonstrates that microfoci are exceedingly difficult to detect during PTS using a tool as insensitive as microfilaria smears. This due to the low estimated prevalence of microfilaremia in a post-MDA population, and the declining sensitivity of microfilaria testing as the prevalence of infection (and thus the number of circulating microfilaria overall and per infected person) decreases [7]. In child populations, smaller, lower-intensity microfoci are essentially invisible, due to the small size of the target population and the very low prevalence of microfilaremia in the background child population. In this model, microfilaria testing can identify large and intense microfoci, but even the best protocols only identified one of five microfoci when they were small and low-intensity, and all required testing of >5% of the total population. Combined with the need to sample persons in most areas between 10 pm and 2 am, microfilaria testing is unlikely to be a practical solution for monitoring the success of an elimination program. Using ICT identifies more, less intense, smaller endemic foci of infection while testing fewer persons.
We additionally show the challenges of cluster sampling, as compared with simple random sampling, in detecting microfoci. The primary advantages in cluster sampling are logistical, as fewer areas need to be visited during a survey than would be required for simple random sampling. However, even in settings of large, high-intensity microfoci, a maximum of only 40% of microfoci were identified in this model. In these protocols, relatively few persons are tested overall due to the low proportion of persons tested during primary sampling and the followup of only 20 persons around each positive. When considering small, low-intensity microfoci, a maximum of one of five microfoci were detected by cluster sampling, at a total cost of testing approximately 3% of the population. Using the ostensibly more sensitive Wb123 test does not yield a meaningful improvement on this metric, although it does improve the predictive value positive: ICT testing of adults or WCBA is clearly the most efficient way of detecting small, low-intensity microfoci regardless of whether cluster sampling or simple random sampling is used. Importantly, TAS-like sampling performed poorly in this model regardless of diagnostic test type used, microfocus size, or microfocus intensity; too few areas were covered during sampling to identify more than one of five microfoci in each simulation.
Microfoci, and not individually dispersed infections, pose the greatest risk for recrudescence of LF. However, the uncertainty surrounding what type of microfoci will spread without further intervention has been a stumbling block in setting LF elimination program targets. Conducting studies to determine which microfoci will spread is unethical: one cannot identify infected persons and deny treatment to evaluate the potential for infection propagation. Results from this model allow us to consider changing our approach to PTS entirely. Current methods focus on measuring average antigen prevalence, a metric that becomes less relevant as evaluation unit size and focal endemicity increase. Instead of targeting maximum tolerable average antigen prevalence for PTS, a decision about the maximum tolerable size and intensity of residual microfoci, and the confidence we desire in their absence in a post-MDA setting, could be considered. For example, we may wish to be 80% confident in the absence of residual microfoci >2 km in diameter, with intensities of infection !3 times the overall antigen prevalence in the population. We may choose to use ICT, and know that we want to test adult outpatients to approximate SRS. Under this framework, different simulations could be examined to determine which provided at least 80% confidence in the absence of such microfoci.
There are several limitations to this model. First, while we tried to design a conservative landscape that captured plausible post-MDA settings, there are currently few data to inform the true post-MDA situation with regard to residual infections or infection markers-particularly Wb123, for which the meaning of a positive test remains unclear. Because Wb123 was simply simulated as a more sensitive test than microfilaremia or ICT, it may be appropriate to consider the results for this test from that perspective, rather than as representing the actual performance of Wb123. Related to this, a minority of simulations did not yield detectable microfoci; that is, due to the low prevalence of background infections, particularly for children being tested for less-prevalent markers such as microfilaremia, even larger or more intense microfoci rarely had statistically significantly more infection-marker-positive persons than background. Thus, in these simulations, identification of microfoci was more a function of chance than of a true difference in infection marker prevalence. While this is important in terms of understanding model results, it is equally important in considering where the limits of detection lie with different markers. Notably, as mentioned above, it is unclear what comprises a microfocus that would spread without further intervention, and thus we cannot separate residual foci of infection into 'important' and 'less important'. Beyond this, risk of LF is not homogeneously distributed across an area; however, precise prediction of risk from related factors is not available and because of this, we chose to treat all areas as though they were at equal risk. Improved determination of risk factors for infection 'hotspots' could facilitate better identification of priority areas for PTS, although this is unlikely to occur in time for most countries stopping MDA.
While this model was designed with LF in mind, it can also be applied to surveillance for other low-prevalence diseases, particularly those for which clinical signs cannot be used to estimate concurrent infection prevalence. Hepatitis B virus (HBV) is one such infection for which this might prove useful. Similar to LF, the signs and symptoms of HBV are not overt until many years after infection, and the elimination target involves reducing antigenemia to <2% in 5-year-old children (with the eventual goal of <1% antigenemia in the general population) [35]. Other diseases with nonspecific symptoms, such as malaria, could also benefit from similar modeling.
While this is a simulation, the parameters-particularly for small, non-intense microfociare not extraordinary. Large microfoci may be more easily detected during TAS or sentinel/ spot-check sampling, although there is no guidance about follow-up testing or actions when persons are identified as positive. Data from this model provide two critical pieces of information: first, to be confident that microfoci, if they existed, would be detected, we need to carry out more robust surveillance than our current TAS requires. Second, microfilaremia testing is unlikely to be useful for PTS if confirmation of elimination is the goal.
To summarize, we show here that our current efforts at post-treatment surveillance will not suffice to detect small, low-intensity microfoci that may remain after cessation of MDA for LF. The use of more sensitive tests and more thorough testing methods are obligatory if we are to have confidence in long-term elimination of LF. Determining both practical and useful methods of surveillance may require some creativity, and perhaps graded efforts over time which help identify areas of interest during the first year, followed by much smaller continued surveys in subsequent years. High-intensity sampling of TAS-eligible areas would provide important data to improve this type of modeling.