Table 1.
Example surveillance system design parameters and their potential impacts on surveillance performance.
Fig 1.
Schematic of the DIOS framework.
The surveillance system optimization procedure uses data and knowledge about disease transmission and case ascertainment to identify optimal surveillance designs with regard to predefined surveillance goals. First, a disease system model is defined, using observed epidemiologic data and/or theory, and taking into account relevant factors influencing disease dynamics or distribution. Multiple realizations of disease data (
) may be generated to explore optimal designs under uncertainty or variability of the underlying system (see Specify and parameterize disease system model). Furthermore, an ensemble of disease models can be combined to reduce the chance of model misspecification. Next, a surveillance model is defined to represent how information on the state of the disease system is captured as a function of design parameters θ and any other relevant variables (e.g., factors known to affect the sensitivity and specificity of a diagnostic test, or estimated underreporting rates for an area; see Specify and parameterize surveillance model). To initiate the optimization process, an initial design parameter set, θ1, is drawn from the design space subject to operational constraints g(θi) ≤ 0, h(θi) = 0 and, along with underlying disease data
, input to the surveillance model to generate a realization of surveillance information,
. The objective function, f, is evaluated based on the disease data
, and surveillance information
(see Define objective function(s)). If a stopping criterion (e.g., reaching a large number of iterations; de minimis improvement in objective function) is not met, a new design parameter set, θi, is proposed from the design space using metaheuristic search algorithms (e.g., simulated annealing, genetic algorithm, particle swarm algorithm) when the design space is large, or enumeration when the design space is small. This new design parameter set is then used to generate a new realization of surveillance information and evaluation of the objective functions (see Simulation optimization search). After a stopping criterion is met, design parameter sets with the best objective function values are output as optimal surveillance designs.
Table 2.
Examples of objective functions for optimization analysis of surveillance networks.
Fig 2.
Simulated data used for surveillance system optimization.
Spatial variation of (A) the risk factor X and (B) log prevalence when ρ = 0.1 and (C) ρ = 0.3. Triangles represent the existing 30 surveyed sites; dots represent the 70 candidate sites; crosses represent two point sources of the risk factor of interest (e.g. locations of mass gatherings); background color in Panel A and contour lines in panels B and C represent the levels of risk factor X. Three realizations of the log prevalence surface when ρ = 0.1 or 0.3 are shown in S1 Fig.
Fig 3.
Optimal site placement to augment a surveillance network for spatial prediction or effect estimation under scenarios of spatially patchy or smooth disease distributions.
Black triangles represent initially enrolled sites, gray circles represent unselected candidate sites, and the cyan circle indicates the optimal site to add to the network. White crosses represent point sources for risk factor X. Raster colors represent objective function values for hypothetical sites added across a regular 41*41 grid in order to visualize the response surface in relation to initial network locations and the underlying risk factor. Colors represent the mean squared error of spatial predictions at unmonitored sites in A and B, and the variance of effect estimation for risk factor X in C and D.
Fig 4.
Results from Pareto optimization under the spatially smooth disease scenario (ρ = 0.3).
(A) Mean squared error of log predicted disease prevalence (OFV1) and variance of causal effect estimate (OFV2) of the Pareto set (colored dots) and all other candidate sites (hollow dots). (B) Locations of the Pareto set (colored triangles) colored coded as in Panel A. Black triangles represent initially enrolled sites, and gray dots represent unchosen candidate sites. Background color in Panel B represents log prevalence when ρ = 0.3 using the same color scheme as in Fig 2C, while contour lines represent levels of risk factor X.
Fig 5.
Metaheuristic optimization with simulated annealing (spatial prediction, ρ = 0.3).
(A) Mean squared error of predicted log prevalence (OFV1) across iterations of three SA runs. (B) The locations of the optimal 3 sites. Black triangles represent existing sites, blue triangles represent the optimal additional sites, and gray dots represent unchosen alternative sites. Background color in Panel B represents log prevalence when ρ = 0.3 using the same color scheme as in Fig 2C, while contour lines represent levels of risk factor X.