Identifying optimal COVID-19 testing strategies for schools and businesses: Balancing testing frequency, individual test technology, and cost

Background COVID-19 test sensitivity and specificity have been widely examined and discussed, yet optimal use of these tests will depend on the goals of testing, the population or setting, and the anticipated underlying disease prevalence. We model various combinations of key variables to identify and compare a range of effective and practical surveillance strategies for schools and businesses. Methods We coupled a simulated data set incorporating actual community prevalence and test performance characteristics to a susceptible, infectious, removed (SIR) compartmental model, modeling the impact of base and tunable variables including test sensitivity, testing frequency, results lag, sample pooling, disease prevalence, externally-acquired infections, symptom checking, and test cost on outcomes including case reduction and false positives. Findings Increasing testing frequency was associated with a non-linear positive effect on cases averted over 100 days. While precise reductions in cumulative number of infections depended on community disease prevalence, testing every 3 days versus every 14 days (even with a lower sensitivity test) reduces the disease burden substantially. Pooling provided cost savings and made a high-frequency approach practical; one high-performing strategy, testing every 3 days, yielded per person per day costs as low as $1.32. Interpretation A range of practically viable testing strategies emerged for schools and businesses. Key characteristics of these strategies include high frequency testing with a moderate or high sensitivity test and minimal results delay. Sample pooling allowed for operational efficiency and cost savings with minimal loss of model performance.


Overview
We give a detailed description of the model underpinning the analysis for testing schools and businesses. The modeling framework describes a scenario in which there is monitoring/testing for a common group of people who mix continuously (as in a school or office setting), and are subject to the introduction of infection from the surrounding (unmonitored) community. All together, the model links a testing strategy, described by a number of tunable parameters, to a disease model for COVID-19. The disease model is dynamic in time as infections spread both from internal mixing and from the surrounding community. The tunable testing parameters correspond to attributes of the tests themselves (like sensitivity and specificity) and also to the elements of the strategy (like how many days between tests). By running a wide range of realistic combinations of these parameters and counting the expected number of tests required, we may estimate the costs of each distinct surveillance strategy. We may also compare the strategy to the disease model running without implementing any testing at all or to the use of symptom tracking without testing to measure the effectiveness of the strategy. We begin by describing the testing strategies considered, and then describe the disease model in detail.

Testing strategies
2.1. Symptom tracking. Symptom tracking is a simple (and free!) way to attempt to suppress community spread of the disease. In our modeling, we consider three scenarios: • No interventions. In this case we run the model without any interventions. The no intervention model establishes a baseline against which various strategies incorporating symptom tracking and testing can be compared. • Symptom tracking alone. In this scenario, we assume 66% of cases entering a symptomatic phase are caught and a fraction b of these cases self isolate. • Testing + Symptom tracking. We adopt a regular regimen of testing in addition to symptom checking.

Test attributes and tunable parameters.
Our first assumption is that testing will happen on a regular cadence. Every day. Every 2 days. Weekly. Moreover, we assume that every individual in the organization is tested during each round of testing. Each test is characterized by four numbers: • Sensitivity (Se) and Specificity (Sp), • Cost (C) in dollars/test, and • results lag d in days. Point-of-care tests feature d = 0, while, in the manuscript, traditional lab-based tests are assumed to have d = 2. Other values of d may be appropriate to local circumstances. The two main choices of the organization are the frequency of testing (how many days between tests, τ − 1) and the number of samples to pool (m, if pooling is an option for that particular test). Finally, the model allows less than 100% of individuals with positive tests to comply with isolation protocols. While this number (b) is not determined by organizational leaders, they will be in the best position to estimate realistic values for compliance in their organization. Another decision to be made at an organizational level concerns confirmatory testing. In low prevalence scenarios, especially with a non-specific test, the number of false positives will be large. An organizational commitment to fund confirmatory testing may involve substantial expense.
Tunable testing parameters for the model are listed in Table 1.  [3]. In that work (and in ours), we imagine samples (swabs) from multiple people being pooled and tested all at once. In low prevalence settings, this kind of group testing is widely used in human infectious disease applications, and the simple version we propose here was used by the US military to screen new inductees for syphilis during WWII [1]. In many circumstances, it is possible to greatly reduce the number of tests required to screen a population. However, there is a cost; pooled testing will result in decreased sensitivity. Yelin et al. [4], using standard RT-PCR, estimated a sensitivity of Se = 0.9 after pooling a single positive SARS-CoV-2-positive sample together with 31 negative samples. We assume that there are no false negatives in the collection of un-pooled samples. We further assume a linear decrease in sensitivity with respect to negative samples added to the pool. Then, it is a simple matter to calculate the discount rate in sensitivity per each additional negative sample: In equation (2.1) s 0 is the test sensitivity when no true negative samples were pooled with the original true positive sample, s 31 is the sensitivity when 31 true negative samples were pooled with the positive sample. Arithmetic yields r = .00323. Finally, for a pooled test with s samples and p true positive samples, the sensitivity can be calculated as Here, Se gp represents the sensitivity of the pooled test and Se ind is the sensitivity of the individuallevel test. In practice, of course, we don't know the number of true positives in the pool, so we assume the same prevalence of infection as in the monitored population. Individual RT-PCR testing for SARS-CoV-2 is highly specific, so a specificity of Sp = 0.995 was assumed for all PCR-type tests regardless of the number of samples pooled. When pooling (m > 1) and considering confirmatory testing, we assumed a simple 2-stage Dorfman testing process in which each individual in a positive pool is retested individually using a high-sensitivity diagnostic test. We then calculated the expected number of tests required to complete each round of testing. That is, suppose X is the random variable counting the number of tests required to complete 2-stage pooling. Then, evidently, the quantity of interest is E[X], the expected number of tests need to complete one round of Dorfman 2-stage pooled testing. This will depend on the prevalence p and on the sensitivity Se and specificity Sp.
For an individual test, the probability of returning a positive result is and the probability of returning a negative result is We treat each test in the pool as an independent Bernoulli trial with probability of success q, and we denote by Y the random variable counting the number of successful trials in a pool of m samples. By elementary probability, whence π m , the probability of having at least one positive test in the pool, is given by Finally, if a pool tests positive, then m additional confirmatory tests are required. Otherwise, only the single pool-level test is needed. Therefore,

Disease Model
3.1. Population characteristics and disease. We assume a monitored population of P individuals. To characterize the disease, we use a classical continuous-time SIR model from classical epidemiology. That is, the monitored population is divided into compartments The disease model is characterized by 4 parameters: β, γ 1 , γ 2 , and γ 3 . The parameter β is the per capita rate of effective contacts. The parameters, γ 1 , γ 2 , and γ 3 are the removal rates for the disease, with γ −1 1 = γ −1 2 + γ −1 3 is the average period of infectiousness. We assume γ −1 1 = 4.5 days. Moreover, the ratio is the fraction of cases which are asymptomatic. The literature suggests a very wide range for this number; we follow CDC modeling guidelines and assume this number is 40%. This assumption supplies a system of three equations in the three unknowns γ 1 , γ 2 , and γ 3 . Because the disease model is formulated as a system of differential equations, we must supply initial conditions, namely the values of S, I A , I S , and R at the initial time t 0 . In the manuscript, the initial conditions are chosen from the average of population scaled new confirmed cases reported by the New York Times for September 23, 2020 in a sample of counties scaled by 1/γ 1 . That is, we begin with .675 infections on day 0 (time t 0 ). These are split such that 40% are asymptomatic and 60% are symptomatic. We further assume no one in the population has immunity to the virus based on previous infection or otherwise. Thus R(t 0 ) = 0.

3.2.
Equations. The core of the model is formed by the simple nonlinear, nonautonomous system of ordinary differential equations.
In (3.2), the forcing term accounts for the introduction of infections from outside the organization; see §3.5 for and explanation and derivation. A critical epidemiological parameter, R 0 , the basic reproduction number, is often defined as the average number of infections generated by each infectious individual. For Equation (3.2) we find

Implementation.
To model symptom tracking, we stop the model each day and remove the appropriate fraction of individuals from the I S compartment. To model pooled testing with symptom tracking, when τ divides the day, in addition to removal due to symptom tracking from the I S compartment there is removal from both infectious compartments due to positive tests. The initial test is assumed to be on day zero. To account for possible delays in receiving test results, we allow for a delay parameter, d. On day τ + d we stopped the model and restarted with new "initial conditions" which account for the transfer of the number of people who tested positive from the infectious to the removed compartment. This process is repeated according to the testing strategy defined by τ , m, d, Sp, Se, and the total number of tests administered and infections caught are recorded. Rounding happens at the end of the simulation, to return whole number values for the number of infections caught.
3.4. The cost of doing nothing. To measure the effectiveness of the various strategies, we can compare them to the strategy of doing no testing or to doing symptom tracking alone. This gives one easily quantified measure, in terms of the reduction in cumulative infections over the time period, for evaluating the performance of a strategy. Figure 1 illustrates the dynamics of the model in the absence of testing.
3.5. Community prevalence and time-dependent forcing. To account for the introduction of infections from the surrounding community, we add a time-dependent forcing term which represents the rate of people becoming infected from an external source continuously in time. With frequent testing, this external forcing drives the behavior of the model. In general, the forcing takes the form of function 3) where (t), measured in people/time, represents the rate of importation of infections into the organization. A key challenge is that this function is not known in general. We assume proportionality to local case counts, and note that county-level case counts are reported daily. Thus, while the future course of case counts is unknown, current and past values are known to decision makes and policy deciders; thus institution leaders will be able to make informed assumptions about local conditions. The continuous dependence on time means that we can test various scenarios (which gives more realism and flexibility) than the periodic forcing induced by the "exogenous shocks" considered by Paltiel et al. [2]. In the manuscript, we examine two data-driven scenarios for this forcing, but many possibilities can be incorporated into the model. The two extremes are meant to illustrate one of the key sources of uncertainty in any organization-based testing strategy -the number of infections entering the organization from outside.
• The first is a relatively flat profile which comes from the 7 day rolling average of the case count in Fayette County, Pennsylvania for the 100 days beginning March 26, 2020 as reported in the New York Times. We scale the case counts by population, which, in the manuscript we choose to be 1500. This low growth profile is reported as panel (a) in Figure  2.

Core Julia code
The code representing heart of the dynamic model is reproduced below. Complete code, for creating all of the figures and creating the full table of experiments is available from the authors.