Figures
Abstract
We developed and implemented a framework for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology. We applied this framework to digital droplet RT-PCR measurements of SARS-CoV-2 and Pepper Mild Mottle Virus genes in wastewater. Measurements were made using 10 replicate wells which allowed for high assay sensitivity, and therefore enabled detection of SARS-CoV-2 RNA even when COVID-19 incidence rates were relatively low (~10−5). We then used a computational downsampling approach to determine how using fewer replicate wells to measure the wastewater concentration reduced assay sensitivity and how the resultant reduction affected the ability to detect SARS-CoV-2 RNA at various COVID-19 incidence rates. When percent of positive droplets was between 0.024% and 0.5% (as was the case for SARS-CoV-2 genes during the Delta surge), measurements obtained with 3 or more wells were similar to those obtained using 10. When percent of positive droplets was less than 0.024% (as was the case prior to the Delta surge), then 6 or more wells were needed to obtain similar results as those obtained using 10 wells. When COVID-19 incidence rate is low (~ 10−5), as it was before the Delta surge and SARS-CoV-2 gene concentrations are <104 cp/g, using 6 wells will yield a detectable concentration 90% of the time. Overall, results support an adaptive approach where assay sensitivity is increased by running 6 or more wells during periods of low SARS-CoV-2 gene concentrations, and 3 or more wells during periods of high SARS-CoV-2 gene concentrations.
Citation: Kim S, Wolfe MK, Criddle CS, Duong DH, Chan-Herur V, White BJ, et al. (2022) Effect of SARS-CoV-2 digital droplet RT-PCR assay sensitivity on COVID-19 wastewater based epidemiology. PLOS Water 1(11): e0000066. https://doi.org/10.1371/journal.pwat.0000066
Editor: Ricardo Santos, Universidade Lisboa, Instituto superior Técnico, PORTUGAL
Received: August 31, 2022; Accepted: October 25, 2022; Published: November 16, 2022
Copyright: © 2022 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data are submitted to the Stanford Digital Repository and are available publicly. The link to the data is: https://exhibits.stanford.edu/data/catalog/km637ys9238.
Funding: This study was supported by the CDC Foundation (to ABB), NSF RAPID (CBET-2023057 to ABB), and by the Epidemiology and Laboratory Capacity for Infectious Diseases Cooperative Agreement (no. 6NU50CK000539-03-02 to CDPH colleagues) from CDC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: D. Duong, V. Chan-Herur, and B. White are employees of Verily Life Sciences.
Introduction
Wastewater-based epidemiology for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is becoming an increasingly important tool in monitoring coronavirus disease 2019 (COVID-19) incidence rates in communities. Monitoring programs that use samples from publicly owned treatment works (POTWs) [1, 2] and from sewer conveyances [3] have actively been implemented to supplement clinical testing data. Wastewater surveillance has the advantage of providing insights into population health by overcoming limitations of clinical testing, such as test seeking behavior and test availability. Wastewater surveillance has also been used to gain insight into the epidemiology of other respiratory viruses such as influenza A virus [4] and respiratory syncytial virus [5] (RSV), as well as gastrointestinal pathogens such as hepatitis A virus [6] and Salmonella [7]. Therefore, wastewater surveillance is likely to become increasingly useful for assessing various aspects of community health beyond COVID-19.
SARS-CoV-2 RNA concentrations in wastewater, whether measured in the solid or liquid phase or using quantitative or digital reverse-transcription polymerase chain reaction (RT-PCR), correlate to COVID-19 laboratory incidence rates in the population contributing to the wastewater [8–10]. The lower detection limit, or the sensitivity, of any method for detection of SARS-CoV-2 RNA, or any other disease target, will dictate the lowest levels of disease occurrence that can be detected using wastewater.
Digital (RT-)PCR can be a sensitive method for detecting disease targets in wastewater. Digital (RT-)PCR methods divide the entire (RT-)PCR solution (master mix, primers, probes and template) into a large number of partitions (droplets or physical partitions in a plate) such that each partition likely contains only one copy of template nucleic acid. By increasing the number of partitions (assuming their volume remains constant) and the associated reaction volume, the analytical sensitivity of the measurement can be increased. For most digital (RT-)PCR platforms, the number of partitions can be increased by increasing the number of wells on a 96-well plate used to analyze samples; the results from partitions generated by all replicate wells are merged to compute the final measurement. In this context, a well is an aliquot of fluid from which partitions (droplets for droplet digital PCR) are generated. However, increasing the number of wells used for each sample to improve sensitivity increases project reagent costs. In a previous study, we compared the lowest measurable concentration of SARS-CoV-2 RNA reported by different laboratories using different pre-analytical methods and digital RT-PCR, and the detection limit decreased as the number of replicate wells used in the method increased [11]. Authors have reported using between one to ten merged wells without full justification on how the number of wells merged was selected [2, 11, 12].
To survey studies that investigated effect of replicate wells on digital (RT-)PCR sensitivity on environmental sample measurements, we conducted a literature review with the following keywords using Web of Science on July 19, 2022: (((ALL = (air OR soil OR water OR wastewater OR environment*))) AND ALL = ("digital PCR" OR dPCR OR ddPCR OR "digital droplet PCR" OR RTddPCR OR RTdPCR OR dRTPCR OR ddRTPCR OR "digital droplet RT-PCR" OR "digital RT-PCR’’ OR RT-ddPCR OR RTdd-PCR OR dd-RTPCR OR ddRT-PCR)) AND ALL = (sensitivity OR modeling OR simulation OR "detection limit"). This review resulted in 216 search results: 104 studies passed initial title and abstract screening where the inclusion criteria required a description of digital (RT-)PCR method development or investigation of digital (RT-)PCR sensitivity, and only one study [13] passed the full text screening. This single study [13] proposed statistical models that describe relationships between different digital PCR parameters including number of replicate wells used; however, the study was mostly theoretical and focused on clinical case studies. The other 107 studies mainly focused on comparing digital (RT-)PCR sensitivity to that of other methods like q(RT-)PCR; a few of the papers mentioned the need for replicate PCR samples [14] or the limit of small reaction volume for digital PCR [15–19], with one study pointing out that replicate wells can be merged retrospectively for an increase in sensitivity [20]. To date, no study has empirically investigated how the number of replicate wells affects digital (RT-)PCR sensitivity in environmental samples, and how the resultant sensitivity affects applications of the technology for public health decision making. The present study aims to fill this knowledge gap.
Given the ongoing importance of wastewater surveillance for COVID-19 disease monitoring and its potential use for other disease surveillance, and increasing use of digital (RT-)PCR, it is important to better understand how analytical measurement sensitivity is controlled by increasing the number of (constant volume) partitions, and also how this change in sensitivity affects the use of the measurements for wastewater-based epidemiology applications. To achieve this aim, we measured SARS-CoV-2 RNA and Pepper Mild Mottle Virus (PMMoV) RNA in wastewater solids samples using digital droplet RT-PCR (ddRT-PCR) using 10 replicate wells. We then computationally downsampled the wells to investigate how the number of wells, and thus partitions, affects the lowest measurable concentrations of SARS-CoV-2 and PMMoV genes and associations between SARS-CoV-2 gene measurements and disease incidence rates. The framework developed herein for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology is generalizable to other infectious agents and other analytical approaches for measuring molecular targets.
Materials and methods
POTWs and data collection
Data used in this study was obtained from an on-going SARS-CoV-2 wastewater monitoring program in California, USA described by Wolfe et al. [2] Samples were collected and processed daily between June 1, 2021 to August 31, 2021 from four POTWs: 80 samples from City of Davis Wastewater Treatment Plant (Dav) in Davis, 83 samples from South County Regional Wastewater Authority Wastewater Treatment Plant (Gil) in Gilroy, 75 samples from Oceanside Water Pollution Control Plant (Ocean) in San Francisco, and 89 samples from San Jose-Santa Clara Regional Wastewater Facility (SJ) in San Jose (listed in the order of size from smallest to largest). Further details on the POTWs and sampling procedures can be found in Wolfe et al. [2] and in Table A in S3 Text. The data reported herein has not been previously published.
The solids samples were processed within 24 hours of collection exactly according to the methods described by Wolfe et al. [2] and are summarized in the S1 Text. In brief, dewatered solids were suspended in a buffer, and then 10 replicate aliquots of the buffer containing a suspension of solids were subjected to RNA extraction and purification, followed by inhibitor removal using commercial kits. The RNA from each of the 10 replicates was assayed in a single 20 μL ddRT-PCR well (10 replicate wells total per sample) to determine SARS-CoV-2 N gene and PMMoV gene concentrations; we also measured recovery of spiked-in bovine coronavirus. The approach of using 10 replicate RNA extracts from 10 replicate solids aliquots per sample allows us to account for variability inherent in the wastewater solids matrix, as viral RNA may be dispersed heterogeneously throughout the matrix. While we recognize that some laboratories may not have the resources for such an approach, we believe biological replication is superior to technical replication for environmental samples owing to the inherent variability of complex environmental matrices. EMMI reporting guidelines [21], which promote transparency in methodologies, and results of controls, were followed in our descriptions below.
Data for each individual well was downloaded from QuantaSoft Analysis Pro software (BioRad, CA, version 1.0.596). Samples for which all wells did not have at least 10 000 generated droplets in each well were eliminated from our analysis. This eliminated a total of 41 SARS-CoV-2 N gene measurements (12 from Dav, 9 from Gil, 17 from Ocean, and 3 from SJ), resulting in 327 measurements for further analysis (89 from Dav, 80 from Gil, 83 from Ocean, 75 from SJ).
COVID-19 epidemiology data
Laboratory confirmed incident cases of COVID-19 as a function of episode date was obtained as described previously [2]; see S2 Text for details.
Downsampling simulation
In order to estimate the SARS-CoV-2 N gene and PMMoV RNA concentration we would have obtained if we had run a smaller number of wells (X = 1–9), we randomly selected X wells from the 10 wells to calculate the resultant concentration: where 0.00085 μL is the volume of a single droplet [22]. If the total number of positive droplets was less than three, the concentration was denoted as not detected (ND).
A thousand simulations were conducted for each possible number of merged wells (X = 1–9) for each sample. The resulting concentrations were converted to units of cp/g dry weight using dimensional analysis [2]. From these thousand simulations, we calculated 1) the percent of the simulations that resulted in less than 3 positive droplets across merged wells and was designated as ND, 2) the median concentration, and 3) its interquartile range (25th and 75th percentiles). No substitution for ND was made, and it was noted whether the median or interquartile range included ND. Similar analyses were performed for ten randomly selected PMMoV measurements from each POTW; each measurement had at least 10 000 total droplets generated in each well. Simulations were conducted using R (version 4.0.4) implemented using RStudio (version 1.4.1106).
Statistical analysis
Statistics were computed using R in conjunction with RStudio (see above), using packages pracma and tidyverse for data analysis, and ggplot2 for data visualization. Shapiro-Wilk test was used to determine whether simulation outputs were normally distributed. The dispersion of the simulation outputs is defined by the interquartile range (IQR). The relative dispersion of the simulation outputs is described as the ratio of the median and the interquartile range. The dispersion and relative dispersion of the simulated concentrations were compared to the standard deviation of the concentration, or the standard deviation normalized by the concentration, respectively, derived from the measurement obtained using 10 wells. The standard deviation of that measurement is the 68% confidence interval as defined by the total error from the instrument software which includes Poisson error and variation among wells; the total error formula is proprietary and not available from the vendor.
Nonparametric Kendall’s tau was used to assess the association between the N gene concentrations in wastewater and laboratory confirmed COVID-19 incidence rates. Kendall’s tau was calculated for both the entire time series and the low incidence month of June. Half of the theoretical lower measurement limit for each number of wells was substituted for measurements considered NDs. Half of the theoretical lower measurement limit was chosen to represent an average of all the concentrations below the theoretical lower measurement limit.
The theoretical lower measurement limit was calculated for each number of merged wells (X = 1–10) by: 1) calculating the concentration resulting from three positive droplets total across merged wells out of 20 000 total accepted droplets (theoretical number of droplets generated) per each well merged, and 2) converting the concentration to units of cp/g dry weight using average solid content for samples from each POTW (Table B in S3 Text).
Linear regression was used to derive an empirical relationship between log10-transformed COVID-19 laboratory-confirmed incidence rates and log10-transformed measured SARS-CoV-2 N gene concentrations using data obtained by merging ten wells; relationships were quantified for each POTW separately, and for the POTWs in aggregate. For the linear regression, NDs were substituted with half the theoretical lower measurement limit calculated for X = 10. Using the empirical relationship between incidence rate and SARS-CoV-2 RNA concentration for the associated POTW, the lowest detectable COVID-19 incidence rate was estimated based on the calculated theoretical lower measurement limits.
A logistic regression was used to model the fraction of samples that were assigned a concentration (versus assigned ND) for X = 1–9 as a function of the true concentration of the sample, defined as the concentration obtained using 10 wells. The concentration corresponding to a detection frequency of 0.5 (C0.5) was calculated using the regression equation. For this analysis, half of the theoretical lower measurement limit was substituted for the 6 NDs for X = 10. All code for simulations and statistics is available through the Stanford Digital Repository (https://purl.stanford.edu/km637ys9238).
The Institutional Review Board of Stanford University determined that this project does not meet the definition of human subject research as defined in federal regulations 45 CFR 46.102 or 21 CFR 50.3 and indicated that no formal IRB review is required.
Results
QA/QC
We ran a total of 3–7 negative and 1 positive extraction controls, and 3–7 negative and 1 positive PCR controls per plate. All negative controls were ND and positive controls showed positive detections. BCoV was used as a process control to verify that the extraction was successful and there was no gross inhibition in quantification. Samples that had less than 10% recovery of BCoV were rerun; no sample had less than 10% recovery. No further correction or analysis of BCoV recoveries are provided here given the complexities of interpreting recoveries of exogenous controls [23]. PMMoV concentrations across samples are similar to those measured and reported previously, also suggesting no gross issues with extraction or inhibition (Fig A in S3 Text). Data on N and PMMoV gene concentrations are available through the Stanford Digital Repository (https://purl.stanford.edu/km637ys9238).
Measurement overview
A total of 327 samples from four different POTW were analyzed for the N gene of SARS-CoV-2. When using ten wells, the SARS-CoV-2 RNA concentration ranged from ND to 3.05 x 105 (Dav), ND to 3.64 x 105 (Gil), ND to 1.93 x 105 (Ocean), and 3.09 x 103 to 2.00 x 105 cp/g dry weight. A summary of SARS-CoV-2 RNA concentration for each month is shown in the SI (Table C in S3 Text).
Simulation output trends at high and low concentrations
For each measurement, a thousand simulations were conducted to sample each possible number of merged wells (X = 1–9) and the results are reported as concentration in units of cp/g dry weight (cp/g, hereafter). The resulting concentration distributions obtained for each X for each measurement were not normally distributed based on Shapiro-Wilk tests (p < 0.05); therefore, medians and interquartile ranges are used to describe the results. Simulation outputs of example measurements for SARS-CoV-2 N gene in samples collected during a period of low COVID-19 incidence (June 1, 2021) and high COVID-19 incidence (August 31, 2021); as well as example PMMoV gene measurements (June 6, 2021) are provided in Fig 1. The simulation dispersion can be compared to the results obtained using 10 merged wells and its standard deviation, defined by the total error as reported by the ddPCR instrument, which includes errors associated with the Poisson distribution and variability among replicate wells.
(Top) SARS-CoV-2 N gene in June 1, 2021 sample during low COVID-19 incidence, (middle) SARS-CoV-2 N gene in August 31, 2021 sample during high COVID-19 incidence, and (bottom) for PMMoV in June 6, 2021 sample. For X = 1–9, the circle in the box represents the median, and the top and bottom of the box represent 75th and 25th percentile, respectively. Any X that resulted in ND in all simulations are marked with an unfilled circle. For X = 10, the circle in the red box represents the software reported concentration from merging all ten wells, and the top and bottom of the box represent upper and lower confidence intervals, respectively, from 68% total error as given by the instrument software, which includes errors associated with the Poisson distribution and variability among replicate wells. Percentage of positive droplets in 10 wells is shown in boxes within each plot.
There are several important insights to glean from these results. First, when the number of positive droplets is less than 3 across the 10 wells (< 0.0015% of droplets positive), and the measurement is deemed as ND (see Dav sample from June 1, 2021), the results obtained from fewer wells agree with the results obtained from 10 wells. Second, when the number of positive droplets is high (for example, for PMMoV where there are 104~105 positive droplets across 10 wells, 5–50% of droplets positive), then the dispersion as represented by the interquartile range (IQR), in the simulations for X < 10 wells is similar to the standard deviation reported by the instrument for X = 10 wells. Finally, when the number of positive droplets is intermediate to these two regimes (fraction of positive droplets between 0.0015% and 0.5%), then the IQR increases as X decreases and is often greater than the standard deviation from the X = 10 well measurement, particularly when X < 3. Below SARS-CoV-2 concentration of 104 cp/g (where the fraction of positive droplets is 0.0062% - 0.024% depending on POTW), the value of X below which the relative dispersion, defined by IQR divided by the median, is larger than the standard deviation normalized by the measured concentration for X = 10 scales inversely with SARS-CoV-2 N gene concentration (Fig B in S3 Text).
As PMMoV is present in such high concentrations and the measurements yielded high positive droplet counts, resulting in similar concentrations across X = 1–10, the remainder of this analysis will focus on the SARS-CoV-2 N gene concentrations, as those measurements yielded low to intermediate positive droplet counts.
Theoretical sensitivity
An empirical relationship between the log10-transformed COVID-19 incidence rate and the log10-transformed SARS-CoV-2 concentration using ten merged wells was derived with linear regression. The regression showed that for 1 log10 increase in SARS-CoV-2 RNA cp/g, there was between 0.50 and 0.88 log10 increase in laboratory-confirmed COVID-19 incidence rate (Table D in S3 Text), depending on the POTW. The data from all four POTWs appear to fall on a single line when COVID-19 incidence rate is plotted against SARS-CoV-2 concentration (Fig C in S3 Text); when data from all POTW are combined, there was a 0.64 log10 increase in laboratory-confirmed COVID-19 incidence rate for 1 log10 increase in SARS-CoV-2 RNA cp/g.
Theoretical lower measurement limit (Table E in S3 Text) and the corresponding incidence rate lower limit (Table F in S3 Text) was calculated. The theoretical lower measurement limit for each POTW ranged from 7500 (SJ) to 24000 (Gil) cp/g when using only one well and from 750 (SJ) to 2400 (Gil) cp/g when using ten merged wells. Since this theoretical lower measurement limit was calculated with average solid content of samples from each POTW by measuring the percent weight of the dewatered solids and assuming a total of 20 000 generated droplets, the observed lower measurement limit may be different. The corresponding incidence rate lower limit per 100 000, calculated using the empirical relationships in Table D in S3 Text, ranged from 1.6 (SJ) to 6.9 (Gil) when using one well and from 0.2 (SJ) to 1.4 (Gil) when using ten merged wells (Table F in S3 Text).
Association with clinical data
Time series of median concentrations resulting from a thousand simulations for each measurement for all possible numbers of wells are provided in Fig D in S3 Text, with X = 1, 3, 6 highlighted in Fig 2. Lines representing low number of wells deviate from those representing higher number of wells during the low incidence rate period in June 2021 when SARS-CoV-2 N gene concentrations in wastewater were relatively low. When the entire study period of June 1, 2021 to August 31, 2021 was considered, SARS-CoV-2 N gene concentrations were positively and significantly correlated with 7-day smoothed COVID-19 incidence rates at all four POTWs regardless of X (Table G in S3 Text, tau > 0.54, p < 0.05 for all); X did not have an effect on Kendall’s tau when considering the entire time series (tau changed by < 0.05 as X varied). However, when considering the month of June alone when COVID-19 incidence rate was relatively low, SARS-CoV-2 N gene concentrations were positively and significantly correlated with 7-day smoothed COVID-19 incidence rates only when X > 1 (α = 0.1) and X did affect tau by as much as 0.15 (Table H in S3 Text).
(Top to bottom) X = 1, 3, 6, for wastewater SARS-CoV-2 gene concentration (cp/g dry weight) and 7 day centered smoothed average laboratory-confirmed SARS-CoV-2 incidence rate for each of the four POTWs from June 1, 2021 to August 31, 2021. Note that the SARS-CoV-2 N gene concentrations are displayed in log10-scale format for ease of visualization. Each wastewater data point represents median SARS-CoV-2 RNA concentration for a single sample obtained from 1000 simulations; for X = 10, each data point is the concentration obtained by merging 10 wells. Samples that resulted in ND were substituted with zero. A figure showing all possible numbers of merged wells is included in the SI (Fig D in S3 Text).
Detection frequency
Detection frequency for SARS-CoV-2 N gene across all POTWs was examined as a function of X and the true concentration of N gene, as defined by the concentration obtained using 10 merged wells (Fig 3). Logistic regressions were fit to the curves (Table I in S3 Text) and the concentration at which the detection frequency is ≤ 0.5 (C0.5) was calculated, as well as the corresponding incidence rate using the empirical relationship between log-transformed SARS-CoV-2 RNA N gene concentrations and COVID-19 incidence rate for all POTWs (Table 1). As there were not as many measurements that resulted in ND for X > 8, the confidence interval of the logistic regression is relatively large, especially for X = 9 and 10. Therefore C0.5 derived for X = 9 and10 is an extrapolation of the data used in this study and may be more uncertain. C0.5 scales with X according to the following equation log10(C0.5) = -0.13*X + 4.0 (R2 = 0.90, p-value < 10−4). Percentage of samples that fall under C0.5 for each X during the entire time series of three months and during the month of June was calculated. Most samples that fell under C0.5 were collected in June. Fig 4 shows data exclusively in June 2021 to focus on this time period; measurements obtained using X = 10 are shown with the concentration at which the detection frequency is ≤ 0.5 for X = 1, 3, 6, 10 is shown in dashed lines. During this low incidence rate period, for X = 1, 81% (92 of 113) measurements fall below the C0.5 across all POTWs. For X = 3, 38% (43 of 113) measurements, and for X = 6, 8.8% (10 of 113) fall below the corresponding C0.5. For reference, when X = 10, 4.4% (5 of 113) measurements fall below the C0.5 across all POTWs.
True concentration is defined as concentration obtained by merging all ten wells. Each data point shows the fraction of 1000 simulations that did not result in ND in each well on the y-axis and its true concentration on the x-axis. ND for X = 10 was substituted with half of the theoretical lower measurement limit. 95% confidence intervals of the logistic regression are shown as the gray ribbon.
Note that the SARS-CoV-2 RNA concentrations are displayed in log10-scale format for ease of visualization. Each wastewater data point represents SARS-CoV-2 concentration measured for a single sample. Samples above the lower measurement limit are shown as filled circles. Samples that resulted in ND, shown as empty circles, were substituted with half the lower measurement limit.
Discussion
With the growing interest in application of wastewater-based epidemiology to various infectious diseases, it is crucial to understand how sensitivity of the assay used to measure infectious disease targets impacts the ability to use wastewater measurements to represent community disease burden. The COVID-19 pandemic provides a unique opportunity to investigate this relationship as active disease surveillance during the first year and a half of the pandemic provides relatively robust disease incidence data [24]. In this study, we developed a framework for investigating how the number of merged replicate wells in digital RT-PCR affects the lowest measurable concentration and number of non-detects, and in turn influences the use of the measurements to detect low incidence rates in the community, or to infer trends in disease occurrence. Understanding this relationship is important in optimizing surveillance efforts for COVID-19 and other infectious diseases.
We measured concentrations of SARS-CoV-2 and PMMoV genes daily in wastewater settled solids at four POTWs in California using ddRT-PCR during a period of time that included both low (~ 10−5) and high (>10−4) COVID-19 incidence rates. We used 10 merged wells for the measurements, and then determined how the measurement would have been affected by using fewer than 10 wells through a down-sampling scheme. Our findings indicate that when a large fraction of droplets are positive (> 5% positive), as was observed for PMMoV, a virus found in high quantity in human stool and wastewater [25], concentrations measured using just one well are similar to those obtained using ten when considering the variability associated with the measurements. On the other hand, when a smaller fraction of droplets are positive (< 0.5%), as was the case for the SARS-CoV-2 gene measurements and expected for other human viral gene targets, using fewer wells can result in measurements that may vary from those obtained using 10 wells and produce more measurements characterized as non-detects.
For the SARS-CoV-2 gene measurements, variability in measurements increased as the number of wells decreased. Generally, we found that when the fraction of positive droplets was greater than 0.024% (corresponding to a conservative approximate concentration of 104 cp/g dry weight), that the variability in the measurement resulting from using 3 or more wells was similar or smaller than the measurement total error obtained using 10 wells. In contrast, when the fraction of positive droplets was less than 0.024%, the variability in the measurements resulting from using 6 or more wells was similar or smaller than the measurement total error obtained using 10 wells. These results could guide adaptive analysis plans that use fewer wells to reduce costs when concentrations of SARS-CoV-2 are relatively high.
The probability of obtaining a non-detect increased as the SARS-CoV-2 gene concentration decreased and the number of wells used in ddRT-PCR decreased. Using logistic regression, we identified the concentration at which the detection frequency was less than 0.5 (C0.5), and this value varies inversely with the number of wells; that is C0.5 is higher when fewer wells are used for ddRT-PCR. This means that when low concentrations of SARS-CoV-2 genes are expected, using too few wells can result in a large number of non-detects. For example, during the low incidence period of June, if only 1 well had been used instead of 10, 92 of the 113 measurements across four POTWs would have been below C0.5. We found that at least 6 wells were needed to achieve 90% of the measurements to be above C0.5 for June 2021.
Consistent with other studies, the wastewater concentration showed positive and significant correlation with 7-day smoothed COVID-19 incidence rates [1–3, 8–12]. When there was variation in COVID-19 incidence rates within the time frame being investigated (here before and during the Delta variant surge), the number of wells being used for the analysis did not affect the magnitude or statistical significance of the correlation. There was a positive and significant correlation even when using only one well because there was enough variation in both variables, although the majority of June measurements were characterized as non-detects. This illustrates that finding a significant correlation between disease incidence and SARS-CoV-2 gene concentrations does not necessarily indicate good measurement sensitivity. It should be noted that while we take the laboratory confirmed COVID-19 incidence rates to be reflective of the level of COVID-19 that the community is experiencing, they are likely an underestimate of incidence rates in the sewershed as the reported incidence rates are dependent on test-seeking behavior and test availability [26].
The results described here on the effect of the number of wells used for ddRT-PCR on sensitivity of PMMoV and SARS-CoV-2 measurements are extendable to other platforms and other gene targets. Increasing the number of wells is analogous to increasing the volume of the PCR reaction (for any PCR method) and increasing the number of (constant volume) partitions for digital PCR applications. Although uncommon, some researchers have previously also used a similar approach to increase sensitivity of qPCR by adding the resulting concentration of replicates [27]. Similarly, the recommendations for increased sensitivity herein apply to other gene targets. Generally, for any high copy number target, like PMMoV, increased sensitivity is generally not needed, so efforts to improve sensitivity through replication are unnecessary. Examples of other high copy targets in wastewater matrices include the 16S rRNA and crAssphage genes. For lower copy number targets, or rare targets, increased sensitivity is likely needed particularly if the results will be used for disease surveillance. Examples include other viral targets like norovirus and rotavirus RNA or bacterial targets like those for Salmonella or Campylobacter.
There are a few limitations of this analysis. First, in our analysis, we assumed that the measurement obtained using 10 wells is the “true concentration” and compared all results simulated with fewer than 10 wells to the true concentration and its error from the ddRT-PCR instrument. Second, the results presented herein regarding assay sensitivity, and in particular the C0.5 values in Table 1 are specific to the methods applied in this study. The relationship between the number of wells used to the number of non-detects, and the lowest measurable concentration will be impacted by the pre-analytical and analytical processes used. Additionally, we were able to do a large number of replicates, each with its own extraction to embrace the variability one might expect in environmental samples, which not all labs may be capable of due to cost constraints. However, we would not expect the general trend of reduced sensitivity with fewer merged PCR wells to change if the replication scheme was different.
Although the specific values in Table 1 are only extendable to other studies using our methods (available on protocols.io [28–30]), the framework for examining the required sensitivity for wastewater surveillance is extendable to all studies. That is, careful attention to how sensitivity affects the lowest measurable concentration and the number of non-detects, as well as the relationships between these values and laboratory confirmed COVID-19 incidence rates is needed to fully understand how decisions on assay implementation are made.
Conclusions
We developed and implemented a framework for examining how molecular assay sensitivity for a viral RNA genome target affects its utility for wastewater-based epidemiology. The framework involves understanding how assay sensitivity affects lowest measurable concentrations in units of copies per environmental matrix mass, and the detection probability of a target that is present; and how this change during periods of different disease occurrence can affect resultant statistical associations between the viral target and measures of disease incidence. We applied this framework to digital droplet RT-PCR (ddRT-PCR) measurements of a SARS-CoV-2 gene made using 10 replicate wells, and determined how using fewer wells affected assay sensitivity and its performance for wastewater-based epidemiology applications. From a reagent cost savings perspective, we recommend an adaptive analytical approach where assay sensitivity is increased by running more replicate wells (6 or more) during periods of low SARS-CoV-2 gene concentrations (using our methods, < 104 cp/g) and COVID-19 incidence rate (< 3.5/100 000) and fewer replicate wells (3 or more) during periods of higher SARS-CoV-2 RNA concentrations and COVID-19 incidence. While the precise recommendations here are only generalizable if one is using the same pre-analytical and analytical protocols, the framework and the conclusion that adaptive approaches can reduce costs and increase sensitivity during periods of low disease incidence can be applied to other methods and other wastewater-based epidemiology targets.
Supporting information
S1 Text. Brief description of experimental methods.
https://doi.org/10.1371/journal.pwat.0000066.s001
(DOCX)
S2 Text. Details on COVID-19 epidemiology data.
https://doi.org/10.1371/journal.pwat.0000066.s002
(DOCX)
Acknowledgments
This study was supported by the CDC Foundation, NSF RAPID (CBET-2023057), and by the Epidemiology and Laboratory Capacity for Infectious Diseases Cooperative Agreement (no. 6NU50CK000539-03-02) from CDC. We thank the California Department of Public Health COVID-19 Wastewater Surveillance, Epidemiology and Data teams for their help with COVID-19 incidence data. We thank Dr. Linlin Li, Michael Balliet, Dr. Pamela Stoddard and Dr. George Han at the County of Santa Clara Public Health Department for provision of case data. Numerous people contributed to sample collection including including Payak Sarkar (SJ), Noel Enoki (SJ), Amy Wong (SJ), Alexandre Miot (Ocean), Lily Chan (Ocean), the Oceanside plant operations personnel, Saeid Vaziry (Gil), Chris Vasquez (Gil), and Jeromy Miller (Dav).
References
- 1. Kantor RS, Greenwald HD, Kennedy LC, Hinkle A, Harris-Lovett S, Metzger M, et al. Operationalizing a routine wastewater monitoring laboratory for SARS-CoV-2. Plos Water. 2022 Feb;e0000007.
- 2. Wolfe MK, Topol A, Knudson A, Simpson A, White B, Vugia DJ, et al. High-Frequency, High-Throughput Quantification of SARS-CoV-2 RNA in Wastewater Settled Solids at Eight Publicly Owned Treatment Works in Northern California Shows Strong Association with COVID-19 Incidence. Langelier CR, editor. mSystems [Internet]. 2021 Sep 14 [cited 2021 Sep 16];6(5). Available from: https://journals.asm.org/doi/ pmid:34519528
- 3. Gibas C, Lambirth K, Mittal N, Juel MAI, Barua VB, Roppolo Brazell L, et al. Implementing building-level SARS-CoV-2 wastewater surveillance on a university campus. Sci Total Environ. 2021 Aug;782:146749. pmid:33838367
- 4. Wolfe MK, Duong D, Bakker KM, Ammerman M, Mortenson L, Hughes B, et al. Wastewater-Based Detection of Two Influenza Outbreaks. Environ Sci Technol Lett. 2022 Jul 5;acs.estlett.2c00350.
- 5. Hughes B, Duong D, White BJ, Wigginton KR, Chan EMG, Wolfe MK, et al. Respiratory Syncytial Virus (RSV) RNA in Wastewater Settled Solids Reflects RSV Clinical Positivity Rates. Environ Sci Technol Lett. 2022 Feb 8;9(2):173–8.
- 6. McCall C, Wu H, O’Brien E, Xagoraraki I. Assessment of enteric viruses during a hepatitis outbreak in Detroit MI using wastewater surveillance and metagenomic analysis. J Appl Microbiol. 2021 Sep;131(3):1539–54. pmid:33550682
- 7. Diemert S, Yan T. Municipal Wastewater Surveillance Revealed a High Community Disease Burden of a Rarely Reported and Possibly Subclinical Salmonella enterica Serovar Derby Strain. Elkins CA, editor. Appl Environ Microbiol. 2020 Aug 18;86(17):e00814–20.
- 8. Peccia J, Zulli A, Brackney DE, Grubaugh ND, Kaplan EH, Casanovas-Massana A, et al. Measurement of SARS-CoV-2 RNA in wastewater tracks community infection dynamics. Nat Biotechnol. 2020 Oct;38(10):1164–7. pmid:32948856
- 9. Graham KE, Loeb SK, Wolfe MK, Catoe D, Sinnott-Armstrong N, Kim S, et al. SARS-CoV-2 RNA in Wastewater Settled Solids Is Associated with COVID-19 Cases in a Large Urban Sewershed. Environ Sci Technol. 2021 Jan 5;55(1):488–98. pmid:33283515
- 10. Fernandez-Cassi X, Scheidegger A, Bänziger C, Cariti F, Tuñas Corzon A, Ganesanandamoorthy P, et al. Wastewater monitoring outperforms case numbers as a tool to track COVID-19 incidence dynamics when test positivity rates are high. Water Res. 2021 Jul;200:117252. pmid:34048984
- 11. Kim S, Kennedy LC, Wolfe MK, Criddle CS, Duong DH, Topol A, et al. SARS-CoV-2 RNA is enriched by orders of magnitude in primary settled solids relative to liquid wastewater at publicly owned treatment works. Environ Sci Water Res Technol. 2022; pmid:35433013
- 12. Feng S, Roguet A, McClary-Gutierrez JS, Newton RJ, Kloczko N, Meiman JG, et al. Evaluation of Sampling, Analysis, and Normalization Methods for SARS-CoV-2 Concentrations in Wastewater to Assess COVID-19 Burdens in Wisconsin Communities. ACS EST Water. 2021 Aug 13;1(8):1955–65.
- 13. Dorazio RM, Hunter ME. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments. Anal Chem. 2015 Nov 3;87(21):10886–93. pmid:26436653
- 14. Brys R, Halfmaerten D, Neyrinck S, Mauvisseau Q, Auwerx J, Sweet M, et al. Reliable eDNA detection and quantification of the European weather loach (Misgurnus fossilis). J Fish Biol. 2021 Feb;98(2):399–414.
- 15. Basu AS. Digital Assays Part I: Partitioning Statistics and Digital PCR. SLAS Technol. 2017 Aug;22(4):369–86. pmid:28448765
- 16. Blaya J, Lloret E, Santísima-Trinidad AB, Ros M, Pascual JA. Molecular methods (digital PCR and real-time PCR) for the quantification of low copy DNA of Phytophthora nicotianae in environmental samples: Digital PCR: a new method to quantify P. nicotianae. Pest Manag Sci. 2016 Apr;72(4):747–53.
- 17. Doi H, Takahara T, Minamoto T, Matsuhashi S, Uchii K, Yamanaka H. Droplet Digital Polymerase Chain Reaction (PCR) Outperforms Real-Time PCR in the Detection of Environmental DNA from an Invasive Fish Species. Environ Sci Technol. 2015 May 5;49(9):5601–8. pmid:25850372
- 18. Acosta Soto L, Santísima-Trinidad AB, Bornay-Llinares FJ, Martín González M, Pascual Valero JA, Ros Muñoz M. Quantitative PCR and Digital PCR for Detection of Ascaris lumbricoides Eggs in Reclaimed Water. BioMed Res Int. 2017;2017:1–9.
- 19. Tan SYH, Kwek SYM, Low H, Pang YLJ. Absolute quantification of SARS-CoV-2 with Clarity PlusTM digital PCR. Methods. 2022 May;201:26–33.
- 20. Cao Y, Raith MR, Griffith JF. Droplet digital PCR for simultaneous quantification of general and human-associated fecal indicators for water quality assessment. Water Res. 2015 Mar;70:337–49. pmid:25543243
- 21. Borchardt MA, Boehm AB, Salit M, Spencer SK, Wigginton KR, Noble RT. The Environmental Microbiology Minimum Information (EMMI) Guidelines: qPCR and dPCR Quality and Reporting for Environmental Microbiology. Environ Sci Technol. 2021 Aug 3;55(15):10210–23. pmid:34286966
- 22.
Bio-Rad Laboratories, Inc. Droplet Digital PCR Application Guide [Internet]. Available from: https://www.bio-rad.com/webroot/web/pdf/lsr/literature/Bulletin_6407.pdf
- 23. Kantor RS, Nelson KL, Greenwald HD, Kennedy LC. Challenges in Measuring the Recovery of SARS-CoV-2 from Wastewater. Environ Sci Technol. 2021 Mar 16;55(6):3514–9. pmid:33656856
- 24. Haldane V, De Foo C, Abdalla SM, Jung AS, Tan M, Wu S, et al. Health systems resilience in managing the COVID-19 pandemic: lessons from 28 countries. Nat Med. 2021 Jun;27(6):964–80. pmid:34002090
- 25. Rosario K, Symonds EM, Sinigalliano C, Stewart J, Breitbart M. Pepper Mild Mottle Virus as an Indicator of Fecal Pollution. Appl Environ Microbiol. 2009 Nov 15;75(22):7261–7. pmid:19767474
- 26. Wu SL, Mertens AN, Crider YS, Nguyen A, Pokpongkiat NN, Djajadi S, et al. Substantial underestimation of SARS-CoV-2 infection in the United States. Nat Commun. 2020 Dec;11(1):4507. pmid:32908126
- 27. Viau EJ, Lee D, Boehm AB. Swimmer Risk of Gastrointestinal Illness from Exposure to Tropical Coastal Waters Impacted by Terrestrial Dry-Weather Runoff. Environ Sci Technol. 2011 Sep 1;45(17):7158–65. pmid:21780808
- 28. Topol A, Wolfe MK, White B, Wigginton K, B Boehm A. High Throughput pre-analytical processing of wastewater settled solids for SARS-CoV-2 RNA analyses v1 [Internet]. 2021 [cited 2021 Sep 16]. Available from: https://www.protocols.io/view/high-throughput-pre-analytical-processing-of-waste-btyqnpvw
- 29. Topol A, Wolfe MK, Wigginton K, White B, B Boehm A. High Throughput RNA Extraction and PCR Inhibitor Removal of Settled Solids for Wastewater Surveillance of SARS-CoV-2 RNA v1 [Internet]. 2021 [cited 2021 Sep 16]. Available from: https://www.protocols.io/view/high-throughput-rna-extraction-and-pcr-inhibitor-r-btyrnpv6
- 30. Topol A, Wolfe MK, White B, Wigginton K, B Boehm A. High Throughput SARS-COV-2, PMMOV, and BCoV quantification in settled solids using digital RT-PCR v1 [Internet]. 2021 [cited 2021 Sep 16]. Available from: https://www.protocols.io/view/high-throughput-sars-cov-2-pmmov-and-bcov-quantifi-btywnpxe