Time to revisit the endpoint dilution assay and to replace the TCID50 as a measure of a virus sample’s infection concentration

The endpoint dilution assay’s output, the 50% infectious dose (ID50), is calculated using the Reed-Muench or Spearman-Kärber mathematical approximations, which are biased and often miscalculated. We introduce a replacement for the ID50 that we call Specific INfection (SIN) along with a free and open-source web-application, midSIN (https://midsin.physics.ryerson.ca) to calculate it. midSIN computes a virus sample’s SIN concentration using Bayesian inference based on the results of a standard endpoint dilution assay, and requires no changes to current experimental protocols. We analyzed influenza and respiratory syncytial virus samples using midSIN and demonstrated that the SIN/mL reliably corresponds to the number of infections a sample will cause per mL. It can therefore be used directly to achieve a desired multiplicity of infection, similarly to how plaque or focus forming units (PFU, FFU) are used. midSIN’s estimates are shown to be more accurate and robust than the Reed-Muench and Spearman-Kärber approximations. The impact of endpoint dilution plate design choices (dilution factor, replicates per dilution) on measurement accuracy is also explored. The simplicity of SIN as a measure and the greater accuracy provided by midSIN make them an easy and superior replacement for the TCID50 and other in vitro culture ID50 measures. We hope to see their universal adoption to measure the infectivity of virus samples.


Introduction
The progression of a virus infection in vivo or in vitro, or the effectiveness of therapeutic interventions in reducing viral loads, are monitored over time through sample collections to measure changes (increases or decreases) in virus concentrations.As such, accurate measurement of the concentration in a sample is critical to study and manage virus infections.The most direct method is to count individual virions as observed under an electron microscope.However, this technique is costly, time consuming, and largely destructive of the samples, and is thus almost never used.Viral RNA can be counted via quantitative polymerase chain reaction (qPCR), a method that amplifies a specific virus genome segment (RNA or DNA) within the sample over multiple cycles.The growth curve resulting from successive amplification cycles, compared against the standard curve for a sample of known concentration, provides an estimate of the number of viral segments in the sample.The major limitation of this method is that it measures not only viral RNA from intact virions, only some of which are infection-competent, but also debris from apoptotic or lysed cells, and antibody-or antiviral-neutralized virions, which misrepresents the effective virion concentration.For this reason, a count of infectious particles rather than, or in addition to, total viral genome segments is preferred.
Infectious virions do not systematically differ in any observable way from replication-defective virions, nor do they differ in a physical way that would allow for their mechanical or chemical separation.For this reason, methods to count infectious virions are based on counting the infections they cause, rather than the particles themselves.In practice, however, not all infection-competent virions contained in a sample will go on to successfully cause infection.Certain experimental conditions, such as temperature or acidity of the medium, can hasten the rate at which virions that were infection-competent in the sample lose infectivity before they can cause infection.This is why, hereafter, we will refer to the quantity measured by infectivity assays as the infection concentration or the number of infections the sample will cause per unit volume, rather than its concentration of infectious virions, which is not a measurable quantity.Two main types of assays are used to quantify the infection concentration within a virus sample: (1) the plaque forming and focus forming assays; and (2) assays we will collectively refer to as endpoint dilution (ED) assays1 , which include the 50% tissue culture infectious dose (TCID 50 ) or cell culture infectious dose (CCID 50 ) or egg infectious dose (EID 50 ) assays, etc.
The plaque forming assay was introduced by Renato Dulbecco in 1952 [3], as an improvement over the ED assay.The plaque forming assay and the focus forming assay, which rely on the same principles, suffer from a number of critical issues that cannot be overcome.For example, the liquid accumulation (meniscus) that forms around well edges means some infectious doses will not get quantified correctly or at all.It can be hard to distinguish two merged plaques from a single large plaque, or to decide how small a plaque one should consider when counting.Some of the difficulties in establishing a robust, unambiguous plaque or focus count for a given well are illustrated in Figure 1.For these reasons, different researchers will count a different number of plaques or foci when observing the same well.This subjectivity in the count means there is opportunity to (sub)consciously count a few more plaques or foci, for example, when expecting a virus strain to be more severe than another or in the absence of an antiviral compound.Ideally, there would be no discretion involved in the counting process of a quantification assay.Indeed, the decision process should be made by a physical, automated measurement, without the possibility of post-facto adjustments of any kind, for any reason.
In contrast, the ED assay offers a more decisive and robust binary determination as to whether or not infection has taken place in each well (or egg, animal, etc.).This determination is insensitive to small, spatially localized irregularities and is typically unanimously agreed upon by all observers.Therefore, it is less subject to (sub)conscious bias.In fact, this feature of the ED assay makes it ideal for systematic, machine-based determination of positive wells (or eggs or other culture types), eliminating subjectivity.Furthermore, infection of wells in the ED assay can be carried out in exactly the same way as planned infection experiments where they will make up the inoculum, e.g., in the same cell type, reproducing whether the inoculum is rinsed or not post-inoculation, and the duration of incubation with the inoculum.In contrast, plaque and focus forming assays can require the use of a semi-solid cellular overlay (e.g., agarose) to restrict the spread of virus beyond cells neighbouring those initially infected by the inoculum.The need to rinse or remove the inoculum to add the semi-solid overlay imposes strict constraints on the timing of this rinse.Because a longer incubation provides more opportunities for infectious virus to cause infection, the number of infectious doses counted via a plaque or focus assay can underestimate the true number of infections that will result when the quantified sample is later used to infect cells under longer incubation periods.The plaque assay can also require the use of different cells than those used in the infection experiments whenever the latter fail to die or detach (form clearly visible plaques) post-infection, making it difficult to predict the number of infections that will result when the quantified sample is later used to infect different cells.
For all its many advantages, the ED assay currently has one key, remediable weakness: its output quantity, the TCID 50 (or CCID 50 or EID 50 ), does not directly correspond, or trivially relate, to one count-issue.pdfMDCK cells were infected with a sample containing influenza A (H3N2) virions, and cell infection was visualized via staining by antibodies against the matrix (M) viral protein.The uneven liquid distribution along the well's edges means some infectivity is lost or miscounted.It can be hard to distinguish between two merged foci and a single larger uneven focus.It is difficult to determine how small a focus should be counted, and doing so to decide on a focus size threshold to be used consistently for all wells and all samples within a particular experiment.As a result of these difficulties, different individuals will commonly count a different number of foci in the same well.Stained well image graciously provided by Frederick Koster (Lovelace Research Institute, NM, USA).infectious dose.The simplistic calculations, introduced by Spearman-Kärber (SK) [5,9] and Reed-Muench (RM) [8] nearly a century ago, remain the primary methods to quantify a virus sample's infectivity using the ED assay.Many research groups rely on spreadsheet calculators that are passed down through generations of trainees or found on the internet, and can contain errors2 .While, theoretically, a dose of 1 TCID 50 is expected to cause −1/ ln(50%) = 1.44 infections [2], the approximation used by the SK and RM methods introduces an often overlooked bias where 1 TCID 50 ≈ 1.781 infections where 1.781 = e γ and γ = 0.5772 is the Euler-Mascheroni constant [4,10].This makes it problematic to experimentally achieve the desired multiplicity of infection when inoculating from a sample quantified via the SK or RM methods.Many have proposed replacements for the RM and SK calculations with some based on logit or probit transforms of the data [2,4,6] and others on statistical analysis of the ED assay output [6,7].Sadly, none of these improvements were widely adopted, possibly due to a lack of visibility of these publications, or the lack of widespread awareness of the limitations of the RM and SK methods.
Thus, the one issue with the ED assay is not with the assay itself but with the calculation of the TCID 50 .We submit that for all the reasons outlined above, the ED assay is experimentally more robust and reliable than the plaque and focus forming assays, and should be preferred over the latter.We propose to: 1. Continue the use, or encourage the adoption, of the ED assay (e.g., TCID 50 assay), but to replace its output, the TCID 50 /mL (or CCID 50 /mL, EID 50 /mL, etc.), with a new quantity in units of Specific INfections or SIN/mL corresponding to the number of infections the sample will cause per mL.The word specific highlights the fact that the infectivity of a sample is specific to the particulars of the experimental conditions (temperature, medium, cell type, incubation time, etc.).
2. Replace the Reed-Muench and Spearman-Kärber approximations with a computer software, midSIN (measure of infectious dose in SIN), that relies on Bayesian inference to measure the SIN/mL of a virus sample.To avoid calculation errors and make the new method widely accessible, midSIN is maintained and distributed as free, open-source software on GitHub (https://github.com/cbeauc/midSIN) for user installation, but also via a free-to-use website application (https: //midsin.physics.ryerson.ca)with an intuitive user interface.
Here, we present examples of midSIN being used to analyze influenza and respiratory syncytial virus samples.We demonstrate that midSIN's output, SIN/mL, is an accurate estimate of the number of infections the sample will cause per unit volume.We show how the accuracy of the SIN concentration estimate is affected by experimental choice of plate layout, including the dilution factor, and the number of replicates per dilution.We compare midSIN's performance to that of the RM and SK methods, and demonstrate how the latter estimators are inaccurate under various circumstances, underlining the need to adopt midSIN to quantify virus samples via the ED assay.

Key features of midSIN's output
Let us consider a fictitious ED experiment, with 11 dilutions and 8 replicate wells per dilutions, in which the minimum sample dilution, D 1 = 1/100 = 10 −2 , is serially diluted by a factor of 10 −0.5 ≈ 0.32 (D 2 = 10 −2.5 , D 3 = 10 −3 , ..., D 11 = 10 −7 ), and the total volume of inoculum (diluted virus sample + dilutant) placed in each well is V inoc = 0.1 mL.Now, consider that a virus sample is measured using this ED experiment and one observes (8,8,8,8,8,7,7,5,2,0,0) infected wells out of 8 replicates at each of the 11 dilutions, as illustrated in Figure 2A.midSIN provides a graphical output of its results, shown in Figure 2B,C for this example.Note how the likelihood distribution for log 10 (SIN/mL) (Fig. 2B) is approximately a normal distribution.This is why log 10 of the infection concentration should be used and reported, rather than the concentration itself.midSIN also graphically compares the number of infected wells observed experimentally (Fig. 2C, black dots) against the theoretically expected values (blue curve and grey CI bands).This graphical representation makes it easy to identify issues with the data entered or with the experiment itself.
Importantly, midSIN provides a more useful quantity to the user than the TCID 50 : an estimate of the concentration of infections the sample will cause, SIN/mL.For this example, the concentration is 10 6.2±0.1 SIN/mL, where 6.2 is the mode (most likely value) of log 10 (SIN/mL), and ±0.1 is its 68% credible interval (CI).The SIN/mL corresponds to the number of infections that will be caused per mL of the sample, which can be directly used to determine the sample dilution required to obtain a desired multiplicity of infection (MOI).
In a laboratory setting, ED experiments can be performed in batches, such as to quantify the infectious concentration in samples collected at several time points over the course of a cell culture infection.For such applications, midSIN provides a comma separated value (csv) template file readily editable in a spreadsheet program, to collect and submit the results for batch processing.Details on the format of the template file are available on midSIN's website (https://midsin.physics.ryerson.ca).Figure 3 illustrates the output for a subset of measurements for in vitro infection with the respiratory syncytial virus (RSV).Each sample was measured twice, and midSIN's estimates are in good agreement with one another (within 95% CI).
The y-axis in the left graph panels of midSIN's graphical output is the non-normalized scale of the likelihood distribution for log 10 (SIN/mL), which ranges between 10 −7 and 10 −2 .The scale loosely relates to the likelihood of observing a particular ED experimental outcome (see Methods).Unlikely ED outcomes appear as large departures of the observed number of infected wells (right panels, black dots) from what is theoretically expected (right panels, curve).It is interesting that the uncertainty (CI) of midSIN's estimated log 10 (SIN/mL) appears to be independent of how much the ED outcome deviates from theoretical expectations.That is, the accuracy of midSIN is not strongly affected even when it is provided more unlikely, noisy experimental data.This robustness is explored further below.

Comparing SIN to TCID 50 and PFU virus sample concentrations
The midSIN calculator provides an estimate of the number of infections that will be caused per mL of a virus sample (SIN/mL).In principle, a plaque assay also measures the number of infections a sample will cause, with each infection expected to develop into a plaque.If a plaque assay is performed under experimental conditions and protocols as similar as possible to those of the ED assay (i.e., using the same cells, medium, period of incubation, rinsing method, etc.), midSIN's SIN/mL estimate is expected to be comparable, in theory, to the number of PFU/mL observed in the plaque assay.In practice, however, the plaque assay likely provides a biased estimate of the concentration of infections in a sample due to its many experimental issues, discussed in the Introduction.To evaluate midSIN's performance compared to existing methods, the infection concentration in two influenza A (H1N1) virus strain samples were measured via both plaque and ED assays, and their concentration in units of PFU, TCID 50 , and SIN were compared (Fig. 4).Details regarding the samples, and how the plaque and ED assays were performed are provided in Methods.
The TCID 50 concentrations estimated via the RM and SK methods are ∼1.5-1.7 times larger (Fig. 4C,D) than the SIN concentration, and the set of ratios are statistically inconsistent with the assumption of equality (p-value: 0.01-0.03).Theoretically, 1 TCID 50 is expected to cause 1.44 infections (= 1/ ln(2)) (B) The midSIN-estimated likelihood distribution of the log 10 infection concentration, log 10 (SIN/mL), for the example ED experiment.The vertical lines correspond to log 10 (SIN/mL), based on the most likely value (mode) of midSIN's likelihood distribution (solid blue), or computed from the RM (solid orange) and SK (dashed green) approximations of the log 10 (TCID 50 ) (see Methods).The x-value of the white and light grey region on either sides of the mode indicate the edges of the 68% and 95% credible interval (CI), respectively.The midSIN-estimated log 10 (SIN/mL) mode ± 68% [±95%] CI are indicated numerically above the graph.(C) The number of infected wells (black circles) out of the 8 replicates, as a function of the 11 serial dilutions of the example ED plate, from the least (leftmost) to the most (rightmost) diluted.For example, x = 3.0 corresponds to a sample dilution of 10 −3 or 1/1,000.The average (expected) number of infected wells, as a function of sample dilution, is shown for the most likely value of log 10 (SIN/mL) (blue curve) or its 68% and 95% CI (inner and outer edge of the grey bands, respectively).The sample dilution (x-value) at which the blue curve crosses the horizontal dotted line (50% infected wells) corresponds to a concentration of 1 TCID 50 per ED well volume.The vertical lines indicate the sample dilution that yields a concentration of 1 TCID 50 according to the RM and SK approximations. ) and sampling time point (e.g., 8 h, 36 h), and each sample was measured in duplicate (rep1, rep2).These data were collected from in vitro infections with the RSV A Long strain, and were previously reported in [1].The ED measurement experiment were conducted using a plate layout of 11 dilutions, with 8 replicates per dilution, an inoculum volume of V inoc = 0.1 mL, serial dilutions from D 1 = 10 −1 to D 11 = 10 −6 , separated by a dilution factor of 10   [2].However, the RM or SK approximations are known to introduce a bias such that 1 TCID 50 estimated by these methods is expected to cause 1.781 infections (= e γ where γ = 0.5772 is the Euler-Mascheroni constant) [4,10].Using the RM, SK, and SIN measurements presented in Figure 4A,B, we confirmed3 that 1.781 SIN ≈ 1 TCID 50 when the latter is estimated via the RM or SK approximations, as expected theoretically if SIN is indeed measuring the infection concentration in a sample.
Similarly, the ratio of the PFU concentration determined via the plaque assay and the SIN concentrations estimated by midSIN is ∼0.89-0.93,which is statistically consistent with the assumption of equality (p-value: 0.2-0.5).These results confirm the theoretical expectation that 1 PFU ≈ 1 SIN when the plaque and ED assays are performed in the same manner, as was the case here.This provides further support, via two independent assays, that the SIN concentration estimated by midSIN from the ED assay is a robust measure of the infection concentration of a virus sample.

Comparing midSIN's performance to that of the RM and SK methods
The RM and SK methods rely on the number of infected wells decreasing as dilution increases.Their estimates are affected when the number of infected wells remains unchanged or even increases as dilution increases, which statistics tell us can reasonably occur experimentally.The RM and SK methods also mostly require that at the lowest and highest sample dilutions, all wells be infected and uninfected, respectively.In contrast, midSIN is robust to these issues.Figure 5 demonstrates how midSIN can provide an estimate for the log 10 (SIN/mL) in a sample using the number of infected wells at a single dilution, as long as at least one well is uninfected if all others are infected or vice-versa.This is because midSIN relies on Bayesian inference, i.e., when more than one column is available, it uses information from each column successively to revise and improve its estimate.This allows midSIN to correct for even large deviations from theoretical expectations, and thus improves its accuracy.methods, which estimate the log 10 (TCID 50 /mL) rather than the log 10 (SIN/mL), the agreement is generally poor due to the bias they introduce.Furthermore, the RM and SK predictions are more variable (wavy pattern), and lose accuracy dramatically as the sample concentration approaches the limits of detection (the 2 ends) which, for the example plate layout simulated here, is around 10 3 SIN/mL and 10 9 SIN/mL.Interestingly, the basic calculations behind the RM and SK methods constrain the set of values they can return (sparsely populated grey histograms), compared to the more continuous range returned by midSIN, which contributes to its increased accuracy.

Estimate accuracy as a function of plate layout
In Figure 3, we observed that even for large discrepancies between the expected (right panels, blue curve) and observed (right panels, black dots) ED assay outcome, the uncertainty (CI) of midSIN's estimate remains relatively unchanged.This apparent robustness is because the uncertainty is primarily determined by the experimental design, namely the change in dilution between columns (dilution factor) and the number of replicate wells per dilution.Figure 7 explores the impact of varying either only the dilution factor, or only the number of replicates at each dilution, or varying one at the expense of the other by using a fixed number of wells (96 wells).When using midSIN, smaller changes in dilution (e.g., going from a dilution factor of 2.2/100 to 61/100) or more replicates per dilution (4 to 24) each improves the measure's accuracy (narrower CIs) by comparable amounts, but only when the total number of wells is allowed to increase to accommodate the change.When the total number of wells used is fixed, changing one at the expense of the other leaves the accuracy (CI) unchanged.This is somewhat also true for the log 10 (TCID 50 ) output concentration estimated by the RM and SK methods.However, at the smallest dilution factors (10/100 and 2.2/100), the bias introduced by the RM and SK methods becomes even larger and more unpredictable.For the input concentration considered in Figure 7 (10 5 SIN/mL), the dilution at which 50% of wells are infected is near the middle dilution.For sample concentrations such that 50% infected wells occur near or at the lowest or highest dilution chosen, the effect is even more significant.
Figure 7 also demonstrates that varying the dilution by smaller increments (e.g., a dilution factor of 61/100 rather than 10/100) provides greater granularity (uniqueness) of ED plate outcomes, and thus, greater accuracy of the log 10 infection concentration estimates.Here, a distinct plate outcome means a distinct number of infected wells at each dilution, with no distinction as to exactly which of the replicate wells (e.g., the second versus the fourth) is infected at each dilution.An ED plate with serial dilutions ranging over 6 orders of magnitude (e.g., 10 −2 to 10 −7 ), with 4 different dilutions and 24 replicates/dilution (i.e., dilution factor of 2.2/100) provides ∼ 10 6 ([24 + 1] 4 ) possible, distinct ED plate outcomes.In contrast, a plate with the same serial dilution range, but with 24 different dilutions and 4 replicates/dilution (i.e., dilution factor of 61/100) yields ∼ 10 17 ([4 + 1] 24 ) distinct outcomes.More generally, [reps + 1] dils is the number of distinct plate outcomes for a chosen number of dilutions (dils) and replicates (reps).Having fewer possible plate outcomes means that a larger range of concentrations would share the same most-likely ED plate outcome, yet each plate outcome only maps to one (the most likely) concentration estimate.This means that with fewer dilutions, the concentration estimate is forced to take on the nearest possible value it can take (Fig. 7, the next grey bar), and the accuracy of the concentration estimate is therefore reduced.So although having a greater number of dilutions is more labour intensive, it should be preferred over having a greater number of replicates per dilution.B,E,H) increasing the number of replicates per dilutions (4 to 24) while keeping a fixed dilution factor (≈ 35/100); or (C,F,I) increasing the dilution factor while decreasing the number of replicates, keeping a fixed number of 96 wells used in total to titer one virus sample.Different rows represent the ratio of the estimated output concentration using (A-C) midSIN in SIN/mL, (D-F) RM or (G-I) SK in TCID 50 /mL, and the input concentration.In all cases (A-I), the input concentration was 10 5 SIN/mL, and as the dilution factor was varied, the highest and lowest dilutions in the simulated ED plate were held fixed to D 1 = 10 −2 and D last = 10 −7 , respectively, by changing the total # of dilutions performed (simulated).Everything else is generated or computed as described in the caption of Figure 6.

Discussion
We have introduced a new calculator tool called midSIN to replace the Reed-Muench (RM) and Spearman-Kärber (SK) calculations to quantify the infectivity of a virus sample based on a TCID 50 endpoint dilution (hereafter ED) assay.Rather than estimating the TCID 50 of a virus sample, midSIN calculates the number of infections the sample will cause, reported in units of specific infections (SIN).It does so without requiring any changes to current ED assay protocols, and can be accessed for free via an open-source web-application (https://midsin.physics.ryerson.ca).Importantly, since the SIN of a virus sample corresponds to the number of infections it will cause, it can be used directly to determine what dilution of the sample will achieve the desired multiplicity of infection (MOI).
We showed that midSIN provides more accurate and robust estimates than the biased RM and SK approximations.We confirmed that the RM and SK approximations overestimate the TCID 50 by 23.5%, such that 1 TCID 50 estimated by these methods will cause 1.781 rather than 1.44 infections [4,10].While in theory one can obtain the intended MOI by multiplying the TCID 50 by 0.7 (or rather ln(2) = 0.693), one should instead multiply by 0.561 to account for the overestimation by RM and SK.Even when accounting for the overestimation, we showed that these methods perform particularly poorly when too few replicate wells per dilutions are used or when the change in dilution is large between successive serial dilutions.The two methods perform especially poorly when quantifying samples whose infection concentration approaches, but is still well within, the detection limit of the ED assay.In such cases, the bias introduced by these methods becomes even larger and more significant.For example, if the minimum and maximum dilutions of an ED plate are 10 −2 and 10 −8 , virus samples with a concentration less than 10 2.2 SIN or greater than 10 7.6 SIN per inoculated well volume (typically 0.1 mL), will see their concentration estimated with an even larger bias by the RM and SK methods.
Using midSIN, rather than RM or SK, to measure the infectivity of a virus sample based on an ED assay does not require any change to ED experimental protocols and methods currently in use in one's laboratory (e.g., dilution factor, replicate per dilution, minimum dilution).Indeed, we demonstrated that midSIN can estimate a virus sample's SIN concentration based on even just a single dilution, as long as only a fraction of the replicate wells are infected at that dilution.For a given number of ED wells used to titrate the sample and fixed minimum and maximum dilutions (ED detection range), we showed that having smaller changes between dilutions (a larger number of serial dilutions) is better than having more replicates per dilution.So those wishing to improve the accuracy in estimating the infectivity of their virus samples should consider using more wells in titrating each virus sample, and favouring smaller dilution changes over more replicates.For example, using 11 dilutions, with a 4-fold dilution factor between dilutions and 8 replicate wells per dilution uses up 88 wells, leaving 8 wells of a 96-well plate for controls.This ED plate design, analyzed using midSIN, accurately measures virus sample concentrations ranging over ∼6 orders of magnitude (e.g., [10 1 -10 7 ] SIN/mL, or [10 6 -10 12 ] SIN/mL, etc.) with an accuracy of ∼1.6-fold (×10 ±0.2 , 95% CI).In comparison, using 7 dilutions, with a 10-fold dilution factor, and 4 replicates (which uses 28 rather than 88 wells) would also span 6 orders of magnitude, but with an accuracy of ∼3.2-fold (×10 ±0.5 , 95% CI).To put these 2 accuracies in perspective: 1 mL of a sample measured to contain 10 SIN/mL, is expected to yield either 6-16 or 3-31 infections 95% of the time, given an accuracy of either ×10 ±0.2 or ×10 ±0.5 SIN/mL, respectively.Such an important decrease in accuracy means a reduced ability to detect experimental changes as statistically signficant, with the ×10 ±0.5 accuracy requiring a >10-fold change for statistical significance.Failing to identify a change as statistically significant as part of a study is far more costly than using a few more wells for each sample to increase measurement accuracy, and thus the statistical power of the study.
The midSIN-estimated SIN obtained from an ED assay was also compared to the PFU from a plaque assay for a set of influenza A virus samples.When the plaque and ED assays are performed as identically as possible (cell type, incubation time, etc.), as was the case here, 1 SIN ≈ 1 PFU.This demonstrates that indeed midSIN's SIN is a measure of the number of infections a virus sample will cause.However, as mentioned, the plaque and focus forming assays often impose experimental requirements (e.g., an early rinse of the inoculum to add agarose, use of cells with pronounced CPE).Such constraints on the plaque or focus assay inoculation protocol make it nearly impossible to relate the number of plaques or foci observed to the number of infections the virus sample will cause under the intended, experimental infection conditions (e.g., late or no inoculum rinse, no agarose, to infect cells exhibiting no significant CPE).Adding to this the subjectivity of counting plaques or foci, it is clear the ED assay combined with midSIN to estimate the SIN concentration of a virus sample is more accessible, accurate, and predictive.
Beyond the work presented herein, the development of midSIN will continue online, as we implement new features and inputs for integration with various colorimetric and fluorescence instruments.The ease of use of midSIN and the greater usefulness and relevance of SIN as a measure of a virus sample's infectivity make them far superior to all currently available alternatives, including the PFU, FFU, TCID 50 , and other ID 50 measures.We hope to see them adopted widely.

Considering a single well
Consider a virus sample of volume V sample which contains an unknown concentration of infectious virions, C inf , which we aim to determine.Drawing a small volume, V inoc < V sample , from the sample of volume V sample , is analogous to drawing balls out of a bag containing green and yellow balls, and considering green balls a success, and yellow ones a failure.It is a series of Bernoulli trials where n = V inoc /V vir is the number of draws, i.e., the number of virion-size volumes (V vir ) drawn from the sample to form the inoculum volume (V inoc ), analogous to the number of balls drawn.
k is the number of successes, i.e., the number of infectious virions drawn from the sample to form the inoculum, analogous to the number of green balls drawn.
p is the probability of success, i.e., the fraction of virion-size volumes in the sample that are occupied by infectious virions, analogous to the probability of drawing a green ball.
The probability of success, p, is related to the concentration of infectious virus in the sample, C inf , as where C inf is the quantity we aim to estimate.Unlike the ball analogy where it is easy to count how many green balls k were drawn, after having drawn n virion-size volumes from the sample into our inoculum, we cannot count how many infectious virions were drawn into the inoculum.However, if this inoculum is deposited onto a susceptible cell culture, we can observe whether or not infection occurs, and this would indicate that the inoculum contained at least one or more infectious virions.Note that, as explained in the Introduction, even a productively infectious virion, i.e., one capable of completing the full virus replication from attachment to progeny release, might not result in a productive infection.As such, from hereon, C inf is used to designate the concentration of specific infections in the sample, which is smaller or equal to the concentration of infectious virions, i.e., measures a subset of the infectious virions.
Having deposited the inoculum into one well of the 96-well plate of our ED experiment, the likelihood that the well will not become infected corresponds to the likelihood of having drawn k = 0 infectious virions (or rather, specific infections) out of the n virion volumes that make up our inoculum, namely where q noinf can be simplified by realizing that As such, where q noinf and (C inf V vir ) ∈ [0, 1] because C inf = N vir /V sample and the number of specific infections in the sample, N vir , is at a minimum zero, and at most the maximum number of virion-size volumes that can physically fit in the sample volume, namely V sample /V vir .As such, the maximum possible infection concentration, given a sample of volume V sample , is C inf = (V sample /V vir )/V sample = 1/V vir , and

Considering replicate wells at a given dilution
The ED assay is based on serial dilutions of the sample, with each dilution separated by a fixed dilution factor.We define the dilution factor ∈ (0, 1) as the fraction of the inoculum volume drawn from the previous dilution.For example, if the inoculum for a well, V inoc = 100 µL, comprises 10 µL drawn from the previous dilution and 90 µL of dilution media, the dilution factor is 10/100 = 0.1.If the serial dilution begins with a dilution of D 1 = 0.2, then the following dilution will be D 2 = 0.02.In Eqn.
(1), the dilution under consideration, D i , will affect n, the number of virion-sized volumes drawn from the sample and deposited into the wells of the i th dilution, such that n = D i V inoc /V vir .Therefore, the probability that a well at the i th dilution will not become infected is given by where 1 − q i is the probability of infection for a well at the i th dilution, where When conducting an ED assay, each dilution in the assay contains a number of independent infection wells (replicates), all inoculated with the same dilution, D i .This is analogous again to drawing balls out of a bag, but this time there are n i draws (replicate wells), and the probability of success (i.e., that a well becomes infected) is simply one minus the probability of failure (i.e., that a well does not become infected, q i ).The probability that k i out of the n i wells become infected at dilution D i , is described by the Binomial distribution where n i is the number of replicate wells at each dilution, but could be less if any well at dilution D i are spoiled or contaminated.However, our interest is not in determining k 1 given q noinf , but rather in determining q noinf given that we observed k 1 infected wells out of n 1 wells in the first column.To this aim, we can make use of Bayes' theorem which, in our context, can be expressed as P(p|data) = P(data|p) P(p) 1 0 P(data|p) P(p) dp or rather P post,1 (q noinf |k 1 ) = P(k 1 |q noinf ) P prior (q noinf ) 1 0 P(k 1 |q noinf ) P prior (q noinf ) dq noinf = (1 − q D1 noinf ) k1 q D1(n1−k1) noinf P prior (q noinf ) 1 0 P(k 1 |q noinf ) P(q noinf ) dq noinf P post,1 (q noinf |k 1 ) ∝ (1 − q D1 noinf ) k1 q D1(n1−k1) noinf P prior (q noinf ) where P post,1 (q noinf |k 1 ) is our updated, posterior belief about q noinf after having observed k 1 successes out of n 1 trials in the first column (i = 1), and given our prior belief, P prior (q noinf ), about q noinf before making this observation.

Considering all dilutions of the ED assay
As mentioned above, in the 96-well ED assay, each dilution contains a number of independent infection wells (replicates) inoculated with the same sample concentration.This process is then repeated over a series of dilutions, each separated from the previous by a fixed dilution factor.Having observed the fraction of wells infected at the first dilution considered, D 1 , we have updated our posterior belief about q noinf .We will now use this updated belief as our new prior as we observe our second dilution (D 2 ), such that where we introduce k 2 = {k 1 , k 2 } and as short-hands for convenience.From this, it is easy to extrapolate the posterior likelihood distribution (pPLD) after having observed all J dilutions (D 1 , D 2 , ..., D J ) of the ED assay, namely where Note that this expression is largely equivalent to that obtained by Mistry et al. [7].

Considering the choice of prior
In Eqn. ( 4), we obtained a pPLD for q noinf .Our objective, however, is to estimate the pPLD of C inf , the specific infection concentration in our sample, rather than q noinf .In fact, because both the plaque and ED assays provide an accuracy that is normally distributed in log 10 (C inf ) rather than C inf , it follows that log 10 (C inf ) (hereafter Cinf ) rather than C inf is the quantity of interest.We note that Q( k J |q noinf ) in Eqn. ( 4) is a probability density function in k J rather than in q noinf .As such, a change of variables, say from q noinf to Cinf (q noinf ), would affect only the prior because Thus, the pPLD for Cinf is given by where ) can be written in terms of q noinf , C inf , or Cinf , because it is a probability density function in k J = {k 1 , k 2 , ..., k J } rather than in q noinf .To complete this expression, we need to choose a physically and biologically appropriate prior belief regarding Cinf .
Prior to conducting the ED assay, we know at least that C inf ∈ [1/V Earth , 1/V vir ], where 1/V vir is the maximum possible concentration, namely that if the entire volume of the sample is constituted solely of infectious virions, and 1/V Earth is the minimum possible concentration, namely that if there was only one infectious virion left on Earth.As we explain below, these limits are not important; only the fact that they are convincingly physically bounded both from above and below, i.e., ∈ (0, ∞), is relevant.
If we choose our prior to be uniform in and using the fact that P prior (C inf ) dC inf = P prior ( Cinf ) d Cinf , we can write We see here that the range chosen for the uniform prior in C inf is not important because it only contributes a constant to our proportionality Eqn.(6).Alternatively, because the ED assay estimates Cinf rather than C inf , our prior belief about the virus concentration is more appropriately expressed in Cinf rather than C inf .Again, the bounds of the uniform distribution in Cinf is unimportant, provided that it is finite in extent such that Cinf ∈ [ Cinf min , log 10 (1/V vir )] where Cinf min > −∞, such that we can write Figure 8 illustrates the two distinct priors assumed to arrive at Eqns. ( 7) and ( 8) and their impact on the posterior P post,J ( Cinf | k J ) for the example ED experiment described in Section 2.1.Figure 8A illustrates the consequence of choosing a prior uniform in C inf , i.e., a bias towards higher virus concentrations.This is because a uniform prior in C inf corresponds to a belief that one is as likely to measure

Calculation of midSIN's outputs
One of the graphical outputs of midSIN is the non-normalized PLD of Cinf given the number of wells that were infected at each dilution, k J , like that shown in Figure 2(left panel), computed as where While U post is not the normalized likelihood of Cinf , its maximum value at its mode ( Cinf ,mode ) is the normalized probability of observing this particular ED plate outcome ( k J ) out of all other possible plate outcomes, assuming the true, specific infection concentration in the sample is Cinf ,mode .Another visual output of midSIN is a graphical representation of the theoretical number of wells that would be infected given the most likely Cinf , like that shown in Figure 2 where x is the log 10 of the dilution such that D = 10 −x is the dilution.It corresponds to the continuous equivalent of this quantity which is discrete in the ED assay, namely D i = 10 −xi which is the i th dilution of the sample.As such, D i = (minimum dilution) • (dilution factor between columns) i−1 where i ∈ [1, J].For example, if the dilution of the least diluted column is 0.1 = 10 −1 and the dilution factor between dilutions in the ED assay is such that it halves the concentration between each dilution, i.e., 1/2 = 2 −1 = 10 − log 10 (2) ≈ 10 −0.301 , then D i = 10 −1 • 10 −0.301 • (i−1) such that D 1 = 10 −1 , D 2 = 10 −1.301 , D 3 = 10 −1.602 , and so on, such that x 1 = 1, x 2 = 1.301, x 3 = 1.602, and so on.
In the graphical representation of the ED assay, the edges of the grey bands flanking the theoretical blue curve correspond to Eqn. (11) wherein Cinf ,mode has been replace by the 68% and 95% CI values for Cinf .These CI bands do not correspond to the 68% and 95% CI of the expected number of infected wells at each dilution given Cinf ,mode .

Cell culture
Madin-Darby canine kidney cells (MDCKs) were cultured in growth media (complete MEM media with 5% heat-inactivated FBS), in tissue culture treated T75 flasks, at 37 • C with 5% CO 2 and 95% relative humidity.Cells were split 1/10 every 3-4 days or upon reaching approximately 95% confluency.One passage of cells was expanded for use by both researchers in one experiment to quantify the 50% tissue culture infectious dose (TCID 50 ) and plaque forming units (PFU) of one viral strain.

Viral stocks
Stocks of influenza A/Puerto Rico/8/34 (H1N1) (PR8) and influenza A/California/4/09 (Cali/09) were stored at -80 • C and thawed on ice immediately before use.The TCID 50 and PFU of stock viruses was known to both researchers prior to this study.Serial dilutions were made in MDCK infection media (complete MEM media with 4.25% BSA) and dilutions were made by each researcher independently for titering.'Researcher A' and 'Researcher B' independently performed the TCID 50 and PFU assays of one viral strain for one experiment on the same day using the same viral stock, reagents, and passage of cells.Each experiment was performed on a separate day (Fig. 4).

TCID 50 assay
MDCKs were seeded in 96-well flat bottom plates (5 × 10 4 cells/100 µL, 100 µL/well) and grown to 80% confluency overnight (37 • C, 5% CO 2 , 95% relative humidity).For each experiment, 4 replicate wells, at each of 7 different dilutions separated by a 10-fold dilution, were infected, and the dilution series was performed 5 times.Cells were washed with PBS w/ Ca 2+ Mg 2+ before the addition of 100 µL of viral dilutions per well.After 1 h at room temperature on a rocker, the inoculum was aspirated and replaced with 100 µL of infection media containing 1 µg/mL TPCK-Trypsin.Cells were incubated (37 • C, 5% CO 2 , 95% relative humidity) for 3 d (PR8) or 4 d (Cali/09).Supernatants were used to do a hemagglutination (HA) assay with chicken red blood cells.HA assays were performed and read by 'Researcher A' or 'Researcher B' on their respective experiments.

Statistical analysis
The data points reported in Figure 4C,D were computed by taking each of the 5 replicates measured with either the PFU, RM, or SK and the 5 replicates measured via SIN (5 replicates × 5 replicates = 25 pairs) for each of the 2 experiments by each of the 2 researchers, yielding 100 pairs.For each pair, the log 10 of ratio of either PFU, RM or SK over SIN was computed.The mean and standard deviation of the resulting 100 log 10 (ratio) were computed and are reported in Figure 4C,D.The statistical significance (p-value) of the differences between (PFU,RM,SK) and (SIN) was computed using the Mann-Whitney U test (scipy.stats.mannwhitneyu).

Figure 1 :
Figure 1: Examples of challenges in establishing a robust count of infection plaques or foci.MDCK cells were infected with a sample containing influenza A (H3N2) virions, and cell infection was visualized via staining by antibodies against the matrix (M) viral protein.The uneven liquid distribution along the well's edges means some infectivity is lost or miscounted.It can be hard to distinguish between two merged foci and a single larger uneven focus.It is difficult to determine how small a focus should be counted, and doing so to decide on a focus size threshold to be used consistently for all wells and all samples within a particular experiment.As a result of these difficulties, different individuals will commonly count a different number of foci in the same well.Stained well image graciously provided by Frederick Koster (Lovelace Research Institute, NM, USA).

Figure 2 :
Figure 2: Visual representation of midSIN's output for the example ED plate.(A) Illustration of the example ED plate where D i are the chosen serial dilutions of the sample.For the example described in the text, D 1 = 10 −2 , D 2 = 10 −2.5 , ..., D 11 = 10 −7 , with 8 replicates per dilution.The number of infected wells (# inf) is indicated at the bottom of each dilution column.(B)The midSIN-estimated likelihood distribution of the log 10 infection concentration, log 10 (SIN/mL), for the example ED experiment.The vertical lines correspond to log 10 (SIN/mL), based on the most likely value (mode) of midSIN's likelihood distribution (solid blue), or computed from the RM (solid orange) and SK (dashed green) approximations of the log 10 (TCID 50 ) (see Methods).The x-value of the white and light grey region on either sides of the mode indicate the edges of the 68% and 95% credible interval (CI), respectively.The midSIN-estimated log 10 (SIN/mL) mode ± 68% [±95%] CI are indicated numerically above the graph.(C) The number of infected wells (black circles) out of the 8 replicates, as a function of the 11 serial dilutions of the example ED plate, from the least (leftmost) to the most (rightmost) diluted.For example, x = 3.0 corresponds to a sample dilution of 10 −3 or 1/1,000.The average (expected) number of infected wells, as a function of sample dilution, is shown for the most likely value of log 10 (SIN/mL) (blue curve) or its 68% and 95% CI (inner and outer edge of the grey bands, respectively).The sample dilution (x-value) at which the blue curve crosses the horizontal dotted line (50% infected wells) corresponds to a concentration of 1 TCID 50 per ED well volume.The vertical lines indicate the sample dilution that yields a concentration of 1 TCID 50 according to the RM and SK approximations.

Figure 3 :
Figure 3: Quantification of RSV sampled from in vitro infections.Each row corresponds to a different experiment (mock-yield [my] or single-cycle [sc]) and sampling time point (e.g., 8 h, 36 h), and each sample was measured in duplicate (rep1, rep2).These data were collected from in vitro infections with the RSV A Long strain, and were previously reported in[1].The ED measurement experiment were conducted using a plate layout of 11 dilutions, with 8 replicates per dilution, an inoculum volume of V inoc = 0.1 mL, serial dilutions from D 1 = 10 −1 to D 11 = 10 −6 , separated by a dilution factor of 10 −0.5 .

Figure 4 :
Figure 4: Comparing SIN to TCID 50 and PFU for influenza A virus samples.(A,B)The infection concentration in two influenza A (H1N1) virus strain samples was measured via both an ED assay and a plaque assay (x, PFU).The ED assay was quantified in log 10 (TCID 50 ) using the RM (square) or SK (triangle) methods, or in log 10 (SIN) using midSIN (circle with 68%,95% CI).Each of the 2 strain samples was measured over 2 separate experiments (Exp.#1, #2), performed each time by 2 different researchers (Researcher A or B), with 5 biological replicates each.The grey bars indicate the range of log 10 (SIN) values across the 5 replicates.The RM, SK, and SIN measures were estimated for each replicate based on the same ED plate.The experimental details are provided in Methods.(C,D) The log 10 of the ratio between either the TCID 50 via the RM or SK method or the PFU, over the SIN via midSIN.The ratios were computed for each replicate (5 × 5 replicates), per experiment, per researcher (25 replicates × 2 researchers × 2 experiments = 100 ratios) shown as individual symbols (dots) for each method (RM, SK, PFU).The mean and 68% CI of the 100 ratios are indicated numerically and as black circles with error bars.

Figure 5 :
Figure5: midSIN's estimate of a sample's infection concentration based on a single dilution.This is a simulated example of an ED plate with an inoculation volume of V inoc = 0.1 mL.Instead of serial dilutions, a single dilution (D 1 = 0.01) is used, and either 1, 2 or 3 well(s) out of the 4 replicate wells are infected.As the fraction of infected wells increases, the uncertainty on the estimate (68% and 95% CIs) decreases, and the likelihood distribution becomes more symmetric (Normal-like).

Figure 6 CFigure 6 :
Figure6: Comparing known input to estimated output concentrations.For each input concentration between 10 2.2 and 10 9.4 , one million random ED experiment outcomes (# of positive wells in each dilution column) were generated.For each ED outcome, either (A) midSIN was used to determine the most likely log 10 (SIN/mL); or the (B) RM or (C) SK method was used to estimate the log 10 (TCID 50 /mL).Vertically stacked grey bands at each input concentration are sideways histograms, proportional to the number of ED outcomes that yield a given y-axis value.The black curves join the median (thick), 68 th (thin) and 95 th (dashed) percentile of the histograms, determined at (but not between) each input concentration.A plate layout of 11 dilutions, with 8 replicates per dilution, an inoculum volume of V inoc = 0.1 mL, serial dilutions from D 1 = 10 −2 to D 11 = 10 −8 , separated by a dilution factor of 10 −0.6 ≈ 1/4 were used in the simulated ED experiments.

Figure 7 :
Figure 7: Comparing the effect of the dilution factor and number of replicates per dilution.The effect of either (A,D,G) decreasing the change in dilution (from a dilution factor of 2.2/100 to 61/100) while keeping 8 replicates per dilution; or (B,E,H) increasing the number of replicates per dilutions (4 to 24) while keeping a fixed dilution factor (≈ 35/100); or (C,F,I) increasing the dilution factor while decreasing the number of replicates, keeping a fixed number of 96 wells used in total to titer one virus sample.Different rows represent the ratio of the estimated output concentration using (A-C) midSIN in SIN/mL, (D-F) RM or (G-I) SK in TCID 50 /mL, and the input concentration.In all cases (A-I), the input concentration was 10 5 SIN/mL, and as the dilution factor was varied, the highest and lowest dilutions in the simulated ED plate were held fixed to D 1 = 10 −2 and D last = 10 −7 , respectively, by changing the total # of dilutions performed (simulated).Everything else is generated or computed as described in the caption of Figure6.
Impact of the choice of prior on the posterior distribution for Cinf .(A)Nonnormalizedpriorsforlog 10 (specific infections, SIN/mL)= Cinf that are uniform in either C inf or Cinf are shown.A prior uniform in C inf is biased towards larger values of Cinf .(B)Updatedposteriorbeliefabout Cinf for each of the two prior beliefs shown in (A), as per Eqns.(7) and (8), after having observed the ED assay example provided in Section 2.1.While the prior uniform in C inf yields a pPLD with a mode of Cinf = 6.21, that for a prior uniform in Cinf yields a mode of Cinf = 6.18. a set of virus concentrations in the range [0.001, 0.002] as in the range [1, 000, 000.001, 1, 000, 000.002].When plotted on a log-scale, there are 100× more intervals of width 0.001 in [10 4 , 10 5 ] than in [10 2 , 10 3 ].Thus, this prior corresponds to a belief that the likelihood of measuring a certain virus concentration increases exponentially as Cinf increases linearly.In contrast, a prior uniform in Cinf corresponds to a belief that one is as likely to measure a set of virus concentrations in the range [0.001, 0.002] than in the range [1, 000, 000, 2, 000, 000], or rather in the range [1, 2] × 10 −3 than in the range [1, 2] × 10 6 .As such, a uniform distribution in Cinf is more physically and biologically sensible and therefore was chosen for our estimation method.