Skip to main content
Advertisement
  • Loading metrics

Viral expansion after transfer is a primary driver of influenza A virus transmission bottlenecks

  • Katie E. Holmes,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America

  • Lucas M. Ferreri,

    Roles Data curation, Formal analysis, Writing – review & editing

    Affiliation Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America

  • Baptiste Elie,

    Roles Methodology

    ¤ Current address: Institute of Medical Virology, University of Zurich, Zurich, Switzerland

    Affiliation Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America

  • Ketaki Ganti,

    Roles Investigation, Supervision, Writing – review & editing

    Affiliation Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America

  • Chung-Young Lee,

    Roles Investigation, Writing – review & editing

    Affiliation Department of Microbiology, School of Medicine, Kyungpook National University, Jung-gu, Daegu, Republic of Korea

  • David VanInsberghe,

    Roles Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America

  • Anice C. Lowen

    Roles Conceptualization, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review & editing

    anice.lowen@emory.edu

    Affiliations Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, GeorgiaUnited States of America, Emory Center of Excellence for Influenza Research and Response (CEIRR), Atlanta, GeorgiaUnited States of America

Abstract

For many viruses, narrow bottlenecks acting during transmission sharply reduce genetic diversity in a recipient host relative to the donor. Since genetic diversity represents adaptive potential, such losses of diversity are thought to limit the opportunity for viral populations to undergo antigenic change and other adaptive processes. Thus, a detailed picture of evolutionary dynamics during transmission is critical to understanding the forces driving viral evolution at an epidemiologic scale. To advance this understanding, we used a barcoded virus library and a guinea pig model of transmission to decipher where in the transmission process influenza A virus populations lose diversity. In inoculated guinea pigs, we show that a high level of viral barcode diversity is maintained. Within-host continuity in the barcodes detected across time furthermore indicates that stochastic effects are not pronounced within the inoculated hosts. Importantly, in both aerosol-exposed and direct contact animals, we observed many barcodes at the earliest time point(s) positive for infectious virus, indicating robust transfer of diversity through the environment. This high viral diversity is short-lived, however, with a sharp decline seen 1–2 days after initiation of infection. Although major losses of diversity at transmission are well described for influenza A virus, our data indicate that events that occur following viral transfer and during the earliest stages of natural infection have a central role in this process. This finding suggests that host factors, such as immune effectors, may have greater opportunity to impose selection during influenza A virus transmission than previously recognized.

Introduction

The high mutation rate characteristic of RNA viruses [1,2], coupled with genetic recombination [3,4] and/or reassortment [5] of segmented genomes, enables constant production of viral variants [68]. In turn, this variation provides the substrate for viral evolution [911]. For some viruses, including influenza A virus and SARS-CoV-2, viral evolution and viral spread through host populations occur on a similar timescale, such that each shapes the other [12]. Understanding viral evolution is therefore crucial in our efforts to control outbreaks.

Selection and genetic drift are two major drivers of evolutionary change in viral populations [13]. Selection is a deterministic process whereby differences in fitness result in changes in variant frequencies over time. In contrast, genetic drift is a stochastic process whereby changes in variant frequencies result from chance. In general, when population sizes are large, selection predominates over drift and changes in variant frequencies result from differences in the variants’ reproductive output. When population sizes are small, drift instead predominates and changes in variant frequencies result simply from random chance associated with reproductive output. Both selection and drift can reduce genetic diversity, but by different means. Under drift, the loss of a variant comes about as a chance event. Under selection, the loss of a variant comes about from the variant having lower fitness. As such, these processes have different impacts on a population’s fitness trajectory: if driven by selection, the loss of diversity increases viral population-level fitness and as such results in viral adaptation. If driven by drift, the loss of viral diversity could (by chance) either increase or decrease population-level fitness. For influenza viruses, the dynamic interplay between selection and genetic drift is still poorly understood [1315].

A reduction in viral population size during transmission has been documented for many viruses [1620]. Termed a transmission bottleneck, this effect is characterized by the establishment of a new infection by few viruses derived from a large source population. The reduction in population size is typically accompanied by a loss of population diversity, known as a genetic bottleneck. Classically, bottlenecks are defined as random events and are therefore a source of genetic drift [21]. When a random bottleneck is active, even highly adaptive variants may go extinct. In the case of HIV, however, the sharp reduction in population size at transmission is a selective process favoring variants that use CCR5 co-receptors [2224]. The term selective bottleneck is often used to describe this deterministic process.

As for other viruses, loss of diversity during influenza A virus transmission has been documented. In humans with naturally occurring infections, as few as one to two viral genomes from an infected individual established in an uninfected individual [13]. Studies using influenza A virus populations with inserted genetic barcodes similarly revealed narrow transmission bottlenecks in animal models [25,26]. More relaxed bottlenecks have been reported in natural hosts aside from humans, suggesting roles for host biology and/or modes of transmission in shaping species-specific between-host dynamics [27,28]. Finally, avian influenza A virus transmission in mammals has been shown to involve a tight bottleneck with reductions in viral diversity attributed to either stochastic or selective forces [2931].

While it is known that diversity is lost during influenza A virus transmission, it is unclear at which point—in the donor host, the environment, or the recipient host—this takes place. The specific stage at which diversity is lost is likely to define the potential for selection to act during transmission. We therefore made an influenza A virus library with 4,096 potential barcodes to monitor the fate of many unique viral lineages within infected guinea pigs and through the course of transmission to contacts. Our data reveal that viral diversity remains high in inoculated animals throughout infection, and many viral genomes are transferred to animals exposed by aerosols or direct contact. However, a severe bottleneck occurs in exposed animals 1–2 days after the initiation of infection, such that few lineages are sustained in the context of population expansion. Thus, in our system, losses of diversity are primarily driven by events that occur during the expansion of infection, not during the process of donor-to-recipient transfer itself. Importantly, this phase of expansion could be a previously unrecognized opportunity for selection to operate.

Results

Generation of a barcoded influenza A virus library

We designed a genetic barcode for influenza A/Panama/2007/99 (H3N2) virus (Pan/99) with the goals of producing a diverse viral population while avoiding attenuation of viral replication and fitness differences among barcode variants. To achieve high diversity, 12 nucleotide sites within a 50-nucleotide region of the NA segment were made bi-allelic. Since one of two bases was possible at each of 12 positions, 212 = 4,096 unique barcodes were introduced (Fig 1A). To avoid attenuation, we introduced polymorphisms within the native sequence of Pan/99 rather than inserting a foreign sequence. To avoid fitness differences among the barcode variants, the polymorphisms introduced were synonymous and chosen based on their natural occurrence in H3N2-subtype influenza A viruses circulating in humans from 1994 to 2004. We reasoned that naturally occurring variants detected in circulation would be likely to have minimal fitness effects. We named the gene segment modified in this way Pan/99 NA-BC.

thumbnail
Fig 1. NA barcode diversity is maintained in both plasmid and virus stocks, and the barcode does not affect overall fitness.

A) Barcode design for the NA segment of influenza A/Panama/2007/99 (H3N2) virus. At each of 12 sites, one of two nucleotides is possible, allowing for up to 4,096 unique barcodes within the population. B) Barcodes detected in the pDP Pan/99 NA-BC plasmid preparation and passage 1 stock of Pan/99 NA-BC virus. Colors represent unique barcodes, and their frequencies within the stock are indicated by the height of the color. C) Shannon Diversity Index (H) of the stock samples is compared to the maximum possible diversity for this system (Hmax = 8.32, shown with a horizontal dashed line). This theoretical maximum reflects a population in which all 4,096 potential barcodes are equally represented. D) Sequence logo plots demonstrate the sequence motifs present in plasmid and passage 1 virus stocks. Each pair of nucleotides represents one of the 12 bi-allelic sites. The height of the letter indicates the corresponding nucleotide frequency in the stock. E) Multicycle replication of Pan/99 WT (black) and Pan/99 NA-BC (pink) viruses. Infections were performed in triplicate, and data points show results from individual cell culture dishes. Lines connect mean titers. Dashed line indicates limit of detection (50 PFU/mL). Two-way ANOVA showed no significant difference between viruses at any time point (p = 0.11). Fig 1A created in BioRender. Underlying data available in S1 Data and on the Zenodo database (https://doi.org/10.5281/zenodo.16115331).

https://doi.org/10.1371/journal.pbio.3003352.g001

To evaluate barcode diversity in the reverse genetics plasmid and the passage 1 (P1) viral stock carrying Pan/99 NA-BC, we subjected the barcode region to next-generation sequencing. No barcode was found to be dominant in the plasmid library or the virus stock, and no nucleotide was dominant at any site within the barcode (Fig 1B and 1D). An aberrant nucleotide was detected at the 12th barcode site in the plasmid preparation but was not carried through to the virus stock. To quantify diversity, we applied the Shannon Diversity Index (H), which considers the richness (i.e., the number of species present) and the evenness (i.e., the abundance of the present species) in a community [32]. To calculate H, each unique barcode was taken to represent a species. The measured diversity in the plasmid and virus stocks was 8.05 and 7.81, respectively (Fig 1C), near to the theoretical maximum (Hmax) for this system of 8.32 and revealing little loss of barcode diversity during viral recovery from cDNA. To determine whether the barcode in the NA segment altered fitness of the Pan/99 virus, multi-cycle replication was evaluated in MDCK cells. No significant differences between wild type and Pan/99 NA-BC viruses were detected (Fig 1E).

Modeling transmission of Pan/99 NA-BC virus in guinea pigs

To evaluate viral dynamics within and between hosts, we inoculated guinea pigs with 5 × 104 plaque-forming units (PFU) of Pan/99 NA-BC virus intranasally [33]. After 24 h, we placed a naïve animal with each inoculated animal in either direct or aerosol contact (S1A Fig). Three independent experiments were performed, each including four direct contact and four aerosol transmission pairs. Viral shedding, assessed by plaque assay of daily nasal lavage samples, was similar for all inoculated animals (S1B Fig). Transmission efficiency varied from 75% to 100% for pairs in direct contact and from 50% to 100% for pairs in aerosol contact. The daily nasal lavages collected from both inoculated and exposed animals furnished samples with which to investigate viral population dynamics and the drivers of the influenza A virus transmission bottleneck.

Stochastic effects are not pronounced in inoculated animals

To determine how barcode diversity changed during infection in inoculated animals, we evaluated the barcodes present in nasal lavage samples by next-generation sequencing. In all directly inoculated animals, many barcodes were detected throughout the course of infection (Figs 2 and S2). Using the Shannon Diversity index (H), we found that barcode diversity either remains consistent or declines slightly over time in inoculated animals (Figs 3 and S3). Changes in diversity are furthermore mirrored by changes in both barcode richness and evenness, indicating that declines in diversity are driven both by loss of barcode species from the within-host population and skewing of the relative frequencies of individual barcodes that comprise the population (Figs 3 and S3).

thumbnail
Fig 2. Population diversity declines between inoculated and exposed guinea pigs.

Viral titers in nasal lavage samples are indicated by the overall height of the bar. Red lines show LOD (50 PFU/mL). Colors within the bars represent unique barcodes, and the height of each color indicates barcode frequency within the sample. Plots for individual animals are paired with those of their cage mate. Representative pairs for direct contact (A) and aerosol (B) exposure are shown from three experimental replicates. Guinea pig ID numbers are shown in the upper right corner of each plot. Data from additional animals are shown in S2 Fig. The data underlying this figure can be found at https://doi.org/10.5281/zenodo.16115331 and in S2 Data.

https://doi.org/10.1371/journal.pbio.3003352.g002

thumbnail
Fig 3. Initial high barcode diversity in exposed animals plummets after the first 1-2 days of viral positivity.

A) For each exposed animal, the Shannon diversity index (H) was normalized to that on the first day of viral positivity (as determined by plaque assay). Data from a total of 24 inoculated animals (left) and 19 exposed animals (right) across three experiments are shown. The Shannon diversity in each animal relative to their first day of positivity is on average 3.7× lower in exposed animals compared to inoculated animals across all timepoints (p < 0.0001). B) Changes in barcode composition between successive days was assessed using the Bray–Curtis dissimilarity index to compare barcode composition in each sample to that observed on the prior day. Lines connect data points for one animal, with inoculated (left) and exposed (right) animals shown on separate graphs. Underlying data can be found at https://doi.org/10.5281/zenodo.16115331 and in S1 Data.

https://doi.org/10.1371/journal.pbio.3003352.g003

High initial diversity in exposed animals precedes a sharp decline

To determine how barcode diversity changed between donor and recipient animals, barcodes present in nasal lavage samples of recipients were analyzed. Strikingly, at the earliest time point(s) positive for infectious virus in recipients, many barcodes are present, signifying transfer of an appreciably large and diverse viral population through the environment (Figs 2 and S2). This observation is true of both aerosol-exposed and direct contact animals. Populations present early in exposed animals are characterized by high barcode diversity, richness, and evenness (Figs 3 and S3).

However, the observed high barcode diversity in exposed animals is short-lived, with a sharp decline seen within 1–2 days of the initiation of infection (Figs 2, S2, and S3). Both richness and evenness contribute to the precipitous drop in diversity seen within exposed animals (Figs 3A and S3). The stark change in population composition is unique to early times after transmission: pairwise comparisons between successive time points show high dissimilarity of barcode composition between the first and second positive days but low dissimilarity between successive days thereafter (Fig 3B). While the few viral lineages that penetrate the bottleneck persist through the remainder of the acute infection, those that persist differ across animals (Figs 2 and S2), excluding the possibility that some barcodes carry a fitness advantage. Notably, barcode diversity in some exposed animals rebounds late in the infection.

To assess the potential for the observed population dynamics to be driven by selection acting on de novo mutations outside the barcoded region, we performed whole-genome sequencing of a subset of nasal lavage samples. Specifically, viral genomes collected on the first or second day of positivity in exposed animals from experimental replicates 1 and 2 were sequenced in full. Among these samples, a single variant was found above a 1% frequency, which was a nonsynonymous change present at 4.9% in the NP segment of guinea pig 31 on day 5 post-inoculation. This variant resulted in a methionine to valine substitution at amino acid 238. Thus, reductions in viral diversity observed in exposed animals were not driven by selective sweeps of de novo variants.

As a means of validating results obtained from targeted sequencing of the barcoded region of the viral genome, we analyzed the whole-genome sequencing reads that spanned this region. This analysis relied on reads spanning the entirety of the barcode, which were relatively rare. Nevertheless, high-level observations could be confirmed: first-positive time points showed multiple barcodes present and subsequent time points revealed the same predominant barcodes in both sequencing datasets (S4A Fig). Furthermore, results of targeted barcode sequencing were validated by sequencing a subset of samples multiple times to ensure reproducibility (S4B and S4C Fig).

To evaluate whether barcodes detected on the first day of positivity in recipient animals were carried by viable viruses, we performed plaque assays with nasal lavage samples from the initial days of viral positivity from 12 recipient animals. The barcodes present in plaque isolates were then identified by sequencing. In five of the 12 animals (GP02, GP34, GP38, GP40, GP44), plaques from the first day of positivity yielded a diverse set of barcodes, indicating that many independent viral lineages present early in these five animals were represented in the infectious viral population (Fig 4). In the remaining seven animals, the plaque isolates from the first positive day were low diversity and largely matched those derived from the subsequent positive days (Fig 4). We also included several late samples in this analysis, to investigate whether increases in barcode diversity seen at late times were also apparent among infectious viruses. Again, we saw a mixed outcome, with the infectious virus isolated from three animals (GP02, GP27, GP40) reflecting the increase in diversity seen at the bulk level and the isolates from the other three animals instead showing low diversity despite a rebound of diversity in the last bulk sample (Fig 4).

thumbnail
Fig 4. Infectious viruses isolated early after transmission carry diverse barcodes in a subset of animals.

For 12 exposed animals, up to 24 plaques were isolated from nasal wash samples collected on the indicated days post-inoculation. Barcodes detected in isolated plaques are shown beneath those detected in the corresponding nasal lavage sample, with each row representing a plaque. Each color represents a unique barcode, with frequency shown on the horizontal axis. Underlying data can be found at https://doi.org/10.5281/zenodo.16115331.

https://doi.org/10.1371/journal.pbio.3003352.g004

To test whether the differing population dynamics between donor and recipient animals relate to viral dose, we evaluated dynamics of barcode composition in animals inoculated with escalating doses of virus (S5A Fig). Three animals were inoculated intranasally with 1 × 102 PFU, 3.3 × 102 PFU, 1 × 103 PFU, 3.3 × 103 PFU, 1 × 104 PFU, or 5 × 104 PFU of Pan/99 NA-BC virus. We saw that, at each inoculation dose, measured values for diversity, richness, and evenness are below what would be expected if all inoculated barcodes were recovered in nasal lavage samples (S5B Fig). Thus, a genetic bottleneck acts within animals infected through inoculation. This bottleneck is of comparable stringency across the dose range tested. A bottleneck acting during the establishment of infection was not, however, observed in inoculated animals as it was transmission recipients.

Discussion

We used a barcoded influenza A virus system to generate a high-resolution view of changes in the composition and diversity of viral populations as they expand, contract, transmit, and establish infection in a new host. Our data suggest that viral evolutionary dynamics through a transmission event comprise two distinct stages: inter-host transfer and intra-host establishment. The bottleneck acting during the first stage can be loose, allowing many genotypes to pass to the new host. Conversely, the bottleneck acting at the second stage is tight, such that few genotypes dominate the established within-host population. The evolutionary implications of this two-stage process are potentially great. The existence of a loose bottleneck between hosts could enable selection to act efficiently on the transferred viral population prior to a stochastic contraction of diversity imposed by the process of viral establishment. Indeed, our data offer a potential solution to a long-standing conundrum in evolutionary virology, that of disparate evolutionary dynamics at the individual and global population scales [34].

The dynamics we observe here for influenza A virus transmission are analogous to those seen in mice inoculated orally with a barcoded coxsackievirus, wherein high initial barcode diversity was followed by a sharp decline during enteric infection [35]. The observed reduction in diversity between influenza virus populations in donor animals and those that become established in recipients is furthermore consistent with prior studies on influenza in human cohorts and experimental animals [13,25,26,29]. Our conclusion that early viral dynamics in the recipient make a major contribution to the loss of diversity during transmission is, however, novel. This pattern was likely not apparent in prior work due to the time points analyzed. In human cohort studies, sample collection is typically triggered by the onset of symptoms, such that the very early stages of infection are often not sampled [13]. Similarly, previous studies of the influenza A virus transmission bottleneck in experimental models did not characterize viral populations present early after transmission [25,29].

The very transient nature of early, diverse, viral populations is also apparent within our own data sets. Direct sequencing of viral genomes sampled from the nasal cavity of recipient hosts revealed a collapse of barcode diversity over a period of approximately 24 h. Conversely, sequencing of infectious isolates derived from the sampled populations suggested a more rapid extinction of viral lineages. Among 12 recipient animals that each showed a high diversity of genomes in early nasal lavages, only five showed a high diversity of infectious viruses in these sampled populations. In the remaining seven, the infectious viral populations detected early matched the low diversity populations that persisted through the course of infection. We postulate that the discrepancy seen in these seven samples relates to the detection in bulk nasal lavage of viral genomes within infected cells and within released particles that lack portions of the genome. In line with the heterogeneity of viral burst size [3639] and composition [3842] at a single-cell level, many infected cells will not produce progeny viruses that can be detected by plaque assay. Thus, these seven samples appear to have captured a time point in which many unique barcodes were being replicated within infected cells, but the bottleneck had already occurred at the level of the released viral population. Conversely, the five samples that showed concordance between bulk nasal lavage and plaque isolates captured an earlier phase of the transition from high to low diversity.

Based on our results, we propose a model in which many viruses are delivered to the recipient and initiate infection in the nasal cavity, such that their genomes are replicated to a level detectable by sequencing. Among this first generation of infected cells, only a subset produces large bursts of progeny viruses early, in part due to the high frequency with which influenza virus genomes replicated within singly infected cells are incomplete [3639,43,44]. The progeny from productive cells initiate a second round of infection, which is again subject to heterogeneity. Within these very first generations of viral amplification, the few genotypes replicated rapidly and to high levels quickly dominate both infected-cell and released-virus populations. The eventual death of producer cells precipitates loss of barcode diversity in the infected-cell population. Among released viruses, competition for new target cells leads to the extinction of many minor lineages [45,46]. Innate antiviral immunity likely hastens the contraction of barcode diversity both in infected cells and in released viruses [47,48]. Within these contractions, some minor lineages may persist owing to long-lived cells [49,50] or spatial isolation [5153]. Our data suggest that such persistent lineages can expand late in infection as dominant lineages are cleared, yielding a transient rebound in barcode diversity.

In contrast to recipient animals infected through transmission, we found only modest reductions in diversity in donor animals during infection. This finding extended to animals inoculated with low doses, where a bottleneck acting at inoculation was apparent, but a growth-induced bottleneck acting during the establishment of infection was not. Since the kinetics of viral expansion may differ with infection route, this difference may be attributable to the timing of sampling. Alternatively, viral population dynamics may differ between animals infected through inoculation and transmission owing to differences in the mode of viral deposition. For example, viral aggregation state may differ, which would modify the potential for coordinated infection and, in turn, alter the extent of heterogeneity in viral burst size [38,54]. Alternatively, or in addition, the size of the epithelial area across which viruses are delivered may differ. If this area is greater following intranasal instillation, the ensuing spatial separation of initial sites of infection may allow more lineages to establish within the host [5153].

Seasonal influenza virus evolution at large geographic and temporal scales is characterized by a clear pattern of positive selection: on a recurring basis, antigenically distinct variants sweep the global viral population, driving epidemic spread [5557]. In contrast, the within-host evolutionary dynamics of influenza viruses show strong genetic drift and purifying selection; positive selection has rarely been observed [13,14,58,59]. A similar dichotomy is true for SARS-CoV-2 at within-host and population scales [6063]. Our data suggest that these scales may be linked by a transmission process in which selective forces have an opportunity to act before potent stochastic forces within hosts. For example, antibodies present at mucosal surfaces could act on an antigenically diverse incoming viral population, mediating antigenic selection. Only positively selected variants would then be available to pass through the stochastic bottleneck associated with population expansion. Experiments performed in pre-immune hosts are needed to formally test this model.

There are some limitations of our study that are important to consider. First, the transmission model used places animals in close proximity at a time when viral loads are high in donors. This approach is designed to achieve transmission to a high proportion of exposed animals, but the extent to which these conditions reflect those in humans is not clear. If they support transmission, many human interactions would impose tight bottlenecks during inter-host transfer of respiratory viruses [64].

Second, processing of plaque-negative nasal wash samples by PCR does not yield a detectable amplification product but does reveal barcodes upon next-generation sequencing. These spurious barcodes are not reproducible across samples or upon resequencing of the same negative sample, indicating that they are neither biologically meaningful nor due to a common contaminant. Nevertheless, their presence frustrates comparison of results to a traditional negative control.

Third, the technical limitations of next-generation sequencing leave some uncertainty in viral population composition and therefore in the relationships between linked populations. Despite substantial read depth, the approach used cannot be assumed to comprehensively sample the barcodes of a diverse, high-titer population. More thorough sampling is, however, expected for populations with lower diversity or lower titer. Subsampling of reads partially corrects for these discrepancies. Nonetheless, our data show a minority of barcodes in recipients that are not detected in the corresponding donors. Since barcodes cannot arise de novo, these discrepancies likely relate to differences in sampling intensity among nasal lavages with substantially different viral titers.

In conclusion, our data reveal a two-stage transmission process in which transfer between hosts can include a large and diverse viral population, but early events in the recipient host impose a stringent and stochastic bottleneck. Transmission may therefore represent an opportunity for selection to operate in the earliest phases of infection before any single genotype sweeps the population. This effect is expected to have a strong influence on viral evolution and is likely relevant across diverse viral families.

Materials and methods

Ethical considerations

All the animal experiments were conducted in accordance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The studies were conducted under animal biosafety level 2 containment and approved by the IACUC of Emory University (PROTO201700595) for the guinea pig (Cavia porcellus). The animals were humanely euthanized following guidelines approved by the American Veterinary Medical Association.

Cells

Madin–Darby canine kidney (MDCK) cells were a gift from Dr. Daniel Perez, University of Georgia, Athens, GA. A seed stock of MDCK cells at passage 23 was amplified and maintained in Minimal Essential Medium (Gibco) supplemented with 10% fetal bovine serum (FBS; Atlanta Biologicals) and Normocin (Invivogen). 293T cells (ATCC, CRL-3216) were maintained in Dulbecco’s Minimal Essential Medium (Gibco) supplemented with 10% FBS and Normocin. All cells were cultured at 37 °C and 5% CO2 in a humidified incubator. The cell lines were not authenticated. All cell lines were tested monthly for Mycoplasma contamination while in use. The medium for the culture of influenza A virus in MDCK cells (virus medium) was prepared by supplementing Minimal Essential Medium with 4.3% bovine serum albumin (BSA; Sigma) and Normocin.

Generation of Pan/99 NA-BC plasmid

The region of Pan/99 into which the barcode was inserted was first identified by aligning 593 sequences of H3N2-subtype influenza A viruses isolated between 1994 and 2004. A region with many nucleotide substitutions was identified in the NA segment from nucleotide position 484–532. Twelve synonymous mutations were identified within this region. Mutations used for the barcode are listed in S1 Data for Fig 1A. Nucleotide changes are numbered with respect to the beginning of the 3′-UTR of the NA segment, and amino acids are numbered based on the start codon. Double-stranded Ultramers (IDT) were designed that contained degenerate bases with two possible nucleotides at each of the 12 chosen barcode sites (cacagtacatgataggaccccttaycgraccytattgatgaatgarttrggtgtyccattycayytrggracyaagcaagtgtgtatagcatggtcc).

Site-directed mutagenesis was used to insert an Xho1 restriction site into the wild-type reverse genetics plasmid, pDP Pan/99 NA, prior to barcode insertion. This was done to give a means of destroying the parental template following barcode insertion. To generate a linearized template for barcode insertion, the following steps were performed. Xho1 digestion of the plasmid stock was followed by phosphatase treatment using rSAP (NEB) to dephosphorylate the cut ends of the plasmid. The plasmid was then amplified by PCR using primers that extend outward from the barcode region: P99_NA_536F 5′-caagtgtgtatagcatggtcc-3′ and P99_NA_479R 5′-gggtcctatcatgtactgtg-3′. PCR purification (Qiagen QIAquick PCR Purification Kit) was used to isolate the linearized PCR product, followed by a dual digestion with Dpn1 and Xho1 to remove residual WT plasmid. PCR purification was repeated, and then an assembly reaction using the NEBuilder HiFi DNA Assembly Kit (NEB) was performed to insert the Ultramers into the linearized vector and re-circularize. The product was then transformed into DH5-α cells (NEB). After plating onto LB-amp plates, approximately 1 × 104 colonies were collected and pooled into LB-amp culture media and then incubated at 37 °C for five hours prior to harvesting the bacterial population for plasmid purification by maxiprep (Qiagen Plasmid Maxi Kit). The presence of a diverse barcode in the plasmid stock was verified by next-generation sequencing (see below). This plasmid stock was then used to generate Pan/99 NA-BC virus by reverse genetics in combination with seven plasmids encoding Pan/99 WT gene segments in a pDP2002 vector [65].

Viruses

Pan/99 WT and Pan/99 NA-BC viruses were derived from influenza A/Panama/2007/99 (H3N2) virus (Pan/99 WT) and were generated by reverse genetics. In brief, 293T cells transfected with eight reverse genetics plasmids 16–24 h previously were co-cultured with MDCK cells in virus medium at 33 °C for 40 h. Recovered virus was propagated in MDCK cells to generate working stocks. Propagation was carried out from low MOI to avoid accumulation of defective viral genomes, but with a sufficient viral population size to maintain barcode diversity. Titration of stocks and experimental samples was carried out by plaque assay in MDCK cells. The presence of a diverse barcode in the virus stock was verified by next-generation sequencing (see below).

Growth kinetics

Replication of Pan/99 WT and Pan/99 NA-BC was determined in triplicate culture wells. MDCK cells in six-well dishes were inoculated at an MOI of 0.01 PFU/cell in phosphate-buffered saline (PBS). After 1 h incubation at 33 °C, inoculum was removed, cells were washed × with PBS, 2 mL virus medium was added to cells, and dishes were returned to 33 °C. A 120 μL volume of culture medium was sampled at the indicated times points and stored at −80 °C. Viral titers were determined by plaque assay on MDCK cells.

Guinea pig infections

Female Hartley strain guinea pigs weighing 250–350 g were obtained from Charles River Laboratories and housed by Emory University Department of Animal Resources. Guinea pigs were sedated with ketamine (30 mg/kg) and xylazine (4 mg/kg) by intramuscular injection prior to intranasal inoculation or nasal lavage. Virus inoculum was given intranasally in a 300 μL volume of PBS containing 5 × 104 PFU of Pan/99 NA-BC. At day 1 post-inoculation, one naïve animal was placed with each inoculated animal in either a cage that allowed for direct physical contact (to model transmission by all modes) or a cage in which the two animals were separated by a double-walled, perforated metal barrier (to model transmission via airborne infectious respiratory particles). Nasal lavage was performed with 1 mL PBS per animal on days 1–7 post-inoculation for inoculated animals and days 2–8 post-inoculation for exposed animals. Collected fluid was divided into aliquots and stored at −80 °C. Viral titers of nasal lavage samples were subsequently determined by plaque assay.

For the dose escalation experiment (S5 Fig), guinea pigs were obtained and sedated as described above. Virus inoculum was given intranasally in a 30 μL volume of PBS containing 1 × 102 PFU, 3.3 × 102 PFU, 1 × 103 PFU, 3.3 × 103 PFU, 1 × 104 PFU, or 5 × 104 PFU of Pan/99 NA-BC. Nasal lavage was performed with 1 mL PBS per animal on days 1–4 post-inoculation, and samples were stored as described above.

Sample processing for next-generation sequencing of barcode region

The following method was used to validate plasmid and virus stocks and to evaluate experimental samples. Viral RNA extraction (QiaAmp Viral RNA kit, Qiagen) was performed using a 140 μL volume of each nasal lavage sample or virus stock, followed by reverse transcription (Maxima RT, Thermo Fisher) with pooled Univ.F(A)+6 and Univ.F(G)+6 primers [46,66]. Either cDNA or, for the purpose of plasmid validation, plasmid DNA was subjected to PCR amplification with primers flanking the barcode region. Samples were sent for amplicon sequencing to either Azenta Life Sciences or the Emory National Primate Research Center (ENPRC) Genomics Core. For samples sent to Azenta, PCR to amplify the region containing the barcode was performed using PFU Turbo AD (ThermoFisher) with primers with partial sequencing adapters that generated a 155-nt product (P99_NA_adptr_428-449_F 5′- acactctttccctacacgacgctcttccgatctggaacaacactaaacaacaggc-3′ and P99_NA_adptr_563-582_F 5′- gactggagttcagacgtgtgctcttccgatctgcttttccatcgtgacaact-3′). For samples sequenced by ENPRC, amplicon PCR was performed in a similar manner but using primers with sequencing adapters that generated a 100-nt product (P99NA_462F5_adptr 5′- tcgtcggcagcgtcagatgtgtataagagacagggactcctcagtacatgataggacccctt-3′ and P99NA_553Rev_adptr 5′- gtctcgtgggctcggagatgtgtataagagacagccatgctatacacacttgct-3′). Thirty PCR cycles were performed for all samples. Column-based PCR purification was performed on all samples followed by quantification of DNA using either NanoDrop or Qubit. Samples were normalized to 20 ng/µL in nuclease-free H20.

Samples submitted to Azenta underwent Amplicon-EZ sequencing, an Illumina-based sequencing service compatible with amplicons of 150–500 nt in length that does not include a fragmentation step in library preparation. Amplicons were sequenced as 2 × 250 bp paired reads and demultiplexed prior to delivery. At ENPRC, library preparation was performed with the omission of a tagmentation step and sequencing was performed on an Illumin NovaSeq 6000 platform. Amplicons were sequenced as 2 × 100 bp paired reads and demultiplexed prior to delivery.

Analysis of barcode sequencing data

Sequences were processed using our custom software, BarcodeID, available in the GitHub repository: https://github.com/Lowen-Lab/BarcodeID. Briefly, BarcodeID uses BBTools [35] to process raw reads, then uses a custom Python script to screen and identify barcode sequences present in each sample, then performs an error correction step to remove spurious barcode sequences, before finally calculating diversity statistics and writes summary tables. BBMerge screens for adapter sequences and merges forward and reverse reads using default settings, and BBDuk merges reads with a low average quality (<30). BarcodeID then screens each read to verify that the nucleotides at barcode and non-barcode sites match the nucleotides expected at those sites. Reads with mismatches are excluded from overall barcode sequence counts, but BarcodeID collects all high-quality variant amplicons and calculates overall mismatch rates by site to determine if any mutants with non-barcode alleles are driving any observed barcode dynamics. Samples with evidence of non-barcode-driven dynamics were excluded from further analyses.

The error correction step was created to emulate the error correction software designed for bacterial 16S rRNA amplicons, DADA2 [67]. The largest difference between the two methods is how the nucleotide substitution model is generated. Where DADA2 uses a complicated statistical method, we can measure it directly from the backbone sites which should only deviate from the expected sequence due to mutation or any error accumulated through the process of library preparation and sequencing. Specifically, it measures the rate at which bases with a specific quality score match or do not match the expected base at each backbone site (λji,Q). For example, the rate of A to T with a quality score of 25 (λTA,25) is the number of times a T with quality score of 25 was observed at a backbone site where A was expected, divided by the number of observations of all bases with a quality score of 25 at those sites. Using this error model and the observed prevalence of barcode sequences, BarcodeID estimates the probability that any barcode sequence could be derived from another through error. This probability calculated by:

where the probability of a barcode sequence j becoming barcode sequence i is Poisson distributed, and determined by the prevalence of both barcodes, nj and ni, and the product of probabilities of observing all nucleotide changes required to mutate barcode j into i.

BarcodeID uses the same agglomerative clustering process as DADA2 that attempts to iteratively identify barcodes that are too abundant to be explained due to error and collapse erroneous barcodes into larger barcode clusters. To initiate the agglomeration process, the most abundant barcode sequence with its associated quality score values in a sample is chosen as the initial seed barcode. BarcodeID then calculates the probability that every observed barcode sequence-quality combination could have been observed through mutation of the initial seed sequence (pi,j where j is the seed barcode and i is the query barcode). Barcodes are then ranked in order of increasing probability of observation due to error. If the highest-ranked barcode is higher than a Bonferroni-adjusted p-value threshold, then it is used to create a new barcode cluster seed. This iterative process of adding the top seed before re-calculating the probability of observation due to error of all remaining barcodes relative to the growing list of seed barcodes continues until no more new seeds are found. Then, BarcodeID enters the agglomeration phase, where query barcodes are sorted in decreasing order of probability of observation due to error, and the barcode with the highest probability of being an error derived from a seed barcode is added to that seed barcode’s cluster. This process proceeds again iteratively until no barcodes can be collapsed into existing barcode clusters, at which point the remaining barcodes are included as new seeds until all observed barcodes are included. Importantly, the Pi>j of barcodes that are observed once will always be equal to 1 and are therefore unable to be clustered. Accordingly, singleton barcodes are excluded from calculating diversity statistics.

To account for uneven sampling intensity, BarcodeID performs repeated subsampling and linear regression to calculate diversity statistics that are more directly comparable between samples. Specifically, samples were subsampled at up to 19 different read depths (between 103–106 reads, where possible), with five iterations per subsampled depth. The median value of each diversity statistic at each depth was then used to fit the relationship between each diversity statistic and sampling intensity. The fit was then used to calculate an adjusted value for all samples at an equivalent depth. For alpha diversity statistics (richness, Shannon diversity, Simpson diversity, and evenness) the equivalent depth was 10,000 reads per sample because it provided the sufficient depth to distinguish samples across multiple levels of diversity, while minimizing bias due to variable sampling intensity. For beta diversity statistics (Bray–Curtis and Jaccard dissimilarity), 50,000 reads per sample was chosen because the added depth improved resolution in pairwise sample comparisons.

Validation of barcode sequencing data

Samples were processed for barcode sequencing if an infectious virus could be detected therein by plaque assay (which has a limit of detection of 50 PFU/mL). We found that many reads were obtained and analyzed for plaque-positive samples, irrespective of viral titer. To evaluate the reproducibility of sequencing results, samples from three guinea pigs were sequenced in triplicate. This set of samples includes high titer, low titer, and plaque negative samples. For each sample, the original RNA extract was processed independently three times to produce, amplify, purify, and sequence the cDNA. Results indicate that barcode composition detected in plaque-positive samples is highly reproducible, while barcode composition detected in the plaque-negative samples is not reproducible (S4C Fig).

Preparation of nasal lavage samples for viral whole-genome sequencing

Viral RNA extraction (QiaAmp Viral RNA kit, Qiagen) was performed using a 140 μL volume of each nasal lavage sample, followed by one-step reverse transcription PCR amplification of full viral genomes using pooled Univ.F(A)+6, Univ.F(G)+6 primers and Univ.R primers and SuperScript III Platinum kit (ThermoFisher) [46,66]. Following PCR purification (Qiagen QiaQuick PCR Purification Kit), cDNA was processed at the ENPRC core for sequencing on an Illumina NovaSeq 6000 platform. Samples were sequenced as 2 × 100 bp paired reads and demultiplexed prior to delivery.

Analysis of whole-genome sequencing data

Whole genome sequencing reads were merged and filtered for low average quality (≥30) using BBMerge, then separated according to the segment using BLAT [68]. The reads were then mapped to their corresponding reference segment using BBMap, with local alignment set to false. From these alignments, we used custom Python scripts to identify iSNVs and reads that map to the barcoded region of the NA segment. Cutoffs for inclusion of iSNVs were set empirically. First, sites were evaluated based on their total coverage and the average quality and mapping statistics. Only sites with ≥100× coverage were considered. For minor variants at these sites to be included in subsequent analyses, they were required to be present at ≥1% frequency and have an average phred score of ≥35, and the reads that contained the minor allele at any given site also had to have sufficient mapping quality to justify inclusion. Specifically, reads containing the minor allele needed an average mapping quality score of ≥40, the average location of the minor allele needed to be ≥20 bases from the nearest end of a read, and the reads overall needed to have ≤2.0 average mismatch and indel counts relative to the reference sequence.

Analysis of beta diversity

Dissimilarity between two populations can be measured using beta diversity metrics, and in this study, we used the both Jaccard index and the Bray-Curtis dissimilarity [69,70]. Both metrics consider the species shared between two populations, and in this case, unique barcodes were considered species. In addition to presence/absence data, the Bray-Curtis dissimilarity also reflects abundance data. For either measure, a value closer to one indicates that the two populations are more dissimilar, whereas a value closer to zero indicates that the two populations are more alike in composition. Pairwise comparisons of barcode data were made between the viral populations present in plaque-positive nasal lavage samples acquired from a contact pair of guinea pigs.

Supporting Information

S1 Fig. Pan/99 NA-BC infects inoculated animals and transmits to exposed animals.

A) Schematic showing transmission set-up in guinea pigs. For each of the three experimental replicates, eight guinea pigs were intranasally inoculated with 5 × 104 PFU of Pan/99 NA-BC in 300 µL. Twenty-four hours post-inoculation, a single naïve animal was placed with each inoculated animal. Cages either allowed for direct contact (n = 4) or maintained separation with a double-walled, perforated metal barrier (n = 4). B) Viral titers in nasal lavage samples in direct contact and aerosol exposure settings from replicates 1, 2, and 3. Inoculated animals are shown in blue and exposed animals in red (direct contact) or yellow (aerosol contact). The dashed black line represents the limit of detection (50 PFU/mL). Paired animals share the same line type. Negative results are plotted at the limit of detection. S1A Fig created in BioRender. Underlying data are available in S2 Data.

https://doi.org/10.1371/journal.pbio.3003352.s001

(PDF)

S2 Fig. Population diversity declines between inoculated and exposed guinea pigs.

Data from experimental replicates 1, 2, and 3 are shown in panels A, B, and C, respectively. Guinea pig ID numbers are shown in the upper right corner of each plot. Nasal lavage titers are indicated by the total height of each bar. Colors within the bars represent unique barcodes, and the height of each color indicates the relative frequency within the sample. Only samples that were plaque-positive are shown. Red lines show LOD of 50 PFU/mL. Plots for individual animals are paired with those of their cage mate. For the exposed animal in the first aerosol transmission pair of Replicate 1 (GP10), most reads were discarded due to poor quality, and the data from this animal were excluded from further analyses. Data underlying this figure are available in S2 Data and at https://doi.org/10.5281/zenodo.16115331.

https://doi.org/10.1371/journal.pbio.3003352.s002

(PDF)

S3 Fig. Changes in richness and evenness both contribute to alterations in diversity.

Shannon diversity (left), richness (right, left axis), and evenness (right, right axis) were determined for inoculated animals (i, blue) and exposed animals (e, red) in replicate experiments 1, 2, and 3 (A, B, and C, respectively). Data from cage mates are paired. In the right-hand plots, bold colors show richness and faded colors show evenness. Line widths show 95% confidence intervals based upon subsampling of barcode reads. Underlying data available in S2 Data.

https://doi.org/10.1371/journal.pbio.3003352.s003

(PDF)

S4 Fig. Targeted amplicon sequencing of the barcode region is reliable and reproducible.

A) Whole-genome sequencing (WGS) was performed on a subset of samples from experimental replicates 1 and 2. WGS reads that spanned the length of the barcode region were compared to targeted barcode sequencing (amplicon or Amp) results for each sample. Each graph is labeled with the experiment number (E1, E2, or E2), the guinea pig number (GP01, GP03, etc.), and the day post-inoculation (D1, D4, etc.). The number of reads mapping to the barcode region is indicated above the stacked bar plots. B) A subset of positive nasal wash samples from two animals (GP02, GP27) were sequenced in triplicate to evaluate reproducibility of sequencing results. These were compared to an animal that was plaque-negative throughout the experiment (GP25). C) Bray-Curtis Dissimilarity values between replicate sequencing runs from each sample are plotted for the plaque-negative animal and the plaque-positive animals (*** p-value < 0.0005, Wilcoxon Sign-Rank test). Underlying data can be found in S2 Data and at https://doi.org/10.5281/zenodo.16115331.

https://doi.org/10.1371/journal.pbio.3003352.s004

(PDF)

S5 Fig. Growth-induced bottlenecks are not detected in inoculated animals, even at low inoculation doses.

At each inoculation dose indicated above the plots, three guinea pigs were inoculated intranasally with Pan/99 NA-BC. A) Nasal lavage titers and barcode compositions over time. The height of the bars indicates viral titer in samples above the limit of detection (50 PFU/mL). Colors indicate barcodes detected, with barcode frequency shown by the height of the color. B) Maximum measured Shannon diversity, richness, and evenness of barcode compositions in each animal are plotted by inoculation dose (blue). For comparison, theoretical data are plotted to show the expected characteristics of inocula of each size derived from a perfectly even population of 4,096 barcodes (red) and from the Pan/99 NA-BC passage 1 stock (orange). Ninety-five% confidence intervals are shown with shading. Statistical significance was determined at each inoculation dose by Kruskal–Wallis Test (* p-value < 0.01, ** p-value < 0.001) comparing the medians of the experimental data with those of the perfectly even population and that of the passage 1 viral stock. Underlying data can be found in S2 Data and at https://doi.org/10.5281/zenodo.16115331.

https://doi.org/10.1371/journal.pbio.3003352.s005

(PDF)

Acknowledgments

We are grateful to Katia Koelle for helpful discussion.

References

  1. 1. Nobusawa E, Sato K. Comparison of the mutation rates of human influenza A and B viruses. J Virol. 2006;80(7):3675–8. pmid:16537638
  2. 2. Smith DB, Inglis SC. The mutation rate and variability of eukaryotic viruses: an analytical review. J Gen Virol. 1987;68 ( Pt 11):2729–40. pmid:3316486
  3. 3. Boni MF, Zhou Y, Taubenberger JK, Holmes EC. Homologous recombination is very rare or absent in human influenza A virus. J Virol. 2008;82(10):4807–11. pmid:18353939
  4. 4. Suarez DL, Senne DA, Banks J, Brown IH, Essen SC, Lee C-W, et al. Recombination resulting in virulence shift in avian influenza outbreak, Chile. Emerg Infect Dis. 2004;10(4):693–9. pmid:15200862
  5. 5. Marshall N, Priyamvada L, Ende Z, Steel J, Lowen AC. Influenza virus reassortment occurs with high frequency in the absence of segment mismatch. PLoS Pathog. 2013;9(6):e1003421. pmid:23785286
  6. 6. Barbezange C, Jones L, Blanc H, Isakov O, Celniker G, Enouf V, et al. Seasonal genetic drift of human influenza a virus quasispecies revealed by deep sequencing. Front Microbiol. 2018;9:2596. pmid:30429836
  7. 7. Visher E, Whitefield SE, McCrone JT, Fitzsimmons W, Lauring AS. The mutational robustness of influenza A virus. PLoS Pathog. 2016;12(8):e1005856. pmid:27571422
  8. 8. Simon B, Pichon M, Valette M, Burfin G, Richard M, Lina B, et al. Whole genome sequencing of A(H3N2) influenza viruses reveals variants associated with severity during the 2016–2017 season. Viruses. 2019;11(2):108. pmid:30695992
  9. 9. Nelson MI, Holmes EC. The evolution of epidemic influenza. Nat Rev Genet. 2007;8(3):196–205. pmid:17262054
  10. 10. Spielman SJ, Weaver S, Shank SD, Magalis BR, Li M, Kosakovsky Pond SL. Evolution of viral genomes: interplay between selection, recombination, and other forces. Methods Mol Biol. 2019;1910:427–68. pmid:31278673
  11. 11. Lynch M, Ackerman MS, Gout J-F, Long H, Sung W, Thomas WK, et al. Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet. 2016;17(11):704–14. pmid:27739533
  12. 12. Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008;453(7195):615–9. pmid:18418375
  13. 13. McCrone JT, Woods RJ, Martin ET, Malosh RE, Monto AS, Lauring AS. Stochastic processes constrain the within and between host evolution of influenza virus. Elife. 2018;7:e35962. pmid:29683424
  14. 14. Han AX, Felix Garza ZC, Welkers MR, Vigeveno RM, Tran ND, Le TQM, et al. Within-host evolutionary dynamics of seasonal and pandemic human influenza A viruses in young children. Elife. 2021;10:e68917. pmid:34342576
  15. 15. Sobel Leonard A, McClain MT, Smith GJD, Wentworth DE, Halpin RA, Lin X, et al. The effective rate of influenza reassortment is limited during human infection. PLoS Pathog. 2017;13(2):e1006203. pmid:28170438
  16. 16. McCrone JT, Lauring AS. Genetic bottlenecks in intraspecies virus transmission. Curr Opin Virol. 2018;28:20–5. pmid:29107838
  17. 17. Valesano AL, Taniuchi M, Fitzsimmons WJ, Islam MO, Ahmed T, Zaman K, et al. The early evolution of oral poliovirus vaccine is shaped by strong positive selection and tight transmission bottlenecks. Cell Host Microbe. 2021;29(1):32-43.e4. pmid:33212020
  18. 18. Wang GP, Sherrill-Mix SA, Chang K-M, Quince C, Bushman FD. Hepatitis C virus transmission bottlenecks analyzed by deep sequencing. J Virol. 2010;84(12):6218–28. pmid:20375170
  19. 19. Edwards CTT, Holmes EC, Wilson DJ, Viscidi RP, Abrams EJ, Phillips RE, et al. Population genetic estimation of the loss of genetic diversity during horizontal transmission of HIV-1. BMC Evol Biol. 2006;6:28. pmid:16556318
  20. 20. Smith DR, Adams AP, Kenney JL, Wang E, Weaver SC. Venezuelan equine encephalitis virus in the mosquito vector Aedes taeniorhynchus: infection initiated by a small number of susceptible epithelial cells and a population bottleneck. Virology. 2008;372(1):176–86. pmid:18023837
  21. 21. Forrester NL, Coffey LL, Weaver SC. Arboviral bottlenecks and challenges to maintaining diversity and fitness during mosquito transmission. Viruses. 2014;6(10):3991–4004. pmid:25341663
  22. 22. Carlson JM, Schaefer M, Monaco DC, Batorsky R, Claiborne DT, Prince J, et al. HIV transmission. Selection bias at the heterosexual HIV-1 transmission bottleneck. Science. 2014;345(6193):1254031. pmid:25013080
  23. 23. Sugrue E, Wickenhagen A, Mollentze N, Aziz MA, Sreenu VB, Truxa S, et al. The apparent interferon resistance of transmitted HIV-1 is possibly a consequence of enhanced replicative fitness. PLoS Pathog. 2022;18(11):e1010973. pmid:36399512
  24. 24. Itell HL, Guenthoer J, Humes D, Baumgarten NE, Overbaugh J. Host cell glycosylation selects for infection with CCR5- versus CXCR4-tropic HIV-1. Nat Microbiol. 2024;9(11):2985–96. pmid:39363105
  25. 25. Varble A, Albrecht RA, Backes S, Crumiller M, Bouvier NM, Sachs D, et al. Influenza A virus transmission bottlenecks are defined by infection route and recipient host. Cell Host Microbe. 2014;16(5):691–700. pmid:25456074
  26. 26. Frise R, Bradley K, van Doremalen N, Galiano M, Elderfield RA, Stilwell P, et al. Contact transmission of influenza virus between ferrets imposes a looser bottleneck than respiratory droplet transmission allowing propagation of antiviral resistance. Sci Rep. 2016;6:29793. pmid:27430528
  27. 27. Hughes J, Allen RC, Baguelin M, Hampson K, Baillie GJ, Elton D, et al. Transmission of equine influenza virus during an outbreak is characterized by frequent mixed infections and loose transmission bottlenecks. PLoS Pathog. 2012;8(12):e1003081. pmid:23308065
  28. 28. Ferreri LM, Geiger G, Seibert B, Obadan A, Rajao D, Lowen AC, et al. Intra- and inter-host evolution of H9N2 influenza A virus in Japanese quail. Virus Evol. 2022;8(1):veac001. pmid:35223084
  29. 29. Braun KM, Haddock Iii LA, Crooks CM, Barry GL, Lalli J, Neumann G, et al. Avian H7N9 influenza viruses are evolutionarily constrained by stochastic processes during replication and transmission in mammals. Virus Evol. 2023;9(1):vead004. pmid:36814938
  30. 30. Zaraket H, Baranovich T, Kaplan BS, Carter R, Song M-S, Paulson JC, et al. Mammalian adaptation of influenza A(H7N9) virus is limited by a narrow genetic bottleneck. Nat Commun. 2015;6:6553. pmid:25850788
  31. 31. Wilker PR, Dinis JM, Starrett G, Imai M, Hatta M, Nelson CW, et al. Selection on haemagglutinin imposes a bottleneck during mammalian transmission of reassortant H5N1 influenza viruses. Nat Commun. 2013;4:2636. pmid:24149915
  32. 32. Magurran AE. Ecological diversity and its measurement. Princeton, NJ: Princeton University Press. 1988.
  33. 33. Tao H, Steel J, Lowen AC. Intrahost dynamics of influenza virus reassortment. J Virol. 2014;88(13):7485–92. pmid:24741099
  34. 34. Xue KS, Bloom JD. Linking influenza virus evolution within and between human hosts. Virus Evol. 2020;6(1):veaa010. pmid:32082616
  35. 35. McCune BT, Lanahan MR, tenOever BR, Pfeiffer JK. Rapid Dissemination and monopolization of viral populations in mice revealed using a panel of barcoded viruses. J Virol. 2020;94(2):e01590-19. pmid:31666382
  36. 36. Bacsik DJ, Dadonaite B, Butler A, Greaney AJ, Heaton NS, Bloom JD. Influenza virus transcription and progeny production are poorly correlated in single cells. Elife. 2023;12:RP86852. pmid:37675839
  37. 37. Heldt FS, Kupke SY, Dorl S, Reichl U, Frensing T. Single-cell analysis and stochastic modelling unveil large cell-to-cell variability in influenza A virus infection. Nat Commun. 2015;6:8938. pmid:26586423
  38. 38. Jacobs NT, Onuoha NO, Antia A, Steel J, Antia R, Lowen AC. Incomplete influenza A virus genomes occur frequently but are readily complemented during localized viral spread. Nat Commun. 2019;10(1):3526. pmid:31387995
  39. 39. Brooke CB, Ince WL, Wrammert J, Ahmed R, Wilson PC, Bennink JR, et al. Most influenza a virions fail to express at least one essential viral protein. J Virol. 2013;87(6):3155–62. pmid:23283949
  40. 40. Kupke SY, Riedel D, Frensing T, Zmora P, Reichl U. A novel type of influenza A virus-derived defective interfering particle with nucleotide substitutions in its genome. J Virol. 2019;93(4):e01786-18. pmid:30463972
  41. 41. VON MAGNUS P. Incomplete forms of influenza virus. Adv Virus Res. 1954;2:59–79. pmid:13228257
  42. 42. Ranum JN, Ledwith MP, Alnaji FG, Diefenbacher M, Orton R, Sloan E, et al. Cryptic proteins translated from deletion-containing viral genomes dramatically expand the influenza virus proteome. Nucleic Acids Res. 2024;52(6):3199–212. pmid:38407436
  43. 43. Zath GK, Thomas MM, Loveday EK, Bikos DA, Sanche S, Ke R, et al. Influenza A viral burst size from thousands of infected single cells using droplet quantitative PCR (dqPCR). PLoS Pathog. 2024;20(7):e1012257. pmid:38950082
  44. 44. Farrell A, Phan T, Brooke CB, Koelle K, Ke R. Semi-infectious particles contribute substantially to influenza virus within-host dynamics when infection is dominated by spatial structure. Virus Evol. 2023;9(1):vead020. pmid:37538918
  45. 45. Sims A, Tornaletti LB, Jasim S, Pirillo C, Devlin R, Hirst JC, et al. Superinfection exclusion creates spatially distinct influenza virus populations. PLoS Biol. 2023;21(2):e3001941. pmid:36757937
  46. 46. Delima GK, Ganti K, Holmes KE, Shartouny JR, Lowen AC. Influenza A virus coinfection dynamics are shaped by distinct virus-virus interactions within and between cells. PLoS Pathog. 2023;19(3):e1010978. pmid:36862762
  47. 47. Pearson JE, Krapivsky P, Perelson AS. Stochastic theory of early viral infection: continuous versus burst production of virions. PLoS Comput Biol. 2011;7(2):e1001058. pmid:21304934
  48. 48. Ramos I, Smith G, Ruf-Zamojski F, Martínez-Romero C, Fribourg M, Carbajal EA, et al. Innate immune response to influenza virus at single-cell resolution in human epithelial cells revealed paracrine induction of interferon lambda 1. J Virol. 2019;93(20):e00559-19. pmid:31375585
  49. 49. Heaton NS, Langlois RA, Sachs D, Lim JK, Palese P, tenOever BR. Long-term survival of influenza virus infected club cells drives immunopathology. J Exp Med. 2014;211(9):1707–14. pmid:25135297
  50. 50. Fiege JK, Stone IA, Dumm RE, Waring BM, Fife BT, Agudo J, et al. Long-term surviving influenza infected cells evade CD8+ T cell mediated clearance. PLoS Pathog. 2019;15(9):e1008077. pmid:31557273
  51. 51. Ferreri LM, Seibert B, Caceres CJ, Patatanian K, Holmes KE, Gay LC, et al. Dispersal of influenza virus populations within the respiratory tract shapes their evolutionary potential. Proc Natl Acad Sci U S A. 2025;122(4):e2419985122. pmid:39835898
  52. 52. Ganti K, Bagga A, Carnaccini S, Ferreri LM, Geiger G, Joaquin Caceres C, et al. Influenza A virus reassortment in mammals gives rise to genetically distinct within-host subpopulations. Nat Commun. 2022;13(1):6846. pmid:36369504
  53. 53. Amato KA, Haddock LA 3rd, Braun KM, Meliopoulos V, Livingston B, Honce R, et al. Influenza A virus undergoes compartmentalized replication in vivo dominated by stochastic bottlenecks. Nat Commun. 2022;13(1):3416. pmid:35701424
  54. 54. Shartouny JR, Lee C-Y, Delima GK, Lowen AC. Beneficial effects of cellular coinfection resolve inefficiency in influenza A virus transcription. PLoS Pathog. 2022;18(9):e1010865. pmid:36121893
  55. 55. Ferguson NM, Galvani AP, Bush RM. Ecological and immunological determinants of influenza evolution. Nature. 2003;422(6930):428–33. pmid:12660783
  56. 56. Koelle K, Cobey S, Grenfell B, Pascual M. Epochal evolution shapes the phylodynamics of interpandemic influenza A (H3N2) in humans. Science. 2006;314(5807):1898–903. pmid:17185596
  57. 57. Fitch WM, Leiter JM, Li XQ, Palese P. Positive Darwinian evolution in human influenza A viruses. Proc Natl Acad Sci U S A. 1991;88(10):4270–4. pmid:1840695
  58. 58. Debbink K, McCrone JT, Petrie JG, Truscon R, Johnson E, Mantlo EK, et al. Vaccination has minimal impact on the intrahost diversity of H3N2 influenza viruses. PLoS Pathog. 2017;13(1):e1006194. pmid:28141862
  59. 59. Valesano AL, Fitzsimmons WJ, McCrone JT, Petrie JG, Monto AS, Martin ET, et al. Influenza B viruses exhibit lower within-host diversity than influenza A viruses in human hosts. J Virol. 2020;94(5):e01710-19. pmid:31801858
  60. 60. Valesano AL, Rumfelt KE, Dimcheff DE, Blair CN, Fitzsimmons WJ, Petrie JG, et al. Temporal dynamics of SARS-CoV-2 mutation accumulation within and across infected hosts. PLoS Pathog. 2021;17(4):e1009499. pmid:33826681
  61. 61. Gu H, Quadeer AA, Krishnan P, Ng DYM, Chang LDJ, Liu GYZ, et al. Within-host genetic diversity of SARS-CoV-2 lineages in unvaccinated and vaccinated individuals. Nat Commun. 2023;14(1):1793. pmid:37002233
  62. 62. Braun KM, Moreno GK, Wagner C, Accola MA, Rehrauer WM, Baker DA, et al. Acute SARS-CoV-2 infections harbor limited within-host diversity and transmit via tight transmission bottlenecks. PLoS Pathog. 2021;17(8):e1009849. pmid:34424945
  63. 63. Kistler KE, Huddleston J, Bedford T. Rapid and parallel adaptive mutations in spike S1 drive clade success in SARS-CoV-2. Cell Host Microbe. 2022;30(4):545-555.e4. pmid:35364015
  64. 64. Sinclair P, Zhao L, Beggs CB, Illingworth CJR. The airborne transmission of viruses causes tight transmission bottlenecks. Nat Commun. 2024;15(1):3540. pmid:38670957
  65. 65. Chen H, Ye J, Xu K, Angel M, Shao H, Ferrero A, et al. Partial and full PCR-based reverse genetics strategy for influenza viruses. PLoS One. 2012;7(9):e46378. pmid:23029501
  66. 66. Zhou B, Wentworth DE. Influenza A virus molecular virology techniques. Methods Mol Biol. 2012;865:175–92. pmid:22528160
  67. 67. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–3. pmid:27214047
  68. 68. Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002;12(4):656–64. pmid:11932250
  69. 69. Chung NC, Miasojedow B, Startek M, Gambin A. Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data. BMC Bioinformatics. 2019;20(Suppl 15):644. pmid:31874610
  70. 70. Bray JR, Curtis JT. An ordination of the upland forest communities of southern Wisconsin. Ecological Monographs. 1957;27(4):325–49.