Molecularly barcoded Zika virus libraries to probe in vivo evolutionary dynamics

Defining the complex dynamics of Zika virus (ZIKV) infection in pregnancy and during transmission between vertebrate hosts and mosquito vectors is critical for a thorough understanding of viral transmission, pathogenesis, immune evasion, and potential reservoir establishment. Within-host viral diversity in ZIKV infection is low, which makes it difficult to evaluate infection dynamics. To overcome this biological hurdle, we constructed a molecularly barcoded ZIKV. This virus stock consists of a “synthetic swarm” whose members are genetically identical except for a run of eight consecutive degenerate codons, which creates approximately 64,000 theoretical nucleotide combinations that all encode the same amino acids. Deep sequencing this region of the ZIKV genome enables counting of individual barcodes to quantify the number and relative proportions of viral lineages present within a host. Here we used these molecularly barcoded ZIKV variants to study the dynamics of ZIKV infection in pregnant and non-pregnant macaques as well as during mosquito infection/transmission. The barcoded virus had no discernible fitness defects in vivo, and the proportions of individual barcoded virus templates remained stable throughout the duration of acute plasma viremia. ZIKV RNA also was detected in maternal plasma from a pregnant animal infected with barcoded virus for 67 days. The complexity of the virus population declined precipitously 8 days following infection of the dam, consistent with the timing of typical resolution of ZIKV in non-pregnant macaques and remained low for the subsequent duration of viremia. Our approach showed that synthetic swarm viruses can be used to probe the composition of ZIKV populations over time in vivo to understand vertical transmission, persistent reservoirs, bottlenecks, and evolutionary dynamics.

Introduction Zika virus (ZIKV; Flaviviridae, Flavivirus) infection during pregnancy can cause congenital Zika syndrome (CZS)-a collection of neurological, visual, auditory, and developmental birth defects-in at least 5% of babies [1]. The frequency of vertical transmission is not known, although data suggest that it may be very common, especially if infection occurs during the first trimester [2]. For both pregnant and nonpregnant women, it was previously thought that ZIKV caused an acute self-limiting infection that was resolved in a matter of days. It is now clear that ZIKV can persist for months in other body tissues after it is no longer detectable in blood and in the absence of clinical symptoms [2][3][4][5][6][7]. During pregnancy, unusually prolonged maternal viremia has been noted, with viral RNA detected in maternal blood up to 107 days after symptom onset [8][9][10][11]. The source of virus responsible for prolonged viremia is not known, though it has been speculated that this residual plasma viral load could represent virus genome release from maternal tissues, the placenta, and/or the fetus.
Recently, we established Indian-origin rhesus macaques (Macaca mulatta) as a relevant animal model to understand ZIKV infection during pregnancy, demonstrating that ZIKV can be detected in plasma, CSF, urine, and saliva. In nonpregnant animals viremia was essentially resolved by 10 days post infection [12,13]. In contrast, in pregnant monkeys infected in either the first or third trimester of pregnancy, viremia was prolonged, and was associated with decreased head growth velocity and consistent vertical transmission [2]. Strikingly, significant ocular pathology was noted in fetuses of dams infected with French Polynesian ZIKV during the first trimester [2]. We also showed that viral loads were prolonged in pregnant macaques despite robust maternal antibodies [2]. We therefore aimed to better understand the in vivo replication and evolutionary dynamics of ZIKV infection in this relevant animal model.
To do this, we developed a novel "synthetic swarm" virus based on a pathogenic molecular ZIKV clone that allows for tracking and monitoring of individual viral lineages. The synthetic swarm consists of viruses that are engineered to be genetically identical except for a run of 8 consecutive degenerate nucleotides present in up to~64,000 theoretical combinations that all encode the same amino acid sequence. This novel barcoded virus is replication competent in vitro and in vivo, and the number and relative proportion of each barcode can be characterized by deep sequencing to determine if the population composition changes among or within hosts. Here and in a companion manuscript by Weger-Lucarelli et al., we demonstrate that this system will provide a useful tool to study the complexity of ZIKV populations within and among hosts; for example, this system can assess bottlenecks following various types of transmission and determine whether non-sterilizing prophylaxis and therapeutics impact the composition of the virus population. Moreover, data from molecularly barcoded viruses will help inform research of ZIKV infection during pregnancy by providing a better understanding of the kinetics of tissue reservoir establishment, maintenance, and reseeding.

Generation and characterization of a molecularly barcoded virus stock
Molecular barcoding has been a useful tool to study viruses including simian immunodeficiency virus, influenza virus, poliovirus, Venezuelan equine encephalitis virus, and West Nile virus, establishing conceptual precedent for our approach [14][15][16][17][18][19][20]. To generate barcoded ZIKV, we introduced a run of eight consecutive degenerate codons into a region of NS2A (amino acids 144-151) that allows for every possible synonymous mutation to occur in the ZIKV infectious molecular clone (ZIKV-IC) derived from the Puerto Rican isolate ZIKV-PR-VABC59 [21]. Following bacteria-free cloning and rolling circle amplification (RCA), linearized and purified RCA reaction products were used for virus production via transfection of Vero cells. All produced virus was collected, pooled, and aliquoted into single-use aliquots, such that single aliquots contain a representative sampling of all genetic variants generated; this barcoded synthetic swarm virus was termed ZIKV-BC-1.0.
We used a multiplex-PCR approach to deep sequence the entire coding genome of the ZIKV-BC-1.0 stock, as well as the ZIKV-IC from which ZIKV-BC-1.0 was derived. For each stock, 1 x 10 6 viral RNA templates were used in each cDNA synthesis reaction ( Table 1), and both stocks were sequenced in duplicate. We identified two nucleotide positions outside of the barcode region that encoded fixed differences between ZIKV-IC and ZIKV-BC-1.0, when compared to the KU501215 reference that we used for mapping. The variant at site 1964 encodes a nonsynonymous change (V to L) in Envelope, and the variant at site 8488 encodes a synonymous substitution in NS5. The variant at site 1964 was also present in our ZIKV-PRVABC59 stock (see [22]), and Genbank contains records for two sequences that match this sequence (accession numbers KX087101 and KX601168) and two that do not (KU501215 and KX377337). In addition, a single nucleotide position in NS5 (site 9581) contained an 80/20 ratio of C-to-T nucleotide substitutions in ZIKV-BC-1.0 but was fixed as a C in ZIKV-IC. The C-to-T change is a synonymous mutation in a leucine codon. There were no other high-frequency

Diversity of barcode sequences in the stock of ZIKV-BC-1.0
We then characterized the diversity of barcode sequences present in the ZIKV-BC-1.0 stock prior to in vitro and in vivo studies. We used three separate approaches to define which barcodes to consider 'authentic.' In the first approach (Approach 'A'), we identified all the distinct non-WT barcodes that were detected in the ZIKV-IC and the ZIKV-BC-1.0 stocks in the region of NS2A encompassing the barcode. We then calculated the arithmetic mean (0.0018%) plus 3 times the standard deviation (0.016%) of the frequency of all the non-WT barcodes present in the two replicates of the ZIKV-IC stock, even if the frequency of a specific barcode in the ZIKV-IC stock was 0%. This threshold frequency was 0.049%. For the second approach (Approach 'B'), we calculated the arithmetic mean (0.012%) plus 3 times the standard deviation (0.040%) of the frequency of all the non-WT barcodes present only in the ZIKV-IC stock. This threshold frequency was 0.13%. For the final method (Approach 'C'), we identified the highest frequency of the most common non-WT barcode present in either replicate of the ZIKV-IC stock. This threshold frequency was 0.57%. As this third calculation was the most conservative, we used 0.57% to be the minimum threshold to consider a barcode in ZIKV-BC-1.0 as 'authentic.' Using this value, we included 20 sequences in our list of authentic barcodes, and these were followed throughout the study. These barcodes were given independent labels (e.g. Zika_BC01, Zika_BC02, etc.) to simplify reporting. The wild type barcode sequence was also tracked, and it is labeled Zika_WT. All remaining sequences that were detected were labeled as 'Other' (see S1-S3 Tables for barcodes identified using all three approaches).
To ascertain whether input RNA template numbers influenced barcode composition, we sequenced a dilution series of viral RNA templates in triplicate (Fig 1 and S4 and S5 Tables). When we used 10,000 or 2000 input vRNA templates, we detected all 20 barcodes. For 500, 250, 100, and 50 input templates, the average number of enumerated barcodes was 17.3 ± 0.9, 14.3 ± 1.2, 12.7 ± 0.5, and 6.7 ± 0.5, respectively. These observations suggest that some barcodes are lost from the population when the number of input templates is reduced.
We also examined diversity and similarity across sequencing replicates in this titration experiment using all the detected sequences, including the sequences labeled as 'Other.' Not surprisingly, Simpson's diversity increased when a greater number of input templates were used, plateauing at 500 input copies (S1 Fig). When comparing similarity across replicates, the samples with 2,000 and 10,000 inputs had the highest Morisita-Horn similarity index (S2 Fig). Unfortunately, it was not possible to obtain a large number of input templates at all timepoints from ZIKV-infected pregnant animals; therefore, the absence of a barcode in sequencing reads from a particular experiment could mean that either the barcode was not present at that timepoint or that it was present in the biological sample but not at a high enough concentration to be detected when sequencing from a small number of templates (S3 Fig).

Molecularly-barcoded ZIKV in vivo replication kinetics and barcode dynamics
Prior to use in nonhuman primates, viral infectivity and replication of ZIKV-BC-1.0 was assessed in vitro using Vero, LLC-MK2, C6/36, and Aag2 cells. Viral growth curves were similar between ZIKV-BC-1.0, infectious clone-derived virus (ZIKV-IC), and wild-type ZIKV-PR-VABC59 (ZIKV-PR) (S4 Fig and Weger-Lucarelli et al., manuscript submitted). These results suggested that insertion of degenerate nucleotides in the barcode viral genome did not have a significant deleterious effect on either infectivity or replicative capacity in vitro, but we cannot exclude the possibility that different barcodes may have different effects with respect to each other. To confirm that ZIKV-BC-1.0 did not have any replication defects in vivo, we assessed its replication capacity in rhesus macaques. Three rhesus macaques were inoculated subcutaneously with 1 x 10 4 PFU of ZIKV-BC-1.0. All three animals were productively infected with ZIKV-BC-1.0, with detectable plasma viral loads one day post inoculation (dpi) (Fig 2). Plasma viral loads in all three animals peaked between two and four dpi and ranged from 2.34 x 10 3 to 9.77 x 10 4 vRNA copies/ml. Indeed, ZIKV-BC-1.0 displayed viral replication kinetics comparable to ZIKV-IC and ZIKV-PR (Fig 2), and replication kinetics were comparable to previous studies with other strains of ZIKV in nonpregnant rhesus macaques [12,13,23]. To compare overall replication kinetics, the data were log 10 -transformed and area under the curve (AUC) was calculated. One-way ANOVA then was conducted to compare AUC between groups and the data were not significantly different [F(2,6) = 0.887, p = 0.460] (S6 Table).
We also infected a single pregnant macaque (776301) by subcutaneous inoculation of 1 x 10 4 PFU of ZIKV-BC-1.0. This animal had been exposed to dengue virus serotype 3 (DENV-3; strain Sleman/78) approximately one year prior to inoculation with ZIKV-BC-1.0. To evaluate cross-reactive neutralizing antibody (nAb) responses elicited by prior exposure to DENV-3 in this animal, serum was obtained prior to inoculation with ZIKV-BC-1.0. Neutralization curves with both DENV-3 and ZIKV revealed that DENV-3 immune sera did not cross-react with ZIKV, whereas DENV-3 was potently neutralized (Fig 3A). The animal then was infected with ZIKV-BC-1.0 at 35 days of gestation (mid-first trimester; rhesus term is 165 ± 10 days) and had detectable plasma viral loads for 67 dpi (Fig 3B); consistent with replication kinetics of wildtype ZIKV in both pregnant macaques [2] and humans [8,9,24]. The animal also had four days of detectable vRNA in urine but no detectable vRNA (Fig 3B) in the amniotic fluid on 22, 36, 50, or 120 dpi (57, 71, 85, 155 days gestation, respectively). By 29 dpi neutralization curves of both viruses revealed a similar profile, indicating the production of a robust maternal nAb response to ZIKV (Fig 3A) coincident with prolonged plasma viral loads, similar to what has been shown previously in other ZIKV-infected pregnant macaques [2]. DENV-3 neutralization curves at 0 and 29 dpi were indistinguishable (Fig 3A).  Table. Each dilution was sequenced in triplicate, with individual replicates labeled A, B, and C. As a comparison, data for ZIKV-BC-1.0 that was collected by the multiplex PCR approach (Table 1)  The pregnancy progressed without adverse outcomes, and at 155 days of gestation, the fetus was surgically delivered, euthanized, and tissues collected. The fetus had no evidence of microcephaly or other abnormalities upon gross examination. Approximately 60 fetal and maternal tissues (see S7 Table for a complete list) were collected for histopathology and vRNA by QRT-PCR. No ZIKV RNA was detected in any samples collected from the fetus. This was surprising, as from seven neonatal macaques we have examined to date (zika.labkey.com), this was the only animal found not to have detectable ZIKV RNA in tissues. Still, ZIKV RNA was detected at the maternal-fetal interface in a section of placental disc (S7 Table). Fetal histology also revealed neutrophilic infiltration of the spleen (Fig 4A), minimal to mild suppurative lymphadenitis of the inguinal lymph node (Fig 4B), minimal multifocal lymphocytic deciduitis, mild multifocal placental infarction with suppurative villositis (Fig 4C and 4D), but normal CNS anatomy, similar to changes noted in previous in utero ZIKV infections [2,25,26]. These data provide indirect evidence that vertical transmission did occur and demonstrate that ZIKV-BC-1.0 is fully functional in vivo with replication kinetics indistinguishable from other ZIKV strains. Thus, inclusion of the barcode did not detectably impair infectivity or replication in adult macaques.

Evaluation of barcodes during acute infection of nonpregnant macaques
We deep sequenced the viruses replicating in the nonpregnant animals who were infected with ZIKV-BC-1.0 and ZIKV-IC (Fig 5A and 5B, Tables 2 and S8). In each group of three animals, we sequenced viruses at two time points from two animals, and then one time point from a third animal. In animals infected with ZIKV-IC, we found that >95% of sequences in the virus stock and all three animals were wild type across the 24 nucleotides that corresponded to where the barcode was located in ZIKV-BC-1.0.
We counted the number of authentic barcodes detected in the stock and the plasma of the nonpregnant animals infected with ZIKV-BC-1.0. We detected a range of 8 to 20 authentic barcodes in these samples (Fig 5C). We then compared the frequency distribution of the individual barcodes in the plasma of these three animals relative to that in the stock to assess whether there was any evidence for a bottleneck that influenced overall barcode distribution. This was accomplished using two independent statistical approaches. The first compared the frequency distributions by a stochastic equality test, which compares several random pairs of individual values taken from the two samples to test whether there is a significant tendency to get higher values in one over the other [27]. In all three animals, the frequency of each barcode at day two (514982) or at day three (715132 and 688387) was not significantly different from the frequency of each barcode in the stock: p-value based on 1000 bootstrap replications = 0.478, 0.602, and 0.114, respectively. Likewise, the frequency of each barcode at day two (514982) or at day three (715132) was not significantly different from the frequency of each barcode in the stock when compared using the Kolmogorv-Smirnov test: p-value = 0.358 and 0.841. However, this approach did detect a significant difference between the frequency of each barcode in 688387 at day three: p-value = 0.0021, but we believe this to be an artifact of the animal's viral load at this time point, because the frequency of each barcode at day five did not differ significantly from the frequency of each barcode in the stock, p-value = 0.358. These Immune sera from a macaque infected with ZIKV-BC-1.0 during the first trimester of pregnancy was tested for its capacity to neutralize DENV-3 (blue dashes) and ZIKV-PR (blue). Infection was measured by plaque reduction neutralization test (PRNT) and is expressed relative to the infectivity of ZIKV-PR in the absence of serum. The concentration of sera indicated on the x-axis is expressed as log10 (dilution factor of serum). The EC90 and EC50, estimated by non-linear regression analysis, are also indicated by a dashed line. Neutralization curves for each virus (ZIKV, solid blue; DENV-3, dashed blue) at 0 (open symbols) and 28 (closed symbols) dpi are shown. B.) Zika vRNA copies per ml blood plasma (solid lines) or urine (dashed line). Blue tracings represent the animal infected with ZIKV-BC-1.0 at 35 days gestation. The day of gestation is estimated +/-2 days. Grey tracings represent viremia in nonpregnant/male rhesus monkeys infected with the identical dose of ZIKV-BC-1.0 (Fig 2). The y-axis crosses the x-axis at the limit of quantification of the qRT-PCR assay (100 vRNA copies/ml).
https://doi.org/10.1371/journal.ppat.1006964.g003 data in conjunction with the results of the stochastic equality test therefore suggest that there was no evidence for changes in barcode frequency compared to input.
We also examined the sequences outside the barcode region to determine if there were additional nucleotide differences present in the virus population as it replicated in animals. There were small fluctuations in some viral SNPs, but we detected no dramatic shifts in nucleotide frequencies among viruses replicating in vivo, except at site 9581, which is synonymous. In the ZIKV-BC-1.0 stock, there was a mixture of T and C nucleotides (22% and 78% of sequences, respectively) at this site. This position remained a mixture in the animals, but the  ratios fluctuated. It dipped to a ratio of 10/90 in animal 688387 at day 5 to as high as 30/70 in animal 715132 at day 5. Overall, there were no new mutations that were detected at greater than 10% frequency in both replicates in the virus populations during the first 5 days after infection in nonpregnant animals. Unfortunately, this site was too far from the barcode (~5000bp) for us to obtain linkage on the same set of paired sequences, making it impossible to know whether this particular nucleotide change was carried on specific barcodes.

Evaluation of barcodes during pregnancy
We also deep sequenced the barcode in virus populations replicating in the one pregnant animal (776301) infected with ZIKV-BC-1.0. Recognizing that the later time points from this animal had persistent, but low plasma viral loads, we modified our sequencing approach to prepare one tube of cDNA, and then split it into two independent PCR reactions that amplified small fragments (131bp and 178bp) spanning the region containing the barcode (Fig 6A, Tables 3 and S9). We quantified the number of authentic barcodes we detected using the same parameters described in the previous section (Fig 6B). At days 3, 5, and 7, we detected all 20 barcodes. For the remainder of the infection, we detected 8.1 ± 2.3 barcodes. Likewise, barcode diversity, as measured by Simpson's diversity index, also declined beginning at day 8 and remained low throughout the duration of infection (Fig 6C). Interestingly, some barcodes, such as Zika_BC02, were not detected at later time points, even though it had been present at~15% during early infection. Other barcodes, such as Zika_BC07, 08, and 09, became more common at later time points, even though they were only present at~2-5% during early infection. Unfortunately, with such low virus input templates at the late time points, there were differences between replicates indicative of sampling uncertainty. With the exception of two samples (day57_A and day60_B), however, greater than 85% of the sequences matched one of the 20 authentic barcodes.  Using ZIKV-BC-1.0 to evaluate transmission bottlenecks To begin to understand potential transmission bottlenecks within the vector and the impact they might have on ZIKV population diversity, Aedes aegypti vector competence for ZIKV-BC-1.0 was evaluated at days 7, 13, and 25 days post feeding (PF) from mosquitoes that were exposed to the pregnant macaque at 4 dpi. A single Ae. aegypti (out of 90 tested) was transmission-competent at All other mosquitoes screened using this methodology were ZIKV-negative. We also found low mosquito infection rates in a previous study exposing mosquitoes to ZIKV-infected rhesus macaques [22]. We deep sequenced virus (viral template numbers added to cDNA synthesis reactions are listed in S10 Table) from all three anatomic compartments from this mosquito (body, leg, and saliva), and we only detected the presence of a single barcode: Zika_BC02. The viral loads in the body, leg, and saliva were 2.57 x 10 8 , 4.73 x 10 7 , and 4.29 x 10 4 vRNA copies/ml, respectively. Zika_BC02 was present in the pregnant animal's virus population at~16% between days 3 and 5 after infection, representing the second most common barcode in the population (Fig 7, Tables 5 and S10).

Discussion
Mosquito-borne viruses like ZIKV typically exist in hosts as diverse mutant swarms. Defining the way in which stochastic forces within hosts shape these swarms is critical to understanding the evolutionary and adaptive potential of these pathogens and may reveal key insight into transmission, pathogenesis, immune evasion, and reservoir establishment. To date, no attempts have been made to enumerate and characterize individual viral lineages during ZIKV infection. Here, we characterized the dynamics of ZIKV infection in rhesus macaques. Specifically, using a synthetic swarm of molecularly barcoded ZIKV, we tracked the composition of the virus population over time in both pregnant and nonpregnant animals. Our results demonstrated that viral diversity fluctuated in both a spatial and temporal manner as host barriers or selective pressures were encountered and this likely contributed to narrowing of the barcode composition in macaques. For example, the proportions of individual barcoded virus templates remained stable during acute infection, but in the pregnant animal infected with ZIKV-BC-1.0 the complexity of the virus population declined precipitously 8 days following infection of the dam. This was coincident with the timing of typical resolution of ZIKV in non-pregnant macaques (Figs 2 and 3), and after this point the complexity of the virus population remained low for the subsequent duration of viremia (Fig 6C). We speculate that the narrowing of the barcode composition in the pregnant animal was the result of establishment of an anatomic reservoir of ZIKV that is not accessible to maternal neutralizing antibodies, which is shed into maternal plasma at low, but detectable, levels. It also is possible that declining viral barcode diversity was an artifact of a declining viral population size and the consequent effects on sampling, without reservoir establishment. Unfortunately, the absence of ZIKV RNA in the fetus at term prevented us from comparing the barcode composition in the fetus to the barcodes in maternal plasma, so this experiment could not resolve questions related to the potential that the feto-placental unit acts as a tissue reservoir of ZIKV.
There are several factors that could explain the apparent lack of ZIKV RNA in the fetus at term. One possibility is that ZIKV-BC-1.0 was impaired in its ability to traffic to the feto-placental unit due to introduction of the barcode sequence. However, we believe this scenario to be unlikely because the presence of ZIKV-induced pathology in both the fetus and placenta provide indirect evidence for vertical transmission (Fig 4). In addition, the inability to detect ZIKV RNA in affected tissues could be due to the focal nature of infection, assay sensitivity, and/or viral clearance by the time of necropsy. Indeed, resolution of maternal viral loads occurred 91 days prior to necropsy, and Hirsch et al. recently demonstrated that ZIKV infection of the placenta was highly focal and could only be determined by comprehensive biopsy of all placental perfusion domains [26]. Furthermore, although our previous studies suggest high rates of vertical transmission [2], it is unlikely that vertical transmission occurs 100% of the time in humans or macaques. Finally, this animal had pre-existing DENV-3 immunity and it remains unclear what role this may play in subsequent ZIKV infection during pregnancy. Therefore, matching ZIKV barcodes in neonatal tissues with barcodes found in the mother will be important for better understanding vertical transmission. While the ZIKV-BC-1.0 reported here has limited complexity, we have recently developed a new synthetic swarm, ZIKV-BC-2.0, which uses an optimized transfection strategy and has orders of magnitude more putative authentic barcodes. This new virus will be used in future studies in conjunction with deep sequencing techniques that enumerate individual templates with unique molecular identifiers [28]. We therefore expect that future studies of pregnant animals infected with barcoded ZIKV will help distinguish between these possibilities. In addition to better understanding vertical transmission, synthetic swarm viruses will be useful tools for future studies aimed at understanding persistent reservoirs, bottlenecks, and overall evolutionary dynamics. For example, by using synthetic swarm viruses it should be possible to estimate the effective size of ZIKV populations (Ne) which determines whether selection or genetic drift is the predominant force shaping their genetic structure and evolution [29,30]. Likewise, it should be possible to estimate the number of founder viruses that are required to initiate infection of the fetus during vertical transmission and/or the number of founder viruses required to initiate infection during mosquito-borne versus sexual transmission. Both the number of founder viruses and Ne have not been estimated for any step of the ZIKV infection and/or transmission cycle, but we postulate that a single or limited number of infectious particles likely contribute to the infection of the fetus during vertical transmission. Strong bottlenecks have been observed previously during vertical transmission in plant virus systems [30,31] and during mother-to-offspring transmission of HIV-1 [32,33] and bovine viral diarrhea virus [34]. In these studies, the vast majority of offspring harbored a single or few viral variants, which suggested a stringent population bottleneck associated with vertical transmission. Therefore, knowledge of Ne is of major interest for a better understanding of how virus population structure changes and/or regenerates as it encounters host barriers or selective pressures within and between hosts. Furthermore, barcoded ZIKV will be useful in studies that combine deep sequencing with experimental evolution to observe within host dynamics of ZIKV variants. Barcoded ZIKV is particularly appropriate for studying the effects of evolutionary forces, such as selection and genetic drift, on the emergence of new ZIKV variants that result from host adaptation or that may emerge in the face of new selective pressures: for example, biocontrol strategies, antiviral therapies, immune escape, vaccines, etc. The effects of these evolutionary forces on virus evolution historically have been challenging to address without the inclusion of neutral markers to estimate selection coefficients and Ne.
Although we developed this system to better understand the dynamics of ZIKV infection in the vertebrate host, this approach can be applied to address other questions about ZIKV transmission. For example, ZIKV-BC-1.0 can be used to quantify the bottleneck forces during mosquito infection and transmission. As a result, we also attempted to characterize barcodes present in mosquitoes that fed on the ZIKV-BC-1.0-infected pregnant animal. Consistent with our previous experiments [22], only a single Ae. aegypti became infected with ZIKV-BC-1.0 after feeding on ZIKV-BC-1.0-viremic macaques. This was likely the result of the low amount of infectious virus in macaque blood [35]. We only detected a single barcode during infection of mosquitoes. This is not entirely surprising because mosquitoes ingest small amounts of blood from infected hosts, which limits the size of the viral population founding infection in the vector. For example, it has been previously estimated that as few as 5-42 founder viruses initiate DENV infection of the mosquito midgut [36]. Also, during replication in mosquitoes, flaviviruses undergo population bottlenecks as they traverse physical barriers like the midgut and salivary glands [36,37]. We therefore expected barcode diversity to be low in infected mosquitoes and these data are perhaps indicative of a stringent midgut bottleneck in this individual that limited the variant pool in other anatomic compartments, but this requires further experimental confirmation. Consistent with what we show here, previous work has demonstrated considerable haplotype turnover for West Nile virus in Culex pipiens but not in Ae. aegypti, i.e., haplotypes remained relatively stable as the virus trafficked from the midgut to the saliva [37]. Likewise, Weger-Lucarelli et al., manuscript submitted most often detected only a single barcode in different Ae. aegypti populations that were exposed to ZIKV-BC-1.0 using an artificial membrane feeding system. In sum, our approach showed that synthetic swarm viruses can be used to probe the composition of viral populations over time in vivo to understand vertical transmission, persistent reservoirs, bottlenecks, and evolutionary dynamics.

Study design
This study was a proof of concept study designed to examine whether molecularly barcoded ZIKV could be used to elucidate the source of prolonged maternal viremia during pregnancy (Fig 3). Datasets used in this manuscript are publicly available at zika.labkey.com.

Ethical approval
This study was approved by the University of Wisconsin-Madison Institutional Animal Care and Use Committee (Animal Care and Use Protocol Number G005401).

Nonhuman primates
Five male and five female Indian-origin rhesus macaques utilized in this study were cared for by the staff at the Wisconsin National Primate Research Center (WNPRC) in accordance with the regulations, guidelines, and recommendations outlined in the Animal Welfare Act, the Guide for the Care and Use of Laboratory Animals, and the Weatherall report. In addition, all macaques utilized in the study were free of Macacine herpesvirus 1, Simian Retrovirus Type D, Simian T-lymphotropic virus Type 1, and Simian Immunodeficiency Virus. For all procedures, animals were anesthetized with an intramuscular dose of ketamine (10ml/kg). Blood samples were obtained using a vacutainer or needle and syringe from the femoral or saphenous vein. The pregnant animal (776301) had a previous history of experimental DENV-3 exposure, approximately one year prior to ZIKV infection.

Construction of molecularly-barcoded ZIKV
Genetically-barcoded ZIKV was constructed using the ZIKV reverse genetic platform developed by Weger-Lucarelli et al. [21]. The region for the barcode insertion was selected by searching for consecutive codons in which inserting a degenerate nucleotide in the third position would result in a synonymous change. The genetically-barcoded ZIKV clone then was constructed using a novel method called bacteria-free cloning (BFC). First, the genome was amplified as two overlapping pieces from the two-part plasmid system of the reverse genetic platform (see [21]). The CMV promoter was amplified from pcDNA3.1 (Invitrogen). The barcode region was then introduced in the form of an overlapping PCR-amplified oligo (IDT, Iowa, USA). All PCR amplifications were performed with Q5 DNA polymerase (New England Biolabs, Ipswich, MA, USA). Amplified pieces were then gel purified (Macherey-Nagel). The purified overlapping pieces were then assembled using the HiFi DNA assembly master mix (New England Biolabs) and incubated at 50˚C for four hours. The Gibson assembly reaction then was treated with Exonuclease I (specific for ssDNA), lambda exonuclease (removes noncircular dsDNA) and DpnI (removes any original bacteria derived plasmid DNA) at 37˚C for 30 minutes followed by heat inactivation for 20 minutes at 80˚C. Two microliters of this reaction then was used for rolling circle amplification (RCA) using the REPLI-g Mini kit (Qiagen). RCA was performed following the manufacturer's specifications except that 2M trehalose was used in place of water in the reaction mixture because it has been previously shown that this modification reduces secondary amplification products [38]. Reactions were incubated at 30˚C for four hours and then inactivated at 65˚C for three minutes. Sequence was confirmed by Sanger sequencing.

Molecularly-barcoded ZIKV stocks
Virus was prepared in Vero cells transfected with the purified RCA reaction. Briefly, RCA reactions were digested with NruI at 37˚C for one hour to linearize the product and remove the branched structure. Generation of an authentic 3'UTR was assured due to the presence of the hepatitis-delta ribozyme immediately following the viral genome [21]. The digested RCA reaction then was purified using a PCR purification kit (Macherey-Nagel) and eluted with molecular-grade water. Purified and digested RCAs were transfected into 80-90% confluent Vero cells using the Xfect transfection reagent (Clontech) following manufacturer's specifications. Infectious virus was harvested when 50-75% cytopathic effects were observed (6 days post transfection). Viral supernatant then was clarified by centrifugation and supplemented to a final concentration of 20% fetal bovine serum and 10 mM HEPES prior to freezing and storage as single use aliquots. Titer was measured by plaque assay on Vero cells as described in a subsequent section.

Subcutaneous inoculations
The ZIKV-PR stock, ZIKV-IC, and ZIKV-BC-1.0 were thawed, diluted in PBS to 1 x 10 4 PFU/ ml, and loaded into a 3 ml syringe maintained on ice until inoculation. Each of nine nonpregnant Indian-origin rhesus macaques was anesthetized and inoculated subcutaneously over the cranial dorsum with 1 ml ZIKV-PR stock (n = 3), ZIKV-IC stock (n = 3), or ZIKV-BC-1.0 stock (n = 3) containing 1 x 10 4 PFU. Likewise, the pregnant animal was anesthetized and inoculated via the same route with 1 ml barcoded virus stock containing 1 x 10 4 PFU. All animals were closely monitored by veterinary and animal care staff for adverse reactions and signs of disease. Nonpregnant animals were examined, and blood and urine were collected from each animal daily from 1 through 10 days, and 14 days post inoculation (dpi). Sampling continued for the pregnant animal until the resolution of viremia.

Mosquito strain, colony maintenance, and vector competence
The Aedes aegypti black-eyed Liverpool (LVP) strain used in this study was obtained from Lyric Bartholomay (University of Wisconsin-Madison, Madison, WI) and maintained at the University of Wisconsin-Madison as previously described [39]. Ae. aegypti LVP are susceptible to ZIKV [40]. Infection, dissemination, and transmission rates were determined for individual mosquitoes and sample sizes were chosen using long established procedures [40][41][42]. Mosquitoes that fed to repletion on macaques were randomized and separated into cartons in groups of 40-50 and maintained as described in [22]. All samples were screened by plaque assay on Vero cells. Dissemination was indicated by virus-positive legs. Transmission was defined as release of infectious virus with salivary secretions, i.e., the potential to infect another host, and was indicated by virus-positive salivary secretions.

Plaque assay
All ZIKV screens from mosquito tissue and titrations for virus quantification from virus stocks were completed by plaque assay on Vero cell cultures. Duplicate wells were infected with 0.1 ml aliquots from serial 10-fold dilutions in growth media and virus was adsorbed for one hour. Following incubation, the inoculum was removed, and monolayers were overlaid with 3 ml containing a 1:1 mixture of 1.2% oxoid agar and 2X DMEM (Gibco, Carlsbad, CA) with 10% (vol/vol) FBS and 2% (vol/vol) penicillin/streptomycin. Cells were incubated at 37˚C in 5% CO 2 for four days for plaque development. Cell monolayers then were stained with 3 ml of overlay containing a 1:1 mixture of 1.2% oxoid agar and 2X DMEM with 2% (vol/vol) FBS, 2% (vol/vol) penicillin/streptomycin, and 0.33% neutral red (Gibco). Cells were incubated overnight at 37˚C and plaques were counted.

Plaque reduction neutralization test (PRNT)
Macaque serum samples were screened for ZIKV and DENV neutralizing antibody utilizing a plaque reduction neutralization test (PRNT) on Vero cells as described in [43] against ZIKV-PR and DENV-3. Neutralization curves were generated using GraphPad Prism software. The resulting data were analyzed by non-linear regression to estimate the dilution of serum required to inhibit 50% and 90% of infection.

Fetal rhesus amniocentesis
Under real-time ultrasound guidance, a 22-gauge, 3.5-inch Quincke spinal needle was inserted into the amniotic sac. After 1.5-2 ml of fluid were removed and discarded due to potential maternal contamination, an additional 3-4 ml of amniotic fluid were removed for viral qRT-PCR analysis as described elsewhere [2,13]. These samples were obtained at the gestational ages 57, 71, 85, and 155 days. All fluids were free of any blood contamination.

Viral RNA isolation
Plasma was isolated from EDTA-anticoagulated whole blood collected the same day by Ficoll density centrifugation at 1860 rcf for 30 minutes. Plasma was removed to a clean 15ml conical tube and centrifuged at 670 rcf for an additional 8 minutes to remove residual cells. Viral RNA was extracted from 300 µL plasma using the Viral Total Nucleic Acid Kit (Promega, Madison, WI) on a Maxwell 16 MDx instrument (Promega, Madison, WI). Tissues were processed with RNAlater (Invitrogen, Carlsbad, CA) according to the manufacturer's protocols. Viral RNA was isolated from the tissues using the Maxwell 16 LEV simplyRNA Tissue Kit (Promega, Madison, WI) on a Maxwell 16 MDx instrument. A range of 20-40 mg of each tissue was homogenized using homogenization buffer from the Maxwell 16 LEV simplyRNA Tissue Kit, the TissueLyser (Qiagen, Hilden, Germany) and two 5 mm stainless steel beads (Qiagen, Hilden, Germany) in a 2 ml snap-cap tube, shaking twice for 3 minutes at 20 Hz each side. The isolation was continued according to the Maxwell 16 LEV simplyRNA Tissue Kit protocol, and samples were eluted into 50 µl RNase free water. RNA was then quantified using quantitative RT-PCR. If a tissue was negative by this method, a duplicate tissue sample was extracted using the Trizol Plus RNA Purification kit (Invitrogen, Carlsbad, CA). Because this purification kit allows for more than twice the weight of tissue starting material, there is an increased likelihood of detecting vRNA in tissues with low viral loads. RNA then was re-quantified using the same quantitative RT-PCR assay. Viral load data from plasma are expressed as vRNA copies/ml. Viral load data from tissues are expressed as vRNA/mg tissue.

Cesarean section and tissue collection (necropsy)
At~155 days gestation, the fetus was removed via surgical uterotomy and maternal tissues were biopsied during laparotomy. These were survival surgeries for the dams. The entire conceptus (fetus, placenta, fetal membranes, umbilical cord, and amniotic fluid) was collected and submitted for necropsy. The fetus was euthanized with an overdose of sodium pentobarbitol (50 mg/kg). Tissues were dissected using sterile instruments that were changed between each organ and tissue type to minimize possible cross contamination. Each organ/tissue was evaluated grossly in situ, removed with sterile instruments, placed in a sterile culture dish, and sectioned for histology, viral burden assay, or banked for future assays. Sampling priority for small or limited fetal tissue volumes (e.g., thyroid gland, eyes) was vRNA followed by histopathology, so not all tissues were available for both analyses. Sampling of all major organ systems and associated biological samples included the CNS (brain, spinal cord, eyes), digestive, urogenital, endocrine, musculoskeletal, cardiovascular, hematopoietic, and respiratory systems as well as amniotic fluid, gastric fluid, bile, and urine. A comprehensive listing of all specific tissues collected and analyzed is presented in S7 Table. Biopsies of the placental bed (uterine placental attachment site containing deep decidua basalis and myometrium), maternal liver, spleen, and a mesenteric lymph node were collected aseptically during surgery into sterile petri dishes, weighed, and further processed for viral burden and when sufficient sample size was obtained, histology. Maternal decidua was dissected from the maternal surface of the placenta.

Histology
Tissues (except neural tissues) were fixed in 4% paraformaldehyde for 24 hours and transferred into 70% ethanol until alcohol processed and embedded in paraffin. Neural tissues were fixed in 10% neutral buffered formalin for 14 days until routinely processed and embedded in paraffin. Paraffin sections (5 µm) were stained with hematoxylin and eosin (H&E). Pathologists were blinded to vRNA findings when tissue sections were evaluated microscopically. Photomicrographs were obtained using a bright light microscope Olympus BX43 and Olympus BX46 (Olympus Inc., Center Valley, PA) with attached Olympus DP72 digital camera (Olympus Inc.) and Spot Flex 152 64 Mp camera (Spot Imaging) and captured using commercially available image-analysis software (cellSens DimensionR, Olympus Inc. and spot software 5.2).

Quantitative reverse transcription PCR (qRT-PCR)
For ZIKV-PR, vRNA from plasma and tissues was quantified by qRT-PCR using primers with a slight modification to those described by Lanciotti et al. to accommodate African lineage ZIKV sequences [44]. The modified primer sequences are: forward 5'-CGYTGCCCAACACA AGG-3', reverse 5'-CACYAAYGTTCTTTTGCABACAT-3', and probe 5'-6fam-AGCCTACC TTGAYAAGCARTCAGACACYCAA-BHQ1-3'. The RT-PCR was performed using the SuperScript III Platinum One-Step Quantitative RT-PCR system (Invitrogen, Carlsbad, CA) on a LightCycler 480 instrument (Roche Diagnostics, Indianapolis, IN). The primers and probe were used at final concentrations of 600 nm and 100 nm respectively, along with 150 ng random primers (Promega, Madison, WI). Cycling conditions were as follows: 37˚C for 15 min, 50˚C for 30 min and 95˚C for 2 min, followed by 50 cycles of 95˚C for 15 sec and 60˚C for 1 min. Viral RNA concentration was determined by interpolation onto an internal standard curve composed of seven 10-fold serial dilutions of a synthetic ZIKV RNA fragment based on a ZIKV strain derived from French Polynesia that shares >99% similarity at the nucleotide level to the Puerto Rican strain used in the infections described in this manuscript.

Deep sequencing
Virus populations replicating in macaque plasma or mosquito tissues were sequenced in duplicate using a method adapted from Quick et. al. [45]. Viral RNA was isolated from mosquito tissues or plasma using the Maxwell 16 Total Viral Nucleic Acid Purification kit, according to manufacturer's protocol. Viral RNA then was subjected to RT-PCR using the SuperScript IV Reverse Transcriptase enzyme (Invitrogen, Carlsbad, CA). Theoretical input viral template numbers are shown in Tables 1-3 and 5. For sequencing the entire ZIKV genome, the cDNA was split into two multi-plex PCR reactions using the PCR primers described in Quick et. al with the Q5 High-Fidelity DNA Polymerase enzyme (New England Biolabs, Inc., Ipswich, MA). For sequencing solely the barcode region, individual PCR reactions were performed that either used a primer pair generating a 131 bp amplicon (131F: 5'-TGGTTGGCAATACGAGC-GATGGTT-3'; 131R: 5'-CCCCCGCAAGTAGCAAGGCCTG-3') or a 178bp amplicon (178F: 5'-CCTTGGAAGGCGACCTGATGGTTCT-3'; 178R (same as 131R): 5'-CCCCCGCAAGTA GCAAGGCCTG-3'). Purified PCR products were tagged with the Illumina TruSeq Nano HT kit or the and sequenced with a 2 x 300 kit on an Illumina MiSeq.

Sequence analysis
Full genome ZIKV sequences generated with the multiplex PCR approach were analyzed using a workflow we termed "Zequencer_2017" (https://bitbucket.org/dholab/zikv_ barcode_manuscript_scripts/src). Briefly, sequences were analyzed using a series of custom Python scripts. To characterize the entire ZIKV genome, up to 1000 reads spanning each of the 35 amplicons were extracted from the data set and then mapped to the Zika reference for PRVABC59 (Genbank:KU501215). Variant nucleotides were called using SNPeff [46], using a 5% cutoff. Mapped reads and reference scaffolds were loaded into Geneious Pro (Biomatters, Ltd., Auckland, New Zealand) for intrasample variant calling and differences between each sample and the KU501215 reference were determined. Sequence alignments of the stock viruses can be found in the sequence read archive: ZIKV-IC (accession number: SRX3258286); ZIKV-BC-1.0 (accession number: SRX3258287).
To characterize the barcodes and their frequencies, we developed a workflow called "ZIKV_barcode_analysis" (https://bitbucket.org/dholab/zikv_barcode_manuscript_scripts/ src) that makes use of the bbmap suite of tools. Briefly, paired-end reads were merged and quality trimmed using bbmerge. Then, reads containing the barcode were extracted, using bbduk to select reads containing both 20 bp sequences upstream and downstream of the barcode region. These reads were mapped against the Zika reference (GenBank:KU501215) with bbmap, and were then oriented and trimmed so that only the 24 bp barcode remained. Identical barcodes were identified and counted. Custom Python scripts were used to identify authentic barcodes and calculate their frequency in each sample.

Diversity and similarity analysis
The diversities of the sequence populations were evaluated using the Simpson's diversity index: where n i is the number of copies of the ith unique sequence, c is the number of different unique sequences, and n is the total number of sequences in the sample.
The similarities between pairs of samples were assessed using the Morisita-Horn similarity index: where f i = n 1i / N 1 and g i = n 2i / N 2, n 1i and n 2i are the number of copies of the ith unique sequence in samples 1 and 2, and N 1 and N 2 are the total number of sequences in samples 1 and 2, respectively. The summations in the numerator and the denominator are over the c unique sequences in both samples. The Simpson's diversity and Morisita-Horn similarity indices account for both the number of unique sequences and their relative frequencies. These relative diversity and similarity indices range in value from 0 (minimal diversity/similarity) to 1 (maximal diversity/similarity). The Simpson's diversity index considers a more diverse population as one with a more even distribution of sequence frequencies and the Morisita-Horn similarity index considers populations to be more similar if the higher frequency sequences in both samples are common to both samples and have similar relative frequencies.

Data availability
Primary data that support the findings of this study are available at the Zika Open-Research Portal (https://zika.labkey.com). Raw FASTQ sequencing data are available at the sequence read archive, accession number: SRP131908. The authors declare that all other data supporting the findings of this study are available within the article and its supplementary information files.