Informing the Historical Record of Experimental Nonhuman Primate Infections with Ebola Virus: Genomic Characterization of USAMRIID Ebola Virus/H.sapiens-tc/COD/1995/Kikwit-9510621 Challenge Stock “R4368” and Its Replacement “R4415”

The creation of licensed medical countermeasures against Select Agents such as Ebola virus (EBOV) is critically dependent on the use of standardized reagents, assays, and animal models. We performed full genome reconstruction, population genomics, contaminant analysis, and characterization of the glycoprotein gene editing site of historical United States Army Medical Research Institute of Infectious Diseases (USAMRIID) nonhuman-primate challenge stock Ebola virus Kikwit “R4368” and its 2014 replacement “R4415.” We also provide characterization of the master stock used to create “R4415.” The obtained data are essential to understanding the quality of the seed stock reagents used in pivotal animal studies that have been used to inform medical countermeasure development. Furthermore, these data might add to the understanding of the influence of EBOV variant populations on pathogenesis and disease outcome and inform attempts to avoid the evolution of EBOV escape mutants in response to current therapeutics. Finally, as the primary challenge stocks have changed over time, these data will provide a baseline for understanding and correlating past and future animal study results.


Introduction
Ebola virus disease (EVD) is a frequently lethal human viral hemorrhagic fever caused by four distinct ebolaviruses (Bundibugyo virus, Ebola virus, Sudan virus, and Taï Forest virus). EVD occurs sporadically and usually affects no more than several hundred people during an outbreak (reviewed in [1]). In 2014, Ebola virus (EBOV) was identified as the etiological agent of an unprecedented EVD outbreak that started in Western Africa in late 2013 and has since caused 28,637 cases and 11,315 deaths (as of January 3, 2016) [2].
Taxonomically, EBOV is the only member of the species Zaire ebolavirus in the genus Ebolavirus (Mononegavirales: Filoviridae) [3]. The EBOV genome is rather conserved over time and geographic distances, which may be due to genetic bottlenecks in the yet-unidentified host reservoir [1,4]. For instance, the EBOV variant that caused the 2013-present outbreak in Western Africa (Makona [5]) differs from those that caused EVD outbreaks in Zaire in 1976 (Yambuku [6,7]) and Zaire's successor country Democratic Republic of the Congo in 1995 (Kikwit [8]) by less than 3% over the entire 19 kb genome [9].
The development, evaluation, and final licensure of medical countermeasures (MCMs) against EVD in the US is critically dependent on standardized animal models of filovirus infection and standardized assays and reagents [10][11][12], including well-characterized virus stocks. Ebola virus/H.sapiens-tc/COD/1976/Yambuku-Mayinga (EBOV/Yam-May) is the best characterized EBOV isolate and has been used for the majority of in vitro experiments [1,7,13]. Ebola virus/H.sapiens-tc/COD/1995/Kikwit-9510621 (EBOV/Kik-9510621), on the other hand, has become the most used EBOV isolate for animal, and in particular nonhuman primate (NHP), experimentation in the United States (US) [1,13,14]. Unfortunately, this isolate has been passaged/maintained by different procedures in different locations and even within the same institutes. These variable procedures result in NHP challenge stocks of different quality and possibly in different experimental outcomes upon use. In addition, only one of these NHP challenge stocks has been genomically characterized to assess potential mutations relative to the published consensus genome sequence [14].
All ebolaviruses make use of co-transcriptional editing of their glycoprotein-encoding GP genes to access three partially overlapping open reading frames (ORF) [15,16]. Editing occurs at a distinct editing site in the GP gene that typically consists of seven consecutive uridylyls (7U). Read-through results in the transcription of a cognate 7 adenylyl mRNA (7A mRNA) and thereby in the expression of pre-sGP, a protein precursor that is proteolytically processed to a secreted homodimeric glycoprotein (sGP) and a probably monomeric secreted peptide (Δpeptide). Stuttering of the ebolavirus RNA-dependent RNA polymerase (L) at the editing site results in transcription of mRNAs that contain various numbers of adenylyls. The majority of non-7A mRNAs contain eight adenylyls, leading to the expression of the homotrimeric ebolavirus spike glycoprotein GP 1,2 , or six or nine adenylyls, leading to expression of the small soluble glycoprotein (ssGP) [15][16][17][18][19][20]. The functions of sGP, ssGP, and Δ-peptide are largely unknown but a role for sGP in immune evasion has been described [21]. Editing appears to be tightly controlled, with the ratio of proteins expressed from a 7U virus of roughly sGP:GP 1,2 : ssGP = 75%:25%:5% [17,22]. EBOV, Sudan virus (SUDV), and possibly other ebolaviruses adapt to different in vitro and in vivo environments [14,[23][24][25]. These adaptations include changes in the EBOV and SUDV GP gene editing sites. For instance, during serial passage of a 7U EBOV/Yam-May isolate in Vero E6 cells, a viral population evolved that predominantly contained an 8U editing site. The opposite occurs in vivo: 8U EBOV/Yam-May populations converted to 7U populations in guinea pigs [10]. These changes may be related to selective advantages linked to the controlled expression of GP 1,2 and/or sGP. Preliminary evaluations indicate that a 7U!8U mutation in EBOV does not alter pathogenesis in guinea pigs or nonhuman primates [25]. However, the observations described above indicate that there may be selective advantages associated with the ratio of expressed sGP:GP 1,2 in different environments (8U-containing viruses predominantly express GP 1,2 rather than sGP). In addition, a recent study reported subtle, but significant, differences in disease course and severity between nonhuman primates exposed to 7U or 8U virus stock variants [26]. All of these findings underscore the importance of fully characterizing viral stocks used in MCM development.
Here, we report the coding-complete genome sequence (see [27] for sequencing terminology), population characterization, and contaminant analysis of an EBOV/Kik-9510621 challenge stock "R4368" (passage 4, 8U) that was used in the past for NHP studies at the United States Army Medical Research Institute of Infectious Diseases (USAMRIID) at Fort Detrick, Maryland. We further report the complete genome of sequence at the same level of characterization of challenge stock "R4415" (passage 3, 7U), which replaced "R4368" (passage 4, 8U) in 2014, and the master stock used to establish this new stock ("R4414" (passage 2, 7U)).

History of USAMRIID EBOV/Kik-9510621 challenge stocks
A clinical specimen (designated Centers for Disease Control and Prevention Special Pathogens Branch Log [CDC SPBLOG] 9510621) was obtained from an EBOV-infected 65-year old female patient during an EVD outbreak that occurred in 1995 around Kikwit, Zaire (today Democratic Republic of the Congo, COD). The patient's disease onset was recorded as April 29, 1995. She was hospitalized on May 1, 1995, and died on May 5, 1995. The clinical specimen, most likely plasma or serum, was obtained from the patient on May 4, 1995. How the patient became infected and what medical care she may have received is unclear. Unfortunately, chain-of-custody records that further detail the origin of 9510621 or its shipment to the CDC are not available anymore, and EBOV titration was not attempted from this specimen prior to cell-culture passage.
The first passage of EBOV/Kik-9510621, designated "virus seed pool (VSP) 807223" (Fig 1), was conducted at the CDC using grivet (Chlorocebus aethiops) Vero E6 cells (ATCC #CRL-1586). The multiplicity of infection (MOI) used for this passage is unknown. Virus was harvested after a 6-day incubation period on May 19, 1995, but the method of harvest (initial freeze/thaw, clarification by centrifugation etc.) is not indicated in the available records. A second passage of virus, was conducted at the CDC, again using Vero E6 cells, and resulted in VSP 807224 (MOI unknown). Virus was harvested after an 8-day incubation, using a "freeze/ thaw" method, on May 29, 1995. A titer of 3.2E+06 was determined for "VSP 807224" in January 2008 using the TCID 50 /Reed-Munch viral titration method. "VSP 807224" was transferred from CDC between late May and late June 1995 to USAMRIID.
For rapid amplification of cDNA ends (RACE), "R4415" (passage 3) RNAs were extracted with Zymo Direct-Zol (Zymo Research Corporation, Irvine, CA) from cell-culture supernatant in TRIzol according to the manufacturers' instructions. SMARter RACE 5'/3' kit (Clontech Laboratories, Inc., Mountain View, CA) was used to amplify both 5' and 3' untranslated regions (UTRs) of the virus genome from the extracted RNAs. The kit's two-stage nested PCR protocol was found to be optimal. The gene-specific primers for each RACE experiment are as follows 5' RACE (outer primer): ATTACCAGAGTTGATTAGTGTG; 5' RACE (inner primer): TTAAATAAC GAAAGGAGTC; 3' RACE (outer primer): TGAATCTCCAATCCTCTAAGTA; 3' RACE (inner primer): AAGGGATTTTCAACTGAGCACACT. Amplification primer removal, duplicate removal, low average quality exclusion ( Q30), and quality trimming was performed. Viral assemblies were completed in DNAStar Lasergene nGen (Madison, WI) with 4×10 5 reads. Only single nucleotide polymorphism (SNPs) present in the population above the 2% threshold are presented in this report (however, the alignment files are provided in SRA if a less conservative approach is desired). Considering this threshold, a target depth of 200 requires an SNP to have 4 supporting reads prior to being called a SNP. The depths are reported in the tables presented in this text for all samples (below 200 depth, calls should be viewed with increasing skepticism). A consensus change is defined here as a change relative to the published sequence for EBOV/Kik-9510621 "134" (GenBank accession # AY354458) present in 50% of the population. Below that threshold, SNPs are considered subclonal substitutions and part of a minority subpopulation of the virus.

Ethics Statement
Research has been reviewed for compliance with dual-use guidelines and approved for publication by the USAMRIID Institute Biosafety Committee (IBC) and the Operational Security office.

Discussion
We have provided here a concise report on the history and genomic characterization of the USAMRIID Ebola virus/H.sapiens-tc/COD/1995/Kikwit-9510621 NHP challenge stocks "R4368" (passage 4) and "R4415" (passage 3), as well as the "R4415" (passage 3) predecessor, "R4414 (passage 2)." "R4368" (passage 4) was used between July 2011 and December 2014 in both in vitro and in vivo Ebola virus experimentation, including pathogenesis studies and candidate medical countermeasure evaluation. "R4415" (passage 3) replaced "R4368" (passage 4) in December 2014 and has been used for most pathogenesis and medical countermeasure evaluation research since then. This work provides a framework for genomic comparison between past experiments as challenge stocks are replaced to address propagation issues and depletion. Characterization of these NHP challenge stocks was completed to the level of "Coding Complete" plus population-level characterization in the case of "R4368" (passage 4) and "Finished" in the case of "R4415" (passage 3) [27]. This characterization includes genome reconstruction (excluding determination of the 3' and 5' UTRs in the case of "R4414" (passage 2) and "R4415" (passage 3), characterization of intrahost variants (iSNVs), and determination of absence of contaminants. Studies to determine the role of the identified iSNVs in interactions with the  host (e.g., in the immune response) are being considered for expansion of this body of work. This level of characterization is crucial for studies evaluating the possibility of EBOV escape from candidate therapeutics or vaccines, as minority variants can play an important phenotypic role in viral escape [28].