Skip to main content
Advertisement
  • Loading metrics

Choclo virus (CHOV) recovered from deep metatranscriptomics of archived frozen tissues in natural history biorepositories

  • Paris S. Salazar-Hamm ,

    Roles Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    psalazarhamm@salud.unm.edu (PSSH); DLDinwiddie@salud.unm.edu (DLD)

    Affiliations Clinical and Translational Science Center, University of New Mexico, Albuquerque, New Mexico, United States of America, Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America, Department of Biology, University of New Mexico, Albuquerque, New Mexico, United States of America

  • William L. Johnson,

    Roles Data curation, Methodology, Writing – review & editing

    Affiliation Department of Pediatrics, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

  • Robert A. Nofchissey,

    Roles Data curation, Writing – review & editing

    Affiliation Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

  • Jacqueline R. Salazar,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Research in Emerging and Zoonotic Infectious Diseases, Gorgas Memorial Institute of Health Studies, Panama City, Panama

  • Publio Gonzalez,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Research in Emerging and Zoonotic Infectious Diseases, Gorgas Memorial Institute of Health Studies, Panama City, Panama

  • Samuel M. Goodfellow,

    Roles Formal analysis

    Affiliation Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

  • Jonathan L. Dunnum,

    Roles Data curation, Writing – review & editing

    Affiliations Department of Biology, University of New Mexico, Albuquerque, New Mexico, United States of America, Museum of Southwestern Biology, University of New Mexico, Albuquerque, New Mexico, United States of America

  • Steven B. Bradfute,

    Roles Supervision, Writing – review & editing

    Affiliation Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

  • Blas Armién,

    Roles Data curation, Investigation, Supervision, Writing – review & editing

    Affiliations Department of Research in Emerging and Zoonotic Infectious Diseases, Gorgas Memorial Institute of Health Studies, Panama City, Panama, Sistema Nacional de Investigación (SNI), Secretaria Nacional de Ciencia, Tecnología e Innovacion (SENACYT), Panama City, Panama

  • Joseph A. Cook,

    Roles Data curation, Investigation, Supervision, Writing – review & editing

    Affiliations Department of Biology, University of New Mexico, Albuquerque, New Mexico, United States of America, Museum of Southwestern Biology, University of New Mexico, Albuquerque, New Mexico, United States of America

  • Daryl B. Domman,

    Roles Conceptualization, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Clinical and Translational Science Center, University of New Mexico, Albuquerque, New Mexico, United States of America, Center for Global Health, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

  • Darrell L. Dinwiddie

    Roles Conceptualization, Data curation, Methodology, Supervision, Writing – review & editing

    psalazarhamm@salud.unm.edu (PSSH); DLDinwiddie@salud.unm.edu (DLD)

    Affiliation Department of Pediatrics, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States of America

Abstract

Background

Hantaviruses are negative-stranded RNA viruses that can sometimes cause severe disease in humans; however, they are maintained in mammalian host populations without causing harm. In Panama, sigmodontine rodents serve as hosts to transmissible hantaviruses. Due to natural and anthropogenic forces, these rodent populations are having increased contact with humans.

Methods

We extracted RNA and performed Illumina deep metatranscriptomic sequencing on Orthohantavirus seropositive museum tissues from rodents. We acquired sequence reads mapping to Choclo virus (CHOV, Orthohantavirus chocloense) from heart and kidney tissue of a two-decade old frozen museum sample from a Costa Rican pygmy rice rat (Oligoryzomys costaricensis) collected in Panama. Reads mapped to the CHOV reference were assembled and then validated by visualization of the mapped reads against the assembly.

Results

We recovered a 91% complete consensus sequence from a reference-guided assembly to CHOV with an average of 16X coverage. The S and M segments used in our phylogenetic analyses were nearly complete (98% and 99%, respectively). There were 1,199 ambiguous base calls of which 93% were present in the L segment. Our assembled genome varied 1.1% from the CHOV reference sequence resulting in eight nonsynonymous mutations. Further analysis of all publicly available partial S segment sequences support a clear relationship between CHOV clinical cases and O. costaricensis acquired strains.

Conclusions

Viruses occurring at extremely low abundances can be recovered from deep metatranscriptomics of archival tissues housed in research natural history museum biorepositories. Our efforts resulted in the second CHOV genome publicly available. This genomic data is important for future surveillance and diagnostic tools as well as understanding the evolution and pathogenicity of CHOV.

Author summary

Hantavirus cardiopulmonary syndrome (HCPS) in Panama, caused by Choclo virus (CHOV, Orthohantavirus chocloense), is intimately linked to the primary mammalian reservoir host, the Costa Rican pygmy rice rat (Oligoryzomys costaricensis). Although the prevalence of hantavirus disease is relatively low in Panama, over a quarter of the country has the agroecological conditions that favor this rodent. In addition, serologic evidence suggests infections are under-reported. Sequence data of the pathogen and host collected across temporal and spatial scales is necessary for diagnostics, surveillance, and forecasting; however, only one complete genome is available in NCBI GenBank. By leveraging deep metatranscriptomics of archived frozen mammal tissues, we generated a low-coverage genome using a reference-guided assembly approach. Sequence data can be used to develop pan-hantavirus diagnostic tools to facilitate acquisition of more detailed genetic data from archival samples to increase our understanding of the evolutionary and population dynamics of rare and neglected hantaviruses. This approach further illustrates the utility of cryopreserved biorepositories archived in natural history museums for pathogen discovery and pathobiology. Generating additional genomic sequence data will also be essential for developing a rigorous taxonomic framework to improve the understanding of hantavirus diversity and distribution.

Introduction

Hantaviruses are tri-segmented negative-stranded RNA viruses within the family Hantaviridae that can cause two severe diseases in human populations, namely hemorrhagic fever with renal syndrome (HFRS) and hantavirus cardiopulmonary syndrome (HCPS). Hantaviruses in the Americas are more closely associated with HCPS, which is characterized by fever, headache, myalgia, hypotension, and thrombocytopenia that can progress to cardiopulmonary failure. The mortality rate for HCPS is estimated between 15–50% and varies among virus species and across countries [15].

Despite high mortality rates in humans, hantaviruses are maintained naturally in rodent populations and can persist for months to the lifetime of the animal [6]. Infected rodents shed virus through saliva, urine, and feces which can form aerosols that can be inhaled by other rodents or humans [7]. In Panama, the Costa Rican pygmy rice rat (Oligoryzomys costaricensis) serves as a rodent reservoir for Choclo virus (CHOV, Orthohantavirus chocloense). O. costaricensis is distributed across areas which are experiencing high levels of habitat conversion from natural to agricultural lands. This transformation is hypothesized to increase population densities of commensal rodents and subsequent human contact with infected rodents, ultimately increasing zoonotic pathogen transmission [8]. CHOV was first identified by RT-PCR after an outbreak of HCPS in the agroecosystems of western Panama from December 1999 to February 2000 [9,10]. Monitoring of human and rodent populations in Panama over the past two decades have discovered multiple hantaviruses in Panama (e.g., Calabazo virus and Rio Segundo virus); however, CHOV is responsible for almost all human cases [1113]. From 2001 to 2007 multiple community-wide surveys of western Panamanians without reported HCPS symptomology found 16–60% of the individuals were hantavirus seropositive depending on region [14], documenting that many mild or asymptomatic exposures were not accounted for in clinical case count data. CHOV-associated disease generally has a lower case fatality rate (mean 7.9%, range from 3% to 50.0%) than other New World Hantaviruses, but unfortunately predominantly affects young people (ages 20–49) [12].

Despite being first reported more than 40 years ago, hantaviruses are often described as ‘emerging’ pathogens due to their increasing number of infections, global distribution, and great breadth of pathogen diversity [15]. Current proactive approaches aimed at pathogen prediction are utilizing hantaviruses as a model for understanding spillover events [16]. Diverse specimens of wild mammals archived in museum biorepositories over temporal and spatial scales are increasingly being utilized for surveillance and characterization of emerging diseases [1720]. For instance, the first complete CHOV sequence was obtained from archival Oligoryzomys costaricensis (= fulvescens, [21]) splenic tissue (MSB:Mamm:96073) using Sanger sequencing [13]. Of the 69 CHOV sequences available in NCBI GenBank, only nine are considered complete segments, eight of which are derived from the same voucher specimen (accessed August 10, 2023). Here, we deep sequenced archived mammalian tissue specimens to generate a metatranscriptome-assembled CHOV genome doubling the total available CHOV genomes.

Methods

Sample acquisition

We requested ten samples designated Orthohantavirus positive by an immunoglobulin G (IgG) serological screening test from the University of New Mexico Museum of Southwestern Biology (MSB) for metatranscriptomic sequencing. Upon collection of specimens in the field, tissues were immediately flash frozen in liquid nitrogen, subsequently transferred to -80°C freezers at the Instituto Conmemorativo Gorgas in Panama City, and then permanently archived in vapor phase nitrogen freezers (-196°C) in the MSB Division of Genomic Resources. One sample (10%), MSB:Mamm:131232, resulted in sufficient sequencing depth to assemble a Choclo virus (CHOV, Orthohantavirus chocloense) RNA genome. This voucher specimen was from an adult female Costa Rican pygmy rice rat (Oligoryzomys costaricensis) collected in El Bebedero, Tonosi, Los Santos, Panama in January 2003. Identification of host species was based on morphological characters and was subsequently verified using mitochondrial cytochrome b region sequence data (GenBank accession OR365535).

RNA extraction, amplification, and sequencing

We performed a total RNA extraction on frozen tissue using the QIAamp viral RNA minikit (Qiagen Inc, Cambridge, MA, USA) according to the manufacturer’s instructions, with slight modifications. Briefly, the tissue was bead beaten using 0.8 g of 1.0 mm Zirconia beads (BioSpec Inc, Bartlesville, OK, USA) and 1.5 g of 2.3-mm Zirconia beads (BioSpec Inc) in 800 μl of AVL buffer using the Benchmark Bead Bug-6 homogenizer at a speed of 4,350 rpm for 45 sec for 2 cycles with a 1 min rest in between. Homogenates went through a series of centrifugation and transfers, first at 4,000 rpms for 7 min, then again at 7,000 rpms for 10 min to pellet debris. The clear lysate was transferred to a new tube with the RNA carrier and vortexed for 15 sec. The final RNA isolation was conducted per manufacturer’s protocol including final elution with 50 μl nuclease free water.

Utilizing the Zymo RNA Clean & Concentrator-5 kit, extracted RNA was concentrated and treated with DNAse I utilizing on-column digestion following the manufacturer’s protocol (Zymo Research, Irvine, CA, USA). The resulting RNA was depleted of ribosomal RNA for two hours, converted to cDNA using random hexamers, and i7 and i5 sequencing adaptors added. Finally, individual samples were barcoded and amplified by PCR (7 cycles). All depletion and library preparation steps were conducted using the Zymo-Seq RiboFree Total RNA Kit (Zymo Research) following the manufacturer’s recommended protocol for degraded RNA. Prepared libraries were normalized to 2 nM, pooled, and combined with PhiX control (v.3, Illumina Inc, San Diego, CA, USA) at a final concentration of 1%. Pooled libraries were loaded at a final concentration of 750 pM and sequenced on an Illumina NextSeq 2000 using a P3 2x150 kit (Illumina Inc). De-multiplexing, adapter trimming, and preliminary QC were conducted using the Dragen pipeline (v.1.3.0, Illumina Inc). Reads were submitted to the NCBI Short Read Archive (SRA) under BioProject PRJNA1015235.

Metatranscriptomic RNA genome assembly

We initially processed the sequence data with the nextflow nf-core/mag pipeline for assembly, binning, and annotation of metagenomes [22]. Initial k-mer based taxonomic classification was performed by Kraken2 [23]. To maximize recovery of CHOV reads, Illumina paired-end reads were mapped to the CHOV reference sequence using BWA v.0.7.17 [24]. The reference consisted of the S and M segments from the RefSeq assembly (NC_038373.1 and NC_038374.1) and a more recently sequenced L segment (KT983773.1) linked through the Arctos specimen record (https://arctos.database.museum/guid/MSB:Mamm:96073). Mapped paired-end reads were filtered and converted from bam to fastq files using SAMtools v.1.15.1 [25]. We generated an assembly from the mapped reads using SPAdes v.3.15.5 [26]. The assembly and mapped reads were then visualized with Artemis v.18.0.0 [27] for validating base calls. A threshold for trusted base calls was set at 3X coverage. Assembly quality was assessed with QUAST v.5.2.0 [28]. The resulting assembly for each segment was deposited in NCBI GenBank under accessions OR365536, OR365537, OR365538 and linked to the voucher specimen as best practice [19,29]. We used the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) v.3.30.19a annotation tool for protein prediction (https://www.bv-brc.org/).

Phylogenetic analyses

Reference sequences were obtained from GenBank for New World hantaviruses including Orthohantavirus andesense (Andes virus, ANDV), Orthohantavirus bayoui (Bayou virus, BAYV), Orthohantavirus nigrorivense (Black Creek Canal virus, BCCV), Orthohantavirus chocloense (Choclo virus, CHOV), Orthohantavirus moroense (El Moro Canyon virus, ELMCV), Orthohantavirus negraense (Laguna Negra virus, LANV and Rio Mamore virus, RIOMV), Orthohantavirus montanoense (isolate Limestone Canyon virus, LSCV), Orthohantavirus sinnombreense (New York virus, NYV and Sin Nombre virus, SNV), and for the Old World hantavirus Orthohantavirus seoulense (Seoul virus, SEOV) as an outgroup [30] (S1 Table). For each segment, sequences were aligned with mafft v.7.487 [31] using automatically determined settings (i.e., mafft—auto). The alignment was trimmed with trimal v.1.4.rev22 [32] in automated1 mode with the additional removal of positions with <50% representation (—resoverlap) and sequences with <60% representation (—seqoverlap). The resulting alignments were 1,825 bp for the S segment, 3,618 bp for the M segment, and 6,562 bp for the L segment. A concatenated alignment of the 5,443 bp complete S and M segment was also generated. Maximum likelihood trees were built in IQ-Tree v.1.6.12 [33] with the GTR+GAMMA model with 10,000 ultrafast bootstraps and 10,000 bootstraps for the SH-like approximate likelihood ratio (SH-aLRT) and visualized in ggtree [34] for each segment and the concatenated alignment. A tanglegram for evaluation of phylogenetic concordance between segments was visualized in R v.4.2.2.

Because the nucleocapsid protein is a primary detection marker for clinical diagnostics [35] and therefore the most abundant in sequence archives, we obtained 55 partial S segment sequences from GenBank by searching ‘Choclo orthohantavirus’ to explore strain diversity (S2 Table). The sequences were aligned, trimmed, and filtered as above resulting in a 585 bp alignment. A maximum likelihood was built under the GTR+GAMMA model in IQ-Tree with ultrafast and SH-aLRT bootstraps and visualized in ggtree as described above.

Spatial distribution

We acquired GPS localities for capture sites of 32 Costa Rican pygmy rice rat (O. costaricensis) voucher specimens (https://arctos.database.museum) and for the approximate residency of 21 clinical cases of HCPS (S2 Table). The administrative political division of the Panamanian map was generated by the Topography Department of the National Institute of Statistics and Census of the General Comptroller of the Republic of Panama (https://www.inec.gob.pa/). A spatial representation of the human cases and rodent capture sites were georeferenced using the Datum UTM, WGS 1984, with ArcMap software from ArcGIS v.10.7 (ESRI2019) [36]. Samples collected within 12 km were aggregated based on maximum spatial movements of close rodent relatives [37,38]. Although we only present samples here with sequence data, previous surveillance efforts have found hantavirus seropositive O. costaricensis across its broad geographic range in Panama in five out of the nine ecoregions: Central American Atlantic Moist Forests, Isthmian-Pacific Moist Forests, Panamanian Dry Forests, Pacific Mangrove S. America, and Choco/Darién Moist Forests [36].

Results and Discussion

An outbreak from late 1999 to early 2000 of hantavirus cardiopulmonary syndrome (HCPS) in western Panama [9,10] has spurred over two decades of epidemiological and wildlife surveillance [12,36]. During those 20 years 712 clinical cases of HCPS were reported in Panama [12] and >11,000 specimens of non-volant mammals with archived biological materials were contributed to museum repositories (https://arctos.database.museum/). Of these, 883 have been identified as O. costaricensis, and 778 (88%) have been screened for prior hantavirus infection using an IgG strip immunoblot assay [39] with an average seropositivity of 16% [36]. We requested samples designated hantavirus positive at the University of New Mexico MSB for metatranscriptomic sequencing. Initial taxonomic assignment by Kraken2 classified < 15 reads Orthohantavirus in all samples except MSB:Mamm:131232, for which it identified 658 reads. Hantavirus-specific IgG antibodies can be detected up to six months after initial infection [40]. Given that 90% of the hantavirus-positive samples had 0 to 14 reads, there was likely no active infection.

We then took a reference-based mapping approach to maximize recovery of CHOV reads. We generated 113,237,341 reads from deep sequencing the metatranscriptome of MSB:Mamm:131232, of which 2,072 reads (0.00002%) mapped to the CHOV reference. The average depth of coverage from the mapping-based alignment was 16X; however, there was variation between segments with the greatest coverage of the small (S) segment (25X) followed by the medium (M) segment (19X) and the large (L) segment (12X) (Fig 1). Due to the low abundance of sequencing reads mapping to CHOV, we implemented a 3X coverage threshold of base-calling in the consensus genome. If the coverage at a site was below 5X, we only called a base when the allele frequency was 100% at that site. Using these parameters, we recovered a 91% complete CHOV assembly. Similar coverage thresholds have aided in the recovery of more complete assemblies [41,42]. This was an improvement upon our initial de novo assembly which was only 64% complete. Under these thresholds, there were 1,199 ambiguous base calls of which 93% (1,114 bp) were in the L segment. Therefore, completeness of the L segment (83%) was less than the M (99%) and the S (98%) segments. While our L segment is not complete, this marks a major improvement compared to many of the hantaviruses deposited in GenBank which are missing the L segment (ELMCV, LANV, and NYV) or only sequenced small portions of the L segment (BCCV). The International Committee on Taxonomy of Viruses Hantaviridae Study Group is revisiting requirements for proper hantavirid classification including standards for minimum sequence quality and genome completeness [43] as well as minimum information standards for uncultured viral genomes [44]. Limited biological material precluded additional sequencing necessary for genome completion and verification of low coverage base-calls. However, our efforts are still highly valuable as they provide the second CHOV, which will aid in our ability to design PCR amplicon tiling array and hybridization target capture sequencing strategies that can lead to affordable and scalable targeted sequencing efforts [45]. Such protocols that enable high throughput and cost-effective means to generate robust genetic information are needed for future genomic comparative studies and systematic exploration of understudied viruses, such as CHOV.

thumbnail
Fig 1. Coverage plot of sequencing reads mapped to the S, M, and L segments of the CHOV reference genome.

https://doi.org/10.1371/journal.pntd.0011672.g001

To place our CHOV genome into the phylogenetic context of other hantaviruses, we performed single and concatenated segment alignments and generated maximum likelihood phylogenetic trees. Our sequence fell into a well-supported clade (≥99% bootstrap values) with the CHOV reference genome from voucher MSB:Mamm:96073 (Figs 2A and S1). Both the reference and our sequence were isolated from O. costaricensis specimens from Los Santos, Panama, and were captured 23 months apart (Fig 3). Excluding ambiguous base calls (Ns), there are 125 single nucleotide polymorphisms (SNPs) differing from the CHOV reference of which 40 SNPs were in the L segment, 58 SNPs were in the M segment, and 27 SNPs were in the S segment. This corresponds to a divergence of 1.1% from the CHOV reference genome. This variation resulted in eight non-synonymous mutations with five located in the M segment (p.D246G, p.S247F, p.K510R, p.A658T, and p.V1133G) and three in the L segment (p.L342S, p.V412I, and p.N778S) (S3 Table). Overall, the limited number of amino acid differences discovered across two samples collected nearly a year apart, but within the same geographic region is interesting. However, to understand the true extent of genomic heterogeneity of CHOV regionally and temporally, generating additional complete genomes from both rodents and humans are necessary. It should be noted, that while the CHOV reference genome was generated with conventional Sanger sequencing (average accuracies of ~99.4%), Illumina next-generation sequencing (NGS) produces comparable sequence accuracies for reads with phred-like quality scores above 30, and overall accuracy increased with increasing depth of coverage [46]. The M segment encodes for two glycoproteins (G1 and G2), involved in signal transduction and mitigation of host antiviral defenses, positions 22–545 and 658–1136, respectively. While amino acid substitutions from lysine (K) to arginine (R), or alanine (A) to threonine (T) are less likely to affect protein structure because of similar side chain polarity and charge, modifications from aspartic acid (D) or valine (V) to glycine (G) are more likely to cause structural changes because of differing side chain biochemistry. Structural and functional studies of the CHOV L segment, similar to those in SNV [47], should be investigated. Non-synonymous mutations can have effects on protein stability and structure or alter protein-protein interactions [48]. In addition, variation in the non-coding regions at the 5′ and 3′ termini could disrupt complementary sequences which aid in the panhandle structure formation essential for transcription and replication [49]. These regions are generally highly conserved, and we observed only one SNP in the S segment (g.T29C), which is not predicted to interfere with panhandle formation.

thumbnail
Fig 2.

Maximum likelihood phylogenies of the S and M segments and a concatenated alignment (A) and a tanglegram used to visualize possible reassortment histories (B). Phylogenies were built with the GTR+GAMMA model with 10,000 ultrafast bootstraps and 10,000 bootstraps for the SH-aLRT on single segment and concatenated alignments. In the tanglegram, lines connect the same taxa/tip in each tree to one another such that crossing lines suggest topological discordance.

https://doi.org/10.1371/journal.pntd.0011672.g002

thumbnail
Fig 3. Geographic distribution of Panamanian CHOV sequences from 32 capture sites of the Costa Rican pygmy rice rat (O. costaricensis) and approximate residence of 21 clinical cases georeferenced using the Datum UTM, WGS 1984, with ArcMap software from ArcGIS (ESRI2019).

Red dots equating to the overlap in the locality of human and rodent sequences represent six rodents and one human in Los Santos, two rodents and two humans in Veraguas, and two rodents and one human in Coclé.

https://doi.org/10.1371/journal.pntd.0011672.g003

Hantavirus evolution has likely been shaped by their co-evolutionary history with their rodent reservoirs [50]. The phylogenetic relationships between South American and North American hantaviruses from sigmodontine rodents (Fig 2B) are most likely derived from a complex history of co-speciation events and the biogeographic constraints that influenced rodent expansion into South America [51]. However, intra- and inter- lineage reassortment between closely related variants have been reported for several hantaviruses [5256]. Although assumptions of reassortment events are often based on conflicting phylogenetic tree topologies, reassortment has also been demonstrated in in vitro experiments [5759]. The high degree of genomic similarity in the S and L segments suggests the exchange of the M segment is more common and potentially beneficial [60]. We did not find evidence of reassortment between the two CHOV genomes (Fig 2B). With additional genomic variants, this question should be revisited with more robust analyses [61].

One outstanding question is the relatedness of hantaviruses isolated from rodents to human clinical cases. A phylogenetic analysis of all publicly available partial S segment CHOV sequences clearly demonstrates viruses isolated from O. costaricensis, including the strain described here, are intimately related to all clinical strains from Panama (Fig 4). Both clinical and rodent CHOV strains were captured from Los Santos, Veraguas, and Coclé provinces; however, only rodent-acquired sequences were found in Panamá (Fig 3). This is reflective of the clinical disease burden with the greatest number of cases in Los Santos (77%) followed by Veraguas (12%) and Coclé (7%) [12]. However, it should be noted that locations of clinical cases are by residence, and it is possible patients were exposed while traveling to another endemic area of the country. The strains from the province of Panamá form a well-supported clade demonstrating potential geographic substructure (Fig 4) which is congruent with previous findings [36]. More sequencing is needed to determine how closely these are related to clinical strains; however, only seven Panamá residents have reported HCPS in the last 20 years (1% of all cases) [12]. Five other sequences from Panamanian hantaviruses were isolated from the short-tailed cane mouse (Zygodontomys brevicauda) and the Chiriqui harvest mouse (Reithrodontomys creper), but these are likely representative of other unclassified hantaviruses (e.g., Calabazo virus and Rio Segundo virus) that have yet to be fully sequenced [43] or associated with human disease (S2 Fig).

thumbnail
Fig 4. A maximum likelihood phylogeny of the partial S segment of 51 CHOV genomes demonstrates sequences isolated from O. costaricensis are associated with human disease.

The phylogeny was built with the GTR+GAMMA model with 10,000 ultrafast bootstraps and 10,000 bootstraps for the SH-aLRT. Bootstrap values ≥ 65% are shown. The sequence in bold was acquired in this study.

https://doi.org/10.1371/journal.pntd.0011672.g004

Virus discovery has been dramatically accelerated with the advances in sequencing [62]. Here, we provide a proof-of-concept for using deep metatranscriptomics to recover a hantavirus genome from archived museum tissue samples. Viral enrichment protocols prior to sequencing, including those that utilize fraction and filtration methods, may even increase recovery of target organisms [63,64]. Other methods, such as PCR or target enrichment by capture [45], that seek to enrich virus sequences during library prep, require a priori knowledge of genomic sequences to preform optimally, and will be assisted by the metatranscriptomic sequencing described here. Furthermore, we demonstrate that tissues from voucher MSB:Mamm:131232, which were ultrafrozen for 20 years before our current sequencing efforts, were still viable for NGS, highlighting the longevity and utility of such preserved specimens. Our understanding of hantavirus systematics, evolution, and ecology have been profoundly influenced by integrating collaborative efforts between public health agencies, virologists, field biologists, and museum scientists. Museum specimen collections have been leveraged to document the distribution of SNV [65] and the discovery of multiple novel hantaviruses [66]. Utilizing these invaluable archives is essential to understanding pathogens circulating past and present. Given that habitat conversion [8] and climate change [67] could influence reservoir host densities and therefore increase the burden of CHOV and other hantaviruses, it is necessary to continue capturing sequence data to inform diagnostics, detection, and vaccine design.

Supporting information

S1 Table. Orthohantavirus sequences obtained from GenBank.

https://doi.org/10.1371/journal.pntd.0011672.s001

(XLSX)

S2 Table. Partial S segment of 56 Panamanian Orthohantavirus sequences obtained from GenBank.

https://doi.org/10.1371/journal.pntd.0011672.s002

(XLSX)

S3 Table. Nonsynonymous mutations against the CHOV reference.

https://doi.org/10.1371/journal.pntd.0011672.s003

(XLSX)

S1 Fig. A concatenated 11,972 bp alignment of the S, M, and L segments was used to infer a maximum likelihood phylogeny of New World Hantaviruses.

The phylogeny was built with the GTR+GAMMA model with 10,000 ultrafast bootstraps and 10,000 bootstraps for the SH-aLRT. LANV, BCCV, ELMCV, and NYV were limited to just the complete S and M segments.

https://doi.org/10.1371/journal.pntd.0011672.s004

(PDF)

S2 Fig. A maximum likelihood phylogeny of the partial S segment of 56 Panamanian hantaviruses.

Sequences were obtained from sick human patients and three rodent hosts (Oligoryzomys costaricensis, Zygodontomys brevicauda, and Reithrodontomys creper).

https://doi.org/10.1371/journal.pntd.0011672.s005

(PDF)

Acknowledgments

The authors thank the field crews and museum scientists at the Instituto Conmemorativo Gorgas de Estudios de la Salud (Panama City) and Museum of Southwestern Biology (Albuquerque) for collecting and preserving archival tissues of the rodents. Specifically, we thank the Panamanian Ministry of Environmental Affairs, the Panamanian Ministry of Health, the Panamanian Institute of Livestock and Agricultural Research, the Gorgas Committee for Animal Care and Use, and the International Center for Infectious Disease Research Program of the National Institutes of Health for their support. We also thank the UNM Center for Advanced Research Computing for providing high performance computing resources.

References

  1. 1. MacNeil A, Ksiazek TG, Rollin PE. Hantavirus Pulmonary Syndrome, United States, 1993–2009. Emerg Infect Dis. 2011;17: 1195–1201. pmid:21762572
  2. 2. Torales M, Martínez B, Román J, Rojas K, De Egea V, Torres J, et al. Actualización de áreas de riesgo y perfil epidemiológico de hantavirus en Paraguay (2013–2020). Mem Inst Investig Cienc Salud. 2022;20: 108–116.
  3. 3. Figueiredo LTM, Souza WMD, Ferrés M, Enria DA. Hantaviruses and cardiopulmonary syndrome in South America. Virus Research. 2014;187: 43–54. pmid:24508343
  4. 4. Martinez VP, Bellomo CM, Cacace ML, Suárez P, Bogni L, Padula PJ. Hantavirus Pulmonary Syndrome in Argentina, 1995–2008. Emerg Infect Dis. 2010;16: 1853–1860. pmid:21122213
  5. 5. Reyes Zaldívar FT, Ferrés M. Hantavirus: Descripción de dos décadas de endemia y su letalidad. ARS med. 2019;44: 30–39.
  6. 6. Ermonval M, Baychelier F, Tordo N. What do we know about how Hantaviruses interact with their different hosts? Viruses. 2016;8: 223. pmid:27529272
  7. 7. Nuzum E, Rossi C, Stephenson E, LeDuc J. Aerosol transmission of Hantaan and related viruses to laboratory rats. 1988;38: 636–640. pmid:2908582
  8. 8. Suzán G, Marcé E, Giermakowski JT, Armién B, Pascale J, Mills J, et al. The effect of habitat fragmentation and species diversity loss on Hantavirus prevalence in Panama. Annals of the New York Academy of Sciences. 2008;1149: 80–83. pmid:19120179
  9. 9. Vincent MJ, Quiroz E, Gracia F, Sanchez AJ, Ksiazek TG, Kitsutani PT, et al. Hantavirus Pulmonary Syndrome in Panama: Identification of novel hantaviruses and their likely reservoirs. Virology. 2000;277: 14–19. pmid:11062031
  10. 10. Bayard V, Kitsutani PT, Barria EO, Ruedas LA, Tinnin DS, Muñoz C, et al. Outbreak of Hantavirus Pulmonary Syndrome, Los Santos, Panama, 1999–2000. Emerg Infect Dis. 2004;10: 1635–1642. pmid:15498167
  11. 11. Armién A, Armién B, Koster F, Pascale J, Avila M, Gonzalez P, et al. Hantavirus infection and habitat associations among rodent populations in agroecosystems of Panama: Implications for human disease risk. 2009;81: 59–66. pmid:19556568
  12. 12. Armién B, Muñoz C, Cedeño H, Salazar JR, Salinas TP, González P, et al. Hantavirus in Panama: Twenty years of epidemiological surveillance experience. Viruses. 2023;15: 1395. pmid:37376694
  13. 13. Nelson R, Cañate R, Pascale JM, Dragoo JW, Armien B, Armien AG, et al. Confirmation of Choclo virus as the cause of hantavirus cardiopulmonary syndrome and high serum antibody prevalence in Panama. J Med Virol. 2010;82: 1586–1593. pmid:20648614
  14. 14. Armien B, Pascale JM, Munoz C, Lee S-J, Choi KL, Avila M, et al. Incidence rate for hantavirus infections without pulmonary syndrome, Panama. Emerg Infect Dis. 2011;17: 1936–1939. pmid:22000376
  15. 15. Khan A, Khan M, Ullah S, Wei D-Q. Hantavirus: The next pandemic we are waiting for? Interdiscip Sci Comput Life Sci. 2021;13: 147–152. pmid:33486690
  16. 16. Colella JP, Cobos ME, Salinas I, Cook JA, The PICANTE Consortium. Advancing the central role of non-model biorepositories in predictive modeling of emerging pathogens. Silverman N, editor. PLoS Pathog. 2023;19: e1011410. pmid:37319170
  17. 17. Salazar-Hamm PS, Montoya KN, Montoya L, Cook K, Liphardt S, Taylor JW, et al. Breathing can be dangerous: Opportunistic fungal pathogens and the diverse community of the small mammal lung mycobiome. Front Fungal Biol. 2022;3: 996574. pmid:37746221
  18. 18. Cheng TL, Rovito SM, Wake DB, Vredenburg VT. Coincident mass extirpation of neotropical amphibians with the emergence of the infectious fungal pathogen Batrachochytrium dendrobatidis. Proc Natl Acad Sci USA. 2011;108: 9502–9507. pmid:21543713
  19. 19. Dunnum JL, Yanagihara R, Johnson KM, Armien B, Batsaikhan N, Morgan L, et al. Biospecimen repositories and integrated databases as critical infrastructure for pathogen discovery and pathobiology research. Bethony JM, editor. PLoS Negl Trop Dis. 2017;11: e0005133. pmid:28125619
  20. 20. Schindel DE, Cook JA. The next generation of natural history collections. PLoS Biol. 2018;16: e2006125. pmid:30011273
  21. 21. Hanson JD, Utrera A, Fulhorst CF. The delicate pygmy rice rat (Oligoryzomys delicatus) is the principal dost of Maporal Virus (Family Bunyaviridae, Genus Hantavirus). Vector-Borne and Zoonotic Diseases. 2011;11: 691–696. pmid:21548760
  22. 22. Krakau S, Straub D, Gourlé H, Gabernet G, Nahnsen S. nf-core/mag: a best-practice pipeline for metagenome hybrid assembly and binning. NAR Genomics and Bioinformatics. 2022;4: lqac007. pmid:35118380
  23. 23. Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20: 257. pmid:31779668
  24. 24. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25: 1754–1760. pmid:19451168
  25. 25. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–2079. pmid:19505943
  26. 26. Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Current Protocols in Bioinformatics. 2020;70. pmid:32559359
  27. 27. Carver TJ, Rutherford KM, Berriman M, Rajandream M-A, Barrell BG, Parkhill J. ACT: the Artemis comparison tool. Bioinformatics. 2005;21: 3422–3423. pmid:15976072
  28. 28. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29: 1072–1075. pmid:23422339
  29. 29. Thompson CW, Phelps KL, Allard MW, Cook JA, Dunnum JL, Ferguson AW, et al. Preserve a voucher specimen! The critical need for integrating natural history collections in infectious disease studies. Prasad VR, editor. mBio. 2021;12: e02698–20. pmid:33436435
  30. 30. Arai S, Yanagihara R. Genetic diversity and geographic distribution of bat-borne Hantaviruses. Current Issues in Molecular Biology. 2020; 1–28. pmid:31997775
  31. 31. Katoh K. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research. 2002;30: 3059–3066. pmid:12136088
  32. 32. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25: 1972–1973. pmid:19505945
  33. 33. Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution. 2015;32: 268–274. pmid:25371430
  34. 34. Yu G, Smith DK, Zhu H, Guan Y, Lam TT. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. McInerny G, editor. Methods Ecol Evol. 2017;8: 28–36.
  35. 35. Amada T, Yoshimatsu K, Yasuda SP, Shimizu K, Koma T, Hayashimoto N, et al. Rapid, whole blood diagnostic test for detecting anti-hantavirus antibody in rats. Journal of Virological Methods. 2013;193: 42–49. pmid:23684845
  36. 36. Gonzalez P, Salazar JR, Salinas TP, Avila M, Colella JP, Dunnum JL, et al. Two decades of wildlife pathogen surveillance: Case study of Choclo orthohantavirus and its wild reservoir Oligoryzomys costaricensis. Viruses. 2023;15: 1390. pmid:37376689
  37. 37. Ortiz N, Juan EE, Chiappero MB, Gardenal CN, Provensal MC, Polop JJ, et al. Microgeographic genetic structure of Oligoryzomys longicaudatus (Rodentia, Cricetidae) in periods of different population density. Ojeda R, editor. Journal of Mammalogy. 2019; gyz152.
  38. 38. Juan EE, Provensal MC, Steinmann AR. Space use and social mating system of the Hantavirus host, Oligoryzomys longicaudatus. EcoHealth. 2018;15: 96–108. pmid:29196828
  39. 39. Hjelle B, Jenison S, Torrez-Martinez N, Herring B, Quan S, Polito A, et al. Rapid and specific detection of Sin Nombre virus antibodies in patients with hantavirus pulmonary syndrome by a strip immunoblot assay suitable for field diagnosis. J Clin Microbiol. 1997;35: 600–608. pmid:9041397
  40. 40. Groen J, Gerding M, Jordans JGM, Clement JP, Osterhaus ADME. Class and subclass distribution of hantavirus-specific serum antibodies at different times after the onset of nephropathia epidemica. J Med Virol. 1994;43: 39–43. pmid:7916034
  41. 41. Card DC, Schield DR, Reyes-Velasco J, Fujita MK, Andrew AL, Oyler-McCance SJ, et al. Two Low Coverage Bird Genomes and a Comparison of Reference-Guided versus De Novo Genome Assemblies. Kolokotronis S-O, editor. PLoS ONE. 2014;9: e106649. pmid:25192061
  42. 42. Schneeberger K, Ossowski S, Ott F, Klein JD, Wang X, Lanz C, et al. Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proc Natl Acad Sci USA. 2011;108: 10249–10254. pmid:21646520
  43. 43. Kuhn JH, Bradfute SB, Calisher CH, Klempa B, Klingström J, Laenen L, et al. Pending reorganization of Hantaviridae to include only completely sequenced viruses: A call to action. Viruses. 2023;15: 660. pmid:36992369
  44. 44. Roux S, Adriaenssens EM, Dutilh BE, Koonin EV, Kropinski AM, Krupovic M, et al. Minimum information about an uncultivated virus genome (MIUViG). Nat Biotechnol. 2019;37: 29–37. pmid:30556814
  45. 45. O’Flaherty BM, Li Y, Tao Y, Paden CR, Queen K, Zhang J, et al. Comprehensive viral enrichment enables sensitive respiratory virus genomic identification and analysis by next generation sequencing. Genome Res. 2018;28: 869–877. pmid:29703817
  46. 46. Manley LJ, Ma D, Levine SS. Monitoring error Rates in Illumina sequencing. J Biomol Tech. 2016;27: 125–128. pmid:27672352
  47. 47. Meier K, Thorkelsson SR, Durieux Trouilleton Q, Vogel D, Yu D, Kosinski J, et al. Structural and functional characterization of the Sin Nombre virus L protein. Whelan SPJ, editor. PLoS Pathog. 2023;19: e1011533. pmid:37549153
  48. 48. Zhang Z, Miteva MA, Wang L, Alexov E. Analyzing effects of naturally accurring missense mutations. Computational and Mathematical Methods in Medicine. 2012;2012: 1–15. pmid:22577471
  49. 49. Meier K, Thorkelsson SR, Quemin ERJ, Rosenthal M. Hantavirus replication cycle—An updated structural virology perspective. Viruses. 2021;13: 1561. pmid:34452426
  50. 50. Plyusnin A, Morzunov SP. Virus evolution and genetic diversity of hantaviruses and their rodent hosts. In: Schmaljohn CS, Nichol ST, editors. Hantaviruses. Berlin, Heidelberg: Springer Berlin Heidelberg; 2001. pp. 47–75. https://doi.org/10.1007/978-3-642-56753-7_4 pmid:11217406
  51. 51. Hughes AL, Friedman R. Evolutionary diversification of protein-coding genes of Hantaviruses. Molecular Biology and Evolution. 2000;17: 1558–1568. pmid:11018161
  52. 52. Li D, Schmaljohn AL, Anderson K, Schmaljohn CS. Complete nucleotide sequences of the M and S segments of two Hantavirus isolates from California: Evidence for reassortment in nature among viruses related to Hantavirus Pulmonary Syndrome. Virology. 1995;206: 973–983. pmid:7856108
  53. 53. Razzauti M, Plyusnina A, Henttonen H, Plyusnin A. Accumulation of point mutations and reassortment of genomic RNA segments are involved in the microevolution of Puumala hantavirus in a bank vole (Myodes glareolus) population. Journal of General Virology. 2008;89: 1649–1660. pmid:18559935
  54. 54. Kim J-A, Kim W, No JS, Lee S-H, Lee S-Y, Kim JH, et al. Genetic diversity and reassortment of Hantaan Virus tripartite RNA genomes in nature, the Republic of Korea. McElroy AK, editor. PLoS Negl Trop Dis. 2016;10: e0004650. pmid:27315053
  55. 55. Lee S-H, Kim W-K, No JS, Kim J-A, Kim JI, Gu SH, et al. Dynamic circulation and genetic exchange of a shrew-borne Hantavirus, Imjin virus, in the Republic of Korea. Sci Rep. 2017;7: 44369. pmid:28295052
  56. 56. Liphardt SW, Kang HJ, Arai S, Gu SH, Cook JA, Yanagihara R. Reassortment between divergent strains of Camp Ripley Virus (Hantaviridae) in the northern short-tailed shrew (Blarina brevicauda). Front Cell Infect Microbiol. 2020;10: 460. pmid:33014888
  57. 57. Rodriguez LL, Owens JH, Peters CJ, Nichol ST. Genetic reassortment among viruses causing Hantavirus Pulmonary Syndrome. Virology. 1998;242: 99–106. pmid:9501041
  58. 58. Rizvanov AA, Khaiboullina SF, St. Jeor S. Development of reassortant viruses between pathogenic hantavirus strains. Virology. 2004;327: 225–232. pmid:15351210
  59. 59. Handke W, Oelschlegel R, Franke R, Wiedemann L, Kruger DH, Rang A. Generation and characterization of genetic reassortants between Puumala and Prospect Hill hantavirus in vitro. Journal of General Virology. 2010;91: 2351–2359. pmid:20505009
  60. 60. Klempa B. Reassortment events in the evolution of hantaviruses. Virus Genes. 2018;54: 638–646. pmid:30047031
  61. 61. De Vienne DM. Tanglegrams are misleading for visual evaluation of tree congruence. Townsend J, editor. Molecular Biology and Evolution. 2019;36: 174–176. pmid:30351416
  62. 62. Zhang Y-Z, Chen Y-M, Wang W, Qin X-C, Holmes EC. Expanding the RNA Virosphere by Unbiased Metagenomics. Annu Rev Virol. 2019;6: 119–139. pmid:31100994
  63. 63. Thurber RV, Haynes M, Breitbart M, Wegley L, Rohwer F. Laboratory procedures to generate viral metagenomes. Nat Protoc. 2009;4: 470–483. pmid:19300441
  64. 64. Thannesberger J, Hellinger H-J, Klymiuk I, Kastner M-T, Rieder FJJ, Schneider M, et al. Viruses comprise an extensive pool of mobile genetic elements in eukaryote cell cultures and human clinical samples. FASEB j. 2017;31: 1987–2000. pmid:28179422
  65. 65. Yates TL, Mills JN, Parmenter CA, Ksiazek TG, Parmenter RR, Vande Castle JR, et al. The ecology and evolutionary history of an emergent disease: Hantavirus Pulmonary Syndrome. BioScience. 2002;52: 989.
  66. 66. Yanagihara R, Gu SH, Arai S, Kang HJ, Song J-W. Hantaviruses: Rediscovery and new beginnings. Virus Research. 2014;187: 6–14. pmid:24412714
  67. 67. Klempa B. Hantaviruses and climate change. Clinical Microbiology and Infection. 2009;15: 518–523. pmid:19604276