Figures
Abstract
Background
We set out to investigate the utility of publicly available tick transcriptomic data to identify and characterize known and recently described tick-borne viruses, using de novo assembly and subsequent protein database alignment and taxonomical binning.
Methodology/principal findings
A total of 127 virus contigs were recovered from 35 transcriptomes, originating from cell lines (40%), colony-reared ticks (25.7%) or field-collected ticks (34.2%). Generated virus contigs encompass DNA (n = 2) and RNA (n = 13) virus families, with 3 and 28 taxonomically distinct isolates, respectively. Known human and animal pathogens comprise 32.8% of the contigs, where Beiji nairovirus (BJNV) was the most prevalent tick-borne pathogenic virus, identified in 22.8% of the transcriptomes. Other pathogens included Nuomin virus (NUMV) (2.8%), African swine fever virus (ASFV) (5.7%), African horse sickness virus 3 (AHSV-3) (2.8%) and Alongshan virus (ALSV) (2.8%).
Conclusions
Previously generated transcriptome data can be leveraged for detecting tick-borne viruses, as exemplified by new descriptions of ALSV and BJNV in new geographic locations and other viruses previously detailed in screening reports. Monitoring pathogens using publicly available data might facilitate biosurveillance by directing efforts to regions of preliminary spillover and identifying targets for screening. Metadata availability is crucial for further assessments of detections.
Author summary
Ticks can transmit many viruses with public health impact on humans and livestock. Here we explore data-reuse of publicly available tick transcriptomic or metagenome sequencing freely available in open access databases as a mechanism to passively increase xenosurveillance data on the distribution and genomic identity of tick-borne viruses. Using open-source tools for de novo assembly and taxonomic classification, we recovered 127 partial virus genomes from 35 data sets. Human and animal pathogens comprise 32.8% of the genomes, including recently described pathogens Alongshan virus (ALSV), Beiji nairovirus (BJNV) and Nuomin virus, along with African swine fever virus (ASFV) and African horse sickness virus 3 (AHSV-3). Our findings demonstrate previously generated data can be leveraged for detecting tick-borne viruses, as exemplified by records of the human pathogens ALSV and BJNV in Europe, and other viruses previously detailed in screening reports. The monitoring of pathogens using publicly available data, combined with accurate sample information, can potentially support surveillance efforts, highlighting regions of preliminary spillover and identifying targets for screening.
Citation: Ergunay K, Bourke BP, Linton Y-M (2025) Exploring the potential of tick transcriptomes for virus screening: A data reuse approach for tick-borne virus surveillance. PLoS Negl Trop Dis 19(3): e0012907. https://doi.org/10.1371/journal.pntd.0012907
Editor: Travis J. Bourret, Creighton University, UNITED STATES OF AMERICA
Received: August 6, 2024; Accepted: February 11, 2025; Published: March 6, 2025
Copyright: © 2025 Ergunay et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data generated in this study are available in the S1 Appendix.
Funding: The study was financially supported by the Armed Forces Health Surveillance Division – Global Emerging Infections Surveillance (AFHSD-GEIS) awards P0057_22_WR and P0044_23_WR, to the Walter Reed Army Institute of Research (WRAIR) One Health Branch (Y-M. L.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Infections caused by tick-borne viruses constitute a global zoonotic risk with substantial disease burden and public health threat and account for a major portion of the vector-borne diseases [1]. As blood-feeding vectors, ticks (Acari: Ixodida) are quintessential vessels for virus transmission, owing to their different life stages feeding on various animal hosts and potential for adaptation to diverse ecological environments. Currently, expansion of tick populations into new geographic regions with subsequent human and animal exposure is widely documented, and impacted by ongoing global climatic and environmental changes [1,2]. Increasing prevalence and geographical range have been reported for particular tick-borne viral pathogens, such as Powassan virus in North America, African swine fever virus (ASFV) in Africa and Crimean-Congo hemorrhagic fever virus (CCHFV) in Eurasia [1]. Moreover, increasing examples of novel tick-borne viruses capable of producing symptomatic human infections and severe outcomes have been documented, such as Alongshan virus (ALSV), Jingmen tick virus (JMTV), Beiji nairovirus (BJNV) and Tacheng tick virus 1 (TTV1) [3–6]. Although many of the case descriptions and screening findings originate from Asia, recent investigations demonstrate expansion of these viruses into Europe, raising questions about undiagnosed human infections occurring at the larger geographical scale [7,8]. Due to lack of commercially available nucleic acid or serology-based testing, scarce information is currently available for the distribution and public health impact of these viruses.
Timely identification of the circulating pathogens prior to disease emergence or outbreak is significantly facilitated by surveillance [9]. Monitoring the spread of viruses and identifying new pathogens in vectors and animal reservoirs constitute key steps in describing spillover, emergence and subsequent mitigation of infectious diseases caused by these agents. Vector surveillance provides an effective strategy for monitoring the introduction or circulation of emerging pathogens into susceptible populations and an early warning system for predicting outbreaks [1,9]. Direct bio- or xeno-surveillance for arthropod-borne infections are mainly based on targeted pathogen antigen or nucleic acid screening and requires biological sample collection, handling, and testing; therefore, personnel, infrastructure, and funding. Nowadays, transcriptomic or metagenome sequencing (MS) technologies have become widely available and utilized in various settings that require sequence generation, including those involving field-collected or laboratory-reared vector arthropods [10]. Enabling analysis of the nucleic acid content in the sample without prior information, MS has proved to be a robust method for virus identification, as well as describing novel virus genomes [11]. Regardless of the goals of the original investigations, MS data is frequently deposited in publicly accessible repositories, with massive amounts of sequencing data collected from various sources [12]. This exponentially growing data is likely to contain sequences originating from a wide variety of viruses, including divergent local strains and those newly described as pathogens, unknown at the time of data generation [13]. Hence, existing raw data sets in public repositories can be recycled for retrospective virus screening and might be leveraged for preliminary screening. This study was carried out to investigate the utility of publicly available tick transcriptomic data as a resource to identify and characterize known and recently described tick-borne viruses.
Methods
Dataset selection
High-throughput sequencing data available at the NCBI Sequence Read Archive (SRA) database [12] was initially screened during November 2023 according to the following search keywords: (“Ixodidae”[Organism]) AND (“Illumina”[Platform]) AND (“biomol rna”[Properties] AND “library layout paired”[Properties]. Search outputs and associated metadata were manually downloaded and checked for availability of information on location, sample properties and target tissues of ticks. Other runs in the bioproject associated with the selected data were further examined for additional samples. Any transcriptome lacking these metadata were omitted. Raw data available as FASTQ was used for downstream analysis. In addition, publication records linked with bioproject were examined and datasets originally generated via metatranscriptomics for virome exploration, virus screening or discovery were omitted. List of transcriptomes selected for downstream processing is provided in S1 Table.
Virus contig generation
Demultiplexed raw Illumina data were first adaptor trimmed and quality filtered using fastp v0.23.3 (--qualified_quality_phred = 15; --unqualified_percent_limit = 40) [14,15]. The data was then de-novo assembled using MEGAHIT v1.2.9 and its “basic usage” setting for paired end libraries [16]. De novo assembled contigs were then aligned to the National Center for Biotechnology Information (NCBI) protein non-redundant (nr) database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz; accessed April 5 2023) using Diamond (--long-reads; --evalue 1e-9) [17,18] and taxonomically binned using Megan v6.24.20 (--minSupport 1; --minPercentIdentity 70; --maxExpected 1.0E-9; --lcaAlgorithm longReads; --lcaCoveragePercent 51; --longReads) [19,20]. A classification table comprising sequence names and NCBI Taxonomy ID was created using the daa2info module available with the Megan v6.24.20. A list of NCBI virus Taxonomy IDs, were generated based on the International Committee for the Taxonomy of Viruses (ICTV) (https://ictv.global/taxonomy; accessed May 4 2023), using the list function in the taxonkit tool [21]. Subsequently, all virus sequence listed in classification table were extracted from assembly files using the subseq option available in the seqtk tool [22].
Sequence handling and phylogenetic analysis
Contigs were handled using Geneious Prime (v2023.2.1) (Biomatters Ltd., Auckland, New Zealand). BLASTn and BLASTp algorithms were used in default settings used for similarity searches in the NCBI database [23]. Contigs were mapped using Minimap2 v2.24 (long assembly to reference mapping; preset option -x asm20, i.e., up to 20% divergence), and nucleotide/protein alignments were generated using CLUSTALW (full multiple alignment mode) [24, 25]. Phylogenetic relationships between virus contigs and near relatives according to ICTV were explored using maximum likelihood analysis performed in MEGA v11.0.13 [26]. The optimal model for the phylogenetic and molecular evolutionary analyses was determined using the built-in “Find Best DNA/protein-substitution model” tools. Maximum-likelihood trees based on the nucleotide sequences were constructed using the Jones-Taylor-Thornton model. The reliability of the inferred trees was evaluated by standard bootstrap analysis of 500 replicates.
Results
A total of 85 tick transcriptomes were processed (S1 Table), resulting in 127 virus contigs generated in 35 transcriptomes (41.2%) (Fig 1). The virus contigs originated from cell lines (30, 23.6%), colony reared ticks (20, 15.7%), and field collected ticks (77, 60.6%). The tick transcriptomes (n=21) were produced from the whole body (n=12), salivary gland (n=5), or the midgut (n=4) (Table 1). Generated virus contigs encompass DNA (n=2) and RNA (n=13) virus families with three, and 28, taxonomically distinct isolates, respectively (Fig 2). Field-collected ticks provided more virus contigs compared to other samples, with pronounced diversity by various metrics including total virus family and species counts, as well as contigs per transcriptome and viruses per sample (Fig 1) (S1 Table). In four samples originating from cell lines (C5), colony-reared ticks (L5) or field-collected ticks (F3 and F4), longer contigs of identical viruses were generated by reference mapping of the initial contigs (S2 Table).
Viruses in cell lines
Fourteen of the transcriptomes with virus contigs were originally generated in cell lines established from the embryonic tissues of laboratory reared ticks belonging to four distinct species (Table 1). A total of 30 contigs representing 6 viral taxa from 4 families were identified. Interestingly, three of the four cell lines (AAE1, DAE100 and DVE1) yielded only DNA viruses of the Parvoviridae family; namely, Culex pipiens pallens densovirus and Anopheles gambiae densonucleosis virus (S2 Table). In contrast, the Ixodes scapularis cell line (ISE6) transcriptomes yielded various tick-associated virus contigs of wider divergence, encompassing many RNA viruses of different families as well as currently unclassified isolates (S2 Table). Three of the virus contigs (Ixodes scapularis bunyavirus, Ixodes scapularis iflavirus, Ixodes scapularis-associated virus 3) were originally described from ISE6 cells [27], while the remainder (Ixodes scapularis-associated virus 1) was further detected in field-collected ticks [28].
Viruses in colony ticks
Twenty virus contigs representing 8 taxa belonging to 6 families were generated from 8 transcriptomes from colony reared ticks, mostly comprising Ixodes ricinus species (Fig 1, S2 Table). Interestingly, 6 (75%) of the transcriptomes yielded tick-borne pathogen contigs including ASFV, African horse sickness virus (AHSV) 3 and BJNV. ASFV and AHSV-3 originated from the single transcriptome generated from salivary glands of Rhipicephalus zambeziensis from South Africa. This dataset further provided a contig of Citrus tristeza virus. BJNV was exclusively identified in I. ricinus samples, that further yielded viruses from Chuviridae (Blacklegged tick chuvirus 2, Lesnoe mivirus) and Phenuiviridae (Blacklegged tick phlebovirus 3, Phenuiviridae sp.) families, previously described in I. scapularis and Ixodes persulcatus ticks collected in USA and China (Tables 1 and S2) [29].
Viruses in field ticks
Twelve transcriptomes from five tick species collected from a broad geographical distribution yielded 77 virus contigs of 19 viruses, classified in 10 families, representing a larger and more diverse cohort compared to other samples (Table 1, Fig 1). Pathogenic viruses including ALSV, ASFV, BJNV-Gakugsa tick virus and Nuomin virus (NUMV) were identified in 5 (41.6%) SRAs. The transcriptomes from Ixodes persulcatus and Rhipicephalus annulatus were observed as the main contributors to the virus diversity (Table 1). In Rhipicephalus annulatus transcriptomes, contigs of Bole tick virus 3, Wuhan mivirus and Wuhan tick virus 2 (Chuviridae), Lihan tick virus and Rhipicephalus associated phlebovirus 1 (Phenuiviridae) and unclassified Flaviviridae sp. were generated. These viruses have been described in Rhipicephalus as well as other tick species, displaying a worldwide distribution as evidenced by MS [29–31]. The I. persulcatus SRAs produced different viruses from related families including Taiga tick nigecruvirus, Yichun mivirus and unclassified Chuviridae sp. (Chuviridae), Fangzheng tombus-like virus (Tombusviridae) and Peribunyaviridae sp., previously documented in Ixodes sp. ticks, in addition to Totivirus sp. (Totiviridae), hosted by fungi in nature [29–33]. In two SRAs from Ornithodoros erraticus samples (F11, F12), contigs of nyaviruses (Nyamiviridae), mainly hosted by Ornithodoros ticks, and birds but detected in diverse invertebrates [34], were identified (S2 Table).
Tick-borne pathogenic viruses
Virus contigs of previously documented tick-borne pathogens were identified in 11 transcriptomes (31.4%) and comprise 42 contigs (32.8%) of 6 viral taxa from 5 families (Table 2). Most prevalent pathogen was observed as BJNV (and Gakugsa tick virus, a discouraged synonym for BJNV, Nairoviridae) [35], identified as 25 contigs (19.5%) in 8 samples (22.8%), in colony reared I. ricinus (n=5) and field collected I. persulcatus (n=3) samples, including various developmental stages, feeding status and sex (Table 2). Both genome segments of BJNV were represented in the contigs. Alignment and maximum likelihood analyses of virus nucleoprotein and replicase sequences (encoded by S and L genome segments) in samples with contigs of sufficient sizes revealed clustering with BJNV sequences within the genus Norwavirus (S1 Fig). In the nucleoprotein phylogeny, while contigs from field collected I. persulcatus (F4, F7) grouped with BJNV-Gakudsa tick viruses (Norwavirus beijiense species), the contig from I. ricinus colony sample (L5) was placed as a separate taxon with the Norwavirus grotenhoutense cluster with Grotenhout and Pustyn viruses, currently classified as a separate species within Norwavirus genus (S1 Fig) [35]. Given the diversity in identity and coverage, some contigs identified in colony-reared ticks might indicate both virus species to be present in these transcriptomes.
We further identified 7 contigs (5.4%) of NUMV (Chuviridae), implicated in human febrile diseases associated with tick bites [36], in a field collected I. persulcatus transcriptome, encompassing various regions in the NUMV genome (Table 2). ASFV (Asfarviridae) and AHSV3 (Sedoreoviridae) were observed in a R. zambeziensis transcriptome, with various regions of the virus genomes represented in the contigs. ASFV was identified from a field collected soft tick Ornithodoros moubata transcriptome as well, with a total of 5 contigs (3.9%) in the dataset. Finally, a single contig of ALSV (Flaviviridae) genome segment 4 was generated in the transcriptome of another field collected soft tick O. erraticus in Spain (S2 Table).
Discussion
In this proof-of-concept study, our findings demonstrate that publicly available tick transcriptome data can be leveraged as low-cost resource to monitor the geographical expansion and new tick species associations of tick-borne viruses. Overall, we generated 127 contigs of DNA and RNA virus genomes belonging to 15 families from 35 tick transcriptomes, where 32.8% of the contigs comprise known human and animal pathogens. BJNV was the most prevalent tick-borne pathogenic virus, identified in 19.5% of the virus contigs and in 22.8% of the transcriptomes. Classified in Norwavirus beijiense species (genus Norwavirus, family Nairoviridae), the BJNV genome comprises two segments, encoding for the viral nucleoprotein (S segment) and replicase (RNA-dependent RNA polymerase, L segment), and lacks the M segment present in other nairovirus genera [35]. It is described as the causative agent of tick-associated human febrile diseases occurring in the Inner Mongolia autonomous region of China [5]. In the region, high rates of virus exposure identified by virus-specific antibodies were documented in humans, sheep and cattle. Moreover, viral pathogenicity was observed in cell culture and experimental animal inoculations. Screening of ticks from different regions of China revealed BJNV genomes in many Dermacentor, Haemaphysalis, and Ixodes spp. (including I. persulcatus), and most recently, in Rhipicephalus sanguineus sensu lato [5,33,37,38]. We detected BJNV in colony-reared I. ricinus from Switzerland and field collected I. persulcatus from China, encompassing various developmental stages, feeding status and sex. BJNV has not been previously reported from Europe or in colony-reared ticks. Maximum likelihood analyses using longer contigs of viral genome segments provided further evidence for virus identification in field collected samples, while findings in colony samples might indicate co-infections of Norwavirus beijiense and Norwavirus grotenhoutense species in the Norwavirus genus. Currently, Norwavirus grotenhoutense species includes only Grotenhout virus, described in I. ricinus ticks from Belgium [39], and closely related viruses such as Pustyn virus and Norway nairovirus 1 documented in Bulgaria, Poland and Norway [7,40]. In any case, these findings warrant further screening for this virus species accommodating a documented pathogen in Europe.
We further detected NUMV, comprising 5.4% of the total contigs, in a field collected I. persulcatus transcriptome (Tables 1 and S2). NUMV is the only chuvirus with potential medical significance demonstrated so far [36]. Chuviruses (family Chuviridae) are single-stranded negative sense RNA viruses, with diverse genome topologies including unsegmented, segmented, linear, or circular genomes [41,42]. Widespread around the globe, they are discovered by MS and subsequently identified in arachnids including ticks and spiders as well as several other insects, barnacles, decapod crustaceans and reptiles. The index case had presented non-specific febrile disease following tick bites and the resulting investigations documented 54 patients with detectable virus genomes and antibody responses during 2017–2019 [36]. Therefore, NUMV is a strong candidate to be included in the list of recently described viruses to be considered an etiological agent in the workup of cases with unknown etiology. In China, it is found in many Ixodes and Haemaphysalis species and isolated from I. persulcatus [36], in parallel with our transcriptome findings.
Analysis of the transcriptome data revealed ASFV, comprising of the 3.9% contigs and 5.7% of the datasets (Tables 2 and S2). ASFV produces a tick-transmitted hemorrhagic fever with up to 100% mortality rate in domestic and feral swine, resulting in tremendous socioeconomic impact [43]. It is endemic in Africa affecting many countries and has emerged in Europe, Asia, and in the Americas [44]. Soft ticks of the genus Ornithodoros are the only recognized ASFV biological vectors and I. ricinus and D. reticulatus are unlikely to be relevant for transmission [45]. Our finding of ASFV in the field collected O. moubata transcriptome coincides with virus epidemiology, although detection in a R. zambeziensis colony is perplexing. It remains to be described whether this species has potential to contribute to virus circulation or merely represents a coincidental finding. Another unexpected pathogen, AHSV-3, was also present in the same transcriptome data, with four of ten virus genome segments being represented as contigs (Table 2). African horse sickness is widely distributed in sub-Saharan Africa and affects several equids, with detrimental impact across working equids and domestic horse industries [46]. The primary vectors responsible for transmission are the Culicoides biting midges, with evidence for involvement of mosquitoes and/or ticks [47]. Hence, the impact of virus detection in the colony requires further assessment for local AHSV-3 epidemiology and control.
We identified another tick-borne pathogen, ALSV, represented as a single contig of genome segment 4, encoding for the virus structural proteins VP2/3 in the study dataset [48]. ALSV is classified in the Jingmenvirus group of the family Flaviviridae and possesses a positive-sense single-stranded RNA genome in four segments. Human infections are described as tick bite-associated, non-specific febrile disease [3]. ALSV has been reported in Ixodes, Dermacentor and Haemaphysalis ticks from many Eurasian countries such China, Finland, France, Germany, Russia, Serbia and Switzerland, with evidence of exposure in sheep, cattle, and deer [3,49–52]. Moreover, vector competence of I. ricinus and Dermacentor reticulatus could be experimentally demonstrated [52]. However, ALSV has not been so far reported in Argasid ticks of the O. erraticus complex, nor from Spain. Capable of transmitting several important livestock and human pathogens, including ASFV and Borrelia hispanica, the O. erraticus tick complex is reported from the Iberian Peninsula, North and West Africa, and western Asia [53]. A possible genome integration should also be considered, given that Jingmen tick virus, a tick-borne human pathogen closely-related to ALSV, was previously reported to partially integrate into I. ricinus genome [54]. Further tick screening is needed to elucidate presence and probable replication of ALSV in O. erraticus and possible impact on human/animal health.
Our exploration further revealed other viruses not directly associated with tick lifecycle in nature (Table 1), such as Triatoma virus (Dicistroviridae) that has so far only been described in wild and colony populations of Triatominae (kissing bugs) [55]. The detection of this virus in field collected O. rostratus possibly resulted from shared host feeding in the environment. Similarly, presence of Citrus tristeza virus, the agent of the economically damaging plant disease of the Citrus genus and vectored by certain aphid species, in colony-reared R. zambeziensis suggests a potential environmental origin. Tick cell line ISE6 transcriptomes further revealed bunyaviruses and iflaviruses, previously reported in identical cell lines, with scarce information about modes of infection, persistence, and impact on tick cells or individual ticks [27]. Interestingly, we observed particular tick cell lines to generate densovirus contigs, isolated from many mosquito species of the genera Culex and Aedes as well as mosquito cell lines [56,57]. These viruses have not been reported in ticks or tick cell lines previously and their origins remain obscure.
In this study, we used a straightforward strategy involving de novo assembly, followed by aligning to the NCBI non-redundant protein database and taxonomical binning by robust tools, for generating virus contigs in tick transcriptomes selected for analysis [58]. Comparable approaches have been used on mosquito, mammalian and avian transcriptomes, mostly with the prime objective of novel virus discovery [13,59]. Other strategies include querying well-conserved amino acid imprints in RNA-dependent RNA polymerase enzymes, a hallmark gene of RNA viruses that lack a DNA stage during replication [60]. Screening publicly available data by aligning to a custom curated database with tick-borne virus reference genomes of interest was performed as well, with a subsequent risk prediction for zoonotic infections [61]. Currently, there is no standardized or optimized approach or toolset for transcriptome/metagenome data processing for broad range pathogen screening or virus discovery. When previously generated data is recycled for other purposes, the quality of the initial sequencing, frequently intended for targets other than viruses become an important bottleneck and may result in lower virus abundance and shorter contig assemblies [62]. Complete virus genome assemblies are often difficult to generate, requiring scaffold assemblies and contig extensions to maximize coverage. Moreover, such assemblies carry the risk of being derived from multiple virus populations, leading to artificially generated chimeric genomes, which can occur between closely related exogenous virus populations and with integrated but sufficiently similar sequences in the host genomes [62]. Therefore, appropriate checks for data accuracy and integrity are essential, especially in novel virus discovery attempts. In this study, with aims to explore existing data to uncover tick-borne viral pathogens, we examined the generated virus contigs directly, utilizing commonly-used tools and straightforward approaches, for unbiased data interpretation and better reproducibility.
A significant bottleneck is the insufficient metadata associated with some publicly available transcriptomes, which can significantly hamper understanding the origin and nature of the sample. Most public repositories including NCBI do not currently impose stringent criteria on metadata to be provided during submissions. Also, not all information embedded in the submission is searchable, preventing robust selection of transcriptomes for further analysis. We experienced these issues during our initial selection, which resulted in our manual assessment of metadata and additional, non-standardized selection of entries to be included in the downstream process, a major limitation of the current study. Nevertheless, our study demonstrates that the data is suitable for pathogen screening and would be of significance with more stringent and reproducible data selection criteria. Moreover, it has advantages for identifying known tick-borne viruses, such as directly following NCBI database updates and generating contigs encompassing various regions of the virus genomes; therefore, multi-layered evidence for pathogen presence. Integrated or endogenized viral sequences in tick chromosomes have been documented for many viruses, which poses a challenge for data interpretation as well as nucleic acid assays commonly used in screening, and might require additional genome targets to confirm presence of replication competent viruses [54,63]. In any case, the findings of in silico explorations such as in this study must be interpreted with caution and pathogen detections should preferably be confirmed by follow-up screening using field collected samples and standardized assays.
As genetic databases are estimated to double in almost every 18 months (https://www.ncbi.nlm.nih.gov/genbank/statistics/; accessed May 4 2023), re-use of original datasets might provide a low-cost option for monitoring the spread of global viruses, and help elucidate new insect vector and vertebrate host associations. Our findings demonstrate that previously generated transcriptome data can be leveraged for detecting tick-borne viruses, as exemplified by preliminary evidence for the several human pathogens. Monitoring tick-borne viruses using publicly available data can significantly augment surveillance efforts, highlighting regions of probable spillover and identifying targets for screening.
Supporting information
S1 Appendix. Sequences of virus contigs generated in the study.
Sample information is provided in S1 Table.
https://doi.org/10.1371/journal.pntd.0012907.s001
(XLSX)
S1 Fig. The maximum likelihood tree of the Norwavirus replicase (A: L segment, 540 amino acids), and nucleoprotein (B: S segment, 450 amino acids), constructed using Jones-Taylor-Thornton model with uniform rates for 500 replications.
Sequences obtained in the study are marked and indicated with sample identifiers. Virus strains are indicated by GenBank accession number, name and isolate identifier. Crimean-Congo hemorrhagic fever virus isolate Matin was included as an outgroup.
https://doi.org/10.1371/journal.pntd.0012907.s002
(PDF)
S1 Table. List of transcriptomes processed for virus contig generation.
https://doi.org/10.1371/journal.pntd.0012907.s003
(XLSX)
S2 Table. Sample information, virus contigs and top BLAST hits.
Longer contigs produced by reference mapping are indicated as Lcontig. Contigs with insertion/deletions (InDel) were marked.
https://doi.org/10.1371/journal.pntd.0012907.s004
(XSLX)
Acknowledgments
This manuscript was prepared whilst KE and BPB held a National Research Council (NRC) Research Associateship Awards at the Walter Reed Biosystematics Unit, through the Walter Reed Army Institute of Research, Silver Spring, MD. All or portions of the laboratory and/or data analysis were conducted in and with the support of the Laboratories of Analytical Biology (https://ror.org/05b8c0r92) of the National Museum of Natural History. The opinions or assertions contained herein are the private views of the authors, and are not to be construed as official, or as reflecting true views of the Department of the Army, Navy or the Department of Defense. Material contained within this publication has been reviewed by the Walter Reed Army Institute of Research. There is no objection to its presentation and/ or publication. Preliminary findings of the study were presented at the 3rd Annual Ticks and Tickborne Diseases Symposium, Johns Hopkins Bloomberg School of Public Health, April 30th, 2025, Baltimore, MD.
References
- 1. Mansfield KL, Jizhou L, Phipps LP, Johnson N. Emerging Tick-Borne Viruses in the Twenty-First Century. Front Cell Infect Microbiol. 2017;7:298. pmid:28744449
- 2. Ogden NH, Mechai S, Margos G. Changing geographic ranges of ticks and tick-borne pathogens: drivers, mechanisms and consequences for pathogen diversity. Front Cell Infect Microbiol. 2013;3:46. pmid:24010124
- 3. Wang Z-D, Wang B, Wei F, Han S-Z, Zhang L, Yang Z-T, et al. A New Segmented Virus Associated with Human Febrile Illness in China. N Engl J Med. 2019;380(22):2116–25. pmid:31141633
- 4. Jia N, Liu H-B, Ni X-B, Bell-Sakyi L, Zheng Y-C, Song J-L, et al. Emergence of human infection with Jingmen tick virus in China: A retrospective study. EBioMedicine. 2019;43:317–24. pmid:31003930
- 5. Wang Y-C, Wei Z, Lv X, Han S, Wang Z, Fan C, et al. A new nairo-like virus associated with human febrile illness in China. Emerg Microbes Infect. 2021;10(1):1200–8. pmid:34044749
- 6. Liu X, Zhang X, Wang Z, Dong Z, Xie S, Jiang M, et al. A Tentative Tamdy Orthonairovirus Related to Febrile Illness in Northwestern China. Clin Infect Dis. 2020;70(10):2155–60. pmid:31260510
- 7. Ergunay K, Bourke BP, Reinbold-Wasson DD, Nikolich MP, Nelson SP, Caicedo-Quiroga L, et al. The expanding range of emerging tick-borne viruses in Eastern Europe and the Black Sea Region. Sci Rep. 2023;13(1):19824. pmid:37963929
- 8. Ergunay K, Bourke BP, Reinbold-Wasson DD, Caicedo-Quiroga L, Vaydayko N, Kirkitadze G, et al. Novel clades of tick-borne pathogenic nairoviruses in Europe. Infect Genet Evol. 2024;121105593. pmid:38636618
- 9. Liang G, Gao X, Gould EA. Factors responsible for the emergence of arboviruses; strategies, challenges and limitations for their control. Emerg Microbes Infect. 2015;4(3):e18. pmid:26038768
- 10. Ergunay K, Bourke BP, Achee N, Jiang L, Grieco J, Linton Y-M. Vector-borne pathogen surveillance in a metagenomic world. PLoS Negl Trop Dis. 2024;18(2):e0011943. pmid:38386620
- 11. Bassi C, Guerriero P, Pierantoni M, Callegari E, Sabbioni S. Novel Virus Identification through Metagenomics: A Systematic Review. Life (Basel). 2022;12(12):2048. pmid:36556413
- 12. Katz K, Shutov O, Lapoint R, Kimelman M, Brister JR, O’Sullivan C. The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Res. 2022;50(D1):D387–90. pmid:34850094
- 13. Kawasaki J, Kojima S, Tomonaga K, Horie M. Hidden Viral Sequences in Public Sequencing Data and Warning for Future Emerging Diseases. mBio. 2021;12(4):e0163821. pmid:34399612
- 14. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884–90. pmid:30423086
- 15. Chen S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta. 2023;2(2):e107. pmid:38868435
- 16. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674–6. pmid:25609793
- 17. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12(1):59–60. pmid:25402007
- 18. Buchfink B, Reuter K, Drost H-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods. 2021;18(4):366–8. pmid:33828273
- 19. Huson DH, Albrecht B, Bağcı C, Bessarab I, Górska A, Jolic D, et al. MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs. Biol Direct. 2018;13(1):6. pmid:29678199
- 20. Bağcı C, Patz S, Huson DH. DIAMOND+MEGAN: Fast and Easy Taxonomic and Functional Analysis of Short and Long Microbiome Sequences. Curr Protoc. 2021;1(3):e59. pmid:33656283
- 21. Shen W, Ren H. TaxonKit: A practical and efficient NCBI taxonomy toolkit. J Genet Genomics. 2021;48(9):844–50. pmid:34001434
- 22. Li H. Seqtk; 2013 [cited 2023 May 19]. Database: GitHub [Internet]. Available from: https://github.com/lh3/seqtk
- 23. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. pmid:2231712
- 24. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100. pmid:29750242
- 25. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8. pmid:17846036
- 26. Tamura K, Stecher G, Kumar S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol Biol Evol. 2021;38(7):3022–7. pmid:33892491
- 27. Nakao R, Matsuno K, Qiu Y, Maruyama J, Eguchi N, Nao N, et al. Putative RNA viral sequences detected in an Ixodes scapularis-derived cell line. Ticks Tick Borne Dis. 2017;8(1):103–11. pmid:27769656
- 28. Tokarz R, Williams SH, Sameroff S, Sanchez Leon M, Jain K, Lipkin WI. Virome analysis of Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis ticks reveals novel highly divergent vertebrate and invertebrate viruses. J Virol. 2014;88(19):11480–92. pmid:25056893
- 29. Ni X-B, Cui X-M, Liu J-Y, Ye R-Z, Wu Y-Q, Jiang J-F, et al. Metavirome of 31 tick species provides a compendium of 1,801 RNA virus genomes. Nat Microbiol. 2023;8(1):162–73. pmid:36604510
- 30. Molina-Hoyos K, Montoya-Ruíz C, Aguilar PV, Pérez-Doria A, Díaz FJ, Rodas JD. Virome analyses of Amblyomma cajennense and Rhipicephalus microplus ticks collected in Colombia. Acta Trop. 2024;253107158. pmid:38402921
- 31. Ergünay K, Dinçer E, Kar S, Emanet N, Yalçınkaya D, Polat Dinçer PF, et al. Multiple orthonairoviruses including Crimean-Congo hemorrhagic fever virus, Tamdy virus and the novel Meram virus in Anatolia. Ticks Tick Borne Dis. 2020;11(5):101448. pmid:32723637
- 32. Qin T, Shi M, Zhang M, Liu Z, Feng H, Sun Y. Diversity of RNA viruses of three dominant tick species in North China. Front Vet Sci. 2023;9:1057977. pmid:36713863
- 33. Liu Z, Li L, Xu W, Yuan Y, Liang X, Zhang L, et al. Extensive diversity of RNA viruses in ticks revealed by metagenomics in northeastern China. PLoS Negl Trop Dis. 2022;16(12):e0011017. pmid:36542659
- 34. Dietzgen RG, Firth AE, Jiāng D, Junglen S, Kondo H, Kuhn JH, et al. ICTV Virus Taxonomy Profile: Nyamiviridae 2021. J Gen Virol. 2021;102(11):001681. pmid:34738886
- 35. Kuhn JH, Abe J, Adkins S, Alkhovsky SV, Avšič-Županc T, Ayllón MA, et al. Annual (2023) taxonomic update of RNA-directed RNA polymerase-encoding negative-sense RNA viruses (realm Riboviria: kingdom Orthornavirae: phylum Negarnaviricota). J Gen Virol. 2023;104(8):001864. pmid:37622664
- 36. Quan L, Wang ZD, Gao Y, Lv X, Han S, Zhang X, et al. Identification of a new chuvirus associated with febrile illness in China. Rs:104938/v1 [Preprint]. 2020 [cited 2020 December 15]. Available from: https://128.84.21.199/abs/1403.3301v1
- 37. Cai X, Cai X, Xu Y, Shao Y, Fu L, Men X, et al. Virome analysis of ticks and tick-borne viruses in Heilongjiang and Jilin Provinces, China. Virus Res. 2023;323:199006. pmid:36414189
- 38. Wang G, Tian X, Peng R, Huang Y, Li Y, Li Z, et al. Genomic and phylogenetic profiling of RNA of tick-borne arboviruses in Hainan Island, China. Microbes Infect. 2024;26(1–2):105218. pmid:37714509
- 39. Vanmechelen B, Laenen L, Vergote V, Maes P. Grotenhout Virus, a Novel Nairovirus Found in Ixodes ricinus in Belgium. Genome Announc. 2017;5(21):e00288-17. pmid:28546475
- 40. Pettersson JH-O, Shi M, Bohlin J, Eldholm V, Brynildsrud OB, Paulsen KM, et al. Characterizing the virome of Ixodes ricinus ticks from northern Europe. Sci Rep. 2017;7(1):10870. pmid:28883464
- 41. Kuhn JH, Dheilly NM, Junglen S, Paraskevopoulou S, Shi M, Di Paola N. ICTV Virus Taxonomy Profile: Jingchuvirales 2023. J Gen Virol. 2023;104(12):001924. pmid:38112154
- 42. Li C-X, Shi M, Tian J-H, Lin X-D, Kang Y-J, Chen L-J, et al. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses. Elife. 2015;4:e05378. pmid:25633976
- 43. Dixon LK, Stahl K, Jori F, Vial L, Pfeiffer DU. African Swine Fever Epidemiology and Control. Annu Rev Anim Biosci. 2020;8:221–46. pmid:31743062
- 44. Ruiz-Saenz J, Diaz A, Bonilla-Aldana DK, Rodríguez-Morales AJ, Martinez-Gutierrez M, Aguilar PV. African swine fever virus: A re-emerging threat to the swine industry and food security in the Americas. Front Microbiol. 2022;13:1011891. pmid:36274746
- 45. de Carvalho Ferreira HC, Tudela Zúquete S, Wijnveld M, Weesendorp E, Jongejan F, Stegeman A, et al. No evidence of African swine fever virus replication in hard ticks. Ticks Tick Borne Dis. 2014;5(5):582–9. pmid:24980962
- 46. Dennis SJ, Meyers AE, Hitzeroth II, Rybicki EP. African Horse Sickness: A Review of Current Understanding and Vaccine Development. Viruses. 2019;11(9):844. pmid:31514299
- 47. Carpenter S, Mellor PS, Fall AG, Garros C, Venter GJ. African Horse Sickness Virus: History, Transmission, and Current Status. Annu Rev Entomol. 2017;62:343–58. pmid:28141961
- 48. Colmant AMG, Charrel RN, Coutard B. Jingmenviruses: Ubiquitous, understudied, segmented flavi-like viruses. Front Microbiol. 2022;13:997058. pmid:36299728
- 49. Stegmüller S, Qi W, Torgerson PR, Fraefel C, Kubacki J. Hazard potential of Swiss Ixodes ricinus ticks: Virome composition and presence of selected bacterial and protozoan pathogens. PLoS One. 2023;18(11):e0290942. pmid:37956168
- 50. Wang Z-D, Wang W, Wang N-N, Qiu K, Zhang X, Tana G, et al. Prevalence of the emerging novel Alongshan virus infection in sheep and cattle in Inner Mongolia, northeastern China. Parasit Vectors. 2019;12(1):450. pmid:31511049
- 51. Kuivanen S, Levanov L, Kareinen L, Sironen T, Jääskeläinen AJ, Plyusnin I, et al. Detection of novel tick-borne pathogen, Alongshan virus, in Ixodes ricinus ticks, south-eastern Finland, 2019. Euro Surveill. 2019;24(27):1900394. pmid:31290392
- 52. Ebert CL, Söder L, Kubinski M, Glanz J, Gregersen E, Dümmer K, et al. Detection and Characterization of Alongshan Virus in Ticks and Tick Saliva from Lower Saxony, Germany with Serological Evidence for Viral Transmission to Game and Domestic Animals. Microorganisms. 2023;11(3):543. pmid:36985117
- 53. Boinas F, Ribeiro R, Madeira S, Palma M, de Carvalho IL, Núncio S, et al. The medical and veterinary role of Ornithodoros erraticus complex ticks (Acari: Ixodida) on the Iberian Peninsula. J Vector Ecol. 2014;39(2):238–48. pmid:25424252
- 54. Morozkin ES, Makenov MT, Zhurenkova OB, Kholodilov IS, Belova OA, Radyuk EV, et al. Integrated Jingmenvirus Polymerase Gene in Ixodes ricinus Genome. Viruses. 2022;14(9):1908. pmid:36146715
- 55. Marti GA, Bonica MB, Susevich ML, Reynaldi F, Micieli MV, Echeverría MG. Host range of Triatoma virus does not extend to Aedes aegypti and Apis mellifera. J Invertebr Pathol. 2020;173:107383. pmid:32298695
- 56. Zhai Y-G, Lv X-J, Sun X-H, Fu S-H, Gong Z, Fen Y, et al. Isolation and characterization of the full coding sequence of a novel densovirus from the mosquito Culex pipiens pallens. J Gen Virol. 2008;89(Pt 1):195–9. pmid:18089743
- 57. Li W-J, Wang J-L, Li M-H, Fu S-H, Wang H-Y, Wang Z-Y, et al. Mosquitoes and mosquito-borne arboviruses in the Qinghai-Tibet Plateau--focused on the Qinghai area, China. Am J Trop Med Hyg. 2010;82(4):705–11. pmid:20348523
- 58. Portik DM, Brown CT, Pierce-Ward NT. Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets. BMC Bioinformatics. 2022;23(1):541. pmid:36513983
- 59. Shi C, Zhao L, Atoni E, Zeng W, Hu X, Matthijnssens J, et al. Stability of the Virome in Lab- and Field-Collected Aedes albopictus Mosquitoes across Different Developmental Stages and Possible Core Viruses in the Publicly Available Virome Data of Aedes Mosquitoes. mSystems. 2020;5(5):e00640-20. pmid:32994288
- 60. Edgar RC, Taylor B, Lin V, Altman T, Barbera P, Meleshko D, et al. Petabase-scale sequence alignment catalyses viral discovery. Nature. 2022;602(7895):142–7. pmid:35082445
- 61. Lin Y, Pascall DJ. Characterisation of putative novel tick viruses and zoonotic risk prediction. Ecol Evol. 2024;14(1):e10814. pmid:38259958
- 62. Brait N, Hackl T, Morel C, Exbrayat A, Gutierrez S, Lequime S. A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data. Virus Evol. 2023;10(1):vead088. pmid:38516656
- 63. Barnes M, Price DC. Endogenous Viral Elements in Ixodid Tick Genomes. Viruses. 2023;15(11):2201. pmid:38005880