Transovarial transmission of a core virome in the Chagas disease vector Rhodnius prolixus

Triatomine assassin bugs comprise hematophagous insect vectors of Trypanosoma cruzi, the causative agent of Chagas disease. Although the microbiome of these species has been investigated to some extent, only one virus infecting Triatoma infestans has been identified to date. Here, we describe for the first time seven (+) single-strand RNA viruses (RpV1-7) infecting Rhodnius prolixus, a primary vector of Chagas disease in Central and South America. We show that the RpVs belong to the Iflaviridae, Permutotetraviridae and Solemoviridae and are vertically transmitted from the mothers to the progeny via transovarial transmission. Consistent with this, all the RpVs, except RpV2 that is related to the entomopathogenic Slow bee paralysis virus, established persistent infections in our R. prolixus colony. Furthermore, we show that R. prolixus ovaries express 22-nucleotide viral siRNAs (vsiRNAs), but not viral piRNAs, that originate from the processing of dsRNA intermediates during viral replication of the RpVs. Interestingly, the permutotetraviruses and sobemoviruses display shared pools of vsiRNAs that might provide the basis for a cross-immunity system. The vsiRNAs are maternally deposited in the eggs, where they likely contribute to reduce the viral load and protect the developing embryos. Our results unveil for the first time a complex core virome in R. prolixus and begin to shed light on the RNAi-based antiviral defenses in triatomines.


Author summary
Rhodnius prolixus is a triatomine insect and a primary vector of Trypanosoma cruzi, the etiologic agent of Chagas disease, in Central and South America. Despite the medical relevance, very little is known about the viruses that infect these so-called assassin bugs. In this study, we show for the first time that triatomines can support the concomitant infection of a variety of RNA viruses belonging to distantly related viral families. Remarkably, we show that the viruses are vertically transmitted from the mothers to the progeny via transovarial transmission. The detection of 22-nucleotide viral small interfering RNAs in mature eggs strongly suggests that RNAi mechanisms contribute to reduce the viral load

Introduction
Triatomine insects (Hemiptera, Reduviidae, Triatominae) include hematophagous species that are responsible for the transmission of Chagas disease, an infectious illness that affects 6-7 million people worldwide [1]. The insect genera Rhodnius, Triatoma and Panstrongylus harbor well established vectors of Trypanosoma cruzi, the etiologic agent of the illness, with Rhodnius prolixus being the most important vector in Colombia, Venezuela and other areas in Central and South America.The protozoan is typically transmitted to the human host through the feces and urine of the bug. Chagas disease is characterized by chronic cardiac, digestive and neurologic alterations, which can culminate in sudden death due to heart failure [1,2]. Approximately 10,000 people die of Chagas disease per year. Although vectorial transmission has only been documented in Central and South America, this disease has already spread to the United States of America, Canada, many European Countries, Australia and Japan due to food-borne transmission, blood transfusions, organ transplantation, laboratory accidents and congenital transmission from mother to child [3][4][5]. Currently there are no vaccines or cures for the illness and the most efficient methods to reduce its spreading rely on vector control strategies [1]. R. prolixus is a hemimetabolous insect and its development proceeds through 5 nymph stages before reaching the winged adult stage [6,7]. Since each molting phase is promoted by a blood meal, the insect is potentially capable of transmitting the disease at any developmental stage. Blood meals are also critical to initiate oogenesis in adult females, where they allow the production of up to 45 eggs per female. Each ovary in this species is formed by 6-8 ovarioles resembling assembly lines ( Fig 1A) [6,[8][9][10]. At the anterior region of the ovarioles, a lanceolate structure known as the tropharium harbors both the trophocytes (i.e. nurse cells) and the immature pro-oocytes. The trophocytes are arranged into a syncytium around a central cavity termed the trophic core. At the posterior region of the tropharium, the oocytes resume meiosis and are encapsulated by follicle cells in an orderly fashion to produce the egg chambers [9,11]. In R. prolixus, the egg chambers are exclusively formed by the oocyte surrounded by external follicle cells and remain connected to the tropharium via cytoplasmic bridges termed trophic cords [8]. The cords represent transport routes that transfer RNAs, proteins and nutrients produced by the trophocytes from the trophic core onto the growing egg chamber. Oogenesis in R. prolixus culminates in the production of eggs encapsulated by a hard chorion that protects the oocyte from dehydration and mechanical stress [12]. However, small pores and micropili along the margin of the operculum allow the exchange of gases and liquids as well as the fertilization process. At the end of the embryonic development, the 1st instar nymphs hatch by displacing the operculum.
A broad range of arthropod species have been shown to host complex populations of viruses, which can establish persistent infections that, in many cases, do not cause easily detectable symptoms. However, certain viruses can dramatically affect the behavior, survival and fitness of the host [13][14][15]. Several members of the Picorna-Calici clade can be either beneficial or pathogenic to their hosts [16]. For instance, the Slow Bee Paralysis Virus (SBPV) transmitted by Varroa destructor mites together with other related viruses, like the Kashmir virus and the Deformed wing virus, have been connected to the collapse of honeybee colonies (i.e. Colony Collapse Disorder), which result in severe economic losses [17][18][19]. The Drosophila C virus, also a Dicistrovirus like SBPV, was shown to increase the mortality in fruit flies by causing intestinal obstruction and metabolic depression [20][21][22]. Some arboviruses are well-established agents of zoonosis, like the Dengue, Zika, Chikungunya and Yellow Fever Flaviviruses, which are transmitted by mosquitoes of the Aedes genus and represent a global health burden [23]. The number of newly discovered arthropod viruses has been exponentially increasing after the development of massive parallel sequencing technologies, but in the majority of the cases, little is known about their biology and ecology as well as their impact on human health and economy [24,25]. It was recently shown that insects host stable populations of viruses, or core viromes, that are vertically transmitted both in laboratory colonies and in nature [26,27].
The host/virus interaction is regulated by the balance between antiviral systems and evasion strategies evolved by the virus. This arms race can culminate in the clearing of the virus from the infected cells, the death of the host or the establishment of persistent infections lacking apparent phenotypes. RNA interference is the primary antiviral defense system in insects [28,29]. Typically, double-strand RNA intermediates that are transiently generated during viral replication provide a template for the host Dicer2 (Dcr2) enzyme that cleaves the dsRNA to generate 20 to 22-nucleotide (nt) small non-coding RNAs depending on the species known as viral small interfering RNAs (vsiRNAs). With the help of the R2D2 factor, the vsiRNAs are then loaded into the RNA-induced Silencing Complex or RISC centred on the Argonaute2 (Ago2) protein. The vsiRNAs serve as guides for the RISC to recognize and cleave the viral genomes via the slicing activity of Ago2 [30]. A variety of cellular exonucleases then further degrade the viral sequences and clear the virus from the cell. A growing body of evidence points to a role for another branch of the RNAi phenomena, namely the piRNA pathway, in antiviral defenses in mosquitos [31,32]. In the fruit fly Drosophila melanogaster, where the pathway was first discovered, the Piwi-interacting RNAs or piRNAs are 18-30-nt non-coding RNAs that are found in complex with members of the Piwi-clade Argonaute proteins (i.e. PIWIs). In this species, the PIWI/piRNA complexes do not exert antiviral functions, rather they are involved in the silencing of transposable elements and the maintenance of genomic stability. In mosquitos instead, piRNAs originating from viral sequences integrated in the host genome, that is the Endogenous Viral Elements or EVEs, guide the PIWIs to target and eliminate cognate viral genomes [33,34].
Chagas disease is mainly transmitted by insects belonging to the Rhodnius, Triatoma and Panstrongylus genera, but more than 150 triatomine species maintain T. cruzi infections in the wild and are potential vectors of the disease [3,35]. Despite the medical relevance however, the virome of these insect species has been poorly investigated and to date, only one virus, namely Triatoma virus (TrV), has been described in Triatoma infestans [36][37][38]. Also, RNAi has been extensively used as a tool for functional studies [39,40], but its physiological role in antiviral defense systems remains unexplored in triatomines. In this study, we employed stage-specific de novo transcriptome assembly and small RNA profiling to identify novel viruses and investigate the antiviral systems during R. prolixus oogenesis. We show for the first time that R.
prolixus harbors a complex core virome that is vertically transferred from the mothers to the offspring via transovarial transmission. Furthermore, we find that viral siRNAs, but not viral piRNAs, are produced during oogenesis and likely contribute to protect R. prolixus germ cells and early embryos and to prevent persistent viral infections. Our findings shed light on the complexity of the virome in triatomine species and their antiviral defense systems and provide a new toolkit for the development of vector population control strategies.

Identification of seven novel (+) ssRNA viruses in R. prolixus ovaries
We have recently described RNA-Seq transcriptomes from previtellogenic stages of oogenesis (PVS) and mature chorionated eggs (Egg) of R. prolixus [39]. In this study, we further interrogated those datasets and discovered that at least 39,340 (0.1%), 46,413 (0.2%), 51,905 (0.2%) and 139,648 (0.6%) reads were related to viral sequences in our RNA-Seq libraries PVS_1, PVS_2, Egg_1 and Egg_2, respectively. De novo transcript assembly initially provided 8 contigs in PVS and 7 in Egg stages. Alignment of the contigs in PVS with those found in Egg ultimately led us to identify 7 unique viral genomes, which we labeled Rhodnius prolixus Virus 1-7 (RpV1-7) (Fig 1). Multiple sequence alignments and NCBI Blast analyses of the putative RNAdirected RNA polymerases (RdRp) or Open Reading Frames (ORFs) allowed us to establish the phylogenetic relationships among the RpVs and closely related viruses from other arthropods ( Fig 1B). The RpVs can be grouped in three different families: Iflaviridae (RpV1 and RpV2), Permutotetraviridae (RpV3, RpV4 and RpV7) and Solemoviridae (RpV5 and RpV6) (Fig 1B-1E) [25]. None of the RpVs shares similarity with TrV, the only known triatomine virus even though TrV also belongs to the Picornavirales order.
Two of the longest contigs correspond to novel viruses of the Iflavirus genus. RpV1 (MZ328304) is represented by a 9.6 Kilobases (Kb) long contig and displays two main ORFs encoding 1,248 AA and 1,632 AA putative proteins sharing 36.99% and 40.88% amino acid (AA) sequence identity with the protein encoded by ORF1 and ORF2 of the Nesidiocoris tenuis iflavirus 1, respectively [41] (Fig 1B and 1C). Remarkably, RpV1 is also highly similar to Deformed wing virus (~32% AA identity), a well characterized entomopathogenic virus. ORF2 encodes a putative RdRp enzyme and a RNA helicase, while ORF1 codes for the capsid proteins typical of Picornavirales (rhv-like) and Cricket Paralysis virus (CRPV). ORF1 and ORF2 of RpV1 display a 1-nt overlap typical of iflaviruses that however was not observed in N. tenuis virus [41] (Fig 1C). The 5' and 3' Untranslated Regions (UTRs) of RpV1 are 759nt and 214nt respectively. Although the size of RpV1 is close to the typical length of iflaviruses (i.e.~10kb), it is possible that our RNA-Seq library preparation protocols failed to fully capture the sequences at the 5' and 3' ends. Similar limitations in defining the exact 5' and 3' ends might also apply to the other RpVs described in this study. RpV2 (MZ328305) also appears to belong to the Iflavirus genus, but its genome encodes a single polyprotein 2,890 AA in length, that dis-plays~70% AA identity with the polyprotein of the SBPV [42]. A closer relative of RpV2 in the phylogenetic tree is also the Bat iflavirus (Fig 1B and 1C). The longest contig for RpV2 is 8.9Kb long and the encoded polyprotein is similar to that of the SBPV with RhV-like and CRPV capsid proteins at the N-terminal region and the RdRp enzyme harbored in the C-terminus ( Fig 1C). Three novel viruses appear to belong to the Permutotetraviridae family (Alphapermutotetravirus genus). One of the best characterized viruses of this clade is the Thosea asigna virus that displays a typical T = 4 icosahedral symmetry of the capsid [43]. The contigs for the RpV3 (MZ328306), RpV4 (MZ328307) and RpV7 (MZ328310) are~5Kb long (Fig 1D). The genomes of these RpVs display two partially overlapping ORFs with ORF1 encoding the RdRp

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus and ORF2 the putative capsid and structural proteins. The putative RdRp enzymes of RpV3, RpV4 and RpV7 share~42% to~47% AA sequence identity with proteins from the closely related Shuangao and Hubei permutotetra-like viruses (Fig 1B and 1D). ClustalW multiple sequence alignment of the RpV3, RpV4 and RpV7 genomes revealed extensive nucleotide sequence identity in any of the combinations (~74% between RpV3 and RpV4, 73% between RpV4 and RpV7 and 78% between RpV3 and RpV7) (S1 Table).
Finally, RpV5 (MZ328308) and RpV6 (MZ328309) belong to the Solemoviridae family (Sobemovirus genus). This family harbors a broad variety of well-characterized plant viruses that are often phytopathogenic and severely affect agriculture [44,45]. Although these viruses cycle between plants and insects, including beetles, aphids and ticks, their interaction with the invertebrate hosts has been poorly investigated. RpV5 and RpV6 display~41% and~66% AA sequence identity with Atrato sobemo-like virus 6 respectively (Fig 1A and 1D). The size of both RpV5 and RpV6 contigs is~2.8Kb with two overlapping ORFs, whereby ORF1 encodes the Peptidase activity and ORF2 the RdRp enzyme. The alignment of the putative RdRp enzymes encoded by these viruses showed 70% AA sequence identity, while their genomes share very low nucleotide sequence identity (<30%). Sobemoviruses are best characterized in plants, while much less is known about the insect variants. Our data reveal for the first time that R. prolixus ovaries support the infection of a range of phylogenetically distant (+) single strand RNA viruses.

Simultaneous RpVs infections are detected during R. prolixus oogenesis
The detection of viral sequences in previtellogenic stages of oogenesis as well as in mature unfertilized eggs suggested that the RpVs might be vertically transmitted from the mothers to the offspring. We therefore sought to investigate this process quantitatively by comparing the steady-state RNA levels of each virus in the PVS and Egg (Fig 2). The endogenous Rp-rp49 gene, that is abundantly expressed in R. prolixus oogenesis, is displayed for comparison (Fig  2A and 2B) [39]. The comparison of the normalized reads over the two stages of oogenesis shows that the RpVs were generally present both in PVS as well as in the chorionated eggs (Fig  2A and 2B). Despite the variability between the biological replicates and the limited number of biological replicates, we found that RpV1 was the most abundant virus in R. prolixus ovaries at the time that the RNA-Seq libraries were prepared (2015). However, the accumulation of each RpV in a specific stage of oogenesis may vary. For instance, RpV3 RNA levels were~100 fold higher in Egg compared to the PVS (Fig 2A and 2B). Conversely, RpV4, RpV5, RpV6 and RpV7 levels were higher in PVS versus Egg, with RpV6 being almost undetectable in chorionated eggs (Fig 2A and 2B). The differential accumulation of the RpVs during R. prolixus oogenesis was also apparent when we looked at the read coverage along each viral genome ( Fig  2C-2E and S1A-S1D Fig). This analysis also revealed that the RNA-Seq coverage along the RpV1, RpV3 and RpV4 genomes is apparently uneven. Higher coverage is observed at the 3' end of the RpV1, while this region shows lower coverage in RpV3 and RpV4 (Fig 2D-2E). Although the bias might be due to the protocols employed in the current study, we cannot rule out that the uneven profiles might be due to the accumulation of defective viral particles or subgenomic RNAs corresponding to portions of the viral genomes. These data demonstrate that the RpVs can infect both the early and late stages during R. prolixus oogenesis.

RpVs are vertically transmitted to progeny via transovarial transmission
All the RpVs are detected in mature eggs dissected from the abdomens of the R. prolixus females. This observation prompted us to investigate whether the viruses can actually be passed onto the developing embryos and ultimately to the hatching nymphs. To answer this

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus question, we performed qRT-PCR assays with oligonucleotides specific for each RpV using total RNA samples extracted from PVS, mature eggs, embryos and 1st instar nymphs as templates ( Fig 2F). For each stage, we calculated the number of viral genome copies per microgram of total RNA. Compared to 2015, when RpV1 was the most abundant virus in PVS and Egg as per RNAseq, it appears that also RpV4, RpV6 and RpV7 are currently present at levels comparable to RpV1 in these stages of oogenesis ( Fig 2F). All the RpVs, except RpV2, are detected also in the developing embryos and 1st instar nymphs ( Fig 2F). RpV3 and RpV5 are the least abundant viruses in all evaluated developmental stages ranging from 2.5×10 2 to 2.6×10 5 mean number of copies/ug of RNA in embryos and PVS, respectively. Interestingly, the levels of all the RpVs are several orders of magnitude lower in the embryos compared to non-oviposited eggs and 1st instar nymphs ( Fig 2F). For instance, RpV1, RpV4 and RpV6 display mean levels ranging between 10 6 and 10 7 copies/ug of RNA in PVS and Egg, but in embryos their levels drop down to approximately 10 5 copies/ug of RNA for RpV1 and below 10 4 copies/ug of RNA for RpV4 and RpV6. In the 1st instar nymphs, the mean levels increase again above 10 5 for these three viruses. RpV2 was below detectable levels in all the assays we performed and thus it was likely lost from our insectarium some time after the transcriptome datasets were produced.
Next, we asked whether the RpVs are able to establish systemic infections and spread to other tissues and organs in R. prolixus. We interrogated the RNA-Seq datasets published by Ribeiro and collaborators in 2012, who had already noticed viral sequences in their tissue-specific transcriptomes [46]. Despite the small size of each library ranging from 151,210 to 465,610 reads and the lack of biological replicates, we could detect sequences of RpV1 in the whole body, anterior and posterior midgut, fat bodies, rectum, testes, ovaries, and embryos, but not in the malpighian tubules ( Fig 2G). Although reads corresponding to RpV2, RpV4 and RpV6 sequences were found, the counts were too low to draw robust conclusions.
Collectively, these results show that RpVs can establish persistent and systemic (at least for RpV1) infections in R. prolixus via vertical transovarial transmission from the mother to the offspring. viral siRNAs for all the RpVs, except RpV2, are detected in early oogenesis and in mature eggs RNA interference was shown to provide a primary antiviral system in fruit flies and mosquitos [29]. Thus, we asked whether a RNAi system might also be active in R. prolixus against the RpVs. First, we used our transcriptomic datasets to investigate the expression levels of the main components of the miRNA, siRNA and piRNA pathway described in Drosophila melanogaster during Rhodnius prolixus oogenesis [28]. Our data show that the mRNAs encoding critical components of the three RNAi branches are maternally deposited in the mature eggs (S2 Fig), thus suggesting that RNAi pathways are active in this insect species. Second, we generated and analysed small RNA datasets from PVS and Egg stages separately. When we mapped the

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus reads against the RpVs, we could identify a total of 113,759 and 139,556 "viral" reads in PVS and Egg, respectively. The length distribution of the small RNAs revealed a clear 22-nt peak for all the viruses except for RpV2 in both stages of oogenesis (Fig 3A and 3B). As expected, we did not find small RNAs for TrV that was never detected in our transcriptomes. We then analysed the strand bias of the small RNAs with respect to the viral genomes (Figs 3C and S3). The small RNA profiles show that both sense and antisense small RNAs for all the RpVs, except for RpV2, are expressed in PVS as well as in chorionated eggs. The distribution of the vsiRNAs seems rather uniform along the viral genomes and approximately equal amounts of sense and antisense vsiRNAs are detected for all the RpVs suggesting that they originate from the processing of viral dsRNA replication intermediates (Figs 3C-3H, S3 and S4). Interestingly, some hotspots in certain RpVs appear consistent among replicates and stages, but additional assays will be necessary to determine their nature.
It has been recently proposed that the piRNA pathway might contribute to antiviral defenses in mosquitos [31]. Seminal studies in the fruit fly D. melanogaster revealed that the piRNAs range from 23 to 30nt in length and their biogenesis requires the activity of the Piwiclade Argonaute proteins Aub and Ago3 [47,48]. These enzymes coordinate a positive feedback amplification loop termed "ping-pong" mechanism that generates sense and antisense secondary piRNAs with a typical 10nt overlap at their 5' ends. Furthermore, sense and antisense piRNAs are characterized by a Uridine bias at position +1 and an Adenine bias at position +10 respectively. However, when we analysed these features in our RpVs visRNA pools, we did not find strong evidence for a ping-pong amplification mechanism. Also, we did not detect viral small RNAs with a 23-30nt size range (Figs 3I, 3J and S5 Fig). One exception might be represented by RpV6, whose small RNAs display a 10nt overlap enrichment in PVS (Fig 3J). Yet, the RpV6 small RNAs also display enrichments of the 9nt and 11nt overlaps in this stage of oogenesis that are not generally observed for piRNAs, and the 10nt overlap is not detected in Egg stages. Finally, we do not find evidence for the +1 U-bias and +10 A-bias when we analyse the nucleotide frequencies along the viral small RNAs (S6 and S7 Figs).
Although our data strongly point to the absence of viral piRNAs in R. prolixus ovaries, we asked whether the viral small RNAs might help identify EVEs in this species. Roughly 2-3% of the PVS and Egg vsiRNAs map both to the viral as well as to the insect genome. However, 50% of these multi mappers match sequences in protein-coding genes either in intronic (40.22%) or exonic (9.76%) sequences (S2 Table). The remaining vsiRNAs map to intergenic regions in the R. prolixus' genome, but neither they appear to cluster together nor the cognate genomic sequences are related to the viral genomes. Furthermore, qRT-PCR and PCR results show that RpVs are not present in R. prolixus' genome and do not appear to be genomically encoded as EVEs (S8 Fig). These results demonstrate that the piRNA pathway may not exert a critical role in protecting R. prolixus ovaries against RpVs infections.
Intriguingly, we noticed that subsets of vsiRNAs for the RpV3-7, but not for RpV1 and RpV2, appeared to map to more than one viral genome (Figs 4 and S9). Upon closer inspection, we found that RpV3, RpV4 and RpV7 share pools of vsiRNAs in PVS, whereby 217 are common to RpV3 and RpV4, 1553 to RpV3 and RpV7 and 404 to RpV4 and RpV7 (Fig 4A). Remarkably, 717 vsiRNAs are shared among the three viruses (Fig 4A). These findings are supported by the analysis of the Egg datasets, where comparable numbers are observed (Fig 4B). Similarly, RpV5 and RpV6 display shared vsiRNAs in PVS (279) and in Egg (762) (Fig 4B). We then asked what regions of the viral genomes produce the shared set of vsiRNAs. To answer this question, we profiled the distribution of the shared vsiRNAs along the viral genomes ( Fig  4C and 4D). Surprisingly, both for the permutotetraviruses (Fig 4C) and sobemoviruses ( Fig  4D), the shared vsiRNAs mostly map to the 3' end of the genome rather than to sequences encoding the conserved RdRp or structural proteins. We also checked whether the shared

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus vsiRNAs could be related to low-complexity regions, but this was not the case for any of the tested sequences (S10 Fig). Our data reveal that vsiRNAs are expressed in Rhodnius' ovaries in response to viral infections, while viral piRNAs do not appear to provide a primary antiviral system in this organ.

Discussion
R. prolixus is a hematophagous triatomine insect responsible for the transmission of Chagas disease in Central and Latin American countries. To our knowledge, the composition of the R. prolixus' virome is completely unknown and only one virus (i.e. Triatoma virus) is known to infect triatomines, which comprise hundreds of species that are potential vectors of Chagas disease. Here we substantially expand the virome in these insect species using R. prolixus as a model system. We show that R. prolixus harbors a complex core virome comprising a range of (+) ssRNA viruses belonging to the Iflaviridae, Permutotetraviridae and Solemoviridae families. We demonstrate that RpVs established persistent infections in our colony that were maintained at least in part via transovarial transmission from the mother to the offspring. Accordingly, viral genomes can be detected not only in early stages of oogenesis and in mature nonoviposited eggs, but also in developing embryos and in 1st instar nymphs. It is possible that the transport of viral particles through the trophic cords might contribute to the accumulation of the viruses in the developing egg chambers and mature chorionated eggs. Vertical transmission is being extensively studied for medically relevant viruses, like Dengue and Zika virus and has been proposed as a possible explanation for recurrent seasonal epidemics [49][50][51]. The virus stored in the mosquito eggs, that are resistant to desiccation, might be protected and overcome the unfavorable seasons, when the population of the mosquito vector is reduced. Our results demonstrate that vertical transmission of the RpVs occurs in R. prolixus and accounts at least in part for the persistence of the viruses in our insectarium, although additional mechanisms including horizontal transfer via cannibalism and coprophagy might be in place. It will certainly be of great interest to investigate the virome composition and transmission routes in wild caught R. prolixus.
Our study also reveals for the first time that RNA interference is active during R. prolixus oogenesis and likely protects the developing germline and embryos from viral infections. Indeed, we show that 22nt vsiRNAs for all the viruses, except for RpV2, are detected both in previtellogenic stages of oogenesis as well as in mature eggs. The length of the vsiRNAs in the Hemipteran R. prolixus (22nt) is similar to that observed in some Hymenopteran and Orthoptera insect species, while it differs from Diptera like Drosophila melanogaster (21nt) and Lepidoptera (20nt) [52]. It seems reasonable to conclude that the RpVs infect the tropharium and early egg chambers, where their dsRNA replication intermediates are converted into vsiRNAs by the R. prolixus Dcr2 and associated factors. Both the vsiRNAs and the transcripts encoding the RNAi factors are then loaded into the growing oocyte most likely via the trophic cords, and stored in the mature chorionated eggs. Interestingly, the number of viral genomes for all the RpVs drops by several orders of magnitude in embryos compared to the eggs prior In both stages of R. prolixus oogenesis, a peak at 22nt is readily detectable for all the RpVs, except for RpV2 and TrV, which did not generate vsiRNA reads above background levels. RpV2 was lost from our insectarium by the time that the small RNA datasets were produced. TrV was never detected in our colony and was used as control for the background small RNA levels. (C-H) vsiRNA profiling along the RpV1, RpV4 and RpV6 genomes and length distribution of the sense and antisense vsiRNAs. Sense and antisense vsiRNAs detected in previtellogenic stages of oogenesis (dark blue = sense, red = antisense) and mature chorionated eggs (light blue = sense, orange = antisense) are displayed for each virus. (I-J) Ping-pong signal for the RpV1 and RpV6 viruses in PVS (white bars) and Egg (grey bars). The red box highlights the position of the typical 10nt overlap between sense and antisense small RNA pairs. https://doi.org/10.1371/journal.ppat.1009780.g003

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus oviposition, but it increases again in the 1st instar nymphs. It is tempting to speculate therefore that maternally provided RNAi machinery might be responsible for reducing the viral levels during embryogenesis in order to safeguard the development of the embryos and nymphs. Conversely, our data demonstrate that the piRNA pathway does not coordinate a prominent antiviral branch during R. prolixus oogenesis, even though Rp-piwi genes are expressed in the ovaries of this insect species and required for female adult fertility [39]. The observation that the RpVs are not completely eliminated from the ovaries despite an active RNAi machinery is not surprising since it has been shown that the RNAi defenses do not always clear the viruses from the infected cells, rather they seem to reduce the viral burden and avoid the death of the host. This mechanism is thought to sustain persistent viral infections both in plants and animals [53,54] and might account for the fact that most of the RpVs have been maintained in our colony for five years at least. The only exception is represented by the RpV2 virus that was lost from our colony sometime between 2015 and 2017. The closest relative of RpV2 in the phylogenetic tree is the SBPV that is known to cause foreleg paralysis and death in honeybees. It is possible that the replication of RpV2 might have reached critical levels for the survival of the infected insects, whose death eventually interrupted the transmission of the RpV2 in the insectarium.
Interestingly, the permutotetraviruses (RpV3, RpV4 and RpV7) and the sobemoviruses (RpV5 and RpV6), display common pools of vsiRNAs, that mostly correspond to sequences in their 3' end. Hundreds of vsiRNAs can also target any combination of two permutotetraviruses. These findings seem to point to a cross-immunity mechanism whereby ovarian tissues can use a shared set of vsiRNAs to prevent or mitigate the replication of distinct, but closelyrelated viruses. Such a mechanism would also provide an adaptive arm in that the visRNAs for an existing viral infection can be potentially employed to fight a new infection from a phylogenetically related virus. Furthermore, the shared vsiRNA complement might allow the rapid establishment of permanent infections from newly infecting viruses of the same family as well as support the evolution of novel viral variants, while still protecting the host cells from possible damage and death. This hypothesis warrants further investigation in the light of the rapidly increasing viral diversity that is being discovered in arthropods [25].
It was shown that both enveloped and non-enveloped viral particles can be found in Trypanosoma cruzi, the etiologic agent of Chagas disease [55]. Our results suggest that the trypanosome might exchange viruses with the host R. prolixus. Given the medical relevance of these organisms, it will be of great interest to thoroughly investigate the interaction between the RpVs, the triatomine insect and the trypanosome as well as the related antiviral defense mechanisms. The employment of metatranscriptomic studies in field-captured triatomines coupled with a thorough analysis of the viral entomopathogenicity will help understand the virome diversity in the triatomine vectors and the possible use of certain viruses in insect population control strategies.

Ethic statement
Animal handling and experimental protocols were conducted in accordance with the guidelines of the Committee for Evaluation of Animal Use for Research (Universidade Federal do Rio de Janeiro, CAUAP-UFRJ) and the NIH Guide for the Care and Use of Laboratory Animals (ISBN 0-309-05377-3). Protocols were approved by CAUAP-UFRJ under registry #IBQM155/13. Dedicated technicians in the animal facility localized at the Instituto de Bioquímica Médica Leopoldo de Meis (UFRJ) carried out all protocols related to rabbit husbandry under strict guidelines, with supervision of veterinarians to ensure appropriate animal handling.

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus

Rhodnius handling, total RNA extraction and RNA-Seq library preparation
The Rhodnius prolixus colony was maintained at 28˚C and 75% relative humidity and regularly fed on New Zealand white rabbit blood. The colony originated in 1980 from animals hosted in the insectary of the Universidade Federal Fluminense (Rio de Janeiro), which, in turn, were obtained from the Instituto Oswaldo Cruz (Rio de Janeiro). Currently, the colony hosts thousands of animals with partially overlapping generations. Previtellogenic stages and mature unfertilized eggs (~30 eggs per biological replicate) were dissected in ice cold 1X PBS from adult females two weeks after the blood meal. Embryos were collected over a period of 5 days after oviposition and roughly 30 embryos were used for each biological replicate. Approximately, 10 1st-instar nymphs per biological replicates were employed. Total RNA was extracted with Trizol Reagents (Life Technologies) as per manufacturer instructions and treated with turbo DNA-free kit (Ambion). Total RNA extracted from unfertilized eggs dissected from females' abdomens were subjected to paired-end RNA sequencing (RNA-Seq) library production (Illumina) as previously described [39]. RNAseq libraries were generated and sequenced on HiSeq Illumina platforms at Lactad Facility (University of Campinas/Brazil). For small RNA library preparations, 20ug of total RNA was separated on 15% TBE-Urea gels and small RNAs ranging in size 18-30nt were selected as previously described [56]. Library preparation and sequencing were performed at the Max Planck Institute Sequencing Facility with Illumina protocols and NextSeq sequencing platforms.

PCR and qRT-PCR
Quantitative RT-PCR (qRT-PCR) assays to validate the RNA-Seq datasets and investigate the stage-specific abundance of the viruses were performed as previously described [39,57]. Briefly, 1μg of total RNA for each developmental stage was subjected to genomic DNA removal with the turbo DNA-free kit (Ambion) and retrotranscription with Multiscribe Reverse Transcriptase (ThermoFisher Scientific). Nine biological replicates were produced for each assay. The resulting cDNAs were used for qPCR assays with the set of oligonucleotides listed below. Approximately 50ng of cDNA for each sample was mixed with specific oligonucleotides and SYBR green reagent (Life Technologies). qRT-PCR was carried on QuantStudio 7 Flex (Ther-moFisher). Analysis of the qRT-PCR results was performed using absolute quantification with standard curves. The standard curve was generated using a 747nt long cDNA corresponding to a portion of the RpV1 virus. The cDNA fragment was amplified by PCR in ovarian cDNAs with oligonucleotides bearing a T7 promoter sequence at the 5' end as listed in Table 1. The PCR product was subjected to in vitro transcription with the T7 Megascript kit (Ambion) followed by DNA removal (Turbo DNA free kit, Ambion). 1μg of the resulting DNA-free RNA was converted into cDNA and serial dilutions were used in qRT-PCR assays to produce the standard curve represented by the linear regression of CT values for the Rpv1 fragment transcribed in vitro versus the logarithm of RNA molecules per μg of total RNA. To this aim, the number of RNA molecules of the RpV1 fragment was calculated using the formula , where "G" is the number of RNA molecules, "m" is the amount of single strand RNA in nanograms, "N A " is the Avogadro's constant (6.022×10 23 ), "L" is the length of ssRNA in nucleotides (747) and "Da" is the average weight of a single strand DNA molecule in Daltons (330). The resulting values were used to calculate the logarithms of the number of RNA molecules per microgram. Based on this standard curve, the Number (N) of viral genome copies per microgram of total RNA for each virus (RpVs) was calculated as N ¼ 10 ðCTÀ bÞ�m À 1 , where CT is the cycle threshold for a given virus, b is the intercept-y of the standard curve and m is the slope of the standard curve.

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus

Identification of viral genomes by de novo assembly
First we filtered out the host sequences aligning the PVS and Egg RNA-Seq data [39] to the R. prolixus reference RproC3 [58] genome using STAR v2.7.2b [59] (-outFilterMultimapNmax 30;-outFilterMismatchNoverLmax 0.1). Unmapped reads were used for de novo transcriptome assembly with the Trinity method v2.5.1 [60] with default parameters to identify novel transcripts that were not previously detected due to incomplete sequencing and assembly of the Rhodnius genome for each development stage PVS and Egg. De novo assembled transcripts

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus were post-processed to filter out potential artefacts based on abundance and quality. First, expression levels of the de novo assembled transcripts were quantified by Salmon v1.2.1 [61] and isoforms with TPM < 10 were removed. Next, deduplication of redundant contig sequences was performed by CD-HIT-EST v4.6 [62,63] at a nucleotide identity of 95%. Trans-Rate software [64] used the unmapped reads and the remaining contigs from the previous filtering steps as input to evaluate common assembly errors (chimeras, structural errors, incomplete assembly and base errors) and to produce a diagnostic quality score for each contig, thereby removing possible artefacts. Assembled transcripts remaining from this last step were considered to be good quality transcripts. To identify potential protein-coding transcripts, we compared the assembled contigs with Swissprot and Uniprot (release 2020_01) [65] protein databases using BLASTX v2.9 (e-value < 1e-20). We selected the longest sequences with similarity to virus proteins or without hit and blasted against the non-redundant protein database (https://blast.ncbi.nlm.nih.gov).
After the identification of the seven candidate virus sequences, we used the tool rascaf (RnA-seq SCAFfolder) to extend the draft assembly using the RNA-Seq data information [66] (minimum support for connecting two contigs equals to 10 mapped reads). In addition, we performed an all vs all BLASTN alignment v2.10.1 [67] between the PVS and Egg final contigs to identify overlappings that could aid us to extend the viral sequences (default parameters). Alternatively, we performed an independent assembly of the PVS and Egg library reads using the assembler rnaSpades [68] with default parameters. The latter was decisive for the length improvement of RpV2 and RpV7 genomes. All genomes are available in GenBank [69] under the following accession numbers: MZ328304 (RpV1), MZ328305 (RpV2), MZ328306 (RpV3), MZ328307 (RpV4), MZ328308 (RpV5), MZ328309 (RpV6), MZ328310 (RpV7).

Virus phylogeny and genome organization
Using the Conserved Domain Database [70], R. prolixus viruses were searched in order to identify conserved RNA-dependent RNA polymerase (RdRp) domains in each ORF (Open Reading Frame). The domain regions were then extracted using an in-house script. When RdRp sequences were not readily available in NCBI's Protein database [71], the same methodology was applied to the other viruses: Triatoma [72]. The phylogenetic tree was constructed with MEGA X [73] using the Neighbor-Joining method [74], bootstrap test with 1000 replicates [75] and Jones-Taylor-Thornton (JTT) matrix [75,76] to compute evolutionary distances.
Gene predictions were performed with the Viral Genome Annotation System (http://cefg. uestc.cn/vgas/) [77] (gene length � 60 nucleotides; ATG, GTG, TTG as start codon). ORF candidates were predicted using NCBI ORFinder [78] (ORF length � 75; standard genetic code; ATG and alternative initiation codons as start codons). Protein domains and families were predicted by Conserved Domain Database v3.18 [70] (e-value � 0.05). To compare the RpVs genome organization with known viruses, sequences of the latter were subjected to ORF and protein family prediction methods with the same parameters.

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus

RNA-Seq bioinformatic analyses
To obtain the RNA-Seq profiles of the virus genomes, we mapped the reads of the PVS and Egg stages [39] against the genome index of the seven RpV genomes using STAR [59] version 2.7.2b (-outFilterMultimapNmax 30;-outFilterMismatchNoverLmax 0.1). Script bamCoverage v3.1.3 [79] produced BigWig read coverage files from the mappings normalized by CPM and bin size equal to 5. Profile plots were visualized using a local instance of the Genome Browser [80].
Expression levels of RNAi components were obtained from the quantification of the RNA--Seq data from pre-vitellogenic stage and unfertilized mature egg [39]. Salmon 1.2.1 [61] was set to produce aggregated gene-level abundance estimates (-g). The transcriptome index was built using k-mer size equal to 19 (-k 19) and the DNA sequences of the improved transcriptome produced by Coelho et al, unpublished.

SmallRNA-Seq bioinformatic analysis
Raw smallRNA-Seq reads had adapter sequences removed and low quality ends trimmed using TrimGalore! [81] with default parameters and filtering out reads with less than 18 nucleotides in length (-length 18), while qualities of sequences were evaluated using FastQC [82]. Genome index for R. prolixus, TrV and RpV genomes was made with Bowtie [83] using bowtie-build with default parameters. SmallRNA-Seq reads were then mapped using Bowtie's "-v-beststrata" mode, allowing up to 3 mismatches. Genome coverage for each virus was performed using Bedtools [84] with genomecov (-d -ibam -strand) and normalized by Reads Per Million method, where the number of reads mapped in each position was multiplied by one million and divided by the number of total reads mapped to each virus. The same normalization method was used for the length distribution of mapped reads in Fig 3A, 3B, 3D, 3F and 3H.
Multi mapped reads that mapped to more than one virus were filtered out and counted. The cons software, part of the EMBOSS package [85], was used to generate consensus sequences with only shared regions between different combinations of RpV genomes (RpV3/ 4/7, RpV3/4, RpV3/7, RpV4/7 and RpV5/6) in order to identify sites possibly responsible for the production of these small RNAs. SmallRNA-Seq reads were then mapped to these consensus genomes and genome coverage was produced and normalized.
Reads that mapped to R. prolixus' genome were filtered and mapped to RpV genomes. Reads that mapped to both were selected and overlapped with R. prolixus' genomic features present in VectorBase [86] using the intersect software, part of the Bedtools package [84].
To check if reads mapped to R. prolixus viruses displayed any signal of a ping-pong amplification mechanism, reads mapped to the positive strand in each virus were used to search for read pairs mapped in the opposite strand, which displayed a 5' to 5' overlap. Overlap between pairs that mapped in more than one place were counted as (1/N1 + 1/N2) / 2, where N1 is the number of different mappings for the read mapped on the positive strand and N2 is the number of different mappings for the read mapped in the negative strand. Using primers specific for each virus, we compared RT-PCR assays on ovarian cDNA (cD) for each virus with a PCR on genomic DNA (GD) extracted from Rhodnius prolixus. An amplification product of the expected length is observed for all the viruses as well as the EF-1 control gene in the RT-PCR lanes, while only EF-1 is detectable by PCR on genomic DNA. Notice that the size of the EF-1 amplification products differs between the RT-PCR and the genomic PCR due to the presence of small intron in the amplified region. The genomic DNA was extracted from 1st instar nymphs as described in Gloor et. al., 1993 [87] and oligonucleotide sequences for EF-1 were obtained from Majerowicz et. al., 2011 [88]. The oligonucleotides specific for each RpVs are those used for the qRT-PCR (Table 1)

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus sequences were considered low-complexity since they did not exceed the threshold of 80% of As and Ts. (TIFF) S1

PLOS PATHOGENS
Transovarial transmission of RNA viruses in R. prolixus