Unraveling the genetic diversity and phylogeny of Leishmania RNA virus 1 strains of infected Leishmania isolates circulating in French Guiana

Introduction Leishmania RNA virus type 1 (LRV1) is an endosymbiont of some Leishmania (Vianna) species in South America. Presence of LRV1 in parasites exacerbates disease severity in animal models and humans, related to a disproportioned innate immune response, and is correlated with drug treatment failures in humans. Although the virus was identified decades ago, its genomic diversity has been overlooked until now. Methodology/Principles findings We subjected LRV1 strains from 19 L. (V.) guyanensis and one L. (V.) braziliensis isolates obtained from cutaneous leishmaniasis samples identified throughout French Guiana with next-generation sequencing and de novo sequence assembly. We generated and analyzed 24 unique LRV1 sequences over their full-length coding regions. Multiple alignment of these new sequences revealed variability (0.5%–23.5%) across the entire sequence except for highly conserved motifs within the 5’ untranslated region. Phylogenetic analyses showed that viral genomes of L. (V.) guyanensis grouped into five distinct clusters. They further showed a species-dependent clustering between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis, confirming a long-term co-evolutionary history. Noteworthy, we identified cases of multiple LRV1 infections in three of the 20 Leishmania isolates. Conclusions/Significance Here, we present the first-ever estimate of LRV1 genomic diversity that exists in Leishmania (V.) guyanensis parasites. Genetic characterization and phylogenetic analyses of these viruses has shed light on their evolutionary relationships. To our knowledge, this study is also the first to report cases of multiple LRV1 infections in some parasites. Finally, this work has made it possible to develop molecular tools for adequate identification and genotyping of LRV1 strains for diagnostic purposes. Given the suspected worsening role of LRV1 infection in the pathogenesis of human leishmaniasis, these data have a major impact from a clinical viewpoint and for the management of Leishmania-infected patients.

Introduction Protozoan parasites of the genus Leishmania are unicellular eukaryotes with a complex digenetic cycle. In the vertebrate host, they are obligatory intracellular parasites. They cause a broad spectrum of diseases, collectively known as leishmaniases, that occur predominantly in tropical and subtropical regions [1]. Given their frequency and the severity of certain clinical forms, these diseases represent a major public health problem in endemic countries. Depending on their behavior in the vector's gut, parasites of the Leishmania genus are divided into two subgenera: Leishmania and Viannia. The Leishmania subgenus Viannia, endemic in Latin America, is the etiological agent of cutaneous (CL) and mucocutaneous leishmaniasis (MCL) [2]. Leishmania RNA viruses are small, nonenveloped double-stranded RNA viruses [3]. LRV (genus Leishmaniavirus) are members of the Totiviridae family, along with several other groups of protozoal or fungal viruses, including Giardia lamblia virus (GLV), Trichomonas vaginalis viruses and Eimeria spp. viruses [3][4][5][6][7][8][9]. They have a nonsegmented genome approximately 5 kb (4.9-5.3 kb) in length with two long, partially overlapping, open reading frames on the positive strand encoding the capsid/coat protein (CP; orf2) and the RNA-dependent RNA polymerase (RdRp; orf3) and a small 5'-proximal potential orf1 [10]. LRV orf1 presents similarities with orf1 of ScV-L-A virus, a totivirus of the yeast Saccharomyces cerevisiae that encodes a toxin, but its function remains unsolved. LRV genomic RNA is flanked at its extremities by 5' and 3' untranslated regions (UTR). It has been shown that the 5'UTR promotes internal initiation of translation with the presence of an internal ribosomal entry site (IRES) [11]. The 5'UTR also possesses five predicted stem-loop structures (I to V) and a consensus cleavage sequence [12][13][14]. Stem-loop IV and the consensus cleavage site represent the minimal essential components for the viral capsid-dependent RNA cleavage that might play a regulatory role for maintaining persistent infection of host cells [13,15].
The presence of virus-like particles in Leishmania parasites was first reported in the early 1970s [16]. Since then, two types of LRV have been identified: LRV1 carried by some New World Leishmania species and LRV2 by Old World species [3,8,17]. To date, LRV1 was detected in Leishmania parasites from Colombia, Brazil, Peru, Bolivia, Suriname and French Guiana, all originating in the Amazon basin [18][19][20][21][22]. LRV2 was first isolated from L. hertigi, a non-human parasite species [16]. It was then found in a single isolate of L. major and, more recently, in strains of L. aethiopica isolated from biopsies of cutaneous leishmaniasis patients in the Ethiopian highlands as well as in two different Leishmania isolates from Iran belonging to the L. infantum and L. major species [17,23,24]. At the time of their discovery, arbitrary identifiers were given to type 1 Leishmania RNA viruses showing a certain degree of sequence conservation on the basis of hybridization analysis, namely LRV1-1 to LRV1-12 [20]. Two additional types, LRV1-13 and LRV1-14, were then described based on sequence comparison of fragments of two genomic regions, i.e., the 5'UTR and orf3 regions [25]. Several subtypes were isolated from L. (V.) guyanensis, LRV1- (1, 4, 5, 6, 7, 8 and 9), others from L. (V.) braziliensis, LRV1-(2, 3, 13 and 14) and untyped Leishmania spp., LRV1- (10,11,12). In addition to L.  [26]. So far, no LRV1 has been detected in other L. (V.) species and none in L. (L.) species [18,21]. These results, proving the wide distribution of LRV1 in the different New World parasite species, combined with the lack of an infectious phase for these viruses, suggested that LRV1 arose prior to the divergence of New World parasites and further supported the theory that LRVs are ancient viruses [25]. Nevertheless, most of these viruses have only been characterized on short stretches of sequences a few hundred base-pairs long and, depending on the studies, different regions of the genome have been amplified. Complete cDNA sequences are only available for two New World virus isolates (LRV1-1 and LRV1-4), one LRV2-1 isolated from L. major and three more LRV2-1 strains from L. aethiopica [10,12,23,27]. The LRV1-1 and LRV1-4 entire genomic sequences share 77% overall nucleotide identity.
Little is known on the biology of LRV, which persistently infects Leishmania parasites and is unable to produce extracellular infectious particles. Nevertheless, LRV1 infection of parasites seems to affect their virulence. Indeed, Ives et al. showed that metastasizing parasites have a high LRV1 burden that is recognized by host Toll-like receptor 3 (TLR3) to induce proinflammatory cytokines and chemokines in murine models [28]. They also showed that LRV1 in the metastasizing parasites subverted the host immune response to Leishmania, promoting parasite persistence and affecting treatment efficiency. We and others recently showed that LRV1 status in L. (V.) guyanensis-and L. (V.) braziliensis-infected patients was significantly predictive of first-line treatment failure and symptomatic relapse, and might stand to guide therapeutic choices in acute cutaneous leishmaniasis (ACL) [29,30]. However, other studies reported contradictory results with respect to the association between the severity of leishmaniasis (cutaneous vs. mucocutaneous) and the presence of LRV1 [19,22,26,31,32].
In  [33,34]. Among them, L. (V.) guyanensis and L. (V.) braziliensis are the two predominant species, accounting for more than 85% and about 10% of CL cases, respectively [33]. The three other species are only occasionally diagnosed [33,35]. For L. (V.) guyanensis and L. (V.) braziliensis, patients present a broad spectrum of clinical manifestations (nodular, ulcerative, disseminated and mucous forms). In addition, symptomatic relapses and treatment failures are frequently seen in L. (V.) guyanensis infections during which patients variously respond to first-line anti-leishmanial treatments and are more prone to developing chronic CL. Factors underlying disease evolution are yet to be fully understood. We recently reported that 74% of Leishmania spp. isolates collected between 2011 and 2014 were LRV1-positive [21]. LRV1 was detected in 80% (90/112) of L. (V.) guyanensis isolates, 55% (6/11) of L. (V.) braziliensis and in none of the (0/6) L. (L.) amazonensis isolates. Considering the paucity of sequence data available for LRV1, because of the many questions that arise about its origin and distribution, we were interested in gaining insight into the genetic diversity, at the genomic level, and the evolution of this virus in the circulating isolates of Leishmania spp. in French Guiana. To this end, using Illumina deepsequencing technology and de novo sequence assembly, we generated and analyzed LRV1 sequences derived from Leishmania parasite cultures obtained from skin lesions of 20 patients suffering of ACL, i.e., with lesions present for less than 6 months. Polymorphism and phylogenetic analyses were then performed to measure LRV1 genome-wide diversity. Here we report the full-length coding sequences of 24 LRV1 strains and use them to make the first-ever estimate of LRV1 viral diversity that exist in Leishmania (V.) guyanensis. These results represent the largest number of LRV1 full-length coding sequences published to date and the first time that in some parasite strains multiple viral infections have been identified.

Ethics statement
The Leishmania isolates used in this study were obtained from a previously published study [21]. Briefly, these isolates had been successfully cultured from biopsies collected between 2011 and 2014 from 20 adult patients (16 males and four females; 19-66 years old) diagnosed with ACL and enrolled in the study before treatment with pentamidine isethionate (Pentacarinat; Rhone Poulenc) ( Table 1). All patients, in consultation for a suspicion of leishmaniasis at the dermatology unit of the Cayenne hospital (Centre Hospitalier Andrée Rosemon, CHAR), Cayenne, French Guiana, were informed during consultation by the clinician that case records and biological data might be further used in research and that they had the right to refuse. Biological samples were taken for diagnostic purposes and oral informed consent was documented in the case records by the clinician during consultation. Patient data was anonymized at the Laboratory of Parasitology and Mycology of the Cayenne hospital, which carried out diagnostic analyzes and initial parasite culture. The project did not raise any concerns and was approved by the Ethical Committee at Cayenne Hospital. Ethical approval was granted based on the human experimentation guidelines of the "Comité consultatif sur le traitement de l'information en matière de recherche" (CCTIRS: 2012-42). The monocentric audit of retrospective anonymized case record data was permitted by the "Commission nationale de l'informatique et des libertés" (CNIL: DR.2014-091) as well as the regulations of Cayenne Hospital (http://www.ch-cayenne.net/Droits-et-Devoirs.html).

LRV1 status and parasite species confirmation
All 20 isolates of Leishmania spp. had formerly been tested for their LRV1 status by RT-PCR and Leishmania spp. were identified using PCR-RFLP, as previously described (Table 1) [21,36].

Biological cloning
For parasite cloning, 0.5 mL of culture medium containing 2-4.10 7 parasites was spread onto freshly prepared Schneider plates (Schneider's medium, 10% FCS, Biopterin 0.6 mg/L and folate hemin 5 mg/L, 2% low-melting-point agarose). After 7 days, the colonies were harvested and cultured separately in tubes containing complemented Schneider's medium for an additional 5-7 days. Separate cultures were then processed for RNA isolation and conventional RT-PCR using primer pairs specific to each virus (see below).

RNA extraction and sequence-independent amplification
Total RNA was extracted from dry pellets of parasites (10 8 parasites) using the RNeasy mini kit (QIAGEN) as recommended by the manufacturer, treated with Turbo DNase (Life Technologies) for 1 h at 37˚C and then measured on a Biophotometer (Eppendorf). Then the generation of double-stranded cDNA and SPIA amplification was performed from 50 ng of total RNA using the Ovation RNA-Seq system V2 (NuGEN Technologies, Inc.) as specified by the manufacturer's protocol.

Generation of libraries and high-throughput sequencing
The libraries were generated using the NextFlex PCR-free DNA Seq Kit (Bioo Scientific, Austin, TX, USA) with (15 strains, with a two-letter and two-number code) or without (five strains, with a 20XX code) a 10-cycle PCR enrichment before quantification and validation. They were then sequenced on an Illumina MiSeq instrument in 250-base paired-end reads (Illumina, San Diego, CA, USA). Sequence files were generated using Illumina Analysis Pipeline version 1.8 (CASAVA).

Bioinformatics analysis
Raw sequences were filtered and trimmed using fastq-mcf with the following parameters: duplicates were removed when 50 bases were identical between reads (-D parameter) while minimum read length was set to 50 [37]. Adapter sequences were removed using -t (% of occurrence) set to zero. The other parameters were left to their default settings. Then Deconseq was used to eliminate phiX174 phage sequences [38]. Fastq clean files were assembled using SPAdes (version 3.0.0) with the following range of k-mers: 21, 55, 77, 99 and 127 [39]. We used blastn for sequence comparison between assembled contigs and reference LRV1 genomes (LRV1-1 and LRV1-4, accession numbers M92355 and U01899, respectively) [6]. The LRV1-positive contigs were then extracted and used as references for mapping using the bwa mem algorithm (version 0.7.10-r789) with a kmer set to 55 and a minimum alignment score of 40, using a minimum fastq quality of 30 [40]. Some assembly abnormalities were manually corrected. The contigs and known LRV genomes were then multiple-aligned with MAFFT alignment software (version 7.037b) [41].

Conventional RT-PCR and Sanger sequencing
Conventional methods were used to extend sequences when needed. Three sets of degenerate consensus primers were designed according to the NGS sequences obtained. Primers were degenerated so as to amplify products from all samples regardless of the sequence. Full information on the primers is available in Table 2. These sets of primers were used in RT-PCR reactions carried out using the Transcriptor one-step RT-PCR kit (Roche), as recommended by the manufacturer, using 500 ng of input total RNA. RT-PCR products at the expected size were sent to Beckman Coulter Genomics (http://www.beckmangenomics.com/) for direct sequencing. Furthermore, to specifically amplify the different viruses in case of multiple-infected Leishmania isolates, one set of nondegenerate specific primers per virus was designed and used for RT-PCR ( Table 2).

Phylogenetic analyses
Contigs from HTS as well as raw sequences downloaded from the Beckman Coulter Genomics website were analyzed and edited in MEGA 5.05 [42]. Multiple sequence alignments were constructed using ClustalW with all published complete and partial LRV1 sequences. Alignments were checked manually. Sequences were translated into amino acids and both nucleotide and amino acid sequences were checked for irregularities. For phylogenetic trees inferred from the aligned nucleotide sequences, the MrModeltest2.3 program was used to determine the optimal model of nucleotide evolution [43]. The GTR model, with a gamma distribution shape parameter (G) and invariable sites (I), was identified and used for the Bayesian approach, which was performed with Mr. Bayes 3.2.2 to infer phylogenetic relationships [44]. Markov Chain Monte Carlo (MCMC) simulations were run for 10,000,000 generations, with four simultaneous chains, using a sample frequency of 100 and a burn-in of 25,000. Majority rule consensus trees were obtained from the output. Validation of the inference was assessed based on the standard deviation of split frequencies, which was less than the expected threshold value of 0.01. For phylogenetic trees inferred from the aligned amino acid sequences, the ProtTest3 program was used to determine the optimal model of amino acid evolution. The JTT model, with gamma (G) distribution, was identified and used for the Bayesian approach [45], which was performed with Mr. Bayes 3.2.2 [44]. MCMC simulations were run for 10,000,000 generations, with four simultaneous chains, using a sample frequency of 100 and a burn-in of 25,000. Majority rule consensus trees were obtained from the output. Validation of the inference was assessed based on the standard deviation of split frequencies, which was less than the expected threshold value of 0.01.

Naming assignation
LRV1 strains were named according to a recent proposal to the International Committee on Taxonomy of Viruses, i.e., "LRV1" followed by two letters corresponding to the parasite species to which the strain designation was then assigned [46]. For example: LRV1-Lg-LF94 for LRV1 from L. (V.) guyanensis strain LF94 ( Table 1).

Accession numbers
Sequences reported in this paper were deposited in the GenBank nucleotide database under accession numbers KY750607 to KY750630. The raw sequencing data are available in the Sequence Read Archive under accession number PRJNA371487 (www.ncbi.nlm.nih.gov/ bioproject/371487).

Patients
In the context of epidemiological screening, we tested 129 isolates of Leishmania sp. Collected in French Guiana between 2011 and 2014 for the presence of LRV1 [21]. Here, we analyzed a subset of previously identified LRV1-positive isolates obtained from patients diagnosed with ACL. A total of 20 parasite cultures, obtained from human biopsies isolated from 16 men and four women, aged 19-66 years with a median age of 35 years, were selected for high-throughput sequencing. The data (gender and age) collected for each patient are listed in Table 1 Global analysis of high-throughput sequencing data Using Illumina MiSeq sequencing technology, 24 full-length or almost full-length genomic sequences were de novo assembled into individual contigs with a mean 49.8 depth of coverage. Sequences were 4.8-5.3 kb long. Alignment of the full-length genome sequences obtained with those available in the database made it possible to design three sets of consensus-degenerate primers ( Table 2). These primer pairs were used for conventional RT-PCR and Sanger sequencing to extend the sequences to the 5' and 3' ends that were missing for some strains. High-throughput sequencing combined with conventional RT-PCR applied to the 20 LRV1-positive isolates generated almost complete genomic sequences (from nucleotide 60 to 5248 of LRV1-1, accession number NC_002063) from each viral strain. This covers the full-length coding sequence of the virus. In addition, Sanger sequencing of the RT-PCR products also allowed verifying the correctness of the sequences obtained. A total of 24 different viral sequences were obtained from the 20 LRV1-positive parasite isolates tested. All sequences were 5.2 kb long with an average G+C content ranging from 44.7% to 46.6%. Detailed genome statistics are outlined in Table 1. Unexpectedly, co-infections by two or three LRV1 viruses were identified in three parasite isolates, 2028, WF69 and XJ93. LRV1 full-length coding sequences obtained from the same isolate were named 2028 G1 , 2028 G2 and 2028 G3 , WF69 G1 and WF69 G2 , XJ93 G1 and XJ93 G2 .

Sequence features
The 24 sequences obtained had the same structural organization as partially overlapping open reading frames surrounded by untranslated regions similar to the sequence already described [10,12]. In addition, they showed an identical genome size.

Sequence identities
The almost complete genome sequences obtained for each strain as well as the nucleotide and amino acid sequences of the capsid and RNA-dependent RNA polymerase were compared to one another and to those published in the databases. All sequences were unique. On the basis of the common % 5260 bp sequences obtained for each strain, pairwise comparisons showed that nucleotide sequence identities ranged from 76.5% (  1% of the nucleotide positions were identical. These 100% conserved nucleotide positions were distributed into blocks of different sizes. The longest one, a 50-bp motif, covering nucleotides 313-362, encompassed the previously identified pyrimidine-rich region (333-UUUCUUGUUACUAUU-347), whose first nine nucleotides are complementary to a purine-rich region of the Leishmania 18S rRNA [47], as well as the nucleocapsid endoribonuclease cleavage site 314-GAUCˇCGAA-321 [13,14]. In addition, analysis of the stem-loop IV sequences located upstream, from nucleotide 266 to 281, showed that all but two sequences were 100% conserved and identical to the published one [15]. The only two divergent sequences corresponded to strains 2001 and XJ93 G2 . Strain XJ93 G2 presented two complementary substitutions at positions U269C and A278G of the stem, while strain 2001 presented one substitution at position A278G able to form a wobble base pair with the uracil at position 269 (all positions are given relative to LRV1-1 sequence, accession number NC_002063). This shows that LRV1 5'UTR sequences are strongly conserved in both nucleotide sequences and predicted RNA secondary structures.

Phylogenies
Phylogenetic analyses were done on the almost-complete genome sequences (nucleotides 60-5248 relative to the LRV1-1 sequence) (S1 Fig), as well as on either nucleotide or amino acid sequences of the capsid (Fig 1A) and RNA-dependent RNA polymerase (Fig 1B). They all gave strictly identical tree topologies. As mentioned above, we identified six monophyletic "clades", A-F, all supported by high posterior probabilities. Most of our strains clustered in clades A and B. Indeed, nine of the strains, with LRV1-4 and LRV1-LgM5313, belonged to the monophyletic clade A while clade B was composed of 11 of our strains. All LRV1 strains belonging to clades A and B were from L. (V.) guyanensis parasites. Clades A and B belonged to a monophyletic lineage supported by a posterior probability value of 1. In addition, in clade A, LRV1-LgM5313 and LRV1-4 sequences formed a monophyletic well-supported branch in clade A (Fig 1A). These two sequences correspond to Brazilian L. (V.) guyanensis isolates, whereas the others were from French Guiana. Clade C contained only a new divergent strain, XJ93 G2 , which is well supported phylogenetically in all analyses, while clade D was composed of strains 2001 and LRV1-Lg1398. In addition, a phylogenetic analysis restricted to a common 219-bp fragment of the 5'UTR region (S2 Fig), which allowed including sequences from most of the LRV1-1 to LRV1-13 strains (except LRV1-3, -5, -6 and -12 for which no sequence is available), demonstrated that strain 2001 was close to LRV1-8 and LRV1-9 strains. These four strains were identified from L. (V.) guyanensis isolates from Brazil. Clade E comprised two strains, one of our sequences, LF94, and LRV1-1. Finally, YA70 and LRV1-Lb2169, identified from L. (V.) braziliensis strains, belonged to a separate clade, F, distinct from the other clades composed of LRV1 sequences of L. (V.) guyanensis (Fig 1). Interestingly enough, the two viral sequences identified from WF69 belonged to two distinct groups of sequences within clade A, two of the three strains from 2028 (2028 G2 and 2028 G3 ) were also part of two distinct groups of sequences within clade A, while the third sequence, 2028 G1 , belonged to clade B. Finally, XJ93 G1 belonged to clade B and XJ93 G2 was the only representative of clade C. The last phylogenetic analysis, restricted to partial capsid sequences (299bp/99aa), allowed including sequences recently published by Adaui et al. obtained from L. (V.) braziliensis isolates from Peru and Bolivia [29]. All of the LRV1 sequences from L. (V.) braziliensis isolates formed a distinct monophyletic clade, highly supported by a posterior probability of 0.91

Biological cloning
To determine if the multiple LRV1 strains identified in three Leishmania isolates were real viral co-infections or due to the simultaneous presence of distinct parasite populations in our cultures, we proceeded to the biological cloning of the parasites. Two of the three parasites, XJ93 and WF69, were tested. The initial parasite cultures were divided for either RNA isolation or biological cloning. Seven Leishmania clones and three subclones of one of the clones were obtained from XJ93. Five clones were obtained from WF69. RNA extracted from the initial cultures and from the different clones and subclones was then submitted to RT-PCR using different combinations of primers, each one specific to a virus ( Table 2). RT-PCR proved the presence of co-occurring viruses in the initial cultures as well as in all the different clones and subclones.

Discussion
Herein we present the full-length coding sequences of 24 LRV1 strains obtained from 18 L. (V.) guyanensis and one L. (V.) braziliensis isolates from across French Guiana as well as one L. (V.) guyanensis isolate from the Manaus area, Amazonas state, Brazil. This sequence data set is the largest analysis of LRV1 sequences undertaken so far. Analysis of these sequence data reveals that they have the same genomic organization. With the exception of a few indels detected in five strains, the 5'UTR sequences show, as previously reported, a higher degree of nucleotide sequence conservation than the coding sequences [47]. This high level of sequence conservation, with long-100% conserved-motifs (up to 50 bp in length), emphasizes the biological importance of this region in the viral life cycle and for maintaining persistent infection [13,15,47]. Moreover, only two sequences possessed indels in the coding sequence of the RNA-dependent RNA polymerase. Two of the three indels observed were located at the same position between the two strains. Given that both strains derived from the only two L. (V.) braziliensis isolates for which complete cds of the RdRp are available, these indels could correspond to a molecular signature of LRV1 strains from L. (V.) braziliensis. Further studies are necessary to confirm this point.
From a phylogenetic perspective, it appears that differential clustering exists between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis isolates and among viral genomes of L. (V.) guyanensis isolates. These results support the hypothesis that LRV1 is an ancient virus that has co-evolved with its parasite [25]. In addition, the intra-and inter-clade variability is, overall, below and above 10%, respectively, at both the nucleotide and amino acid levels. Finally, geographical clustering is observed with some clades related to the supposed geographical origin of the parasites. Nevertheless, insufficient geographical sampling, with many areas/countries lacking sufficient data, only based on very short sequences, currently limits phylogeographic interpretations. These results emphasize the crucial need for a better assessment of LRV1 distribution and genetic variation in the different parasite species at a wider geographical scale. Analysis of other L. (V.) guyanensis isolates from different geographical regions as well as of other New World Leishmania parasite species should thus expand our understanding of the diversification Fig 2. Phylogenetic analysis of partial capsid gene sequences (alignment of 299 nucleotides). The tree was inferred using the Bayesian method with the GTR + G + I model. Novel sequences generated in this processes and the evolutionary history of LRV1. These results are also of major importance from a taxonomic viewpoint. They should help define analytical methods and criteria to be used for virus classification and to set up an informative naming system.
A unique viral genome sequence was recovered from 17 of the 20 Leishmania isolates tested. Unexpectedly, the three other isolates studied contained more than one viral sequence, indicating that they were co-or super-infected with two or more viruses. The different strains identified from the same isolate showed from about 8% to almost 17% nucleotide divergence among them on the almost complete genome sequences generated. A closer look at the 5'UTR motifs identified as minimal essential elements for site-specific targeting of the capsid endoribonuclease, i.e. the stem-loop IV structure and single-site specific cleavage site, of the different strains showed their perfect conservation, confirming that they had exact endoribonuclease sites [15]. Furthermore, biological cloning of these parasite isolates followed by RT-PCR using primers specific to each virus confirmed the presence of multiple viruses in each clone. Taken together, these data sustain that the multiple viruses identified in these parasite isolates correspond to true persistent viral co-infections. These results suggest that co-or super-infections with divergent LRV1 strains is common given that three of 20 (15%) isolates were multiply infected. The frequency of co-infections will nevertheless have to be confirmed on a larger series. Furthermore, the relevance of viral co-infection for parasite pathogenesis is unknown. Additional surveys on their biological significance will have to be implemented.
These results provide, moreover, a unique opportunity for implementing reliable diagnostic testing methods. Indeed, knowledge of the genomic sequence of many LRV1 strains was essential for oligonucleotide design ensuring adequate detection of the virus (100% detection rate for LRV1s-LRV2as and 5LRVs1-LRV11as/LRV14as). It is noteworthy that the primers and probes that have previously been described, whether used for quantification or identification of LRV1 in clinical samples, possessed mismatches relative to our sequence data set [22,26,29,32,[48][49][50]. This could have led to underestimation of the viral load or under-detection of divergent unknown strains. Furthermore, phylogenetic analyses conducted on sequences obtained with our primer pairs (S3 Fig and S4 Fig) gave an identical tree topology to those based on complete coding sequences of CP or RdRp. These couples of consensus-degenerate primers are thus robust diagnostic tools intended to be used for future screening of large panels of clinical isolates as well as for molecular phylogenetic studies. Finally, characterization of highly conserved sequence motifs, especially in the 5'UTR region, should help, by defining primers and probes with no mismatch relative to the available sequence data set, implement a pan-LRV1 real-time quantitative reverse-transcription (qRT)-PCR assay for unbiased quantitation of viral RNA. Therefore, given previous results by Ives et al. showing in murine models that a high LRV1 burden, eliciting a high pro-inflammatory profile, was associated with metastasizing parasites, it will be of particular interest to analyze various types of clinical samples to further define the role of LRV1 viral load in parasite virulence, disease progression and response to treatment [28]. These tools should contribute to drawing a more definite causal relationship of LRV1 in disease manifestation. study are shown in bold. Sequence identifiers include the strain ID and the NCBI accession number. The major clades representing the different LRV1 genotypes identified (A-F) as well as the clades of LRV1 isolates from L. (V.) guyanensis (Lg) or L. (V.) braziliensis (Lb) parasites are labeled. Posterior probabilities of the Bayesian analysis (>90%) are shown next to each node. The scale bar indicates nucleotide sequence divergence among sequences. https://doi.org/10.1371/journal.pntd.0005764.g002 Supporting information S1 Fig. Phylogenetic analysis of Leishmania RNA virus 1 isolates based on the almost complete genome sequences. The phylogenetic trees were inferred from the almost complete nucleotide sequences (nucleotides 60-5248 relative to the LRV1-1 sequence) using the Bayesian method with the GTR + G + I model. New LRV1 sequences generated in this study are in boldface. Sequence identifiers include the strain ID and the NCBI accession number. The major clades representing the different LRV1 genotypes identified (A-F) are labeled. Posterior probabilities of the Bayesian analysis (>80%) are shown next to each node. The scale bar indicates nucleotide sequence divergence among sequences. (PDF) The tree was inferred using the Bayesian method with the T92 +G + I model. New LRV1 sequences generated in this study are in boldface. Sequence identifiers include the strain ID and the NCBI accession number. The major clades representing the different LRV1 genotypes identified (A-F) are labeled. Posterior probabilities of the Bayesian analysis (>80%) are shown next to each node. The scale bar indicates nucleotide sequence divergence among sequences. (PDF)