Full-Genome Characterisation of Orungo, Lebombo and Changuinola Viruses Provides Evidence for Co-Evolution of Orbiviruses with Their Arthropod Vectors

The complete genomes of Orungo virus (ORUV), Lebombo virus (LEBV) and Changuinola virus (CGLV) were sequenced, confirming that they each encode 11 distinct proteins (VP1-VP7 and NS1-NS4). Phylogenetic analyses of cell-attachment protein ‘outer-capsid protein 1′ (OC1), show that orbiviruses fall into three large groups, identified as: VP2(OC1), in which OC1 is the 2nd largest protein, including the Culicoides transmitted orbiviruses; VP3(OC1), which includes the mosquito transmitted orbiviruses; and VP4(OC1) which includes the tick transmitted viruses. Differences in the size of OC1 between these groups, places the T2 ‘subcore-shell protein’ as the third largest protein ‘VP3(T2)’ in the first of these groups, but the second largest protein ‘VP3(T2)’ in the other two groups. ORUV, LEBV and CGLV all group with the Culicoides-borne VP2(OC1)/VP3(T2) viruses. The G+C content of the ORUV, LEBV and CGLV genomes is also similar to that of the Culicoides-borne, rather than the mosquito-borne, or tick borne orbiviruses. These data suggest that ORUV and LEBV are Culicoides- rather than mosquito-borne. Multiple isolations of CGLV from sand flies suggest that they are its primary vector. OC1 of the insect-borne orbiviruses is approximately twice the size of the equivalent protein of the tick borne viruses. Together with internal sequence similarities, this suggests its origin by duplication (concatermerisation) of a smaller OC1 from an ancestral tick-borne orbivirus. Phylogenetic comparisons showing linear relationships between the dates of evolutionary-separation of their vector species, and genetic-distances between tick-, mosquito- or Culicoides-borne virus-groups, provide evidence for co-evolution of the orbiviruses with their arthropod vectors.


Introduction
The genus Orbivirus contains 22 virus species that are formally recognised by the International Committee for the Taxonomy of Viruses (ICTV) [1], as well as multiple unclassified viruses some of which may represent additional Orbivirus species. The orbiviruses are vectored by Culicoides midges, ticks, phlebotomine flies (sandflies), and anopheline or culicine mosquitoes [1,2]. Lebombo (LEBV) and Orungo viruses (ORUV) were originally isolated from mosquitoes [3], leading to suggestions that they might be mosquito-transmitted [1,2].
Transmission studies of ORUV by Aedes mosquitoes have been inconclusive, hampered by lack of a suitable laboratory host [6,7] (http://wwwn.cdc.gov/arbocat). A low level of replication was detected in intra-thoracically inoculated mosquitoes, which could subsequently transmit the virus. However orally fed mosquitoes failed to replicate or transmit the virus, suggesting an insectinfection barrier. ORUV causes lethal encephalitis in suckling mice and hamsters. It also causes CPE and plaques in Vero and BHK-21 cells [8]. Mice, hamsters and chickens were not infected by subcutaneous inoculation, although mice and hamsters did produce a low-grade viraemia following intra-cranial inoculation [9].
Lebombo virus type 1 (LEBV-1 -(the only serotype of the Lebombo virus species) was isolated in Ibadan, Nigeria, in 1968, from a child with fever [7,10] (http://wwwn.cdc.gov/arbocat). The virus replicates in C6/36 cells without CPE and lyses Vero and LLC-MK2 (Rhesus monkey kidney) cells. It is pathogenic for suckling mice and has also been isolated from rodents and mosquitoes (Mansonia africana: 1 isolate; and Aedes circumluteolus species) in Africa [3,7] (http://wwwn.cdc.gov/arbocat). The species Changuinola virus contains twelve 'named' serotypes that have been isolated from sandflies (phlebotomines) [1,2]. Changuinola virus (CGLV) replicates in mosquito cells (C6/36) without producing CPE and is pathogenic for newborn mice or hamsters following intracerebral inoculation [11]. During a study in central Panama, seven virus strains were isolated from whole blood samples of 80 wild-caught sloths, Bradypus variegatus and Choloepus hoffmanni, using Vero cells [12]. Four strains (Pan An 59663, Pan An 53061, Pan An 307566 and Pan An 341275) were found to belong to two different serotypes and two strains belonging to the same serotype (Pan An 307566 and Pan An 341275) were associated with prolonged or recrudescent viremias in sloths. Antibodies against CGLV were widespread in both sloth species and especially prevalent in Choloepus, but were virtually absent from all other wild vertebrate species tested [12]. However, CGLV was also isolated in Panama from a human with a brief febrile illness, and antibodies were detected in rodents [11].
The increasing availability of representative sequence data for multiple Orbivirus species provides a valuable resource to study their evolution. Previous comparisons of homologous proteins of the insect and tick-borne orbiviruses, have shown only 23-38% aa identity, revealing high levels of genetic diversity within the genus [13]. We present a comparison of the genome sequences of ORUV, LEBV and CGLV, focussing on the genes coding for the viral polymerase (VP1(Pol)), the cell attachment and outer-capsid protein 1 (OC1), the sub-core shell 'T2' protein and the outer-core 'T13' protein.

Cell Culture and Virus Propagation
Orungo virus (UG MP 359) was isolated in 1959. Lebombo virus (SAAR 3896) was isolated in 1968. Changuinola virus (strain Xaraira, BE AR 490492) was isolated in 1990. All viruses were propagated in BHK-21 cells (clone BSR, a gift from Dr. Noel Tordo, Institut Pasteur, France), at 37uC, in Glasgow Minimum Essential Medium (GMEM) supplemented with 10% foetal bovine serum and 100 IU of penicillin/100 mg of streptomycin per ml. Infected cell cultures were incubated at 37uC for 72 hours, until cell lysis began. The cells were scraped into the supernatant and centrifuged at 3,0006g for 10 minutes. The cell pellet was used for dsRNA extraction, using RNA NOW reagent (Biogentex, Tx, USA), as described earlier [14,15].

Cloning of dsRNA Segments
LEBV, ORUV and CGLV genome segments were copied into cDNA, cloned and sequenced using a single primer amplification technique as previously reported [14,15].

Sequence Comparisons
VP1(Pol), VP2(OC1), VP3(T2) and VP7(T13) protein sequences of ORUV, LEBV and CGLV were compared with their homologues from 10 different Orbivirus species retrieved from international sequence databases. Sequence accession numbers used in these analyses are provided in table 1.

Methods used for Sequence Analysis and Phylogenetic Comparisons
The genome sequences of ORUV, LEBV and CGLV were compared to available sequences for other selected reoviruses, using the DNATools package (version 5.2.018, S.W. Rasmussen: Valby Data Center, Denmark). Nucleotide (nt) and amino acid Co-Evolution of Orbiviruses and Arthropod Vectors PLOS ONE | www.plosone.org (aa) sequence alignments were generated using Clustal X version 1.8 [16]. Phylogenetic analyses were performed using MEGA5 [17]. The Neighbour-joining method [18] was used, together with a P-distance model, for initial phylogenetic reconstructions of trees. Maximum likelihood trees (nearest neighbour interchange) were then constructed using the Kimura-2 parameter model for nucleic acid sequences and Poisson model for amino acid sequences.
The best fit model of nucleotide substitution to be used in Bayesian coalescent analyses, was determined using jModelTest (v 0.1.1) [19]. Bayesian coalescent analysis based on Markov Chain Monte Carlo (MCMC) sampling [20] was implemented in BEAST (Bayesian evolutionary analysis by sampling trees) [21]. Unrooted models of phylogeny and strict molecular clock models are two extremes of a continuum [22]. Substitution rates were therefore calculated in BEAST, using a relaxed uncorrelated lognormal clock model. The most general Bayesian skyline coalescent prior was used [23], which allows for both constant and complex changes in population size through time. As a measure of estimate uncertainty, the program returns the 95% highest posterior density (HPD) interval. Molecular evolutionary rates were calculated using BEAST for the three most conserved genes that show the highest conservation in their amino acid sequences between orbiviruses: proteins VP1(Pol), T2 and T13. Although amino acids sequences are well conserved, the corresponding nucleotide sequences are more variable. Therefore to ensure a reliable alignment of the nucleotide sequences, ORFS encoding the VP1(Pol), T2 or T13 were aligned using DAMBE [24] or the web-based programme RevTrans (http://www.cbs.dtu.dk/services/RevTrans/), creating a codon to codon alignment based on the profile of amino acid alignment for corresponding proteins.
Analyses were carried out using a chain length of 10,000,000 states with the first 10% removed as burn-in. Output log files of 4 independent BEAST runs were combined together using Log- Combiner (v1.5.4). This increased the effective sample sizes, and checked whether the various runs are converging on the same distribution in the MCMC run. The program Tracer (v1.5) was used to inspect posterior distributions and estimate evolutionary parameters. The PredictProtein server (http://www.predictprotein.org) was used to predict specific localisations and interactions. Repeated aa sequences were identified using the programme REPRO (http:// www.ibi.vu.nl/programs/reprowww/). The presence of nuclear localisation signals were analysed by PredictNSL, implemented in the PredictProtein server, and the cNLS Mapper (http://nlsmapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi). Sequence relatedness to proteins in public databases was assessed using the NCBI's BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and the pfam software (http://pfam.sanger.ac.uk/search/sequence).
Hydrophobicity profiles of proteins were analysed using Kyte and Doolittle algorithm [25] implemented in the Winpep programme [26].

Sequence Analysis and Comparison of Orbivirus Proteins
The 10 dsRNA genome segments ORUV, LEBV or CGLV were converted into full-length cDNAs, cloned and sequenced. The resulting data has been deposited in GenBank (see table 2 The first three and last three nucleotides of all segments or ORUV or CGLV, and the first two and last two nucleotides of LEBV are inverted complements. In all three viruses the 59 dinucleotide and 39 trinucleotide are identical to those found in other orbiviruses [1,2]. Most of the ORUV, LEBV and CGLV genome segments contains a single major open reading frame (ORF), which spans almost the entire length of the +ve strand. The only exceptions are Seg-9, which in each case contains two overlapping but out-ofphase ORFs. The first of which spans almost the entire length of the segment, encoding the viral helicase VP6(Hel), while a second and overlapping ORF encodes NS4, as found in other orbiviruses [27,28]. The sizes of the encoded proteins together with the lengths of 39 and 59 non-coding regions (NCRs) are given for each genome segment characterised in table 2.
The NS4 sequences of ORUV, LEBV and CGLV contain a high proportion of charged residues, with basic R+K (arginine+lysine) content ranging from 13% to 22%, while acidic E+D (glutamic+aspartic acids) content ranges from 12% to 22%. Each NS4 protein contains 4-5 histidine residues. As seen in other orbivirus NS4s [27], these analyses also identified either monopartite or bi-partite nuclear localisation signals (NLS) (table 7). The 3 NS4s are rich in arginine and lysine residues that are essential for NLS [29]. The NS4 of ORUV (133 aa long), LEBV (92 aa long) and CGLV (87 aa long) were also predicted, using BLAST and Pfam analyses, to bind DNA, confirming previous results obtained with NS4s of GIV and BTV [27] and in particular the ORUV NS4 exhibited 30% amino acid identity with the XRE transcriptional regulation factor (binds DNA and regulates transcription). These findings confirm the presence of NS4 ORF in sandfly-borne orbiviruses as recently shown in other insect-and tick-borne orbiviruses [27].

Comparisons of the VP1(Pol) to the Polymerase of other Orbiviruses
Phylogenetic comparisons of the polymerase genes and proteins of ORUV, LEBV and CGLV were aligned with those of other Orbivirus species (figure 1 and figure S1), showing that the tick and tick-borne viruses cluster together, 'rooting' the insect-borne orbiviruses. A previous study detected 53% to 73% identity in VP1(Pol) between different insect transmitted Orbivirus species, including AHSV, EHDV, BTV, Equine encephalosis virus (EEV) and Palyam virus (PALV) [13]. In contrast only ,35% aa identity was detected between these insect transmitted viruses, and the tickorbivirus SCRV; and 45% between the insect transmitted viruses and members of the tick-borne Great Island virus species (GIV). Intermediate identity levels of 41% were detected between the polymerases of GIV and SCRV [30]. Accession numbers for orbivirus VP1(Pol) downloaded from the databases are provided in table 1. Comparisons of VP1(Pol) of ORUV with the Culicoides-borne orbiviruses, showed 50% to 62% aa identity with Ibaraki virus (EHDV-2) and AHSV, respectively. In contrast, comparison of ORUV VP1 with the mosquito-borne orbiviruses showed 47% to 49% aa identity with PHSV and Umatilla virus (UMAV), respectively. Amino acid identities with tick-borne orbivirus VP1 ranged from 39% to 47% with SCRV and GIV, respectively.

Comparisons of the T2 Subcore Proteins
The orbiviruses show 26% to 83% aa identity in their T2 proteins between different virus species [13]. The levels of aa identity between the T2 proteins of ORUV, LEBV and CGLV ranged from 57% to 67% (table 6) confirming their classification as three different species.
Accession numbers for orbivirus T2 proteins downloaded from the databases are provided in table 1.
Previous evolutionary analyses have suggested that ticks appeared approximately 225 million years ago (MYA) [32], whilst the earliest dating of culicine mosquitoes is about 150 MYA [33] and Culicoides biting midges have been dated to the Cretaceous period (140-65 MYA) [34,35]. The earliest extant lineage of biting  midges was found in 120-122 million years old amber [36]. The oldest sandflies were identified in Lebanese amber that is 135-120 million years old [37,38]. The evolutionary and fossil studies are in agreement regarding dates of separation of ticks, mosquitoes and Culicoides. They however disagree on the date of separation of sandflies [39]. The use of two different arthropod genes to assess arthropod phylogenies was therefore important. The COXI based tree for 3 groups which transmit orbiviruses (ticks, mosquitoes and Culicoides) is shown in figure 3. The antigen 5-related protein based tree of all 4 arthropod groups is shown in figure 4.
Comparisons of the VP1 trees, to COXI tree of ticks, mosquitoes and Culicoides also revealed strikingly similar topologies (figure 5). The antigen 5-related protein based tree showed an identical topology to that of the VP1 trees (figure 6). Such a resemblance has been considered as an indication of co-evolution of viruses and their hosts [40].
Topologies of trees for the T2 aa and nt sequences differed from the VP1 trees. The orbivirus T2 protein sequences cluster into two groups: containing either the VP2(T2) tick-borne viruses, or the VP3(T3) mosquito-borne/Culicoides-borne viruses. This clustering indicates that the mosquito-borne T2 sequences are closer to those of the tick-borne, than the Culicoides-borne viruses.

Comparisons of the T13 (VP7) Core Surface Proteins
Accession numbers for orbivirus VP7(T13) downloaded from the databases are provided in table 1.
Accordingly, VP7(T13) of ORUV, LEBV and CGLV shows highest aa identity levels compared to other Culicoides-borne orbiviruses, but is less closely related to the mosquito-borne and tick-borne orbiviruses. The aa maximum likelihood phylogenetic tree (figure S2) confirms that VP7(T13) of ORUV, LEBV and CGLV clusters within the Culicoides-borne virus-group. A codon to   Co-Evolution of Orbiviruses and Arthropod Vectors codon aligned nucleic acid ML tree ( figure 7) showed a similar topology to those of VP1(Pol), where tick-borne viruses provide a root to insect-borne orbiviruses. The VP7 nucleic acid ML tree has a strikingly similar topology to that of the arthropod COXI-based and antigen 5-related-based protein trees, consistent with the 'coevolution' hypothesis.

Comparison of the Genetic Distances between Groups of Orbiviruses and Dates of Vector-family Divergence
The largest genetic distance between members of the tick-borne, mosquito-borne or Culicoides-borne groups, were calculated for VP1(Pol), T2 (VP2 or VP3) and the VP7(T13) (the three most conserved orbivirus proteins) (table 8).
Comparisons of the divergence dates for the ticks, mosquitoes and Culicoides midges, with the genetic distances between the orbiviruses transmitted by these three vector groups showed nearly linear relationships for both VP1 and T2 proteins, with correlation coefficients of 0.998 and 0.994, respectively. The linearity in the T13 protein is less obvious, with a correlation coefficient of 0.931 for that series (figure 8).

G+C Content of the Orbivirus Genome
Analysis of the G+C contents of genomes of various midgeborne, mosquito-borne and tick-borne orbiviruses showed specificities to each group. The G+C content of Culicoides-borne viruses ranged from 42

Calculations of Molecular Evolutionary Rates (MRE)
Molecular evolutionary rates (MREs) were calculated for the three most conserved orbivirus genes using BEAST and were consistent with what is known for RNA viruses in general. The overall mean rates were 3.22610 24   MREs for the insect-borne orbiviruses are almost double those of the insect-borne orbiviruses.

Outer Capsid Protein 1 of the Insect-borne Orbiviruses Represents a Concatemer of an Ancestral Tick-borne Counterpart
Agarose gel electropherotypes of a Culicoides-borne orbivirus (BTV), mosquito-borne orbivirus (YUOV) and a tick-borne orbivirus (GIV) are shown in figure 9. These electropherotypes show the relative mobility (related to the size) of genome segments encoding OC1 and OC2 of these groups of orbiviruses. The relative migration of genome segments encoding OC1s indicate that Seg-5 encoding VP4(OC1) of the tick-borne viruses is about half the size of that encoding VP2(OC1) of Culicoides-borne and VP3(OC1) of mosquito-borne viruses. VP4 of GIV is related to the carboxy terminal half (aa 483 to 954) of VP2 from BTV, EHDV, ASHV (Culicoides transmitted) and VP3 from YUOV and PHSV (both mosquito-transmitted), with 28-30% aa sequence identity. Figure 9 also shows a schematic for the match between VP4(OC1) of GIV and the carboxy terminal half of VP2(OC1) of BTV. The hydrophobicity plot of GIV VP4 between aa 114 to 523 is similar to that of aa 642 to 956 of VP2 (VP2D2) ( figure 9).
An amino acid based neighbour joining phylogenetic tree shows three groups of the highly variable cell-attachment and outercapsid protein 'OC1' (figure 10). Use of the programme 'REPRO' indicates that OC1 of the insect-borne viruses contains sequences that have been duplicated at some point during their evolution (figures 11 and 12). In BTV, aa 63 to 471 were identified as a repeat of aa 500 to 955. Finer sequence analyses identified that aa75-442 have highly similar hydrophobicity plots to aa 567-955 ( figure 11). In YUOV, aa 11 to 448 were identified as a repeat of aa 45 to 851. Finer sequence analysis identified that aa 60-448 have highly similar hydrophobicity plots to aa 462-851 ( figure 11).
For the three viruses characterised in this study, OC1 is identified as VP2 (based on its relative size, as the second largest virus-protein). In ORUV (figure 13), aa 26 to 421 of VP2(OC1) was identified as a repeat of aa 427 and 899 of the same protein.
Finer sequence analysis identified that aa 75-384 have highly similar hydrophobicity plots to aa 520-899 ( figure 13). In LEBV, aa 7 to 412 of VP2(OC1) represents a repeat of aa 417 to 831. In CGLV, which has the longest orbivirus OC1 reported to date (1151 aa), aa 1 to 505 were derived by duplication of aa 521 to 1002. In the tick-borne orbiviruses, OC1 of SCRV is also identified as VP3, containing 654 aa, while OC1 of tick borne KEMV and GIV is identified as VP4, containing 551 aa. Amino acids 1 to 81 of SCRV VP3(OC1) may also represent a duplication of aa 88 to 160 ( figure 14). The hydrophobicity plots of the two sequences are highly similar ( figure 14).
Single isolates of ORUV have been obtained from Culex perfuscus, Anopheles gambiae, or Aedes aegypti mosquitoes) [3,7] (http:// wwwn.cdc.gov/arbocat). The inability of orally fed Aedes aegypti to transmit ORUV, even though transmission occurred after intrathoracic inoculation (bypassing a potential gut barrier), raises questions about the current assumption that this virus is mosquitoborne.
Differences in the migration order of the T2 protein, between the groups of orbiviruses that are transmitted by different vectors, are caused by large variations in the relative size of the highly variable outer-capsid protein one (OC1). In the Culicoides-borne orbiviruses, OC1 is the 2 nd largest viral protein (VP2 -encoded by Seg-2: 110-120 kDa), while in the mosquito-borne viruses it is slightly (,10%) smaller (VP3 -encoded by Seg-3: 90-100 kDa).
Our analyses indicate that OC1 of both groups of insect-borne orbiviruses were generated by concatermerisation/duplication of OC1, from an ancestral tick borne virus. The ancestral form of OC1 would be ,573 aa, similar in length to that Great Island virus (GIV) which is transmitted by ticks, contains a smaller OC1 (VP4 -encoded by Seg-5, 62 kDa) that is approximately ,55% of the size of its counterpart in the insect-borne orbiviruses [13]. The OC1 and T2 proteins of ORUV, LEBV and CGLV are identified as VP2 and VP3 respectively, again grouping them with the Culicoides-borne viruses.
Phylogenetic analysis of tick-, mosquito-and Culicoides-borne sequences indicate that OC1 of the tick-borne viruses forms a distinct phylogenetic cluster, which has a common root with OC1 of the mosquito-borne orbiviruses, while the Culicoides-borne OC1 forms a further separate cluster. These analyses also indicate two groups of OC1 representing the tick or tick-borne orbiviruses. One represented by VP4(OC1) of the GIV group, which may be closer to the ancestral orbivirus, while the other group is represented by SCRV VP3(OC1) which has a partial duplication within its first 160 aa.
Duplication of individual orbivirus genome segments can occur via a process of concatermerisation [47]. Partial or full gene duplication (concatermerisation) has also been identified in several other 'reovirus' proteins, indicating that it is a generalised mechanism creating sequence diversity in viruses of family Reoviridae [47,48,49,50]. For example the genome segment 9 of Colorado tick-fever virus (Coltivirus, Reoviridae) was found to be generated by a full gene duplication. Following a duplication-event the repeated sequences can evolve separately, in response to functional constraints [48,49,51].
The two subdomains of the insect-borne OC1s show significant levels of aa identity (28-29%) and have very similar (almost superimposable) hydrophobicity profiles. It therefore appears unlikely that the smaller OC1s of the tick-borne orbiviruses could have originally evolved through partial deletion of a larger (insectborne) precursor protein. Deletions in dsRNA viruses often generate defective-interfering viruses, that are unable to spread in the absence of the original complementing virus-strain [50,52], while concatermerisation does not affect the viability of the viruses [47,53]. Recently, an African horse sickness virus expressing a truncated VP2 from which 20% of the protein (amino acids 279 to 503) had been lost, was found to replicate efficiently in cell culture [54]. In BTV VP2, the corresponding sequence to this deletion encompasses the neutralisation epitopes described earlier [55]. The size differences observed in OC1 between the different groups of orbiviruses may have implications for their interactions with cell-surface receptors in the different groups of vectors/hosts. Antibody-neutralisation of the tick-borne orbiviruses involves both OC1 and OC2, while OC1 is clearly the dominant neutralisation antigen of the insect-borne orbiviruses [1,2,56]. From an evolutionary perspective, for such a duplication of aa sequence to become fixed within the virus population it must provide a fitness-advantage in terms of replication or transmission efficiency, promoting survival of the new modified gene, for example through adaptation to new environment/host [57,58,59]. Mutations that positively affect protein function could potentially increase, rather than reduce the probability of retention of the duplicated gene [60]. After a concatermeric duplication of the aa sequence within a single protein, subsequent evolution of the duplicated gene could lead to partitioning and separation of its original functions into the different halves of the protein, rather than simply a duplication of  these functions [60]. Indeed, this may be true for OC1 of the insect-borne orbiviruses such as BTV, where neutralisation epitopes are principally mapped to the amino half of VP2(OC1) [55,61]. It is noteworthy that the deletion that was identified in AHSV VP2 implicates a sequence of domain 1, while domain 2 is intact. It has been suggested that this deletion also uncovers a sialic acid binding site [54], which may be located on domain 2. Domains 1 and 2 of BTV VP2(OC1) expressed separately, in a soluble form, were both found to raise neutralising antibodies in mice. However, the neutralising antibody titers were 10 times higher with domain 1 than domain 2 (Mohd Jaafar et al., manuscript in preparation).
The G+C content of ORUV, LEBV genomes also places them closer to the Culicoides-borne viruses, than to the mosquito-borne viruses, while the G+C content of CGLV (transmitted by sandflies) is borderline between those of mosquito-borne and Culicoides-borne viruses.
Based on their isolation from mosquitoes, both ORUV and LEBV were originally considered likely to be mosquito-borne. However, it is possible for a virus to be isolated from freshly engorged mosquitoes that have ingested infectious blood meal, rather than an actual infection of the mosquitoes, which would be required for transmission. Although data presented here indicates that both ORUV, LEBV are likely to be Culicoides-borne viruses, this will require confirmation by vector competence studies. CGLV, the only known sandfly-borne orbivirus, clusters among the Culicoides-borne viruses. Interestingly, CGLV was also found to replicate in KC cells (cells derived from Culicoides variipennis) (data not shown).
It has been previously suggested that the non-vectored dsRNA viruses have evolved by co-evolution with their respective hosts [62]. Neighbour-joining analysis of orbivirus T2 proteins using the P-distance or Poisson's correction algorithms, as well as maximum likelihood analyses, indicate that SCRV represents the oldest known orbivirus lineage, providing a 'root' for all of the other orbiviruses. SCRV has no known vertebrate hosts and could be a true ''tick virus'' rather than a ''tick-borne virus'' [13,30,63]. The same analyses also show that T2 proteins of the tick-borne and mosquito-borne orbiviruses form distinct phylogenetic clusters originating from a common branch, but are more closely related to each other than to the Culicoides-borne viruses, which are located on a distinct branch of the tree.
The VP7(T13)-based amino acid trees showed similar topologies to those of the T2 protein. Together with the amino acid or nucleic ML trees for Seg-1/VP1(Pol), these indicate that the tickborne orbiviruses group together, providing a root for the mosquito-borne and Culicoides-borne orbivirus groups. Previous phylogenetic analyses, based on mitochondrial genes, indicate that ticks also represent a root for other arthropods, including the flies (Culicoides and sandflies) and mosquitoes [64]. The clustering of ORUV and LEBV among Culicoides-borne viruses disagrees with previous suggestions based on their isolation from mosquitoes, that both viruses are mosquito-borne.
A linear relationship was observed between the largest genetic distances calculated within each of the three phylogenetic groups of orbiviruses and dates for the evolutionary separation of their vectors. The similar topology of the viral-gene trees and vector-COXI based trees or antigen 5-related protein based trees is not shared with the mammalian-host-COXI tree (data not shown) and no linearity was detected between the genetic distances between viruses and the dates of separation of their mammalian hosts (data not shown). These results provide primary evidence for coevolution of the orbiviruses with their arthropod vectors rather than their vertebrate hosts.
The G+C content of the mosquito genome is within the range of 35.2%-38.7% [65,66] (http://www.broadinstitute.org/ annotation/genome/aedes_aegypti.2/SingleGenomeIndex.html). In contrast the G+C content of ixodid tick genome is approximately 56% for coding regions (http://mail.vectorbase. org:82/pipermail/iscapularis/2008-December/000017.html). From available Culicoides sequences in the databases, the G+C content of a Culicoides coding region is approximately 39%. The G+C content of the genome of different vector-groups of orbiviruses is similar to those of their vectors, supporting the coevolution hypothesis between orbiviruses and their respective hosts.
The G+C content is significantly different between the tickborne and insect-borne orbiviruses (14% to 17% difference). This is inconsistent with a simple and rapid jump to a new vector species but suggests a much slower co-evolution/adaptation process. In contrast there are smaller differences in G+C content between the tick-borne and insect-borne flaviviruses (of only ,9%) which appear to have diverged more recently from a proposed mosquito2/mosquito-borne ancestor [67,68]. In phylogenetic trees, the insect-borne flaviviruses provide a 'root' for the tickborne flaviviruses, while the reverse is true for the orbiviruses.
Previous evolutionary studies suggest that ticks appeared approximately 225 million years ago (MYA) [32], while the earliest dating of culicine mosquitoes is about 150 MYA [33]. Culicoides biting midges are vectors for several orbiviruses and their appearance has been dated to the Cretaceous period (140-65 MYA) [34,35].
The topologies of phylogenetic trees for the orbivirus genes/ proteins are similar to those of the vector's genes. The relationship between genetic distances of the orbivirus genes and the dates of separation of the three vector groups (ticks, mosquitoes and midges) are near linear. OC1 of the insect-borne orbiviruses appears to have evolved from an ancestral OC1, probably from a tick-borne virus. It is therefore likely that orbiviruses have coevolved with their vector groups generating three major phylogenetic groups. The available data suggest that viruses in these groups do not cross between the vector-species groups. The lack of co-speciation with their vertebrate hosts suggests that the ancestral orbiviruses were primarily arthropod viruses that subsequently crossed the species barrier between arthropods and mammalian hosts.
Based on the T2 gene (which showed the lowest rates of change in both the tick-borne and insect-borne orbiviruses), the most recent common ancestor of the known tick-borne orbiviruses is dated to ,7,000 years ago (range: ,4,500 to ,8,500), while the most recent common ancestor for the currently known insectborne orbiviruses is dated to 3,700 years ago (range: ,2100 to ,5200).
The data provided in this manuscript supports the co-evolution hypothesis for the orbiviruses with their vectors [13], indicating that it is more likely than host switching from one vector group to another. Isolates of a single virus species can be transmitted by more than one vector species (e.g. BTV has been isolated from several Culicoides species), making it difficult to infer co-speciation at the vector-species level. The earliest orbiviral ancestor was a tick/tick-borne orbivirus which existed at least 225 MYA. Mosquito or mosquito-borne orbiviral ancestors would have evolved from this ancestral virus followed by Culicoides or Culicoides-borne orbiviruses.
The generation of full genome sequence data for ORUV, LEBV and CGLV will facilitate the development of sequencespecific RT-PCR assays for epidemiological studies, well as identification of other virus isolates belonging to the same Orbivirus species. Figure S1 A maximum likelihood tree showing phylogenetic comparisons of the nucleotide sequences of Seg-1 encoding the VP1(Pol) of ORUV, LEBV and CGLV, aligned with those of other Orbivirus species. The figure depicts the three groups of orbiviruses (i-Culicoides-/sandfly-borne, ii-mosquito-borne and iiitick-borne) as separate clusters. The tree is based on codon to codon nucleotide alignments generated from aa profile alignment.