Full Genome Sequencing of Corriparta Virus, Identifies California Mosquito Pool Virus as a Member of the Corriparta virus Species

The species Corriparta virus (CORV), within the genus Orbivirus, family Reoviridae, currently contains six virus strains: corriparta virus MRM1 (CORV-MRM1); CS0109; V654; V370; Acado virus and Jacareacanga virus. However, lack of neutralization assays, or reference genome sequence data has prevented further analysis of their intra-serogroup/species relationships and identification of individual serotypes. We report whole-genome sequence data for CORV-MRM1, which was isolated in 1960 in Australia. Comparisons of the conserved, polymerase (VP1), sub-core-shell ‘T2’ and core-surface ‘T13’ proteins encoded by genome segments 1, 2 and 8 (Seg-1, Seg-2 and Seg-8) respectively, show that this virus groups with the other mosquito borne orbiviruses. However, highest levels of nt/aa sequence identity (75.9%/91.6% in Seg-2/T2: 77.6%/91.7% in Seg-8/T13, respectively) were detected between CORV-MRM1 and California mosquito pool virus (CMPV), an orbivirus isolated in the USA in 1974, showing that they belong to the same virus species. The data presented here identify CMPV as a member of the Corriparta virus species and will facilitate identification of additional CORV isolates, diagnostic assay design and epidemiological studies.


Introduction
Corriparta viruses are mosquito-borne arboviruses, classified within one of the 22 virus species currently recognised within the genus Orbivirus, family Reoviridae. Currently there are also 15 'unclassified' orbiviruses in the genus, which may represent additional species [1][2][3][4][5][6]. The orbivirus genome is composed of 10 segments of linear dsRNA, packaged as one copy of each segment within each of the non-enveloped icosahedral virus particles. The intact virion is composed of three concentric protein shells (the 'outer-capsid', 'core-surface layer and the 'subcoreshell'). Orbiviruses are transmitted by ticks or hematophagusinsect vectors (including Culicoides, mosquitoes or sand flies) and collectively have a wide host-range that includes both domesticated and wild ruminants, equids, camelids, marsupials, sloths, bats, birds, large canine and feline carnivores, and humans [2,[7][8][9].
The species Corriparta virus currently contains six distinct viruses, that are identified as: corriparta virus MRM1 (CORV-MRM1); CS0109; V654; V370; Acado virus; and Jacareacanga virus [2]. The structural and chemical properties of the corriparta viruses are similar to those of other orbiviruses [2]. They are sensitive to low pH and heat, and can be modified by treatment with trypsin or chymotrypsin [10]. They have also been shown to multiply in mosquitoes after intra-thoracic inoculation [10].
Members of the Corriparta virus species/serogroup have been detected in Australia, Africa and South America [11]). They have been isolated from wild birds, and neutralizing antibodies were found in wild and domestic birds, cattle, marsupials, horses and man [12][13][14][15]. Corriparta virus MRM1 was isolated in 1960, from Culex mosquitoes, as well as from Aedeomyia catasticta, a rare mosquito species collected near Mitchell River in North Queensland, Australia. Subsequently, strains CS0109, V654 and V370 were also isolated in Australia [2,8,15,16]. Acado virus and Jacareacanga virus were isolated from pools of Culex mosquitoes collected in Ethiopia and Brazil during 1963 and 1975 respectively [8,11].
The International Committee on Taxonomy of Viruses (ICTV) has agreed 'polythetic' definitions for individual virus species [17]. The ability to exchange genome segments with other viruses belonging to the same virus species by 'reassortment' is recognised as the primary determinant of Orbivirus species [2,7]. However, in the absence of data concerning their compatibility for reassort-ment, the members of individual species can be identified by other 'polythetic' parameters that include similarities in RNA and protein sequences, their RNA-segment size distribution (reflected by their migration patterns -electropherotype) during agarose gel electrophoresis [AGE], host and/or vector range, the clinical signs of infection, and serological relationships [2,7,[18][19][20].
The members of the different Orbivirus species were originally identified as belonging to distinct 'serogroups', based on their cross-reactivity in 'group-specific' serological assays that include complement fixation (CF) tests, group-specific ELISA, or agar-gelimmuno-diffusion (AGID) tests, most of which target outer-core protein VP7(T13) [2,7,21]. The corriparta viruses were initially grouped primarily on the basis of CF tests [8,22]. However, a lack of neutralization assays has prevented further analysis of their intra-serogroup serological-relationships and the identification of distinct serotypes.
'California mosquito pool virus' (CMPV) was isolated in 1974 from pooled Culex tarsalis mosquitoes collected as part of an infectious agent surveillance program conducted by The California Department of Public Health [5]. Partial sequences for genome segments 2, 4, 6, 7 and 9 from CMPV (accession numbers EU789391 to EU789395) were compared to available data for other orbiviruses, suggesting that CMPV might represent a novel virus species [5]. However, the lack of reference sequences for representatives of all Orbivirus species, made it impossible to confirm the taxonomic status and species identity of CMPV at that time.

Characterisation and coding assignments of CORV genome segments
Sequences for Seg-1 to Seg-10 CORV-MRM1 (AUS1960/ 01) have been deposited in the GenBank with accession numbers KC853042 to KC853051, respectively. They range from 3,925 bp to 790 bp (encoding proteins of 1,290 aa to 108 aa) with a total length of 19,093 bp, ( Table 1). The different genome segments of CORV-MRM1 all share six fully conserved nucleotides at their 59 ends, and 10 at their 39 ends (+ve: 59-GUAUAG………..CAAAGGAUAC-39). Two terminal nucleotides at the 59 end and three nucleotides at the 39 end (59-GU……UAC-39) are characteristic of the genus Orbivirus and the first and last two nucleotides represent inverted complements (http://www.reoviridae.org/dsRNA_virus_proteins/ CPV-RNA-Termin.htm).
Like other orbiviruses, most genome segments of CORV-MRM1 (AUS1960/01), have shorter 59 than 39 NCRs, except for Seg-6 (encoding the smaller outer capsid protein VP5(OC2)) which has a longer 59 NCR and Seg-9 (encoding VP6) which has 39 and 59 NCRs of equal length (Table 1). Exceptions also occur in Umatilla virus (UMAV) and Great Island virus (GIV) although the significance of these variations is unclear.
The 'highly conserved' subcore-shell VP2(T2) protein and the 'highly variable' outer-capsid/cell-attachment protein VP3(OC1) of CORV-MRM1, are encoded by Seg-2 and Seg-3, respectively. A similar coding pattern is seen in other MBOs, but the presence of a larger OC1 in the CBOs (including bluetongue virus, the orbivirus 'type' species) results in a reversed coding assignment for these two genome-segments ( Figure 1 and Table 2).
The size of the highly conserved T13 core-surface protein (VP7 of BTV) is also relatively consistent in most orbiviruses, while the viral inclusion-body matrix-protein, (non-structural protein 2 [NS2(ViP)]) is more variable in size ( Table 2). As a result, NS2 of CORV-MRM1 (AUS1960/01) is encoded by Seg-7, while Seg-8 encodes the core-surface protein VP7(T13) ( Table 1), in a manner similar to the other MBOs (Table 2). However, this coding-assignment is again reversed in BTV and in some CBOs (EUBV and CHUV), due to variability in the size of NS2 between different viruses [34,35]. The capping enzyme 'VP4(CaP)' of CORV-MRM1 (at 643aa) is smaller than that of the other MBOs and some of the CBOs (Table 2). However, it is encoded by the largest Seg-4 (at 2,032 bp) so far identified in any orbivirus.
Most of the genome segments of CORV-MRM1 (AUS1960/ 01) except Seg-9 and Seg-10 are monocistronic, encoding a protein from a single large ORF, starting from an initiation codon with a strong Kozak sequence (RNNAUGG) [36]. However, like BTV, CORV Seg-10 has two in-frame AUG initiation sites encoding the NS3 and NS3a proteins of 238 aa and 108 aa (starting at 17 bp and 407 bp) respectively ( Table 1). The first of these (coding for NS3) has a 'weak Kozak context' (GUAAUGU), possibly enhancing read-through and initiation of translation from the second 'in frame' initiation site (at 407 bp). This has a strong Kozak context (GUUAUGG), but would express the smallest NS3a in any of the orbiviruses characterized to date.
The first start codon of CORV-MRM1 (AUS1960/01) Seg-9 also has a moderate Kozak sequence (UUGAUGA) and a second down-stream ORF. However, this is in the +2 reading-frame (at 143-598 bp), encoding the 152 aa NS4 protein. NS4 has previously been identified in several other orbiviruses and has been characterised in BTV and GIV [24,37,38]. The downstream Seg-9 ORF of CORV has a strong Kozak context (AGGAUGG) and is expected to produce a protein in infected cells. Weak or moderate Kozak sequences have also been observed in several of the genome segments of other orbiviruses, but they still appear to be translated effectively [24,39].
The G+C content of the CORV genome is 45.16%, which is considerably higher than that of other MBOs [Peruvian horse sickness virus (PHSV) and YUOV with 36.66% and 41.59% respectively] but within the overall G+C range of the insect-borne orbiviruses [36.66% in PHSV (mosquito), to 45.86% in Equine encephalosis virus (EEV) (Culicoides)]. However, it is lower than that of the tick-borne or tick-associated orbiviruses [57.29% in GIV and 51.93% in St Croix River virus (SCRV)].

Relationships between CORV and CMPV
The sequences of CORV-MRM1 Seg-4/VP4(Cap), Seg-6/ VP5(OC2), and Seg-9/VP6(Hel)/NS4 were also compared with Figure 2. Unrooted neighbour-joining tree for orbivirus subcore-shell T2 proteins. A phylogenetic tree was constructed using distance matrices, generated using the p-distance determination algorithm and pairwise deletion parameters in MEGA 5 (1000 bootstrap replicates) [65]. Since many of the available sequences are incomplete, the analysis is based on partial sequences (aa 393 to 548, numbered with reference to the aa sequence of BTV-VP3(T2)). The numbers at nodes indicate bootstrap confidence values after 1000 replications. The tree shown in Figures 3 and 4 were drawn using the same parameters. The CORV and CMPV isolates are shown in red font in the amber coloured circle. Full names of virus isolates and accession numbers of T2 protein sequences used for comparative analysis are listed in Table S1 (supplementary data). 'e' and 'w' after serotype number indicate eastern and western topotype strains, respectively. doi:10.1371/journal.pone.0070779.g002 the incomplete sequence data available for CMPV (Table 4). CMPV again grouped closely with CORV-MRM1 (AUS1960/ 01), sharing aa/nt identities of 75/67.4% and 86.2/74.3% in VP4(Cap) and VP5(OC2) respectively, further supporting the identification of CMPV as a member of Corriparta virus species. Seg-9 of CORV-MRM1 (coding for NS4 and VP6(Hel)) shares only 42.6/61.6% aa/nt identities with CMPV. However, no closer matches were found in either of these proteins with members of other Orbivirus species. Further analysis of CMPV Seg-9 confirmed the presence of an alternate ORF, encoding a 153 aa NS4 protein, which shares 55.9%/63.2% aa/nt identity with NS4 of CORV-MRM1.

Discussion
Collectively the orbiviruses infect a wide range of hosts and are transmitted by a diverse group of vectors, including Culicoides, mosquitoes, sand flies and ticks [7]. Initially, the different 'serogroups' of orbiviruses, which are now recognised as distinct virus species, were identified by CF, AGID, immunofluourescence (IF) tests and/or enzyme-linked immunosorbent assays (ELISA). However, low level serological crossreactions have been detected between some of the more closely related Orbivirus species (for example between isolates of BTV (the Orbivirus type species) and EHDV [41] making it difficult to conclusively identify the species of new isolates by these techniques alone. The high quality reference-strains and antisera needed for these assays are not widely available for all of the existing serogroups/species, and may themselves represent a biosecurity risk. In addition, serological methods do not generate absolute quantitative values for the relatedness of individual virus strains.
In contrast, nucleotide sequence data for reference orbivirus strains and novel isolates can be compared and transmitted easily between laboratories, without risk, providing highly reproducible and fully quantitative numerical values for the relatedness of each genome segment/protein. These data can also be used to unambiguously identify different genome segments, proteins, virus species, topotypes and serotypes [1,25,33,39,42].
Sequence variations in the outermost orbivirus capsid and cellattachment protein, VP2(OC1) of BTV, correlate with both the geographic origin of the virus (topotype) and with its serotype [42,47]. In contrast sequence variation in the core proteins VP1(Pol) and T2 (VP3 of BTV) correlate only with virus genus, species and topotype [1,32,33,39,40]. However, a lack of fullgenome sequence data for all 22 recognized Orbivirus species has hindered identification of isolates belonging to novel Orbivirus species and development of nucleic acid based diagnostic tests.
The full-genome sequence data presented here for CORV-MRM1 (AUS1960/01) provides a primary reference for identification of other (novel) members of the species Corriparta virus (CORV). Conserved nucleotide sequences are present at both the upstream and downstream termini of the genome segments of different Orbivirus species [2,52]. All of the genome segments of CORV-MRM1 (AUS1960/01) contain the terminal sequences (59-GUAUAG…..  showing several differences from those of BTV (e.g. 59-GUUAAA……..ACUUAC-39). The significance of a longer conserved 39 terminal region in CORV- MRM1 is unknown, although it has been suggested that these regions may play a role in initiation of transcription or translation of the RNA or its packaging during virus replication [53,54]. The orbivirus proteins VP1(Pol), 'T2' and 'T13', which are highly conserved have taken priority in development of molecular diagnostic assays and in phylogenetic analyses [24,29,31,32,55]. Studies with large numbers of different BTV and EHDV isolates show .73%, .83% and .73% intra-species aa identities in VP1, T2 and T13 respectively, providing useful markers for the identification and classification of existing and novel orbivirus isolates [1,40,44].
CORV-MRM1 shares less than 60% aa identity (Table 3) in VP1, T2 and T13, with members of the other recognised Orbivirus species, confirming the classification of Corriparta virus as a distinct species. CORV-MRM1 is most closely related to other MBO species, particularly Umatilla virus (UMAV), with 59.66% and 50.17% aa identity in VP1 and T2 proteins respectively (Table 3). However, CORV-MRM1 shows 91.6%/75.9% aa/nt identity and 91.7%/77.6% aa/nt identity to CMPV in its T2 and T13 protein/gene sequences, indicating that they belong to the same virus species and therefore that CMPV does not represent a new species as previously suggested [5].
Previous studies have suggested that the MBOs have evolved from tick-borne ancestors, with CBOs being last to evolve [39]. The concatermerisation of orbivirus genome segments and subsequent mutations may provide a mechanism that can progressively increase the size of individual genome segments [39,56]. It may therefore be significant that the size of OC1 increases in the order: TBOs (551 aa in GIV and 654 aa in SCRV), MBOs (755 aa in CORV to 881 aa in PHSV) and CBOs (961 aa in BTV to 1056 aa in AHSV-1), (Table 2).
Previous studies also indicate that the orbiviruses have evolved through a process of 'co-speciation' with their vectors [39]. Phylogenetic analyses of the conserved Pol, T2 and T13 proteins (presented here - Fig. 2, 3 and 4), show consistent grouping of the CBOs, MBOs and TBOs. In each case CORV-MRM1 groups with the other MBOs (WGRV, UMAV, PHSV and YUOV).
Corriparta viruses have been isolated in Australia, Africa and South America [11]. However, the data presented here clearly identify CMPV, which was isolated in North America [5], as a member of the species Corriparta virus. The occurrence of these closely related viruses in the Americas and Australia indicates that there has been spread of viruses between these regions, which could be due to movement of infected hosts or vectors. Similar movements are also suggested by the detection of other orbiviruses (UMAV and PHSV, and individual serotypes of BTV and EHDV) in more than one continent, e,g, in Australia, Africa and the Americas [1,28,[57][58][59]. Additional strains of each serogroup/ species, from different locations/origins, need to be isolated and characterised/sequenced to better understand their geographical distribution and its significance.  proteins. An unrooted NJ phylogenetic tree for orbivirus VP1(Pol) proteins was constructed using a p-distance algorithm and pairwise deletion parameters, as indicated in Figure 1. The CORV-MRM1 isolate characterised in this study is indicated in red font in amber coloured circle. Full names of virus isolates and accession numbers of polymerase sequences used for comparative analysis are listed in Table S1 (supplementary data). 'e' and 'w' after serotype number indicate eastern and western topotype strains, respectively. doi:10.1371/journal.pone.0070779.g003 Figure 4. Unrooted neighbour-joining for orbivirus outer-core VP7(T13) proteins. An unrooted NJ phylogenetic tree for orbivirus VP7(T13) proteins was constructed using a p-distance algorithm and pairwise deletion parameters, as indicated in Figure 1. The CORV-MRM1 and CMPV isolates are shown in red font in amber coloured circle. Full names of virus isolates and accession numbers of T13 protein sequences used for comparative analysis are listed in Table S1 (supplementary data). 'e' and 'w' after serotype number indicate eastern and western strains, respectively. doi:10.1371/journal.pone.0070779.g004 The sequences and relative sizes of the VP1(Pol), T2 and OC1 proteins are important evolutionary markers that can help differentiate/group orbiviruses by species, serotypes and topotype. The sequence data generated in this study will facilitate the use of phylogenetic analyses to identify other novel isolates belonging to the Corriparta virus species, as well as helping to identify the arthropod vectors involved in their transmission. Further studies are still needed to define the different serotype and topotypes of CORV.

Virus propagation
CORV-MRM1 (AUS1960/01), obtained at passage level MB6/BHK2 from the Orbivirus Reference Collection at The Pirbright Institute, was propagated in BHK-21 cell monolayers [clone 13 obtained from European Collection of Animal cell Cultures (ECACC -84100501)], in Dulbecco's minimum essential medium (DMEM) supplemented with antibiotics (100 units/ml penicillin and 100 mg/ml streptomycin) and 2 mM glutamine. Infected cell-cultures were incubated at 37uC until they showed widespread (100%) cytopathic effects (CPE). Then viruses were harvested, aliquoted and used for the extraction of viral dsRNA. All virus isolates used in these studies were obtained from diagnostic samples of naturally infected animals and were taken as a part of normal diagnostic investigations by qualified veterinarians in the individual countries.

Extraction and purification of CORV dsRNA
Cell monolayers showing 100% CPE after infection with CORV-MRM1, were harvested and pelleted at 3000 g for 5 min. The viral dsRNA was released and purified using TRIzolH reagent (Invitrogen) as described by Attoui et al [60]. Briefly, the infected cell pellet was lysed in 1 ml of TRIzolH, then 0.2 volume of choloroform was added, vortexed and the mixture incubated on ice for 10 min. The aqueous phase, containing total RNA, was separated from the phenol-chloroform phase by centrifugation at 10,000 g for 10 min before 900 ml of isopropanol was added prior to incubation at 220uC for 2 hours. The RNA was pelleted at 10,000 g for 10 min, washed with 70% ethanol, air dried and dissolved in 100 ml of nuclease free water (NFW). Single stranded RNA (ssRNA) was removed by precipitation with 2M LiCl (Sigma) at 4uC overnight, followed by centrifugation at 10,000 g for 5 min. An equal volume of isopropanol, containing 750 mM ammonium acetate, was mixed with the supernatant. After precipitation at 220uC for 2 hours, the viral dsRNA was pelleted at 10,000 g for 10 min, washed with 70% ethanol, air dried and suspended in 50 ml of NFW. The RNA was either used immediately or stored at 220uC.

Reverse transcription and PCR amplification
CORV-MRM1 genome segments were reverse-transcribed into cDNA using the full-length amplification (FLAC) technique described by Maan et al. [61]. Briefly, a 35 base oligonucleotide 'anchor-primer', with a phosphorylated 59 terminus, was ligated to the 39 ends of the viral dsRNAs using the T4 RNA ligase overnight at 16uC. Then dsRNA segments were fractionated on 1% agarose gel and recovered from the gel using a 'silica binding' method (RNaidH kit, MP Biomedicals) as per the manufacturer's instructions. The dsRNA eluted in NFW, was denatured at 99uC for 5 minutes, and then snap chilled on ice before synthesising first-strand cDNA using RT system (Promega). The resulting cDNAs were amplified using primers complementary to the anchor primer and high fidelity KOD polymerase enzyme (Novagen). PCR amplicons were analyzed by agarose gel electrophoresis.
Cloning and sequencing of cDNA segments cDNA amplicons were purified and cloned into the 'pCRH-Blunt' vector supplied with the Zero BluntH PCR Cloning Kit (Invitrogen). Recombinant plasmid-vectors containing CORV-MRM1 inserts were transformed into One ShotH TOP10 competent cells supplied with the cloning kit. Clones containing the desired inserts were identified by colony touch PCR using M13 universal primers. Plasmids were extracted from the clones identified, using the QIAprep Spin MiniPrep Kit (Qiagen). The plasmids and PCR products were sequenced using an automated ABI 3730 DNA sequencer (Applied Biosystems).

Sequence and phylogenetic analysis
'Raw' ABI sequence data was assembled into 'contigs' using the SeqManII sequence analysis package (DNAstar version 5.0). The ORFs of CORV-MRM1 genome segments were identified and translated to aa sequences for further analysis using EditSeq (DNAstar version 5.0). The putative function of each protein was identified by Blast X comparisons to homologous orbivirus (BTV) proteins in GenBank (http://blast.ncbi.nlm.nih.gov/Blast. cgi?CMD = Web&PAGE_TYPE = BlastHome). Multiple alignments of consensus sequences were performed using Clustal X (Version 2.0) [62], Clustal Omega (http://www.ebi.ac.uk/Tools/ msa/clustalo/) and MAFFT [63] to ensure proper alignment. Aligned protein sequences were back translated to nucleotide sequences using DAMBE [64]) or RevTrans 1.4 server available online (http://www.cbs.dtu.dk/services/RevTrans/) for further nucleotide analysis. Pairwise distance (aa and nt) calculations and phylogenetic trees constructions were done using MEGA 5 software [65] with the p-distance parameter and neighbourjoining method [66]. GenBank nucleotide accession numbers of polymerase (VP1), T2 and T13 protein sequences that were used in phylogenetic analyses are provided in Table S1 (supplementary data).

Supporting Information
Table S1 Nucleotide accession numbers for sequences used in phylogenetic analysis. (DOCX)