Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

RNA Viral Metagenome of Whiteflies Leads to the Discovery and Characterization of a Whitefly-Transmitted Carlavirus in North America

  • Karyna Rosario ,

    Affiliation College of Marine Science, University of South Florida, Saint Petersburg, Florida, United States of America

  • Heather Capobianco,

    Affiliation Department of Plant Pathology, University of Florida, Gainesville, Florida, United States of America

  • Terry Fei Fan Ng,

    Current address: Blood Systems Research Institute, San Francisco, California, United States of America

    Affiliation College of Marine Science, University of South Florida, Saint Petersburg, Florida, United States of America

  • Mya Breitbart,

    Affiliation College of Marine Science, University of South Florida, Saint Petersburg, Florida, United States of America

  • Jane E. Polston

    Affiliation Department of Plant Pathology, University of Florida, Gainesville, Florida, United States of America

RNA Viral Metagenome of Whiteflies Leads to the Discovery and Characterization of a Whitefly-Transmitted Carlavirus in North America

  • Karyna Rosario, 
  • Heather Capobianco, 
  • Terry Fei Fan Ng, 
  • Mya Breitbart, 
  • Jane E. Polston


Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector.


The majority of vectored plant viruses are transmitted by hemipteran insects, whose piercing-sucking mouthparts allow efficient transmission [1]. Whiteflies (Aleyrodidae), in particular the Bemisia tabaci species complex, are among the most detrimental insect vectors causing considerable economic losses to multiple agricultural industries [2], [3]. Whiteflies damage crops directly through feeding, which can weaken plants and elicit undesirable plant responses [4], and through depositing excreta that favors sooty mold production. In addition, whiteflies indirectly damage crops by transmitting pathogenic viruses [2], [3], [5]. Viruses are responsible for almost half of the emerging diseases affecting plants and whitefly-transmitted viruses are some of the most devastating agents affecting cash crops [6].

Among the large diversity of viral types known to infect plants, only DNA viruses belonging to the genus Begomovirus (Geminiviridae) and a small diversity of RNA viruses have been associated with whiteflies. Whiteflies are known to transmit more than 280 begomovirus species [5], [7] and, to our knowledge, no other DNA viral groups have been detected in whiteflies. The emergence of begomoviruses as important pathogens is closely associated with the increased prevalence of highly polyphagous whitefly species [5]. Whiteflies feed on a large number of cultivated and native plant species and thus may provide the opportunity to transmit viruses among a variety of hosts, including wild and cultivated vegetation [3], [8], [9]. The ability of whiteflies to transmit begomoviruses into diverse hosts, as well as the high potential for co-infection and recombination opportunities, may have contributed to the emergence of the genus Begomovirus as the group of plant viruses with the largest number of recognized species [5], [7]. In contrast to the species-rich, genus-specific association of whiteflies with begomoviruses, a low species diversity of RNA viruses is known to be vectored by whiteflies. There are four genera of RNA viruses (each with fewer than 15 species) known to be transmitted by whiteflies, namely: Crinivirus (Closteoviridae; 12 species), Carlavirus (Betaflexiviridae; 1 species), Ipomovirus (Potyviridae; 4 species), and Torradovirus (Secoviridae; 4 species) [5].

It is possible that the diversity of whitefly-transmitted viruses reported to date does not accurately represent the total complement of viruses in this insect vector. Viral types known to be carried by B. tabaci may instead reflect methodological limitations that are only capable of detecting close relatives of known vector-transmitted viruses (e.g., PCR with degenerate primers designed based on known sequences). In addition, agricultural surveillance efforts typically focus on viral species that negatively affect economically important crops. Therefore it is likely that viruses present in native vegetation that do not show any impact on agricultural crops in the area will be overlooked. Since viruses found in native vegetation may emerge as serious pathogens for crops and asymptomatic hosts may facilitate virus spread by serving as reservoirs [10][12], there is a critical need to investigate the total community of DNA and RNA viruses associated with whiteflies in a given area. This endeavor can be accomplished by applying the vector-enabled metagenomics approach (VEM; where viruses are purified and sequenced directly from insect vectors) using whiteflies. The main advantage of VEM is that it allows the detection of viruses carried by insect vectors without a priori knowledge of the plant pathogens present in a given area [13]. Moreover, the VEM approach does not depend on the collection of foliar tissue exhibiting virus-like infection symptoms to detect viruses present in an area, circumventing limitations associated with sampling individual plants and enabling the identification of asymptomatic infections. VEM using a small sequencing effort has been successfully implemented to identify whitefly-transmitted begomoviruses infecting both commercial crops and native vegetation [13].

In an effort to shed light on the total diversity of viruses carried by whiteflies, this study incorporated high-throughput sequencing into the VEM approach to detect DNA and RNA viruses in B. tabaci specimens collected from an experimental field site in Florida. Begomoviruses were the only DNA plant viruses detected, whereas known and novel RNA viruses from different families were found in whiteflies from this single field site. Furthermore, sequencing efforts resulted in the detection and first report of a whitefly-transmitted carlavirus most similar to Cowpea mild mottle virus (CpMMV) in North America. Although the CpMMV Florida isolate was originally detected in whiteflies, it was subsequently identified in wild and cultivated plants from the same area and its host range was experimentally determined. Analysis of the CpMMV Florida genome suggests that the current classification criteria for carlaviruses need to be reevaluated.

Materials and Methods

Whiteflies: Sample collection, processing, and metagenomic sequencing

The B. tabaci specimens used for viral metagenomics were collected in an experimental field in Citra, Florida (29°24′N 82°06′W) in August 2007 as previously described [13]. Briefly, adult whiteflies were collected from soybean and volunteer watermelon plants using a battery-operated vacuum. The whiteflies were manually inspected using a Nikon model C-DSD115 stereoscope and debris and other insects were removed before storing at −80°C. A subset of the whiteflies were used in a pilot study investigating DNA viruses [13], while the remainder were processed for the present study as described below.

Virus particles were partially purified from the whiteflies before nucleic acid extraction and sequencing. For this purpose, approximately 250 whiteflies were homogenized in SM Buffer (50 mM Tris·HCl, 10 mM MgSO4, 0.1 M NaCl, pH 7.5) using a bead-beater (BioSpec) with 1.0 mm glass beads (Research Products International) for 1 min. Cells were removed from homogenates by centrifuging at 10,000 xg for 10 min and filtering the supernatant through a 0.22 µm Sterivex filter (Millipore). Virus particles present in the filtrate were treated with 0.2 volumes of chloroform, followed by DNase I (2.5 U/µl) and RNase A (0.25 U/µl) treatment at 37°C for 3 hrs to remove non-encapsidated nucleic acids.

DNA and RNA were simultaneously extracted from purified virus particles using the All Prep DNA/RNA Mini Kit (Qiagen) following manufacturer's instructions and sequenced individually. For the RNA fraction, the ‘on-column DNase digestion’ step was used to minimize DNA carryover. The extracted DNA fraction was amplified using the GenomiPhi V2 DNA Amplification Kit (GE Healthcare) followed by further amplification and fragmentation using the GenomePlex Whole Genome Amplification (WGA) Kit (Sigma-Aldrich). The RNA fraction was amplified using the TransPlex Whole Transcriptome Amplification (WTA) Kit (Sigma-Aldrich). WGA- and WTA-amplified nucleic acids were used for next-generation sequencing using a single lane of a Genome Analyzer IIx System (Illumina) by multiplexing.

Metagenomic data analysis

WGA and WTA adapter sequences as well as multiplexing barcodes were removed from the DNA and RNA sequence libraries, respectively, using the TagCleaner server ( [14]. Trimmed sequences from both DNA and RNA libraries are publicly available on the Metavir website ( under the project ‘Whiteflies_Citra_2007’. Sequences (1.4 million from the DNA library and 2.1 million from the RNA library) were then assembled with a minimum identity of 95% over 25 bp using the Geneious software package (Biomatters). Contigs over 80 bp in length were compared against the GenBank non-redundant database using either BLASTn (DNA library) or BLASTx (RNA library) with an e-value cut-off of E<0.001 in June 2011 [15]. BLAST results were summarized and inspected using the Metagenome Analyzer (MEGAN4) software [16] to identify viral sequences. The top viral match for each contig was accepted only if the score for the top virus hit was at least 10% higher than the next best hit; otherwise, the contig was annotated as “unassigned”. In most cases where the BLAST scores were within 10% of each other, the viral matches belonged to the same genus and thus the genus was identified.

CpMMV Florida isolate genome completion

The majority of contigs from the RNA library with significant matches to viral sequences were similar to the carlavirus CpMMV. To sequence the full genome of this virus, contigs with similarities to CpMMV were organized based on the genomic position sharing similarity with a CpMMV reference genome from Africa (NC014730). Primer pairs were designed to bridge the gaps between contigs and primer pairs that spanned the entire genome were used to complete the genome (Table S1). cDNA for PCR reactions was produced from RNA extracted from purified virus particles using a SuperScript III First-Strand Synthesis System kit (Invitrogen). All PCR reactions contained 1 µl cDNA, 1 U Apex Red Taq Polymerase (Genesee), 1× NH4 buffer, 1.5 mM MgCl2, and 0.5 µM of each primer. Amplification was performed with an initial denaturation at 94°C for 5 min followed by 35 cycles of 94°C for 45 sec, 50°C for 45 sec (incrementally decreasing the temperature by 0.1°C each cycle), 72°C for 1.5 min, followed by a final extension at 72°C for 8 min. The 5′ end of the genome was completed using gene-specific primers with the 5′ RACE System Kit (Invitrogen) according to manufacturer's instructions (Table S1). All PCR products were cloned using the TOPO TA system (Invitrogen) and Sanger sequenced. PCR product sequences were assembled using Sequencher 4.7 (Gene Codes) and the complete genome was annotated using SeqBuilder (DNASTAR). Each region of the genome had at least 4× sequence coverage.

Survey and isolation of CpMMV from wild and cultivated vegetation

To investigate the presence of the carlavirus CpMMV in the vegetation, 90 plants were surveyed in the summer of 2011 from an area around the field site in Citra, FL where whiteflies were originally collected. Leaves exhibiting a variety of viral infection symptoms (e.g., mottling, leaf curling) were collected from plants belonging to the Fabaceae family which are known hosts of CpMMV, namely peanuts (Arachis hypogaea L.; n = 71), hairy indigo (Indigofera hirsuta L.; n = 14), and dixie ticktrefoil (Desmodium tortuosum (Sw.) DC.; n = 5). All plant tissues were tested for the presence of CpMMV using a CpMMV-specific ELISA Reagent Set (Neogen Europe Ltd) in accordance with manufacturer's protocols. Samples were considered positive when their absorbance values were greater than the mean of the negative controls plus three standard deviations. Positive samples were verified through a degenerate carlavirus RT-PCR assay targeting part of the capsid protein (CP) and 3′end poly-A tail of these RNA genomes [17]. Briefly, RNA was extracted from plant tissues using TRI Reagent following manufacturer's protocols (Ambion Inc.). Reverse transcription was performed using ImProm-II™ Reverse Transcriptase (Promega) with the oligo-d(T21) primer according to manufacturer's protocols. The cDNA was used for PCR with the Carla-CP (5′GGBYTNGGBGTNCCNACNGA3′) and oligo-dT (21) primers under the following conditions: 0.5 ul cDNA, Taq DNA Polymerase (New England Biolabs), 1× standard Taq (Mg-free) buffer, 3.0 mM MgCl2, and 1 µM spermidine. Amplification was performed with an initial denaturation at 94°C for 5 min, followed by 35 cycles of 94°C for 1 min, 50°C for 1 min, and 72°C for 1 min ending with a final extension at 72°C for 5 min.

One D. tortuosum plant sample that tested positive for CpMMV by ELISA was used to establish a culture and obtain inoculum for transmission and host range determination experiments. The sample of D. tortuosum was mechanically inoculated to Chenopodium quinoa L., which is an established local lesion host for CpMMV [18], using a 1∶5 dilution of tissue to phosphate buffer (100 mM K2HPO4, 100 mM Na2HPO4, 10 mM Na2SO3, pH 7.4). Eight days later, chlorotic local lesions were observed on inoculated leaves of C. quinoa. Four of these lesions were removed and individually mechanically inoculated to primary leaves of common bean (Phaseolus vulgaris L. ‘Topcrop’) which is a known systemic host of some CpMMV isolates [19][22]. Three bean plants exhibited virus-like symptoms from this inoculation and tested positive for CpMMV by ELISA and RT-PCR.

Whitefly transmission of isolated CpMMV

Viral isolates from each of the three infected bean plants were transmitted to new bean plants using B. tabaci (Mediterranean/Asia Minor/Africa clade, formerly known as B. tabaci Biotype B). For this purpose, infected beans were placed in separate cages in different growth rooms for acquisition and transmission. Transmissions were performed at different times throughout the day to prevent contamination through whitefly carryover between rooms. Non-viruliferous whiteflies were placed on each infected bean and given an acquisition access period of 20 min. Whiteflies were then transferred to three healthy beans and given an inoculation access period of 4 hrs. Transmission was terminated using insecticidal soap (20 ml/L Safer Soap®) and Imidacloprid (0.2% active ingredient formulation, applied as a 30 ml per plant drench). The presence of CpMMV was confirmed in all three whitefly-inoculated beans by RT-PCR. The CpMMV genome was sequenced from each of these bean plants through PCR using the same primers used to sequence the CpMMV genome from whiteflies (Table S1).

Experimental CpMMV Florida host range

A variety of hosts were selected for experimental infectivity assays based on previously reported hosts for isolates of CpMMV (Table 1) [20]. Bean leaf tissue infected with an isolate of CpMMV from D. tortuosum was collected 19 days post inoculation, frozen and used as the inoculum source for all inoculations. Three to four experimental host species were tested at a time. Five to twenty plants of each species were mechanically inoculated at the first true leaf stage. At the same time, three to five plants of each test species were mock-inoculated to serve as negative controls and three to five common bean plants were inoculated to serve as positive controls for the quality of the inoculum. Plants were visually assessed daily and systemic symptoms were recorded at 14 days post inoculation. Plants were then sampled and tested for the presence of CpMMV by ELISA. Inconclusive results based on ELISA were further tested by RT-PCR.

Table 1. Responses observed in a range of selected host plants mechanically inoculated with the Cowpea mild mottle virus Florida isolate.

CpMMV Florida genome, pairwise comparisons, and phylogenetic analysis

The CpMMV Florida genomes sequenced from whiteflies and bean plants as well as their predicted protein sequences were compared against known members of the Carlavirus genus. Predicted protein sequences were compared against the Pfam database [23] to identify conserved motifs. For all pairwise comparisons, alignments were performed using the MUSCLE algorithm [24] implemented in MEGA5 [25]. Pairwise distances were calculated in MEGA5 using p-distance and pairwise deletion of gaps. For phylogenetic analysis of the capsid protein, alignments were optimized using the PRALINE server [26] with default settings. A maximum likelihood tree was constructed using the PhyML online server [27] with the (LG+I+G+F) model chosen as the best-fit substitution model according to ProtTest [28]. The approximate likelihood ratio test (aLRT) was used to assess branch support [29].


Viruses identified in whiteflies

VEM revealed a diversity of DNA and RNA plant viruses present in whiteflies collected from a single site in Citra, Florida. Viral sequences in the DNA library displayed high levels of similarity to previously described viruses, enabling their identification through BLASTn searches, while RNA sequences had to be identified through BLASTx due to limited similarities to known sequences. Viral contigs (n = 259) in the DNA library were dominated by begomoviruses (Geminiviridae; 97.3%), the majority of which shared >88% nucleotide identity with their top match in the database (Table S2). Note that short reads hinder any definitive classification of begomovirus species or strains; therefore, Table S2 only provides an overview of potential begomovirus types detected in whiteflies. In addition to begomoviruses, five contigs were most similar to novel begomovirus-associated satellites, Whitefly VEM Satellites, discovered in a nearby field [13]. Only two contigs were not related to plant viruses, including a single-stranded DNA bacteriophage and a human virus.

Although fewer viral contigs were recovered from the RNA library (n = 64) compared to the DNA library, the RNA sequences encompassed a broader viral diversity at the family level. The viral sequences identified in the RNA library had similarities to viruses from at least five different families (Betaflexiviridae, Closteroviridae, Bunyaviridae, Bromoviridae, Virgaviridae), three of which (Bunyaviridae, Bromoviridae, Virgaviridae) have not been detected in whiteflies previously (Table 2). Most of the identified RNA viruses were similar to plant viruses and contigs similar to the carlavirus CpMMV dominated the viral sequences. Viral sequences similar to plant viruses known to be transmitted by whiteflies, including criniviruses and CpMMV, had high amino acid identities (up to 100%) with their top match in the database. In contrast to the DNA library, many of the RNA viral contigs (33%) were highly divergent from known species since they shared less than 45% amino acid identity with their top match in the database. Several contigs had low identities to double-stranded RNA viruses, Circulifer tenellus virus 1 and Spissistilus festinus virus 1, recently discovered in plant-feeding hemipteran pests; however, it remains unknown whether these viruses replicate in insect cells or those of associated microorganisms [30]. Only three contigs had similarities to viruses that infect hosts other than plants or insects, including diatoms (Rhizosolenia setigera RNA virus) and humans (Uukuniemi virus and Armero virus). However, due to low amino acid identities, it is possible that these sequences represent novel plant or whitefly viruses.

Table 2. Plant or insect RNA viruses identified in whiteflies and amino acid (aa) identity ranges.

Isolation and experimental host range determination of CpMMV Florida

Since CpMMV-like sequences were abundant in whiteflies and this virus had never been reported in North America, a survey of 90 symptomatic plants from three different species using ELISA with a CpMMV antibody was conducted in the same crop field four years after the whitefly collection. Thirty-eight percent of A. hypogea (n = 71), 36% of I. hirsuta (n = 14), and 100% of D. tortuosum (n = 5) plants tested positive for CpMMV. Note that infection symptoms observed in the field may not have been caused by CpMMV. Since all of the samples of D. tortuosum tested positive for CpMMV, an infected seedling from this species was used to establish a culture and obtain virus inoculum. This plant was used to mechanically infect C. quinoa and local lesions were subsequently inoculated into common bean plants (P. vulgaris). To confirm infection by CpMMV, the common bean plants were tested by ELISA and a degenerate carlavirus RT-PCR assay. Once the presence of CpMMV was confirmed, the CpMMV Florida isolate was successfully transmitted to three common bean plants using whiteflies. All three plants exposed to CpMMV-bearing whiteflies exhibited mild mottling symptoms and were verified as infected with CpMMV based on ELISA and RT-PCR.

To determine host range, a CpMMV isolate from a common bean plant infected through whitefly transmission was used to inoculate 18 species of experimental hosts belonging to five different families (Amaranthaceae, Chenopodeaceae, Cucurbitaceae, Fabaceae, Solanaceae). Ten of the 18 species tested were successfully infected by the CpMMV Florida isolate, all of which belonged to the Chenopodeaceae and Fabaceae families (Table 1). Six of the ten infected species did not show any visible symptoms of infection and had the same appearance as the negative controls; only four species exhibited local or systemic symptoms of infection. C. quinoa showed local chlorotic lesions on inoculated leaves, Vigna unguiculata exhibited both local and systemic symptoms, whereas Glycine max and Pisum sativum only displayed systemic symptoms (Table 1).

CpMMV Florida genome

PCR primers (Table S1) were used to obtain and sequence the entire CpMMV Florida genome from the field-collected whiteflies and each of the three bean plants experimentally infected with CpMMV Florida using whiteflies. The CpMMV genomes sequenced from bean plants, CpMMV Florida [Beans_2011] (Accession no. KC774020), are 100% identical to each other and share 99% genome-wide nucleotide identity with the genome retrieved from whiteflies collected four years earlier, CpMMV Florida [Whiteflies_2007] (Accession no. KC774019). The CpMMV Florida genomes exhibit organizations identical to members of the Carlavirus genus, including six open reading frames (ORFs) encoding the following proteins from 5′ to 3′: replication polyprotein, movement proteins [i.e., triple gene block (TGB)], capsid protein (CP) and nucleic acid binding (NB) protein (Fig. 1). Among the carlaviruses, the CpMMV Florida genomes are most closely related to the only other complete CpMMV genome sequence that was available at the time the analysis was performed, an isolate from Ghana (NC014730) [31], with which they share 67.5% genome-wide pairwise identity. Although the average amino acid pairwise identity for the complete protein complement of these two viral genomes is ∼62%, the CP exhibits 95% identity (Fig. 1). Phylogenetic analysis of the CP of different carlavirus species also supports identification of the Florida carlavirus as an isolate of CpMMV (Fig. 2). Based on pairwise distances among available CpMMV CP and NB sequences (Tables S3 and S4), the CpMMV Florida isolate may be more closely related to isolates from South America (Brazil) and the Caribbean (Puerto Rico) than to the Ghana isolate, since it shares 98–99% amino acid identity with these isolates.

Figure 1. Schematic genome organization of Cowpea mild mottle virus (CpMMV) isolates identified in Ghana (top) and Florida (middle) as well as the general genome organization observed in carlavirus species (bottom; exact species shown in phylogenetic tree in Figure 2).

Each block represents identified open reading frames (ORFs) in these genomes and the encoded protein names are given on top of the CpMMV Florida panel, including the replication polyprotein, triple gene block (TGB), capsid (CP), and nucleic acid binding (NB) protein. The four different domains within the replication polyprotein exhibiting viral methyltransferase, peptidase, superfamily one (SF1) helicase, and RNA-dependent RNA polymerase (polymerase) motifs are indicated. Percentage values in the CpMMV Ghana genome schematic represent amino acid pairwise identities among individual ORFs of CpMMV Ghana and CpMMV Florida. Percentages in the Carlaviruses genome schematic represent amino acid pairwise identity ranges observed between different carlavirus species and CpMMV Florida. ORFs highlighted in grey represent ORFs without a standard start codon (a lighter grey in the carlaviruses panel indicates that some carlavirus species have a standard start codon while others do not).

Figure 2. Maximum likelihood phylogenetic tree of capsid proteins (CP) found in different carlavirus species.

Cowpea mild mottle virus (CpMMV) isolates identified in Florida, USA in this study are highlighted in grey. Branch support was assessed with the approximate likelihood ratio test and values >70% are shown.

Searches in the Pfam database using predicted amino acid sequences for each of the six ORFs present in the CpMMV Florida genomes revealed significant matches (e-value≪0.001) to conserved motifs observed in carlaviruses. The replication polyprotein contains four different domains characterized by viral methyltransferase [32], carlavirus endopeptidase (family C23 peptidase) [33], superfamily one helicase [34], and supergroup three RNA-dependent RNA polymerase [core motif: TGX3TX3NTX22GDD, where ‘X’ represents any amino acid residue] [35] motifs. Downstream from the replication polyprotein, there is a triple gene block involved in cell-to-cell movement with characteristic motifs of filamentous viruses, specifically the ‘potex-like’ class [36]. The first block contains NTPase/helicase sequence domains belonging to superfamily one helicases. The second and third genes contain the signature sequences GDX6GGXYXDG and CX5GX8C, respectively. Similar to several carlavirus species, the third gene in the TGB of the CpMMV Florida isolate lacks a standard start codon. The capsid protein exhibits both carlavirus- and potexvirus-specific domains and the carlavirus capsid signature sequence ‘GLGVPTE’ [17]. The putative NB protein encoded at the 3′ end of the CpMMV Florida genome, whose presence distinguishes virus species in the Carlavirus genus from other members of the Betaflexiviridae with similar genome organization (i.e., foveaviruses) [37], exhibits four characteristic cysteine residues in the pattern CX2CX12CX4C [38].


Diversity of viruses identified in whiteflies

The VEM approach has been introduced as a strategy to survey viruses carried by insect vectors in a given region without a priori knowledge of the viral types present [13], [39]. Here VEM was used to detect both DNA and RNA plant viruses found in whiteflies collected from an experimental field station in Citra, FL. Strikingly, all of the DNA plant viruses identified with this deep Solexa sequencing effort were limited to the well-established whitefly-transmitted genus Begomovirus with high nucleotide identities (>88%) to known viral species. Although the short reads (maximum fragment size of 71 nt) hindered our ability to conclusively identify these begomoviruses to the species level even after assembly, results indicated that DNA viruses present in whiteflies from this site are dominated by members of a single genus. On the other hand, a diversity of RNA viral sequences from various families was detected and the VEM approach ultimately led to the discovery of the whitefly-transmitted carlavirus CpMMV in Florida, USA.

Since the VEM approach does not rely on sequence-specific primers/probes, this method should have recovered any type of DNA virus present in the whiteflies whose virions are resistant to chloroform and nuclease treatment. However, the use of a 0.22 µm filter during virus particle purification may have excluded larger non-plant DNA viruses which are commonly associated with insects (e.g., baculoviruses). The fact that only begomoviruses were identified suggests that this group indeed dominates the whitefly-transmitted DNA plant viruses, and perhaps exclusively occupies this niche. Future studies investigating plant DNA viruses in whiteflies from different locations using sequence-independent methods are needed to confirm whether or not whiteflies have evolved an exclusive relationship with begomoviruses.

In contrast to DNA viruses, the RNA library indicated that whiteflies from a single site can carry RNA viruses from disparate families. Currently, whiteflies are known to transmit four different groups of RNA viruses, including filamentous viruses from the Ipomovirus (Potyviridae), Crinivirus (Closteoviridae) and Carlavirus (Betaflexiviridae) genera, and icosahedral viruses from the Torradovirus genus (Secoviridae) [5]. Two of these groups were identified in the RNA metagenomic library, with high amino acid identities to known viruses including two criniviruses (Lettuce chlorosis virus (LCV) and Tomato chlorosis virus (TCV)) and a carlavirus (CpMMV). TCV has previously been reported from tomato in Florida [40]. However, this is the first evidence documenting the presence of CpMMV in the United States and LCV in the eastern United States. The remaining virus-like sequences identified in the RNA metagenomic library have low amino acid identities (<45%) with their top matches in the database. These novel viral sequences are most similar to groups that are not known to be carried by whiteflies and encompass divergent species, including viruses classified in three different families, as well as unclassified viruses (Table 2). It remains to be determined if these RNA viral sequences indeed represent novel whitefly-transmitted plant viruses, viruses infecting the whiteflies themselves, or simply transient viruses picked up by the whiteflies through feeding. Nevertheless, the detection of novel RNA viral sequences with weak similarities to known plant pathogens suggests that there are RNA plant viruses that have not yet been described.

Discovery of the carlavirus CpMMV in Florida

The VEM approach led to the detection of the first whitefly-transmitted carlavirus (CpMMV Florida) in North America. This virus has been reported in a wide range of geographical areas including Africa, India, Asia, the Middle East, and South America, where it can infect and negatively impact important food crops [19][22], [41][43]. The host range of the CpMMV Florida isolate includes members of the Chenopodeaceae and Fabaceae families that have been previously reported as either natural or experimental hosts for CpMMV isolates from different regions. Many of the susceptible hosts did not show any visible signs of infection, which is similar to other CpMMV isolates [18], [21], [22], [44]. Asymptomatic infection by CpMMV may contribute to the high prevalence and transmission of this virus in some crop fields [18]. Although CpMMV Florida infects hosts that have been previously reported for CpMMV isolates from other regions, there are differences in the host ranges of isolates from different locations and crops. For example, CpMMV isolates from Israel and Ghana are able to infect representative members of the Solanaceae family [18], [45], whereas CpMMV Florida and isolates from Brazil, Thailand, and Southern Iran did not infect any members of this family [22], [41], [44]. Despite these differences, CpMMV isolates cannot be distinguished by electron microscopy or serologically, and no stringent comparative tests have been performed to determine if host range differences are sufficient to distinguish strains, pathotypes, or species [20].

A recent report published during the review process of this manuscript described six novel CpMMV isolates from Brazil that are closely related to CpMMV Florida (see below) [46]. Based on experimental hosts tested in both studies, most of the Brazilian isolates shared a similar host range with CpMMV Florida, mainly infecting members of the Fabaceae. However, two of the isolates (CpMMV∶BR∶BA∶02 and CpMMV∶BR∶GO∶01∶1) were able to infect a member of the Solanaceae (Nicotiana glutinosa) which CpMMV Florida fail to infect. CpMMV∶BR∶BA∶02 and CpMMV∶BR∶GO∶01∶1 share 98% and 93% genome-wide pairwise identity with CpMMV Florida, respectively. Therefore CpMMV isolates may exhibit different host ranges despite high nucleotide identities. Further research examining both the full genomes and experimental host range of all available CpMMV isolates will provide insight into which genetic differences explain differences in host range.

The CpMMV genome recovered from whiteflies collected in 2007 is 99% identical to the genome isolated from vegetation in the same field four years later. The genome organization of the CpMMV Florida isolate is similar to other carlaviruses. Full genome comparisons between CpMMV isolates from Florida and Ghana, which were the only genomes available at the time when the analyses were performed, clearly show that the CP shares a much greater degree of identity (95%) than non-structural proteins in the genomes (∼62%) (Fig. 1). This conservation of structural proteins but high variability in non-structural genes has been noted by other authors investigating CpMMV partial sequences [21], [31] and a recent report investigating six new CpMMV genomes from Brazil which share 93–99% genome-wide pairwise identity with the Florida isolate [46]. Despite the higher genetic distance among non-structural genes, CpMMV Florida exhibits all the core and functional domains that have been identified for these proteins. According to the ICTV classification criteria for members of Betaflexiviridae (formerly known as Flexiviridae), distinct species share <72% nucleotide or <80% amino acid identity between the entire CP or replication genes [47]. Due to the difference between the CP and non-structural identities, these criteria present a problem for properly classifying CpMMV isolates. Based on the replication gene, CpMMV Florida represents a novel species, whereas CP identities suggest it does not.

Unfortunately, most of the available CpMMV sequences only encompass the 3′ end of the genomes, containing the CP and/or NB since available carlavirus-specific degenerate PCR assays target this region [17], [48]. Most studies have based their classifications on ELISA, microscopy, and/or degenerate PCR targeting the coat protein and, thus, many viruses previously identified as CpMMV may actually represent different strains or even species. CP identities among available sequences only range from 88–99% whereas NB identities range from 56–99% (Tables S3 and S4). Furthermore, full genome comparisons between the Florida and Ghana isolates suggests that NB identities reflect identities for non-structural proteins (Fig. 1). Therefore our analysis suggests that the NB may be more representative of overall genomic similarities than the CP for classification purposes at the strain level. Based on this observation, it was expected that the CpMMV Florida isolate would be closely related to isolates from Brazil and Puerto Rico. This was confirmed with the recently released CpMMV genome sequences from Brazil which seem to belong to the same viral strain as CpMMV Florida. Due to the scarcity of genomic data regarding currently classified CpMMV isolates and strong CP similarities, we have named the Florida isolate as CpMMV; however, this classification may need to be revised as more genomic and infectivity data become available. It was also concluded that the Brazilian CpMMV isolates represented a new strain belonging to the same viral species as CpMMV based on their close phylogenetic relationship with the CpMMV isolate from Ghana. Interestingly, recombination analyses of CpMMV genomes from Brazil and Ghana suggested that low pairwise identities in the RdRP compared to the rest of the genome may be partly due to recombination events in this ORF [46]. Due to the occurrence of recombination events it may be necessary to use full genomic sequences for classification of carlavirus strains.

The biological significance of the highly conserved CP in CpMMV isolates is largely unknown; however, it may result from selective pressure of transmission driven by the whitefly vector. CpMMV isolates have been reported to be transmitted both non-persistently and semi-persistently as there are no latent periods for virus transmission in whiteflies but retention times vary from minutes to hours [20]. Regardless, even non-persistent transmission in other vector-virus systems can depend on specific interactions between vector and virus [49]. Therefore the diversity of CpMMV populations may be constrained by the need to retain specific interactions between its CP and the whitefly vector [8].

Concluding remarks

The results of this study demonstrate that current understanding of RNA viruses found in B. tabaci whiteflies is not nearly as complete as that of DNA viruses, which appear to be restricted to the genus Begomovirus. Our findings indicate that the range of RNA viruses found in whiteflies may not be limited to the four groups that have been described since viral sequences with low amino acid identities likely represent novel groups. In addition to expanding current knowledge regarding viruses that can be captured by whiteflies, the VEM approach allowed us to expand the geographical range of CpMMV by documenting its presence in North America. Genomic comparisons among CpMMV genomes suggest that the classification criteria for carlaviruses need to be reevaluated, especially when considering variants that cannot be serologically distinguished. Future studies need to establish criteria to classify CpMMV variants and pathotypes by comparing genomic features, symptoms, infectivity and host range.

Supporting Information

Table S1.

Primers used to amplify the CpMMV genome.


Table S2.

Plant DNA viruses (i.e., begomoviruses) and associated satellite DNAs identified in whiteflies.


Table S3.

Amino acid pairwise comparisons among available Cowpea mild mottle virus (CpMMV) capsid proteins (CP).


Table S4.

Amino acid pairwise comparisons among available Cowpea mild mottle virus (CpMMV) nucleic acid binding (NB) proteins.



We thank the staff at the NGS Core at the Scripps Research Institute for providing DNA sequencing support.

Author Contributions

Conceived and designed the experiments: KR TFFN MB JEP. Performed the experiments: KR HC TFFN JEP. Analyzed the data: KR MB JEP. Contributed reagents/materials/analysis tools: KR HC TFFN MB JEP. Wrote the paper: KR MB JEP.


  1. 1. Hogenhout SA, Ammar ED, Whitfield AE, Redinbaugh MG (2008) Insect vector interactions with persistently transmitted viruses. Annual Review of Phytopathology 46: 327–359.
  2. 2. De Barro PJ, Liu SS, Boykin LM, Dinsdale AB (2011) Bemisia tabaci: a statement of species status. Annual Review of Entomology 56: 1–19.
  3. 3. Oliveira MRV, Henneberry TJ, Anderson P (2001) History, current status, and collaborative research projects for Bemisia tabaci. Crop Protection 20: 709–723.
  4. 4. Zarate SI, Kempema LA, Walling LL (2007) Silverleaf whitefly induces salicylic acid defenses and suppresses effectual jasmonic acid defenses. Plant Physiology 143: 866–875.
  5. 5. Navas-Castillo J, Fiallo-Olive E, Sanchez-Campos S (2011) Emerging virus diseases transmitted by whiteflies. Annual Review of Phytopathology 49: 219–248.
  6. 6. Anderson PK, Cunningham AA, Patel NG, Morales FJ, Epstein PR, et al. (2004) Emerging infectious diseases of plants: pathogen pollution, climate change and agrotechnology drivers. Trends in Ecology & Evolution 19: 535–544.
  7. 7. King AMQ, Lefkowitz E, Adams MJ, Carstens EB, editors (2012) Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses San Diego: Academic Press.
  8. 8. Power AG (2000) Insect transmission of plant viruses: a constraint on virus variability. Current Opinion in Plant Biology 3: 336–340.
  9. 9. Harrison BD, Robinson DJ (1999) Natural genomic and antigenic variation in whitefly-transmitted geminiviruses (Begomoviruses). Annual Review of Phytopathology 37: 369–398.
  10. 10. Jones RAC (2009) Plant virus emergence and evolution: Origins, new encounter scenarios, factors driving emergence, effects of changing world conditions, and prospects for control. Virus Research 141: 113–130.
  11. 11. Polston JE, Cohen L, Sherwood TA, Ben-Joseph R, Lapidot M (2006) Capsicum species: Symptomless hosts and reservoirs of Tomato yellow leaf curl virus. Phytopathology 96: 447–452.
  12. 12. Seal S, van den Bosch F, Jeger M (2006) Factors influencing begomovirus evolution and their increasing global significance: Implications for sustainable control. Critical Reviews in Plant Sciences 25: 23–46.
  13. 13. Ng TFF, Duffy S, Polston JE, Bixby E, Vallad GE, et al. (2011) Exploring the diversity of plant DNA viruses and their satellites using vector-enabled metagenomics on whiteflies. PLoS ONE 6.
  14. 14. Schmieder R, Lim Y, Rohwer F, Edwards R (2010) TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets. BMC Bioinformatics 11: 341.
  15. 15. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25: 3389–3402.
  16. 16. Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Research 21: 1552–1560.
  17. 17. Gaspar JO, Belintani P, Almeida AMR, Kitajima EW (2008) A degenerate primer allows amplification of part of the 3′-terminus of three distinct carlavirus species. Journal of Virological Methods 148: 283–285.
  18. 18. Brunt AA, Kenten RH (1973) Cowpea mild mottle, a newly recognized virus infecting cowpeas (Vigna-Unguiculata) in Ghana. Annals of Applied Biology 74: 67–74.
  19. 19. Brito M, Fernández-Rodríguez T, Garrido M, Mejías A, Romano M, et al. (2012) First report of Cowpea mild mottle carlavirus on yardlong bean (Vigna unguiculata subsp. sesquipedalis) in Venezuela. Viruses 4: 3804–3811.
  20. 20. Jeyanandarajah P, Brunt AA (1993) The natural occurrence, transmission, properties and possible affinities of Cowpea mild mottle virus. Journal of Phytopathology-Phytopathologische Zeitschrift 137: 148–156.
  21. 21. Naidu RA, Gowda S, Satyanarayana T, Boyko V, Reddy AS, et al. (1998) Evidence that whitefly-transmitted cowpea mild mottle virus belongs to the genus Carlavirus. Archives of Virology 143: 769–780.
  22. 22. Tavasoli M, Shahraeen N, Ghorbani SH (2009) Serological and RT-PCR detection of Cowpea mild mottle carlavirus infecting soybean. Journal of General and Molecular Virology 1: 007–011.
  23. 23. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, et al. (2012) The Pfam protein families database. Nucleic Acids Research 40: D290–D301.
  24. 24. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 1792–1797.
  25. 25. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, et al. (2011) MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution 28: 2731–2739.
  26. 26. Simossis VA, Heringa J (2005) PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Research 33: W289–W294.
  27. 27. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, et al. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology 59: 307–321.
  28. 28. Abascal F, Zardoya R, Posada D (2005) ProtTest: Selection of best-fit models of protein evolution. Bioinformatics 21: 2104–2105.
  29. 29. Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Systematic Biology 55: 539–552.
  30. 30. Spear A, Sisterson MS, Yokomi R, Stenger DC (2010) Plant-feeding insects harbor double-stranded RNA viruses encoding a novel proline-alanine rich protein and a polymerase distantly related to that of fungal viruses. Virology 404: 304–311.
  31. 31. Menzel W, Winter S, Vetten HJ (2010) Complete nucleotide sequence of the type isolate of Cowpea mild mottle virus from Ghana. Archives of Virology 155: 2069–2073.
  32. 32. Rozanov MN, Koonin EV, Gorbalenya AE (1992) Conservation of the putative methyltransferase domain: a hallmark of the ‘Sindbis-like’ supergroup of positive-strand RNA viruses. Journal of General Virology 73: 2129–2134.
  33. 33. Lawrence DM, Rozanov MN, Hillman BI (1995) Autocatalytic processing of the 223-kDa protein of blueberry scorch carlavirus by a papain-like proteinase. Virology 207: 127–135.
  34. 34. Gorbalenya AE, Koonin EV (1989) Viral proteins containing the purine NTP-binding sequence pattern. Nucleic Acids Research 17: 8413–8438.
  35. 35. Koonin EV (1991) The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. Journal of General Virology 72: 2197–2206.
  36. 36. Morozov SY, Solovyev AG (2003) Triple gene block: modular design of a multifunctional machine for plant virus movement. Journal of General Virology 84: 1351–1366.
  37. 37. Martelli GP, Adams MJ, Kreuze JF, Dolja VV (2007) Family Flexiviridae: a case study in virion and genome plasticity. Annual Review of Phytopathology 45: 73–100.
  38. 38. Foster GD (1992) The structure and expression of the genome of carlaviruses. Research in Virology 143: 103–112.
  39. 39. Ng TFF, Willner DL, Lim YW, Schmieder R, Chau B, et al. (2011) Broad surveys of DNA viral diversity obtained through viral metagenomics of mosquitoes. PLoS ONE 6: e20579.
  40. 40. Wisler GC, Li RH, Liu HY, Lowry DS, Duffus JE (1998) Tomato chlorosis virus: a new whitefly-transmitted, Phloem-limited, bipartite closterovirus of tomato. Phytopathology 88: 402–409.
  41. 41. Almeida AMR, Piuga FF, Marin SRR, Kitajima EW, Gaspar JO, et al. (2005) Detection and partial characterization of a carlavirus causing stem necrosis of soybean in Brazil. Fitopatologia Brasileira 30: 191–194.
  42. 42. Chang C-A, Chien L-Y, Tsai C-F, Cheng YH, Lin Y-Y (2013) First report of Cowpea mild mottle virus in cowpea and French bean in Taiwan. Plant Disease 97: 1001.
  43. 43. Pardina PER, Arneodo JD, Truol GA, Herrera PS, Laguna IG (2004) First record of Cowpea mild mottle virus in bean crops in Argentina. Australasian Plant Pathology 33: 129–130.
  44. 44. Iwaki M, Thongmeearkom P, Prommin M, Honda Y, Hibi T (1982) Whitefly transmission and some properties of Cowpea mild mottle virus on soybean in Thailand. Plant Disease 66: 365–368.
  45. 45. Antignus Y, Cohen S (1987) Purification and some properties of a new strain of Cowpea mild mottle virus in Israel. Annals of Applied Biology 110: 563–569.
  46. 46. Zanardo L, Silva F, Bicalho A, Urquiza G, Lima A, et al. (2013) Molecular and biological characterization of Cowpea mild mottle virus isolates infecting soybean in Brazil and evidence of recombination. Plant Pathology DOI: 10.1111/ppa.12092.
  47. 47. Adams MJ, Antoniw JF, Bar-Joseph M, Brunt AA, Candresse T, et al. (2004) The new plant virus family Flexiviridae and assessment of molecular criteria for species demarcation. Archives of Virology 149: 1045–1060.
  48. 48. Badge J, Brunt A, Carson R, Dagless E, Karamagioli M, et al. (1996) A carlavirus-specific PCR primer and partial nucleotide sequence provides further evidence for the recognition of cowpea mild mottle virus as a whitefly-transmitted carlavirus. European Journal of Plant Pathology 102: 305–310.
  49. 49. Gray SM, Banerjee N (1999) Mechanisms of arthropod transmission of plant and animal viruses. Microbiology and Molecular Biology Reviews 63: 128–148.