Wolbachia is a genus of bacterial endosymbionts that impacts the breeding systems of their hosts. Wolbachia can confuse the patterns of mitochondrial variation, including DNA barcodes, because it influences the pathways through which mitochondria are inherited. We examined the extent to which these endosymbionts are detected in routine DNA barcoding, assessed their impact upon the insect sequence divergence and identification accuracy, and considered the variation present in Wolbachia COI. Using both standard PCR assays (Wolbachia surface coding protein – wsp), and bacterial COI fragments we found evidence of Wolbachia in insect total genomic extracts created for DNA barcoding library construction. When >2 million insect COI trace files were examined on the Barcode of Life Datasystem (BOLD) Wolbachia COI was present in 0.16% of the cases. It is possible to generate Wolbachia COI using standard insect primers; however, that amplicon was never confused with the COI of the host. Wolbachia alleles recovered were predominantly Supergroup A and were broadly distributed geographically and phylogenetically. We conclude that the presence of the Wolbachia DNA in total genomic extracts made from insects is unlikely to compromise the accuracy of the DNA barcode library; in fact, the ability to query this DNA library (the database and the extracts) for endosymbionts is one of the ancillary benefits of such a large scale endeavor – for which we provide several examples. It is our conclusion that regular assays for Wolbachia presence and type can, and should, be adopted by large scale insect barcoding initiatives. While COI is one of the five multi-locus sequence typing (MLST) genes used for categorizing Wolbachia, there is limited overlap with the eukaryotic DNA barcode region.
Citation: Smith MA, Bertrand C, Crosby K, Eveleigh ES, Fernandez-Triana J, Fisher BL, et al. (2012) Wolbachia and DNA Barcoding Insects: Patterns, Potential, and Problems. PLoS ONE 7(5): e36514. https://doi.org/10.1371/journal.pone.0036514
Editor: Jonathan H. Badger, J. Craig Venter Institute, United States of America
Received: June 29, 2011; Accepted: April 2, 2012; Published: May 2, 2012
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Funding: MAS was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant and the Nouragues research grant by the Centre national de la recherche scientifique (CNRS) (French Guiana). ESE was supported by Natural Resources Canada, Canadian Forest Service. MH was supported by an NSERC Discovery Grant. DS was supported by the Alfred P. Sloan Foundation. BLF was supported by National Science Foundation Grants DEB-0072713, DEB-0344731, and DEB-0842395. JBW was supported by NSF grant DEB 1020510 and USDA Grant AG sub UGA RC293-359. SEM, JH and MJ were supported by NSF grant DEB-0841885 and Czech Science Foundation grant P505/10/0673. MJ was supported by a Marie Currie Fellowship (PIOFGA2009-25448). HDW was supported by the National Natural Science Foundation of China (NSFC grant no. 31090253). JR was supported by the National Center for Ecological Analysis and Synthesis, US NSF Grant #EF-0553768. DHJ was supported by NSF DEB 0515699, the Wege Foundation of Grand Rapids, Michigan, the International Conservation Fund of Canada (Nova Scotia), and private donors to the Guanacaste Dry Forest Conservation Fund. JG was supported through funding to the Canadian Barcode of Life Network from Genome Canada, NSERC, and other sponsors listed at www.BOLNET.ca. Laboratory analyses on sequences generated since 2009 were funded by the Government of Canada through Genome Canada and the Ontario Genomics Institute (2008-0GI-ICI-03). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
DNA barcoding uses a standardized short sequence of DNA as a key character for species-level identification and discovery . Barcode variation can be used for the identification of known species from trace amounts of tissue  or a taxonomically unidentifiable stage  or as a part of a suite of characters for the discovery and description of new species . As a tool in revisionary studies it can speed up the rate of taxonomic research in flagging otherwise cryptic diversity , –. Within arthropods, the approach has been used in many orders , , , – utilizing the mitochondrial cytochrome c oxidase subunit 1 (COI) gene with reports of success and of failure , . In some cases where it has failed – when there was not sufficient variation present in the barcode region to differentiate between species  or where there was an evident mito-nuclear discordance such that intra-specific mtDNA variation might be confused with inter-specific variation ,  – the failures were hypothesized to be due to the effects of the host-manipulating intracellular rickettsial-type symbiotic bacteria, Wolbachia.
Wolbachia are alpha-proteobacterial reproductive parasites which can alter the sex-ratio and reproductive compatibility of their host to their own benefit . They are among the most common endosymbiotic bacteria in many, perhaps most, arthropod systems. Known effects of Wolbachia include cytoplasmic incompatibility (CI) in which matings between uninfected females and infected males produce inviable embryos, and male-killing (MK) in which infected females produce no (or a reduced number of) viable male offspring. These strategies generally increase the reproductive success of infected relative to uninfected matrilines. Perhaps the best known, and/or most frequently reported impact of Wolbachia on its host behavior is CI. In CI, any zygote formed through fertilization of an uninfected egg with sperm from an infected male dies. This strategy of host manipulation has been remarkably successful and it has been estimated that as many as 66% of all insect species carry a Wolbachia infection , although Wolbachia incidence is not the same as CI prevalence. This favoring of infected matrilines can also drive a mitochondrial sweep through a population (or species), confounding interpretations of mtDNA divergence among populations as outlined below .
Infections of bacterial endosymbionts could threaten the accuracy of an mtDNA based system of identification and species discovery such as DNA barcoding in any one of four ways:
- Unintended amplification of bacterial COI due to the use of broad, near-universal primer sets and failure to then recognize these sequences as bacterial.
- Conflation or confusion of insect species identifications due to the inclusion of the bacterial endosymbiont COI.
- Lineage disruption via CI as an isolating mechanism leading to the conflation of insect lineages that are infected with different Wolbachia strains within a species (thereby overestimating diversity; i.e. individuals within a population being swept with a mitochondrial type via a Wolbachia infection may appear as different species using mtDNA barcoding).
- Lineage disruption via CI as an isolating mechanism leading to the fixation of one species' mtDNA within a hybridizing species pair for which one carries a Wolbachia infection (underestimating diversity; i.e. hybridization resulting in the replacement of the mitochondria of one species with that of the other , ).
Wolbachia can be amplified from arthropod total genomic DNA extracts made from somatic tissue  (including legs, the most commonly used material for DNA barcoding projects). We have demonstrated this previously utilizing the Wolbachia surface protein coding gene (wsp) assay  to test for endosymbiont prevalence within certain groups being assayed for DNA barcode variation (Formicidae – , ; Tachinidae – , ; Braconidae – ). We have also experienced the un-intended amplification of Wolbachia COI from insect genomic DNA extracts . We examined the more than two million insect trace files on the Barcode of Life Datasystem (BOLD - ) for evidence of un-intended amplification of Wolbachia and also conducted more in-depth cases studies using more than 95K DNA extracts from three insect orders (Hymenoptera, Diptera, and Lepidoptera) and more than nine families to ask 1) whether these unintended amplifications would compromise our capacity to generate or analyze the barcodes of their insect hosts; 2) whether the observed frequency of Wolbachia COI amplification is a function of Wolbachia prevalence as measured using the wsp PCR assay; and 3) what Wolbachia phylogenetic information can be gleaned from bacterial gene regions generated from insect DNA barcoding surveys.
We conclude that unrecognised amplification of bacterial COI or the confusion of insect identifications due to the inclusion of unanticipated amplification of bacterial COI does not represent a serious impediment for a barcoding survey of a taxon or area. Such incidences are rare and can be easily recognized if queried. Our greatest concern a priori regarding the potential effects of Wolbachia on mtDNA based identifications, and on species discovery, was the potential conflation of infected (and isolated) lineages within species as species – but we have not yet documented such a case. A DNA barcoding survey through a taxon or sampling regime is far from being compromised by the influence of Wolbachia. Rather, these surveys represent an ideal opportunity to explore what relationships actually do exist between different bacterial strains and hosts and between bacteria from different hosts in different geographic regions.
Unanticipated amplification of bacterial COI from insect hosts and primer specificity
The Barcode of Life Data System (BOLD- ) library of trace files was searched for evidence of Wolbachia (Wolbachia is one of the suite of possible contaminants that all sequences uploaded to BOLD are checked against as a normal quality-control routine - ). Out of 1.09 million insect specimen trace files searched, generated from extractions principally (but not exclusively) based on somatic tissue, we found evidence of Wolbachia in 1,768 traces (0.16%). Non-specific amplification of Wolbachia was found in multiple insect orders (Table 1) and using multiple primer combinations (Table 2), however, that amplicon was never confused with the COI of the host.
For example, within Lepidoptera there are, at the time of writing, more than 506,297 COI DNA barcodes on BOLD (BOLD Taxonomy Browser, Lepidoptera sequences on BOLD on June 2011) within which we found only 286 cases where Wolbachia COI was amplified rather than the insect (0.05%) (as of June 2011). For those Lepidoptera generated as part of the Área de Conservación Guanacaste (ACG) rearing and light-collecting program  we found 186 Wolbachia sequences from the 162,065 specimens of ACG Lepidoptera barcoded (BOLD Taxonomy Browser on 11.06.02) – 0.11%)).
Conflation of insect identifications due to the inclusion of the bacterial endosymbiont COI
On average, there are 167 base pair differences between Wolbachia and their host COI within the barcode region. Bacterial COI GC content does not possess the characteristic insect AT bias (Table 3 - the average GC content of the insect hosts is 13%, while in Wolbachia it is much higher (20%)).
wsp assay and prevalence
For three sub-sets of data (ants from the south-western Indian Ocean island of Mauritius, and both ants and parasitoid wasps from Churchill, Manitoba, Canada – Table 4) we used the PCR based wsp assay  to test whether the proportion of generated bacterial COI was correlated with the frequency of Wolbachia in the insects themselves. A subset of these bands was sequenced to confirm the identity of the surface coding protein.
In ants collected on the island of Mauritius , we tested 438 ant specimens from 57 species for Wolbachia using the wsp assay and found that approximately a third of these specimens and species tested positive (116/438 specimens = 26.5%, 18/57 species = 31.5%). Of the total ant specimens sequenced from the Mauritius project (1111), only 4 bacterial COI sequences were recovered (0.36% - Table 4). In a smaller set of ants collected in Churchill, Manitoba, Canada  we found that 178 of 282 DNA extracts from 5 of 7 species were infected (63%, 71%); however we recovered no bacterial COI from this group using standard insect barcoding procedures.
Using a slightly larger set of parasitoid wasps from Churchill , , we screened 376 specimens for wsp and found 203 infections (a conservatively estimated rate of infection of almost 54%). However, after sequencing >6,000 parasitoid wasp specimens from Churchill, Wolbachia COI was generated only four times in total (0.067%), and never from the 376 specimens that we scanned using wsp primers.
Comparison to MLST Database
The multilocus sequence typing (MLST) database ,  allelic profile for COI (or coxA) contains 104 sequences (also see the BOLD project, “MLST – Wolbachia from MLST database”). All of the COI Wolbachia sequences that have been inadvertently amplified in the insect species we have barcoded are consistent with infections from Supergroup A strains. This indicates a strong bias in amplification towards this supergroup by the insect CO1 primers. Within these Supergroup A strains there are four major allele groups present. One is identical in the overlapping region to the MLST allele coxA-1, a second, to the overlapping region of the allele coxA-6, and two others represent apparently new allele groups (MAS-2, MAS-1) (Table 5, Figure 1 a,b). In only one genus (Hesperiidae, Urbanus belliDHJ01, U. belliDHJ03) did we amplify gene fragments consistent with Wolbachia Supergroup B (in this case not from COI, but initially using the wsp protocol).
Tips labeled by BOLD process ID and host insect taxonomy (if generated here) or MLST allele group. Branches colored by host insect taxonomy (brown = Tachinidae, dark blue = Braconidae, light blue = Halictidae, pink = Chalcididae, red = Ichneumonidae, green = Formicidae, yellow = Lepidoptera, purple = Agaonidae, black = MLST Wolbachia alleles). Stars indicate the position of Wolbachia from new-world ants.
One of the first criteria involved in determining a standardized gene region appropriate for a DNA barcoding approach is to find conserved primer regions that enable the utilization of universal (or near-universal) primers , . This strategy of near-universal primer design could be compromised if the priming region variability for a taxon in question had less affinity for the barcode oligonucleotide than for a bacterial endosymbiont. In an apparent recent example of this, Linares et al.  wrote that “… generalized primers led to the inadvertent amplification of the endosymbiont Wolbachia, undermining the use of universal primers and necessitating the design of genus-specific COI primers alongside a Wolbachia-specific PCR assay.” – and further that, “[t]his result underscores a major problem with the widespread application of universal primers for DNA barcoding i.e. non-specific species amplification”.
It is important to note that although Linares et al. refer to LepF1/LepR1 as “Lepidoptera specific” primers, what was originally written was that LepF1/LepR1 was a “primer pair designed for Lepidoptera” . In fact, it is clear from the intervening eight years since the initiating barcoding paper was published, through one million sequencing reactions at the Biodiversity Institute of Ontario using LepF1 or LepR1, that these primers have broad utility across most insect groups (from the publicly available BOLD website accessed on 11.04.19). Interestingly, Lineares et al. noted that, in spite of their concerns following discovery of Wolbachia, they did not find any “obvious association between host lineages and Wolbachia infections” (i.e. infection status did not appear to affect species identification via barcodes).
Conflation of insect identifications due to the inclusion of the bacterial endosymbiont COI
To what degree is the non-specific amplification of Wolbachia COI a problem for the widespread application of DNA barcoding? It is apparent to us that it is exactly because barcoding is frequently successful for species identification that non-target amplification (between insect and bacteria) is not a major concern. It is immediately apparent when an endosymbiont COI fragment is unintentionally amplified from its host through the degree of difference between what was expected and what was generated (Table 3). It is because, vastly more often than not, barcoding can differentiate species that the inadvertent (and therefore mislabeled) inclusion of non-specific bacterial amplicons is not a major problem.
We do note that the majority of these extractions are made not from whole specimen or abdominal extractions but from legs. Although Wolbachia can be found in extractions made from somatic tissue, this is generally presumed to occur at a lower rate than for extractions made from the abdomen (however consider that the actual concentrations recovered by  were not different between reproductive and somatic tissue). Perhaps, our extraction protocol , produces, on average, more host DNA than the protocol followed by Linares et al. Alternatively, perhaps the high fidelity Taq (Platinum Taq DNA polymerase; Invitrogen) used in the Biodiversity Institute of Ontario permits the critical first stages of PCR to be swamped by the more abundant host DNA rather than that of the endosymbiont.
For example, consider the order Lepidoptera in general, and a specific case study of the ACG Lepidoptera  where we saw a very low rate of Wolbachia amplification. These low rates of non-intended amplification have not impeded the production of large numbers of Lepidoptera DNA barcodes, nor have we yet documented a case within the Lepidoptera where either the bacterial COI was confused for the insect, nor when differential possession of Wolbachia strain(s) has conflated population and species level divisions. Furthermore the non-intended amplification has produced some interesting ancillary findings. For instance, the two distinct Wolbachia COI sequences recovered from ACG Lepidoptera matched those of the MLST alleles coxA-1 and coxA-6, although since they do not completely overlap with the MLST standard coxA sequence, they may thus not be identical. The coxA-1 allele was primarily found in large butterflies and moths (Hesperiidae, Notodontidae and Nymphalidae) while the coxA-6 allele was found predominantly in smaller Pyralidae and Elachistidae. In addition, twenty-three (12%) of these bacterial contaminant sequences arose from the same host species (Caligo telamonius Felder, Nymphalidae).
Conflation of infected lineages with species via the effects of Wolbachia
Due to the heightened capacity for these bacteria to fragment the mitochondrial lineages of a species, concern has been expressed regarding what impact their apparent omnipresence has on a mitochondrial system of DNA-based species identification and discovery . Specifically, problems will arise if more lineages than are truly present are flagged as new or different species as a result of Wolbachia separated mtDNA lineages harbored within a single species (a statistical Type I error (rejecting a true null when the initial null hypothesis is that specimens are of the same species)).
Alternatively, Wolbachia infections can sweep away the mitochondrial variation between species – if even infrequent hybridization events result in the fixation of the endosymbiont. In one recent example, the lack of within-species monophyly was hypothesized to result from introgressive hybridization associated with Wolbachia infection . Similar patterns of evident interspecific mitochondrial introgression have been noted in sister species of parasitic wasps , butterflies  and Drosophila . However, it is not clear from the literature how common this is (e.g. “We see no obvious association between host lineages and Wolbachia infections” ). From the perspective of our dataset, we have seen no evidence of this type of between-species mtDNA barcode sharing due to the sharing of Wolbachia infections – with one possible exception.
The one example where there was an apparent mito-nuclear discordance – possibly caused by Wolbachia – was documented in the Costa Rican tachinid fly, Chetogena scutellarisDHJ01 . The presumably generalist (polyphagous) tachinid “Chetogena scutellaris” was found to include two barcode groups: C. scutellarisDHJ01 and C. scutellarisDHJ02. Both groups were also supported by divergences within 28S and ITS1. However, within C. scutellarisDHJ01, there was an additional rDNA split that was not apparent in the barcode. Using the wsp assay it was found that nearly ¾ of the specimens of C. scutellarisDHJ01 contained Wolbachia – and thus suggested that Wolbachia may have been the source that swept mtDNA variation from this provisional morphologically cryptic species that is nonetheless diagnosable with nuclear sequences.
Wolbachia infections can also inflate the estimates of intra-specific diversity – if different strains infect different populations or individuals within a population. One example where there were evidently different Wolbachia strains present in different provisional and morphologically cryptic species was described recently . Here, one apparent morphospecies of Pristomyrmex was collected from a threatened population. Specimens from these collections were found to contain deep barcode divergences (15%) suggesting the morphospecies actually contained multiple cryptic species, or that the population may be a contemporary refuge for two apparently divergent mtDNA lineages. One of two rDNA loci tested revealed corroborating variation and all Pristomyrmex specimens tested positive for Wolbachia. However, each provisional species was infected with different Wolbachia strains – suggesting that the presence of the different strains of endosymbiont alone could have produced the evident patterns of mitochondrial divergence. It is clear that these provisional Pristomyrmex species harbor different Wolbachia, and it is also possible that the infection with different strains of Wolbachia has played a role in the evident diversification within these cryptic species. Shoemaker et al.  and Sun et al.  also discuss speciation events within host insect species of Drosophila and Eupristina that were putatively reinforced by a Wolbachia infection. In this case, we observed that the wsp sequences from one provisional Pristomyrmex species contained multiple peaks, while the wsp from other provisional species had unambiguous base pair callings. This suggests that rather than unidirectional cytoplasmic incompatibility (CI: prevention of intra-lineage mating through presence/absence of a Wolbachia strain), this Pristomyrmex example may be driven by the prevention of intra-lineage mating through the possession of different strains (bidirectional CI).
In another example, we examined intraspecific divergences in ant species of Mauritius  that were infected or uninfected with Wolbachia. The published supporting information file for this dataset contains information regarding the infection status per individual and species based on the wsp assay (http://www.frontiersinzoology.com/content/6/1/31/additional). Using this coding, we searched for infected or uninfected species from the public BOLD project “Ant Diversity of Mauritius (ASMA)” (when individuals from a species had been recorded as both uninfected and infected individuals, the species was coded as infected in this analysis). For each infection status, we then used BOLD to calculate distance summary statistics (Table S1). While the Wolbachia infected species contained slightly less variation, the difference was slight (the average intra-specific distance for Wolbachia infected species is 0.824, while for uninfected it is 0.99). While these results need be understood as preliminary and ought not to be generalised as they arise from one taxonomic case on an isolated island, they are nevertheless demonstrative of the capacity to identify insect species in spite of Wolbachia infection and furthermore, the capacity to use somatic DNA extractions to study a species' Wolbachia infections (specifically when there are multiple specimens sequenced per species).
Amplification and Primer Design
It is clear that Wolbachia COI can be amplified from DNA extractions of insects made from somatic tissue . However, in our data, the frequency of this occurrence within the case study projects (min of 0%, max 0.61%, mean 0.12% - Table 3) suggests that this does not compromise the barcoding of their arthropod hosts, nor de facto require the design of genus-specific COI primers . While such re-design is not required in general, it may be necessary in some cases. The difference in the proportion of amplification between different groups of insects is of interest. For instance the halictid bees, including the largest (>1,750 spp) and perhaps most taxonomically challenging bee genus (Lasioglossum), appear to contain a relatively high preponderance of Wolbachia. This may be due to an increased infection load (and therefore increased likelihood of infection due to that ‘super-infection’ load being carried by the individual insect) or, alternatively, lack of fit for the near-universal insect COI primers within this specific family (if the target insect COI is not amplified in the important initial stages of PCR, the proportion of co-amplifying endosymbionts becomes more important). In a subset of the bee data (570 specimens), ten Wolbachia COI sequences were produced using LepF1/LepR1. However, re-amplification from the same extracts using the same primers but paired with degenerate internal primers (C_ANTMR1D and RonMWASPdeg_t1 respectively) produced the bee mtDNA barcode in all cases. While the use of a degenerate primer cocktail does not preclude the amplification of bacterial COI (Table 2), it did reduce the frequency of bacterial amplification for these bees. When the fit of one of the standard near-universal insect primers (LepR1) was compared to Halictidae in GenBank, and the Wolbachia MLST strain database, it is apparent that the LepR1 primer has a much better fit with the bacterial endosymbiont than with the insect host (Figure 2). Halictidae represents a case where family specific primer design is warranted.
wsp assay COI amplification and Wolbachia prevalence
Standard protocols for Wolbachia screening usually call for fresh abdominal tissue from the insect host, while insect DNA barcoding is more typically done by sampling a leg from a preserved specimen. Due to this difference alone, routine Wolbachia screening on barcoded specimens will likely miss some true infections, and therefore underestimate infection rates . However, our results suggest that integrating the two sampling surveys would likely provide access to an abundance of previously un-anticipated diversity.
In all cases (ants from Mauritius and ants and parasitoid wasps of Churchill, Manitoba, Canada) our examples support the hypothesis that many more of these insect specimens and species carry Wolbachia than are apparent by our inadvertent COI bacterial amplification, a finding in agreement with other studies .
In addition to comparing recovered bacterial COI to wsp surveys, for one group we used the literature to calibrate our finding of inadvertent endosymbiotic COI amplification. For fig wasps, the prevalence of Wolbachia COI revealed in the barcoding assay was large compared to the other test datasets analysed here (∼9%). Yet when calibrated to the overall expected prevalence of Wolbachia known from Chinese fig wasps (∼50% in all species , ) this value appears low. Within one hesperiid genus of ACG Lepidoptera (Urbanus) we amplified a wsp gene fragment that was identified as Supergroup B. Within this genus we have never inadvertently amplified bacterial COI and this case is the only incidence where Wolbachia strains from Supergroup B have yet appeared in our data (although it should be noted that, due to recombination, the use of wsp alone to categorise Supergroup ought to be interpreted with caution ).
COI allele group diversity
We compared the fragments of isolated Wolbachia COI that we generated to the MLST database for Wolbachia that includes COI as one of the six loci used for typing the strains of this bacterium (coxA in MLST terminology). However, it is important to note that the accepted MLST COI fragment is in the 3′ region of the gene and has very little overlap (194 bp) with the standard barcode locus. Despite this small degree of overlap, there was sufficient variation to compare the COI alleles from the MLST to the COI fragments fortuitously generated here. We found that the majority of the diversity fell within Supergroup A and, while some strains appear novel, most were associated with existing strain types; however a thorough comparison of databases would require congruent COI regions.
Geography and Genetic Isolation by Distance
In ants, Wolbachia strains from New World collections were shown to differ from those in ants from elsewhere . When compared across all host families we detected no evident pattern of isolation by distance in the bacterial COI gene (Figure 3). Within COI, we found slight COI divergences (∼1%) between the Wolbachia from ants across Old and New World. For instance, Wolbachia COI from a Costa Rican ant differed by 2 and 4 base pairs respectively to those from bacteria hosted by ants in Papua New Guinea and Mauritius. While the COI region alone does not appear to have sufficient resolution to observe the patterns of New World/Old World divergence described in , within the more variable wsp, we did see, on average, 17% divergence between ants from Mauritius and Churchill, Manitoba Canada. As a comparison, the Wolbachia wsp of Costa Rican tachinid flies and Mauritus ants was found to be only 9% divergent. Patterns of evident isolation by distance in Wolbachia must be approached with caution – and calibrated with information from more than one insect host family.
Our results suggest that insect barcoding is not compromised by the presence of Wolbachia. Insect DNA barcodes are easy to differentiate from the sequences of their bacterial endosymbionts in cases when inadvertent amplification occurs and, based on several hundred thousand amplifications, the bacterial sequences do not occur frequently. However, insect barcoding projects would do well to incorporate additional steps that standardize the collection of the ancillary data present in whole genome extracts, including Wolbachia MLST analyses – and in increasing the number of extractions based on abdomens rather than somatic tissue. This would help both to document our expectations regarding the prevalence of this bacterium and to explain unanticipated patterns of mitochondrial sharing or divergence. In addition – the Wolbachia MLST program would also benefit from expanding and/or shifting the COI region included in its database to overlap with the large (at writing >1.25 million records) database of eukaryotic COI sequences. Expanding the current MLST standardized selection of COI to align with the eukaryotic DNA barcoding region would permit a more thorough comparison of mitochondrial diversity, even though it is evident that the great majority of Wolbachia infections will go unnoticed in standard COI barcoding protocols. Such standardization would help explain the apparently new allele groups recovered here (particularly when the insect portion of BOLD could be positioned to be a major contributor to the MLST campaign). The Wolbachia COI alleles seen here are broadly distributed geographically and, with some exceptions within the ants, strain type does not appear to be tightly associated with their hosts. While preliminary, our results demonstrate the benefits and potentials of integrating Wolbachia surveys into insect DNA barcoding projects. In understanding the species within ecological communities, we would do well to understand the communities within those species .
After being given special access to all traces files produced by the Canadian Center for DNA Barcoding on the BOLD database, we scanned nearly 2.2 million trace files for matches to Wolbachia COI sequences by blasting trace sequences to a Wolbachia COI reference library. The reference library was constructed from single representatives of each strain in GenBank where COI sequences of sufficient length were available. Traces were matched to the reference library based on an e-value threshold of <1e-110 (Figure S1).
A query for Wolbachia COI traces was possible for this survey because BOLD preserves all electropherograms produced for every individual record even if the sequence itself is identified as a contaminant and excluded from the database as a result of the quality-control procedures in place, which includes screening for sequencing of non-target COI Wolbachia amplicons.
All COI fragments were generated using standard extraction and amplification protocols at the Biodiversity Institute of Ontario , , . Primers utilized for generating COI are standard barcoding primers that are listed in Table 2.
Wolbachia COI fragments were each assigned a sample ID number that corresponded to the BOLD process ID number of the host DNA extract with a suffix of “.w” attached. Thus, the Nesomyrmex ant sample CASENT0152435-D01 can be accessed through Antweb by this accession, or BOLD as ASANV619-09, while the bacteria associated with the ant specimen can be accessed by ASANV619-09.w. All Wolbachia COI sequences generated here are available on BOLD within the container project: Insect Endosymbionts (ASENZ) and on GenBank. All accession numbers and insect collection details are available in Table S2.
For four subsets of the data, we used the PCR based wsp assay  to determine the proportion of insect specimens that were infected. We compared this rate of wsp determined prevalence to the rate at which bacterial COI had been produced from insect leg extractions (Table 3). For a sub-set of these positives, we amplified the wsp product to confirm its identity.
The Mantel test, measuring isolation by distance on bacterial COI was completed using Arlequin v3  where geographic distances was based on the insect host collection locality.
Measures of diversity at each site within the oligonucleotide for LepR1 were completed using DNASP .
The number of trace files matching Wolbachia in BOLD trace library. The value of <1e-110 was chosen as a threshold between the conservative match of query to known Wolbachia strains and the identification of novel strains.
DNA barcode diversity for ant species of Mauritius coded by Wolbachia infection.
We thank all of our colleagues at the Biodiversity Institute of Ontario for their enthusiasm, diligence and assistance in this project. We wish to make particular mention and thanks to our many colleagues who have donated taxonomic materials, services and expertise for this project. MAS thanks Jack T. Longino for permitting the use of a Pheidole specimen's endosymbiont in these analyses and Clark Smith and Elaine Bazinet Smith for monitoring one of the Ontario Malaise traps. DHJ and WH were assisted by Tanya Dapkey and the parataxonomist team of Area de Conservacion Guanacaste. We thank the anonymous reviewers whose comments helped to improve this manuscript.
Conceived and designed the experiments: MAS. Performed the experiments: MAS CB SR KC KH DS RR XZ. Analyzed the data: MAS SR. Contributed reagents/materials/analysis tools: MAS CB KC ESE JFT BLF JG MH WH KH JH DWH MJ DHJ YL SEM LP DQ SR JR RR MS CS JKS DS JW MW XZ. Wrote the paper: MAS KC ESE JFT BLF JG WH KH JH DWH MJ DHJ JTL SEM LP DQ SR JR RR MS CS JKS DS JW MW XZ.
- 1. Janzen DH, Hallwachs W, Blandin P, Burns JM, Cadiou JM, et al. (2009) Integration of DNA barcoding into an ongoing inventory of complex tropical biodiversity. Molecular Ecology Resources 9: 1–26.
- 2. Rasmussen RS, Morrissey MT, Hebert PDN (2009) DNA Barcoding of Commercially Important Salmon and Trout Species (Oncorhynchus and Salmo) from North America. Journal of Agricultural and Food Chemistry 57: 8379–8385.
- 3. Swartz ER, Mwale M, Hanner R (2008) A role for barcoding in the study of African fish diversity and conservation : review article. South African Journal of Science 104: 293–298.
- 4. Fisher BL, Smith MA (2008) A revision of Malagasy species of Anochetus Mayr and Odontomachus Latreille (Hymenoptera: Formicidae). PLoS ONE 3: e1787.
- 5. Smith MA, Fernandez-Triana J, Roughley R, Hebert PDN (2009) DNA barcode accumulation curves for understudied taxa and areas. Molecular Ecology Resources 9: 208–216.
- 6. Steinke D, Zemlak TS, Hebert PDN (2009) Barcoding Nemo: DNA-Based Identifications for the Ornamental Fish Trade. PLoS One 4: Article No.: e6300.
- 7. Smith MA, Fisher B (2009) Invasions, DNA barcodes, and rapid biodiversity assessment using ants of Mauritius. Frontiers in Zoology 6: 31.
- 8. Virgilio M, Backeljau T, Nevado B, De Meyer M (2010) Comparative performances of DNA barcoding across insect orders. BMC Bioinformatics 11: 206.
- 9. Schmidt BC (2009) Taxonomic revision of the genus Grammia Rambur (Lepidoptera: Noctuidae: Arctiinae). Zoological Journal of the Linnean Society 156: 507–597.
- 10. Smith MA, Rodriguez JJ, Whitfield JB, Deans AR, Janzen DH, et al. (2008) Extreme diversity of tropical parasitoid wasps exposed by iterative integration of natural history, DNA barcoding, morphology, and collections. Proc Natl Acad Sci U S A 105: 12359–12364.
- 11. Smith MA, Wood DM, Janzen DH, Hallwachs W, Hebert PDN (2007) DNA barcodes affirm that 16 species of apparently generalist tropical parasitoid flies (Diptera, Tachinidae) are not all generalists. Proc Natl Acad Sci U S A 104: 4967–4972.
- 12. Whitworth TL, Dawson RD, Magalon H, Baudry E (2007) DNA barcoding cannot reliably identify species of the blowfly genus Protocalliphora (Diptera : Calliphoridae). Proceedings of the Royal Society B-Biological Sciences 274: 1731–1739.
- 13. Linares MC, Soto-Calderon ID, Lees DC, Anthony NM (2009) High mitochondrial diversity in geographically widespread butterflies of Madagascar: A test of the DNA barcoding approach. Molecular Phylogenetics and Evolution 50: 485–495.
- 14. Gompert Z, Forister ML, Fordyce JA, Nice CC (2008) Widespread mito-nuclear discordance with evidence for introgressive hybridization and selective sweeps in Lycaeides. Molecular Ecology 17: 5231–5244.
- 15. Nice CC, Gompert Z, Forister ML, Fordyce JA (2009) An unseen foe in arthropod conservation efforts: The case of Wolbachia infections in the Karner blue butterfly. Biological Conservation 142: 3137–3146.
- 16. Engelstadter J, Hurst GD (2010) The Ecology and Evolution of Microbes that Manipulate Host Reproduction. Annual Review of Ecology and Systematics 40: 127–149.
- 17. Hilgenboecker K, Hammerstein P, Schlattmann P, Telschow A, Werren JH (2008) How many species are infected with Wolbachia? - a statistical analysis of current data. Fems Microbiology Letters 281: 215–220.
- 18. Hurst GDD, Jiggins FM (2005) Problems with mitochondrial DNA as a marker in population, phylogeographic and phylogenetic studies: the effects of inherited symbionts. Proceedings of the Royal Society B: Biological Sciences 272: 25–1534.
- 19. Raychoudhury R, Grillenberger BK, Gadau J, Bijlsma R, van de Zande L, et al. (2010) Phylogeography of Nasonia vitripennis (Hymenoptera) indicates a mitochondrial-Wolbachia sweep in North America. Heredity 104: 318–326.
- 20. Raychoudhury R, Baldo L, Oliveira D, Werren JH (2009) Modes of acquisition of Wolbachia: Horizontal transfer, hybrid introgression and codivergence in the Nasonia species complex. Evolution 63: 165–183.
- 21. Dobson SL, Bourtzis K, Braig HR, Jones BF, Zhou WG, et al. (1999) Wolbachia infections are distributed throughout insect somatic and germ line tissues. Insect Biochemistry and Molecular Biology 29: 153–160.
- 22. Braig HR, Zhou W, Dobson SL, O'Neill SL (1998) Cloning and characterization of a gene encoding the major surface protein of the bacterial endosymbiont Wolbachia pipientis. Journal of Bacteriology 180: 2373–2378.
- 23. Smith MA, Woodley NE, Janzen DH, Hallwachs W, Hebert PDN (2006) DNA barcodes reveal cryptic host-specificity within the presumed polyphagous members of a genus of parasitoid flies (Diptera : Tachinidae). Proc Natl Acad Sci U S A 103: 3657–3662.
- 24. Ratnasingham S, Hebert PDN (2007) BOLD: The Barcode of Life Data System. Molecular Ecology Notes 7: 355–364.
- 25. Fernandez Triana J, Smith MA, Boudreault C, Goulet H, Hebert PDN, et al. (2011) A poorly known high-latitude parasitoid wasp community: unexpected diversity and dramatic changes through time. PLoS ONE 6: e23719.
- 26. Baldo L, Dunning Hotopp JC, Jolley KA, Bordenstein SR, Biber SA, et al. (2006) Multilocus Sequence Typing System for the Endosymbiont Wolbachia pipientis. Appl Environ Microbiol 72: 7098–7110.
- 27. Meusnier I, Singer GAC, Landry JF, Hickey DA, Hebert PDN, et al. (2008) A universal DNA mini-barcode for biodiversity analysis. BMC Genomics 9: 214.
- 28. Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003) Biological identifications through DNA barcodes. Proc R Soc Lond B Biol Sci 270: 313–321.
- 29. Hebert PDN, Penton EH, Burns JM, Janzen DH, Hallwachs W (2004) Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences of the United States of America 101: 14812–14817.
- 30. Ivanova NV, Dewaard JR, Hebert PDN (2006) An inexpensive, automation-friendly protocol for recovering high-quality DNA. Molecular Ecology Notes 6: 998–1002.
- 31. Johnstone RA, Hurst GDD (1996) Maternally inherited male-killing microorganisms may confound interpretation of mitochondrial DNA variability. Biological Journal of the Linnean Society 58: 453–470.
- 32. Jiggins FM (2003) Male-killing Wolbachia and mitochondrial DNA: Selective sweeps, hybrid introgression and parasite population dynamics. Genetics 164: 5–12.
- 33. Ballard JWO (2000) When one is not enough: introgression of mitochondrial DNA in Drosophila. Molecular Biology and Evolution 17: 1126–1130.
- 34. Shoemaker DD, Katju V, Jaenike J (1999) Wolbachia and the evolution of reproductive isolation between Drosophilla recens and Drosophila subquinaria. Evolution 53: 1157–1164.
- 35. Sun X-J, Xiao J-H, Cook J, Feng G, Huang D-W (2011) Comparisons of host mitochondrial, nuclear and endosymbiont bacterial genes reveal cryptic fig wasp species and the effects of Wolbachia on host mtDNA evolution and diversity. BMC Evolutionary Biology 11: 86.
- 36. Weinert LA, Tinsley MC, Temperley M, Jiggins FM (2007) Are we underestimating the diversity and incidence of insect bacterial symbionts? A case study in ladybird beetles. Biology Letters 3: 678–681.
- 37. Chen LL, Cook JM, Xiao H, Hu HY, Niu LM, et al. (2010) High incidences and similar patterns of Wolbachia infection in fig wasp communities from three different continents. Insect Science 17: 101–111.
- 38. Fernandez Triana J, Smith MA, Boudreault C, Goulet H, Hebert PDN, et al. (in review) A poorly known high-latitude parasitoid wasp community: unexpected diversity and dramatic changes through time. PLoS ONE.
- 39. Baldo L, Werren JH (2007) Revisiting Wolbachia supergroup typing based on WSP: Spurious lineages and discordance with MLST. Current Microbiology 55: 81–87.
- 40. Russell JA, Goldman-Huertas B, Moreau CS, Laura B, Stahlhut JK, et al. (2009) Specialization and Geographic Isolation Among Wolbachia Symbionts from Ants and Lycaenid Butterflies. Evolution 63: 624–640.
- 41. Ferrari J, Vavre F (2011) Bacterial symbionts in insects or the story of communities affecting communities. Philos Trans R Soc Lond B Biol Sci 366: 1389–1400.
- 42. Wenseleers T, Ito F, van Borm S, Huybrechts R, Volckaert F, et al. (1998) Widespread occurrence of the micro-organism Wolbachia in ants. Proc R Soc London Ser B Biol Sci 265: 1447–1452.
- 43. Grasso DA, Wenseleers T, Mori A, Le Moli F, Billen J (2000) Thelytokous worker reproduction and lack of Wolbachia infection in the harvesting ant Messor capitatus. Ethology Ecology & Evolution 12: 309–314.
- 44. Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evol Bioinform Online 1: 47–50.
- 45. Rozas J, Jc SD, X M (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496–2497.
- 46. Smith MA, Eveleigh ES, McCann KS, Merilo MT, McCarthy PC, et al. (2011) Barcoding a quantified food web: crypsis, concepts, ecology and hypotheses. PLoS ONE 6: e14424.
- 47. Li YW, Zhou X, Feng G, Hu HY, Niu LM, et al. (2009) COI and ITS2 sequences delimit species, reveal cryptic taxa and host specificity of fig-associated Sycophila (Hymenoptera, Eurytomidae). Molecular Ecology Resources 10: 31–40.