Neotropical Bats: Estimating Species Diversity with DNA Barcodes

DNA barcoding using the cytochrome c oxidase subunit 1 gene (COI) is frequently employed as an efficient method of species identification in animal life and may also be used to estimate species richness, particularly in understudied faunas. Despite numerous past demonstrations of the efficiency of this technique, few studies have attempted to employ DNA barcoding methodologies on a large geographic scale, particularly within tropical regions. In this study we survey current and potential species diversity using DNA barcodes with a collection of more than 9000 individuals from 163 species of Neotropical bats (order Chiroptera). This represents one of the largest surveys to employ this strategy on any animal group and is certainly the largest to date for land vertebrates. Our analysis documents the utility of this tool over great geographic distances and across extraordinarily diverse habitats. Among the 163 included species 98.8% possessed distinct sets of COI haplotypes making them easily recognizable at this locus. We detected only a single case of shared haplotypes. Intraspecific diversity in the region was high among currently recognized species (mean of 1.38%, range 0–11.79%) with respect to birds, though comparable to other bat assemblages. In 44 of 163 cases, well-supported, distinct intraspecific lineages were identified which may suggest the presence of cryptic species though mean and maximum intraspecific divergence were not good predictors of their presence. In all cases, intraspecific lineages require additional investigation using complementary molecular techniques and additional characters such as morphology and acoustic data. Our analysis provides strong support for the continued assembly of DNA barcoding libraries and ongoing taxonomic investigation of bats.


Introduction
DNA barcoding studies employ the mitochondrial cytochrome c oxidase subunit 1 gene (COI) as a tool for species identification and discovery through the comparison of inter-and intraspecific sequence divergences [1]. The effectiveness of this technique has been validated in various animal groups, where most species are characterized by highly similar haplotypes with low intraspecific variation and substantial divergence from closely related taxa [1][2][3][4][5]. In a few cases incomplete lineage sorting or shared barcode haplotypes exist between hybridizing or closely related taxa [5,6] limiting identifications for several groups of species (invariably within a genus). Conversely, most prior barcode studies have generated hypotheses about the existence of cryptic species based on unusually high genetic divergence between intraspecific lineages, some of which have subsequently been recognized as having morphological or ecological differences e.g. [7], supporting the use of barcoding for species discovery.
Assembling a reference database of DNA barcode sequences for mammals represents an obvious target for the global DNA barcode of life campaign. Mammals are a large, charismatic and relatively well-studied group of animals, but a modest objective with just over 5400 species recognized in 2007 [8] making the assembly of a DNA barcoding reference library a readily attainable goal. Despite the popular assumption that most mammals have been described, the rate of species discovery has actually accelerated recently [8] particularly with the aid of new molecular technologies. Bats (order Chiroptera) represented approximately 20% (1116 of 5416) of all mammal species indexed in 2005 [9] but the incidence of overlooked taxa is likely to be particularly high within this group due to their cryptic nocturnal, volant behaviour and often subtle morphological differences between species.
Most past DNA barcode studies of mammals have concentrated on local faunas or have had a taxonomically limited scope and include two studies of primates [10,11], one survey of bats [4], one survey of small mammals [12], a methodological study [3] and a taxonomic revision of the bat Myotis phanluongi [13]. Molecular taxonomic surveys of bats using mitochondrial genes other than COI have been conducted in Europe [14] using ND1 and in Central and South America [15] using cytochrome b. In both cases, numerous hypotheses regarding cryptic speciation were advanced. The largest study of bats to date [16] included 1896 specimens representing 157 bat species in South East Asia and speculated that taxonomic richness in this area may be underestimated by more than 50%. Francis et al. [16] also speculate that rates of endemism are much higher than previously recognized by classical morphology, a conclusion which has great conservation implications for the region.
Bradley and Baker [17] derived a set of criteria for evaluating the taxonomic implications of genetic diversity at mitochondrial loci (particularly cytochrome b): values ,2% were indicative of intraspecific variation, values between 2 and 11% were often indicative of variation between species (thus species with intraspecific values in this range require additional taxonomic scrutiny) and values .11% invariably indicated the presence of other congeneric species. Baker and Bradley [15] defined a theoretical framework for a genetic species concept for mammals and, using criteria similar to Bradley and Baker [17], evaluated cytochrome b sequences from 718 specimens representing 61 Neotropical mammal species (29 of which were bats). In total, Baker and Bradley [15] identified 32 cases (11 in bats) where a currently recognized species contained ''phylogroups'' with substantial DNA sequence variation (.5%) suggesting the presence of cryptic species and concluded that the species richness of mammals in Neotropical regions may be significantly under diagnosed. While similar to the conclusion of Francis et al. [16], it is somewhat surprising because, although the Neotropics contain some of the highest bat species diversity in the world [18], they have also received considerable taxonomic scrutiny e.g. [19][20][21][22][23][24]. Given the increasing evidence suggesting that cryptic diversity is prevalent in this region [4,12,15] a comprehensive survey of potential diversity is needed on a scale which is taxonomically diverse, geographically broad, and includes many representatives per species.
Here we examine patterns of COI sequence divergence in 9076 vouchered specimens from 163 bat species spanning collections from 13 countries across the continental Neotropics. To the best Figure 1. A neighbour-joining tree of COI sequence divergence (K2P) in surveyed species in the family Emballonuridae. All currently recognized species are supported by bootstrap values $97 (1000 replications). Triangles indicate the relative number of individuals sampled (height) and sequence divergence (width). In two cases, Saccopteryx bilineata and Cormura brevirostris (highlighted in red) deep intraspecific mitochondrial lineages are present which are strongly supported indicating the need for additional taxonomic scrutiny. The identification of intraspecific lineages can be hindered by small sample sizes from large geographic areas (e.g. Cyttarops alecto) where divergent sequences may represent independent lineages or poorly sampled intraspecific variation. doi:10.1371/journal.pone.0022648.g001 of our knowledge, it is one of the largest molecular surveys of biodiversity ever conducted and certainly the largest for land vertebrates. We evaluate these species with the following goals: 1) to assess genetic variation, 2) to estimate the number of distinct intraspecific mitochondrial lineages and 3) to evaluate the distancebased criteria used by Bradley and Baker [17] to categorize mitochondrial diversity. We use these data to estimate the potential taxonomic richness of the area and to provide a framework for further taxonomic investigation.

Sample Acquisition
We sampled preserved tissue from 9076 vouchered specimens held at the Royal Ontario Museum, representing 163 species from 65 genera including representatives from all nine bat families present within Central and South America. We followed the taxonomic designations of Simmons [9] with the following exceptions: we retained Artibeus intermedius as distinct from A. lituratus (R.J. Baker, pers comm.), A. planirostris as distinct from A. jamicensis following Lim et al. [23], A. bogotensis as distinct from A. glaucus [24], a species of Choeroniscus in the western Amazon distinct from C. minor due to a taxonomic revision in progress, and Molossus sp. as an undescribed species in Guyana following Lim and Engstrom [21] and Clare et al. [4]. Details on all specimens (sampling location, GPS co-ordinates of collection, voucher number etc.) are available within the ''Bats of the Neotropics'' project in the Barcode of Life Data Systems (BOLD, www.barcodinglife.org). Records from previously published data used here are contained on BOLD within the projects ''Bats of Guyana'' [4], ''BMC Sturnira'' [3] and ''Small mammal survey in Bakhuis, Suriname'' [12]. Our protocols for DNA extraction, amplification and sequencing follow Clare et al. [4], Ivanova et al. [25,26] and Borisenko et al. [12]. Genbank, BOLD and Museum accessions for all sequences are located in Table S1.

Data analysis
We aligned sequences using SeqScape v.2.1.1 (Applied Biosystems) and edited them manually. Sequences and original trace files are available in the BOLD projects described earlier. We calculated sequence divergences using the Kimura-two-parameter (K2P) model of base substitution [27] and generated a neighborjoining (NJ) tree of K2P distances showing intra-and interspecific variation in BOLD ( Figure S1). We generated all other trees in MEGA [28] as NJ trees of K2P sequence variation. Given the number of sequences and that phylogeny/branch arrangements were not a goal of this analysis, branch support was calculated on subsets of species for simplicity using 1000 bootstrap replications.

Molecular Taxonomic Identification
Our analysis included a mean of 56 individuals per species (range 1-1013, median = 11) with 147 species represented by multiple samples. The NJ tree of COI sequence divergence for all individuals ( Figure S1) demonstrates that only two species, Artibeus lituratus and A. intermedius, are not differentiated by COI sequences. In both species, levels of intraspecific variation are similar to other species in the genus (A. lituratus mean = 0.69% and A. intermedius mean = 0.79%) but form a single reciprocally monophyletic cluster with many common haplotypes. Mean intraspecific sequence variation in all species represented by $3 sequences was 1.38% (equal weighting regardless of sample size), but varied from 0-11.79%. Mean intraspecific variation was not correlated to sample size (one tailed test, r = 0.03, p = 0.74 for all species with n$3). Using the criteria established by Bradley and Baker [17] we observed 107 species with ,2% mean sequence divergence which would be classified as intraspecific variation whereas 29 had between 2 and 11% mean sequence divergence and would be classified as potentially containing cryptic species requiring additional taxonomic scrutiny, and one species contained .11% mean sequence divergence. A visual inspection of the structure of the NJ trees ( Figure S1, Figure S2) suggests that at least 44 of the species surveyed may contain distinct intraspecific mitochondrial lineages (e.g. Figure 1) with substantial divergence from other conspecifics, most supported by bootstrap values $90 (Table 1, 2,3,4,5,6,7,8). In some cases, these lineages represent a single divergent haplotype in the dataset which may reflect rare mutations within a geographic area (e.g. Pteronotus personatus Figure  S1, Figure S2) rather than distinct lineages. In other cases, small sample sizes from large geographic areas (e.g. Cyttarops alecto Figure 1) hinder the interpretation of mitochondrial sequence variation because divergent sequences may represent independent lineages or panmictic intraspecific variation that is poorly sampled. Divergent intraspecific lineages are found with both allopatric (e.g. Figure 2a) and sympatric (e.g. Figure 2b) distributional patterns.
For twelve species our sampling was extensive with 64-1013 sequences acquired per species from 5-10 countries in both Central and South America (Figure 3). In two of these cases (Artibeus lituratus and Carollia perspicillata) no geographic structuring is evident despite sequence divergences of up to 2.35% and 2.83% respectively. In the remaining ten species substantial mitochondrial structuring was observed. In four cases (Chrotopterus auritus, Saccopteryx bilineata, Anoura geoffroyi, and Sturnira lilium) distinct mitochondrial lineages within each species appear to have allopatric distributions. In Uroderma bilobatum, Central and South American groups are similarly evident except for one sample from Ecuador that groups with Central America (though see [29] for a discussion of U. bilobatum). Within each of the remaining five species (Platyrrhinus helleri, Glossophaga soricina, Desmodus rotundus, Trachops cirrhosus, and Pteronotus parnellii) distinct lineages are found with both allopatric and sympatric (either in whole or in part) distributional patterns. Similarly, C. brevicauda and C. sowelli, (formerly included in C. brevicauda but restricted to Central America) have a potential sympatric zone in central Panama ( Figure 4). In seven species (C. auritus, S. bilineata, S. lilium, P. helleri, G. soricina, A. geoffroyi and P. parnellii) the Central American specimens form a single group that is distinct from South American groups (Figure 3).

Discussion
To our knowledge, the present study is the largest survey ever conducted of land vertebrate mtDNA diversity. Our results provide further confirmation that DNA barcoding is a powerful tool for species identification in Neotropical bats regardless of geographic scale or sample size. Only two of the 163 species examined in this study (Artibeus intermedius and A. lituratus) share haplotypes and cannot be distinguished via DNA barcoding. The remaining species are distinguishable at this locus and the resulting library of molecular data will be a powerful tool for guiding systematic research and furthering phylogeographic studies. As our sequences are all derived from vouchered specimens the reference database will also be a valuable tool for validating field collections e.g. [12] when vouchering is impractical and the discrimination of some species requires examination of morphological characters which cannot be evaluated on live specimens (e.g. cranial or dental characters). In addition, molecular tools can help to identify partial remains or trace materials from guano when capture, morphological assessment or tissue acquisition are not possible [30,31].

Cryptic Taxa and Estimates of Diversity
DNA barcoding campaigns seek to simplify and aid in the identification of species, and to advance species discovery by using deep intraspecific sequence divergence between mitochondrial lineages as an indication of potential new species. Methods of identifying cryptic lineages are diverse. Distance-based methods are common, particularly using strict thresholds [15,17]. However, thresholds will not necessarily reveal recently diverged species and may inflate or deflate the species count within some genera if not accompanied by analyses of morphological, behavioral and ecological characteristics. Rate heterogeneity and variation in selective pressure on protein evolution in mitochondrial DNA likely contribute to levels of genetic divergence [32] but they also make character-based approaches [33][34][35], the 10x threshold rule [36] and other distance approaches [37,6] unlikely to provide more accurate estimates of cryptic species.
We estimate potential taxonomic richness by visual inspection of trees for distinct lineages that are well supported (most bootstrap values $90) and compared these to the criteria described by Bradley and Baker [17] and Baker and Bradley [15]. Only 30 of 137 taxa represented by 3 or more samples contained .2% mean sequence divergence and would be flagged by the Bradley and Baker [17] criteria. In contrast, by visually inspecting the trees for deep, intraspecific, mitochondrial structure we found 44 cases of potential cryptic speciation. In three cases, Furipterus horrens (2.48% mean sequence divergence), Enchisthenes hartii (2.12% mean sequence divergence) and Cyttarops alecto (3.69% mean sequence divergence), species had divergence .2% but no distinct mitochondrial lineages or ''phylogroups'' as defined in Baker and Bradley [15], though in all three cases determining the pattern of intraspecific divergence is complicated by a small sample size. Maximum sequence divergence was a similarly poor predictor of mitochondrial lineages. It is also interesting to note that one of the best examples to date of cryptic diversity and the genetic species concept in bats, Uroderma bilobatum [38], would not have been flagged for taxonomic reassessment as it had 1.13% mean sequence divergence though internal mitochondrial structuring was obvious by visual inspection of the tree. It should however be noted, that cytochrome b evolves at a faster rate than COI [39] so the criteria developed by Bradley and Baker [17] might need to be lowered for COI, though explicit tests of rate heterogeneity have not been made here and variation in selection pressure may alter this pattern. It remains to be seen how many cases of distinct mitochondrial lineages are associated with a cessation of gene flow -an assessment that will require the analysis of nuclear loci.
Even in this relatively well-studied group, our estimates of species richness suggest as much as a 42% increase in species diversity compared to current estimates (Table 1, 2,3,4,5,6,7,8). Though these are rough estimates, and can change depending on how ''intraspecific mitochondrial lineages'' is defined, they provide a guide for future systematic research and the number of cases is likely to increase with more complete geographic sampling, particularly with the addition of specimens from the Antilles due to the influence of island isolation [40]. In particular, the monotypic genera Desmodus and Trachops may contain as many as 15 intraspecific lineages, any of which may represent cryptic species (Figure 3, Table 4, Table 7) and this observation is in accordance with the high diversity in Desmodus observed by Martins et al. [41,42]. Of the 12 species with extensive geographic and individual sampling (Figure 3) six appear to contain multiple divergent lineages located within the same countries (particularly Ecuador, Guyana, and Suriname) suggesting at least partially sympatric ranges for these lineages and raising questions about modes of reproductive isolation, the role of male-mediated gene flow, and the frequency of hybridization.
Allopatric lineages can be difficult to define as they may appear allopatric due to incomplete sampling. In Saccopteryx bilineata (Figure 2a) our sampling suggests three distinct lineages that are strongly geographically isolated. However, no known break in the distribution of S. bilineata is currently recognized making it impossible to predict whether these lineages would become one   hyperdiverse cluster if sampling through Central America and northern South America were increased, or whether the lineages are maintained with allopatric or symptatric distributions. The genus Carollia contains newly described species which were recognized genetically [43,44]. Carollia brevicauda was thought to be distributed in both Central and South America until the Central American lineage was identified as distinct and revised as C. sowelli (Figure 4) by Baker et al. [43]. These species were reported as occupying allopatric distributions [9], but our data ( Figure 4) suggests a sympatric zone in central Panama though it cannot be determined from these data whether these species hybridize or live in reproductive isolation at this location. Previous regional assessments of bat diversity using COI [4,12] identified a number of species which may represent complexes of undescribed taxa though these were only investigated in small geographic areas. In the continental survey conducted here, lineages proposed by Clare et al. [4] and Borisenko et al. [12] were supported by increased sampling over broader geographic areas.

Future Research Directions
Mean sequence divergence in bats (1.38%) is substantially higher than that observed in birds (0.23%), the only other vertebrate group to have been surveyed across a continent [5]. However the birds were of North American origin so the effect of locality cannot be separated from that of taxonomy. Similarly, the proportion of distinct lineages reported here is high compared to birds [5], but not dissimilar to estimates provided for mammals by Baker and Bradley [15] and for South East Asian bats by Francis et al. [16]. Several clear research priorities exist to understand the biodiversity of Neotropical bats. First, the nature and extent of intraspecific sequence divergence must be quantified to provide an accurate measure of diversity, and this must be done in the context of selection, rates of mutation, protein evolution and the role of selective sweeps [45,46], particularly in hyperdiverse taxa. For taxonomic assessments, additional gene regions/markers, particularly of nuclear origin, will be required to understand evolutionary patterns e.g. [29]. Directed morphological analysis of species in potential areas of diversity will also help to clarify species boundaries. Because many bats do not rely on vision as a primary means for conspecific identification, they likely use other sensory modalities for mate recognition. Acoustic analysis of echolocation may identify the basis for intra-and interspecific recognition and potential modes of speciation [47]. Alternately, olfaction also plays a large role in habitat choice (particularly for food) and may also be utilized in intra-and interspecific recognition. For example, many of the ''whispering bats'' (family Phyllostomidae, widely represented in our dataset) use lower intensity echolocation calls (although see [48,49]) but tend to be frugivorous or nectivorous species which may rely heavily on olfactory cues for both food acquisition and mate recognition. Some insectivores, such as some sac-winged bats (Emballonuridae) also rely heavily on olfaction to attract mates [50]. Alternative isolating cues in these different sensory modalities may evolve faster in species where selection drives non-visual means of inter-and intraspecific recognition. While these traits cannot be evaluated in museum specimens, they may provide a wealth of research opportunities and a method of identifying cryptic modes of assortative mating and prezygotic reproductive isolation. Supporting Information Figure S1 A neighbour-joining tree of COI sequence divergence (K2P) in surveyed species. (PDF) Figure S2 Neighbour-joining trees of COI sequence divergence (K2P) in surveyed species simplified to show current species designations and cases of deeply divergent intraspecific lineages (coloured red) in need of further systematic study. For clarity, trees were generated on subsets of the total dataset. All branch supports represent boostrap values (1000 replications). (PDF)