Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The Integrative Taxonomic Approach Reveals Host Specific Species in an Encyrtid Parasitoid Species Complex

  • Douglas Chesters,

    Affiliation Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China

  • Ying Wang,

    Affiliations Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China, Division of Forest Protection, School of Forestry, Northeast Forestry University, Harbin, China

  • Fang Yu,

    Affiliation Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China

  • Ming Bai,

    Affiliation Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China

  • Tong-Xin Zhang,

    Affiliation Ningbo Technology Extension Center for Forestry and Specialty Forest Products, Ningbo, China

  • Hao-Yuan Hu,

    Affiliation College of Life Science, Anhui Normal University, Wuhu, China

  • Chao-Dong Zhu,

    Affiliation Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China

  • Cheng-De Li,

    Affiliation Division of Forest Protection, School of Forestry, Northeast Forestry University, Harbin, China

  • Yan-Zhou Zhang

    Affiliation Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China

The Integrative Taxonomic Approach Reveals Host Specific Species in an Encyrtid Parasitoid Species Complex

  • Douglas Chesters, 
  • Ying Wang, 
  • Fang Yu, 
  • Ming Bai, 
  • Tong-Xin Zhang, 
  • Hao-Yuan Hu, 
  • Chao-Dong Zhu, 
  • Cheng-De Li, 
  • Yan-Zhou Zhang


Integrated taxonomy uses evidence from a number of different character types to delimit species and other natural groupings. While this approach has been advocated recently, and should be of particular utility in the case of diminutive insect parasitoids, there are relatively few examples of its application in these taxa. Here, we use an integrated framework to delimit independent lineages in Encyrtus sasakii (Hymenoptera: Chalcidoidea: Encyrtidae), a parasitoid morphospecies previously considered a host generalist. Sequence variation at the DNA barcode (cytochrome c oxidase I, COI) and nuclear 28S rDNA loci were compared to morphometric recordings and mating compatibility tests, among samples of this species complex collected from its four scale insect hosts, covering a broad geographic range of northern and central China. Our results reveal that Encyrtus sasakii comprises three lineages that, while sharing a similar morphology, are highly divergent at the molecular level. At the barcode locus, the median K2P molecular distance between individuals from three primary populations was found to be 11.3%, well outside the divergence usually observed between Chalcidoidea conspecifics (0.5%). Corroborative evidence that the genetic lineages represent independent species was found from mating tests, where compatibility was observed only within populations, and morphometric analysis, which found that despite apparent morphological homogeneity, populations clustered according to forewing shape. The independent lineages defined by the integrated analysis correspond to the three scale insect hosts, suggesting the presence of host specific cryptic species. The finding of hidden host specificity in this species complex demonstrates the critical role that DNA barcoding will increasingly play in revealing hidden biodiversity in taxa that present difficulties for traditional taxonomic approaches.


Parasitoids are insects that feed upon arthropod hosts during larval development [1]. They represent a key division of terrestrial food webs [2], [3], [4], and yet knowledge, particularly on their species richness, is severely limited [5], [6]. This situation is understandable given the lack of morphological differentiation in many sibling species, and the methodological difficulties posed in rearing due to the presence of multiple tropic levels, and complex life cycle [7], but must be addressed if factual estimates of insect diversity and host-specificity are to be known. Parasitoids represent a substantial proportion of biodiversity, with about 8.5% of described insect species [2], yet this figure does not take into account current thinking on the constraints of host parasite relationships [3], [8], [9], [10], meaning the diversity of parasitoids may be a substantial underestimation.

The discovery of cryptic species is proliferating in no small part due to the adoption of molecular data into taxonomic study. In particular, a new tool has been developed and is widely adopted and tested, that is providing invaluable information about species identities in such difficult to study taxa. DNA barcoding typically uses universal primers to sequence a standardized segment of the mitochondrial COI gene [11]. The resulting data can be used in i) assigning taxon names to newly sequenced individuals, by reference to a barcode library, and more controversially, ii) delimiting species boundaries and thus assigning new species. Considerable investment has been made to the barcoding endeavor, with the barcode of life database (BOLD) currently holding over 110,000 species, with the eventual aim to obtain 10× coverage for all ∼ 10 million animal species [12]. The ease and rate at which barcode sequences are being obtained and analyzed mean they have been of great utility in highlighting possible cases of cryptic speciation, often prompting further taxonomic work [13], [14], [15], [16], [17], [18], [19], [20]. In the case of cryptic parasitic species, it is often found that the sibling populations correspond to differing hosts species [21], [22], [23], suggesting that host generalism has been assumed where it is unwarranted. Theory suggests generalism (host generalism and otherwise) is unlikely to be maintained though speciation [10], meaning apparent examples of generalism are illusory, and thus current biodiversity estimates are an underestimation [24]. Given the breadth of inquiries and biological endeavors that may be sensitive to the accurate description of species, and the power of DNA barcoding to provide extensive divergence information with little expertise or taxon specific knowledge, it seems inevitable that taxonomic description will incorporate barcoding-like approaches, and that patterns in host-parasite relationships will be better resolved.

While DNA barcode datasets sweep through biodiversity, few would advocate replacing current species descriptions with groupings defined by sequence variation from a single fragment of mitochondrial DNA. No particular approach to taxonomy is without complication, and the theoretic causes of incongruence between mitochondrial variation and a species tree are well known [25]. There is intuitive benefit in taking a whole evidence, or ‘integrative’ approach to taxonomy [26], and consult evidence from different disciplines in order to avoid pitfalls associated with a single approach. Incongruence between methods arises from various aspects. Firstly, while a general consensus is emerging on a definition of the species [27], disagreements remain on the degree of divergence at which separately evolving populations are regarded as different species [28], [29]. In addition, the evolutionary processes resulting in population divergence are heterogeneous [30]. The integrative taxonomy approach uses numerous such lines of evidence to corroborate taxonomic hypotheses, without ruling out that a single delineation criterion may correctly indicate the species [26]. Commonly used delimitation criteria include phenotypic distinctiveness, ecological niche divergence [31], reciprocal monophyly [32] and clustering of molecular data [33]. For example, extensive mitochondrial variation alone cannot be used to infer species, where reproductive compatibility is still present [34]. In the current paper we take an integrative approach to delineate species in the E. sasakii complex. E. sasakii are endoparasitic Hymenoptera belonging to the hyperdiverse wasp family, Encyrtidae (Hymenoptera: Chalcidcoidea). The hosts of E. sasakii are scale insects (of the Coccoidea superfamily), specifically, Rhodococcus sariuoni, Takahashia japonica, Eulecanium kuwanai and Eulecanium gigiantea [35], [36], [37], [38], [39], [40]. We find evidence of extensive molecular variation at the barcode locus among E. sasakii populations inhabiting different hosts, and find corroboration in the form of reproductive and morphometric characteristics.


Collection of Host Populations

In view of the broad range of hosts recorded for E. sasakii in the literature (see above), a survey of the hosts yielding E. sasakii was carried out during the period 2006–2010. However, only the host species Eulecanium kuwanai (Kuwana), Eulecanium giganteum (Shinji), Takahashia japonica (Cockerell) and Rhodococcus sariuoni generated the E. sasakii parasitoid. These host species are distributed in central and northern China, Japan (T. japonica) and Korea (E. kuwanai). In total, 18 populations of the host species were collected from host plants (Sophora japonica, Lorpetalum chinense, Ulmus sp. etc), throughout their continental range (Figure 1). Twigs from scale insect infested plants were returned to the lab and parasitoids segregated upon emergence. ∼2000 E. sasakii individuals were lab reared. Parasitoids were identified by author Yan-Zhou Zhang. The host scale insects were identified by an experienced taxonomist, Professor San-An Wu.

Ethics Statement

No specific permits were required for the described field studies.

DNA Extraction, PCR and Sequencing

DNA was extracted from adult specimens using the DNeasy Blood & Tissue Kit (Qiagen) according to the manufacturer’s protocols. All PCRs were performed on an Eppendorf thermal cycler, using 50 µL reaction volume as follows: 5 µL DNA template, 5 µL 10× Buffer (Takara), 25 mM MgCl2, 2.5 mM dNTP mixture, 10 pmol of each primer, and 1 unit of ExTaq DNA polymerase (Takara). To amplify 28S ribosomal gene D2 expansion segment, the primers D2-3549 [F] 5′- AGTCGTGTTGCTTGATAGTGCAG -3′ [41] and D2-4068[R] 5′-TTGGTCCGTGTTTCAAGACGGG-3′ [42] were used. PCR cycles were as follows: 3 min at 94°C; 30 cycles of 1 min at 94°C, 45 s at 58C, 1 min at 72°C; followed by 6 min at 72°C. The mitochondrial cytochrome oxidase I (COI) gene was amplified using the universal DNA barcoding primers LCO1490 (5′-GGTCAACAAATCATAAAGATATTGG-3′), and HCO2198 (5′-TAAACTTCAGGGTGACCA) [43]. The PCR program was as follows: 1 cycle of 3 min at 94°C, 5 cycles of 1 min at 94°C, 1 min at 45°C, and 1.5 min at 72°C, followed by 30 cycles of 1 min at 94°C, 1 min at 50°C, and 1 min at 72°C, with a final step of 5 min at 72°C. PCR products were electrophoresed through agarose gel (1%) then sequenced using BigDye v3.1 on an ABI PRISM 3730×l DNA Analyzer.

Figure 2. Forewing of E. sasakii, showing positions of the seven landmarks used for morphometric analyses.

Analysis of Molecular Data

Sequence alignment was unambiguous, and carried out manually using BioEdit [44]. Model testing was performed on individual partitions, and the concatenated matrix, using MrAIC v1.4.3 [45] and PhyML v2.4.4 [46]. Phylogenies were then inferred under the optimal evolutionary model using MrBayes v3.1.2 [47]. Evolutionary parameters (state frequencies, substitution rates, alpha and the proportion of invariant sites) were allowed to vary amongst four partitions; 28 s, and the three codon positions of COI. Two independent runs were performed, both with one cold and seven heated chains, and sampled at intervals of 10,000. Runs were terminated when the standard deviation of split frequencies dropped below 0.01, then the parameter distributions checked using Tracer v1.5 [48]. Neighbor joining trees were also generated under the optimal model, using Paup*4b [49]. The branch-lengths on the Bayesian phylogeny and the NJ phylogram were adjusted by non-parametric rate smoothing [50] to form an ultrametric tree for analysis of branch waiting times. Branch rate smoothing was carried out using the r8s program [51], fixing the age of the root node at an arbitrary value of 1.0. The evolutionary units on the ultrametric trees were then inferred using the general mixed Yule coalescent approach (GMYC) [52], with a likelihood ratio test performed of a GMYC model against a null model whereby a single coalescent population was fit upon the tree.

Figure 3. Bayesian consensus phylogeny of E. sasakii.

Node support is indicated by posterior probabilities, and is given where >80. The upper (green), central (red), and lower (blue) clade represent specimens isolated from R. sariuoni, T. japonica, and E. kuwanai/E. gigiantea, respectively. First two letters of terminal name indicate sampling locality, where QH = Qinhai, SD = Shandong, BJ = Beijing, SH = Shaanxi, HJ = Heilongjiang, JL = Jilin, JS = Jiangsu, HN = Henan, SX = Shanxi.

Figure 4. GMYC groups on the ultrametric NJ tree, generated from 31 unique haplotypes.

Three clusters (shown in highlighted boxes) and one singleton (BJ0893A) are found as significant GMYC entities.

The molecular distances between individuals from different populations were calculated by the standard K2P measure for DNA barcodes, using Paup*4b, and characters diagnosing the populations identified using the Caos software [53]. The distribution of molecular divergences found between the populations was compared to divergences in Chalcidoidea as a whole, using i) intraspecific divergences, and ii) congeneric divergences. All Chalcidoidea DNA sequences were downloaded from Genbank, and searched locally using software from the Blast+ toolkit [54]. A Chalcidoidea database was created with makeblastdb, and queried using one of the newly sequenced E. sasakii COI sequences (JS06A). The blastn method was used for homology searching, with a strict e-value cutoff of 1e-5, and the tabular output format invoked (option: -outfmt 6) to aid parsing. The hit sequences were then extracted and a fasta file formed, using a Perl script. The COI barcode sequences were then aligned using the protein version of BlastAlign [55], against the translated JS06A sequence. The aligned Chalcidoidea sequences were checked by eye and the edges trimmed, using BioEdit. Where species were fully identified (where the species string in the description line matched the typical binomial format), the K2P distances were calculated as previously. The molecular distances were then split into intraspecific observations, and congeneric observations. The E. sasakii and Chalcidoidea distances were read into R for analysis [56].

Table 1. Specimens information on the sequences used in molecular analyses.

Figure 5. Boxplot giving pair-wise molecular distances between, (upper) individuals from different species of the same genus in the Chalcidoidea, (central) different members of the same species in the Chalcidoidea, (lower) individuals belonging to different E. sasakii host-related populations.

Morphometric Analysis

Geometric morphometrics have been used to study various insect taxa ranging from species level to analysis of a superfamily, and have been informative in investigating relationships between members of lower taxonomic levels [57]. In this study, the first application of geometric morphometrics in Encyrtidae was carried out. Although previous taxonomy of the genus Encyrtus [58], [59] has focused on the shape of both the antenna and its forewing, due to high variation and the difficulty in preparing of slide mounted antennae, here only the forewings are used. In total, 59 specimens were prepared for geometric morphometric analysis, using individuals randomly selected from those used for DNA extraction, and covering all populations. The specimens were dissected and examined using a Leica MZ12.5 stereoscope. The microphotographs were taken from slide mounted specimens using an EVOS f1 inverted microscope. Seven landmarks were selected to describe variation in wing morphology (Figure 2). The landmarks were as follows: 1, the beginning of submarginal vein; 2, the end of submarginal vein/beginning of marginal vein; 3, the end of marginal vein/beginning of post marginal vein/beginning of stigmal vein; 4, the end of postmarginal vein; 5, the end of stigmal vein; 6, the tip of forewing; 7, the tip of posterior margin of forewing. Cartesian coordinates of the landmarks were digitized with tps-DIG 2.05 [60]. In order to reduce the measurement error all specimens were digitized twice. The coordinates were analyzed using tps-RELW 1.44 [61] to calculate eigen values for each principal warp. Statistical analyses were performed using SPSS version 16.0 for windows [62].

Mating Tests

The courtship and mating behaviors of E. sasakii intrapopulation and interpopulation pairs were observed through reciprocal crosses. Crosses were performed during the period of host emergence overlap (May). Virgin individuals were paired in vials (one male and one female per vial) and observed for 7 days, with 10 replicates performed for each of the nine possible reciprocal population combinations. A solution of bee honey (50%) was provided as food supply during the mating tests.


Analysis of Molecular Data

Fragments for COI and 28S were successfully sequenced for 83 E. sasakii specimens, from 18 populations plus the outgroup Encyrtus auranti shown as 0704, in Figures 3 and 4 (detailed information see Table 1). After edge trimming, the data matrix consisted of 631 base pairs for COI and 511 bases for 28S. The 28S gene was virtually invariant for the sequenced specimens, however it contained a single base substitution (at site 205), with the cytosine character unique to samples obtained from the host R. sariuoni, and thymine for samples obtained from hosts T. japonica and E. kuwanai. Typically for insect mitochondrial genes, the AT content was high (68.8%), however, at the lower end of the range compared to other parasitic wasps, e.g. 74.85% in Cynipidae [63], 74.0% in Apocrita [64], 72% in Eulophidae [65], 68% in Braconidae [66].

The degree of genetic divergence in COI was found to be particularly high between the three populations. The mean K2P distance between pairs belong to different E. sasakii populations was 11.24%, with 1.5% divergence within populations. In order to determine if this was significantly high compared with species in the superfamily as a whole, 2393 Chalcidoidea barcode sequences (225 fully identified species and 77 genera) were downloaded from Genbank and aligned, then K2P distances for two classes (intraspecific and congeneric) were calculated. Figure 5 plots K2P values for the Chalcidoidea, along with the divergences between the three E. sasakii populations. While the E. sasakii molecular divergences do not belong to either the intraspecific or congeneric Chalcidoidea distributions (p<0.001 in both cases, unpaired Wilcoxon signed rank test), the median E. sasakii divergence (0.113) is over an order of magnitude higher than the median Chalcidoidea intraspecific divergence (0.005), and well within the same order of magnitude than the median Chalcidoidea congeneric divergence (0.155), indicating the E. sasakii populations show molecular variation more representative of congeners.

Characters diagnostic of the three main populations were identified using Caos. 122 (19.4% of the COI positions) were found diagnosing one or more of the populations, where all the characters were classed as simple (non-compound). These 122 sites were subdivided into 73 pure (unique to all members of the clade) and 49 private (present in some clade members but absent in other clade) positions. Figure 6 gives a graphic illustration of the 73 pure diagnostic characters when isolated from the dataset, and a table giving the total 298 characters (with population identity, diagnostic character state, position and confidence value) is provided in the supplementary file (File S1).

Figure 6. The 73 pure diagnostic characters isolated from the COI alignment.

Figure 7. Three-D scatter plots constructed from principal component analyses of the landmark data set.

In the scatter plots the first, second and third principal components were plotted on the x (RW1), y (RW2) and z (RW3) axis respectively.

The molecular data were subject to evolutionary analyses using NJ and Bayesian approaches. Due to the low number of parameters and low variation in some partitions (28S in particular), we used the AICc to determine the best fit model for the un-partitioned dataset, which was found to be the general time reversible with gamma distributed rates (d.f. 174, lnL -3156, AICc 6723, wAICc 0.71). Two independent MrBayes runs successfully converged (the standard deviation of split frequencies <0.01) after 12,950,000 generations. The parameters were checked in Tracer, where the estimated sample sizes were >200 in virtually all cases. The tree was summarized after discarding the burnin phase (25%), and shown in Figure 3. Three monophyletic clades were recovered corresponding to the three host populations, each with high posterior probabilities, and long subtending branch lengths. The host specificity was found to be complete in that all specimens within a clade were reared from the same host, without exception.

We next determined whether the pattern of branch lengths in the trees were characteristic of both within and between species branching events. For the Bayesian tree, we found no significant shift between Yule and Coalescent branch waiting times (lnL of GMYC model  = 393.5, lnL of null model  = 392.8, likelihood ratio  = 1.44, p = 0.70). We also performed this analysis on a NJ tree; unique haplotypes were isolated from the dataset, and a NJ tree generated under the GTR gamma model. As shown in Figure 4, the three significant GMYC clusters corresponded to the host associated groups apart from one sequence (BJ0893A) excluded from the E. kuwanai associated cluster (the lower blue colored clade in Figure 4). The GMYC model was a significant improvement in fit, over the null model of a single coalescent cluster (null lnL = 104, GMYC lnL = 110, likelihood ration = 12, p = 0.007), indicating the shift to longer branches separating the E. sasakii populations are characteristic of a change to interspecies branch waiting times.

Morphometric Analysis

The relative warps analysis and cluster analysis of forewing shape revealed a trend dividing the populations into three host associated groups (Figure 7). The contribution of the 1st, 2nd, 3rd and 4th canonical variates to the total variance was 26.8, 20.57, 17.4 and 13.02 percent, respectively. To ensure reliability of the results, the first ten canonical variates were used for cluster analysis in SPSS 16.0. Analysis of Variance (ANOVA) tests were performed to determine population differences in forewing shapes. The three host clusters were significantly distinct in the first (p<0.01; F = 43.117; d.f. = 2), second (P<0.05; F = 3.527; d.f. = 2), and third variates (p<0.01; F = 13.56, d.f. = 2).

Mating Test

Courtship and mating behavior were recorded as they occurred, in reciprocal crosses for all combinations of the three E. sasakii populations. Typical receptive behavior consisted of antennal contact followed by copulation [67], and repellence fighting occurred when the female was unreceptive. Courtship and mating behavior were observed in intra-population crosses only, never in inter-population crosses (Table 2), indicating pre-copulatory barriers to gene flow between host-specific populations.


Barcode Divergence, Molecular Delineation and Identification

The likely case of cryptic speciation in E. sasakii was initially made apparent during routine DNA barcode sequencing, and the molecular evidence supporting the promotion of the host-specific populations to species level remains particularly striking. The degree of molecular divergences at the COI barcode locus, fell well outside the expected distribution for individuals of the same species (Figure 5). The inter-population divergence (11.24%) was found to be an order of magnitude higher than within-population divergence (1.5%), consistent with the barcode species criterion given by Hebert et al. [13]. But the major advantage of quantifying absolute level of divergence for COI in particular is the comprehensive benchmarks available in the literature. Hebert et al. [11], reported K2P divergence for Lepidoptera families as 0.17–0.33% for within species and 5.8–9.1% within genera. Ball et al. [68] gave 1.1% for within species and 18.1% for congeners in mayflies. Molbo et al. [69] discovered cryptic species where molecular divergence was 4.2–6.6% (amongst other lines of evidence). In a comprehensive analysis of barcode divergence using a number of mined insect datasets, Meier et al. [70] reported mean intraspecific/interspecific divergences as 2/11.2 for Coleoptera, 1.3/10.1 in the Diptera, 1.8/9.3 for Hymenoptera, and 0.7/6.2 for Lepidoptera, amongst others. While the intraspecific and congeneric divergences appear somewhat limited in their ability to vary across taxonomic groups, we thought it prudent to calculate specific values for the inclusive clade in which sufficient data were available. Figure 5 shows that the divergences between populations of E. sasakii are more representative of congeneric Chalcidoidea than intraspecific.

Given the molecular divergences, and the other advantages of barcode identification (e.g. ease of sequencing and non-requirement of taxon specific expertise), we suggest it warrants the adoption of molecular identification in this species complex. It has been demonstrated here that the properties of COI make it amenable to a number of proposed barcoding methods. The structuring of genetic variation makes the COI barcode an ideal marker for identification in this species complex, both due to the amount of divergence (Figure 5), and the robust reciprocal monophyly of the populations (Figure 3). The diagnostic characters given in the File S1 provide the rules for assignment of future query sequences to the newly proposed species. When used with algorithms such as Caos [53], such identification can be rapid and automatable.

Further analysis of the combined molecular data revealed that the three populations were recovered as robust monophyletic groups. Reciprocal monophyly requires fixation of divergent characters, these being typical of the later stages of lineage evolution [29]. However, further analysis of the shape of branching patterns was less clear-cut. The GMYC model tests for the presence of a shift from Yule (between species) to coalescent (within species) branch-lengths in an ultrametric tree, but was found significant for the NJ tree only. However, the choice of tree building method is likely a confounding factor for this test. Monaghan et al. [71] has previously noted the circularity of testing for a shift in branching pattern, on a tree that has been inferred under one of the very models being tested for. The imposition of root to tip branch length pattern during a tree search is very apparent using for example, the Beast software [48], where the default setting for branch-length model is coalescent, with additional options of Yule and birth-death. Preliminary analyses (not shown) were performed using this software, but these models has a clear bias on the resulting tree-shapes. A preferable approach would be tree inference independent of such models. In the current paper we applied the GMYC to a tree inferred under a Bayesian model in which branch lengths were unconstrained (non-clock), which precludes the imposition of root to tip branching model (which in the MrBayes clock trees include uniform, birth-death and coalescent), although the branch-lengths are sampled from a specified distribution (uniform or exponential). In an attempt to avoid all possible imposition of branch length bias we repeated the GMYC using a simple NJ tree, which was found to give significant GMYC groups. The analysis highlighted that where the aim is to analyze shift in these different types of branching it may be advisable to consult simpler tree building approaches, which may avoid some confounding effects.

Integrative Taxonomy

Where the sample is limited (for example considering key species complexes), variance in the pattern of intra/inter species divergence appears greater [72], meaning the host correlated divergence observed in E. sasakii may simply represent a local increase in intraspecific variation. Confirmation that high divergence is the result of independent evolution should be obtained by reference to other character types. If other characters do not covary with molecular divergence, then it is not necessarily the case that multiple species are present [73]. The integrative approach to taxonomy overcomes biases associated individual lines of evidence and increases the information on which taxonomic hypotheses are tested [74]. Where corroborative evidence has been found from independent sources that support an alternative hypothesis, ‘breaking out’ of the current taxonomy is deemed reasonable [75]. Independent evidence may come from a number of sources, including various forms of molecular data, morphology, ecology, behavior, geography, and reproductive capacity [69], [74], [75], [76], [77]. Here, in addition to the molecular evidence, we show that i) the molecular clusters correspond to three clusters formed from certain morphometric characteristics, ii) these three putative taxonomic units inhabit differing niches (hosts), and iii) individuals from different hosts, when paired, show no mating capacity.

The hypothesis of cryptic species was further tested using morphometrics of the forewing. Forewing shape has been proposed as a morphometric-based population/species diagnostic character in the Hymenoptera, due to ease of slide preparation and high discriminatory power [78], [79], [80], [81], [82], [83]. In the current study we find the phenetic clusters based on the forewing shape are generally consistent with the phylogenetic classification, with both methods indicating differentiation according to host species. The populations isolated from the three hosts showed partially overlapping variation in wing pattern, reflecting the difficulties commonly encountered when analyzing morphological characters in sibling species groups [8]. However, the molecular divergence (Figure 5) and lack of courting or mating behavior between the R. sariuoni and T. japonica populations (Table 2) indicate these entities would be regarded as different species, according to many definitions of the concept. The unified species concept requires any method of delineation conforming to a single species concept in order to infer a species boundary, but where a delineation is congruent under multiple concepts (here for example, certainly the phylogenetic species concept and the biological species concept apply), the hypothesis can only be considered more robust [27].

Cryptic Species and Host Specificity

There is an increasing number of cases where the initial analysis of molecular data has led to the discovery of previously unknown divergent features, but where parasitic taxa are under study, divergent populations usually corresponds to host specific races [84]. In E. sasakii, the three divergent genetic clusters (Figure 3) correspond to scale insect hosts, with geographic separation unlikely to have a substantial contribution to the molecular differentiation, since within clades, geographic sampling is widely ranged. For example the basal R. sariuoni associated clade (upper, green clade in Figure 3) contains samples obtained from regions ranging from central to far eastern China, covering areas sympatric with that of E. kuwanai associated parasitoids. This indicates the recent distribution of E. sasakii across much of the sampled range, whereas gene flow is prevented across different host groups, with all members of the basal clade isolated from a single scale insect species. While general conclusions can not be drawn based on this single species complex, there is a growing body of research indicating such host specificity is much more prevalent than previous diversity estimates suggest [10], [15], [18], [85], [86], [87], [88], [89], [90]. However, the route towards accurate estimates of diversity will be hindered by naive application of molecular sampling. As observed in E. sasakii, the presence of sympatric host races means informed approaches (particularly, using host identities) to the barcode sampling strategy are required to capture the diversity.

Supporting Information

File S1.

COI diagnosing character states for E. sasakii populations. Column ‘group’ gives population, ‘pos’ is COI site position, ‘state’ is diagnostic character state, and ‘conf’, confidence value.



The authors would like to thank Prof. San-An Wu, Beijing Forestry University, for identifying scale insects, and Dr. De-Yan Ge, for providing useful advice on morphometric analysis. Two anonymous reviewers gave us valuable comments on the first version of our manuscript.

Author Contributions

Conceived and designed the experiments: YZZ DC YW. Performed the experiments: DC YW FY YZZ MB TXZ HYH CDZ CDL. Analyzed the data: DC YW YZZ FY MB. Contributed reagents/materials/analysis tools: YZZ DC YW FY MB TXZ HYH. Wrote the paper: DC YW YZZ FY MB. Designed the software used in analysis: YZZ DC YW.


  1. 1. Sequeira R, Mackauer M (1992) Nutritional ecology of an insect host parasitoid association – the pea aphid Aphidius ervi system. Ecology 73: 183–189.
  2. 2. Godfray HCJ (1994) Parasitoids: Behavioral and evolutionary ecology, Princeton University Press. 520 p.
  3. 3. Quicke DLJ (1997) Parasitic wasps. London: Chapman and Hall. 470 p.
  4. 4. Hassell MP (2000) The spatial and temporal dynamics of host–parasitoids interactions. Oxford: Oxford University Press. 208 p.
  5. 5. Jones OR, Purvis A, Baumgart E, Quicke DLJ (2009) Using taxonomic revision data to estimate the geographic and taxonomic distribution of undescribed species richness in the Braconidae (Hymenoptera: Ichneumonoidea). Insect Conservation and Diversity 2: 204–212.
  6. 6. Santos AMC, Jones OR, Quicke DLJ, Hortal J (2010) Assessing the reliability of biodiversity databases: identifying evenly inventoried island parasitoid faunas (Hymenoptera: Ichneumonoidea) worldwide. Insect Conservation and Diversity 3: 72–82.
  7. 7. Noyes JS (1982) Collecting and preserving chalcid wasps (Hymenoptera: Chalcidoidea). Journal of Natural History 16: 315–334.
  8. 8. Bensch S, Perez-Tris J, Waldenstrom J, Hellgren O (2004) Linkage between nuclear and mitochondrial DNA sequences in avian malaria parasites: Multiple cases of cryptic speciation? Evolution 58: 1617–1621.
  9. 9. Westenberger SJ, Sturm NR, Yanega D, Podlipaev SA, Zeledon NR, et al. (2004) Trypanosomatid biodiversity in Costa Rica: genotyping of parasites from Heteroptera using the spliced leader RNA gene. Parasitology 129: 537–547.
  10. 10. Loxdale HD, Lushai G, Harvey JA (2011) The evolutionary improbability of ‘generalism’ in nature, with special reference to insects. Biological Journal of the Linnean Society, 103: 1–18.
  11. 11. Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society of London, Biological Sciences Series B 270: 313–321.
  12. 12. Ratnasingham S, Hebert PDN (2007) 7: 355–364. BOLD: The barcode of life data system ( Molecular Ecology Notes.
  13. 13. Hebert PDN, Penton EH, Burns JM, Janzen DH, Hallwachs W (2004) Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences of the United States of America 101: 14812–14817.
  14. 14. Hajibabaei M, Janzen DH, Burns JM, Hallwachs W, Hebert PDN (2006) DNA barcodes distinguish species of tropical Lepidoptera. Proceedings of the National Academy of Sciences of the United States of America 103: 968–971.
  15. 15. Smith MA, Woodley NE, Janzen DH, Hallwachs W, Hebert PDN (2006) DNA barcodes reveal cryptic host-specificity within the presumed polyphagous members of a genus of parasitoid flies (Diptera: Tachinidae). Proceedings of the National Academy of Sciences of the United States of America 103: 3657–3662.
  16. 16. Burns JM, Janzen DH, Hajibabaei M, Hallwachs W, Hebert PDN (2008) DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservación Guanacaste, Costa Rica. Proceedings of the National Academy of Sciences of the United States of America, 105: 6350–6355.
  17. 17. Yassin A, Capy P, Madi-Ravazzi L, Ogereau D, David JR (2008) DNA barcode discovers two cryptic species and two geographical radiations in the invasive drosophilid Zaprionus indianus. Molecular Ecology Resources 8: 491–501.
  18. 18. Smith MA, Rodriguez JJ, Whitfield JB, Deans AR, Janzen DH, et al. (2008) Extreme diversity of tropical parasitoid wasps exposed by iterative integration of natural history, DNA barcoding, morphology, and collections. Proceedings of the National Academy of Sciences of the United States of America 105: 12359–12364.
  19. 19. Pfenninger M, Nowak C, Kley C, Steinke D, Streit B (2007) Utility of DNA taxonomy and barcoding for the inference of larval community structure in morphologically cryptic Chironomus (Diptera) species. Molecular Ecology 16: 1957–1968.
  20. 20. deWaard JR, Hebert PDN, Humble LM (2011) A comprehensive DNA barcode library for the looper moths (Lepidoptera: Geometridae) of British Columbia, Canada. PLoS ONE 6, e18290:
  21. 21. Herre EA (2006) Barcoding helps biodiversity fly. Proceedings of the National Academy of Sciences of the United States of America 103: 3949–3950.
  22. 22. Smith MA, Wood DM, Janzen DH, Hallwachs W, Hebert PDN (2007) DNA barcodes affirm that 16 species of apparently generalist tropical parasitoid flies (Diptera, Tachinidae) are not all generalists. Proceedings of the National Academy of Sciences of the United States of America 104: 4967–4972.
  23. 23. Zhang YZ, Si SL, Zheng JT, Li HL, Yu F, et al. (2011) DNA barcoding of endoparasitoid wasps in the genus Anicetus reveals high levels of host specificity (Hymenoptera: Encyrtidae). Biological Control, 58: 182–191.
  24. 24. Bickford D, Lohman DJ, Sodhi NS, Ng PK, Meier R, et al. (2007) Cryptic species as a window on diversity and conservation. Trends in Ecology & Evolution 22: 148–155.
  25. 25. Ballard JWO, Whitlock MC (2004) The incomplete natural history of mitochondria. Molecular Ecology 13, 729–744:
  26. 26. Padial JM, Miralles A, De la Riva I, Vences M (2010) The integrative future of taxonomy. Frontiers in Zoology 25: 7–16.
  27. 27. De Queiroz K (2007) Species concepts and species delimitation. Systematic Biology 56: 879–886.
  28. 28. Hey J (2006) On the failure of modern species concepts. Trends in Ecology and Evolution 21: 447–450.
  29. 29. Mallet J (2008) Hybridization, ecological races and the nature of species: empirical evidence for the ease of speciation. Philosophical Transactions of the Royal Society B, 363: 2971–2986.
  30. 30. Wiley EO (1978) The evolutionary species concept reconsidered. Systematic Zoology, 27: 17–26.
  31. 31. Van valen L (1976) Energy and evolution. Evol. Theory 1: 179–229.
  32. 32. Donoghue MJ (1985) A critique of the biological species concept and recommendations for a phylogenetic alternative. The Bryologist 88: 172–181.
  33. 33. Mallet J (1995) A species definition for the modern synthesis. Trends Ecol Evol 10: 294–299.
  34. 34. Lysyk TJ, Scoles GA (2008) Reproductive compatibility of prairie and montane populations of Dermacentor andersoni. Journal of Medical Entomology 45: 1064–1070.
  35. 35. Liao DX, Li XL, Pang XF, Chen TL (1987) Hymenoptera: Chalcidoidea (1). Economic Insect Fauna of China, 34: 1–241.
  36. 36. Xie YP (1998) The scale insects of the forest and fruit trees in shanxi of China. Beijing: China Forestry Publishing House. pp. 46–47.
  37. 37. Xu ZH, Huang J (2004) Chinese fauna of parasitic wasps on scale insects. Shanghai: Shanghai Scientific & Technical publishers. 524 p.
  38. 38. Lu XP (2006) Research on occurring disciplinarian of scale insect parasitized by parasitical wasp in Sophora japonica. Forest Science and Technology 3l(2): 34–36.
  39. 39. Xie YP, Fu XH, Xue JL, Zhang XM, Zhang YF (2007) Observation of the morphology during development of Encyrtus sasakii (Hymenoptera: Encyrtidae) in the body of its host scale insect Eulecanium kuwanai. Entomotaxonomia 29(2): 145–151.
  40. 40. Lou JX, Fang H, Ding XY (2011) Chalcidoidea and Chrysidoidea fauna in the northeast China. Beijing: Beijing Normal University publishing Group. 388 p.
  41. 41. Campbell BC, Steffen-Campbell JD, Werren JH (1993) Phylogeny of the Nasonia species complex (Hymenoptera: Pteromalidae) inferred from an internal transcribed spacer (ITS2) and 28S rDNA sequences. Insect Molecular Biology 2: 225–237.
  42. 42. Campbell BC, Heraty JM, Rasplus JY, Chan K, Steffen-Campbell JD, et al. (2000) Molecular systematic of the Chalcidoidea using 28S-D2 rDNA. In: Austin A D, Dowton M, editors. Hymenoptera, evolution, biodiversity and biological control. CSIRO Publishing, Collingwood, Australia. pp. 59–73.
  43. 43. Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R (1994) DNA primers for amplification of mitochondrial cytochrome C oxidase subunit I from diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology 3: 294–299.
  44. 44. Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series, 41: 95–98.
  45. 45. Nylander J (2005) MrAIC 1.4. [Computer software and manual]. School of Computational Science (SCS). Tallahassee, Florida: Florida State University.
  46. 46. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology 52: 696–704.
  47. 47. Ronquist F, Huelsenbeck JP (2003) MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572–1574.
  48. 48. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 7: 214.
  49. 49. Swofford DL (2003) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.
  50. 50. Sanderson MJ (1997) A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218–1231.
  51. 51. Sanderson MJ (2003) r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19: 301–302.
  52. 52. Pons J, Barraclough TG, Gomez-Zurita J, Cardoso A, Duran DP, et al. (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Systematic Biology 55: 595–609.
  53. 53. Sarkar IN, Planet PJ, DeSalle R (2008) CAOS software for use in character-based DNA barcoding. Molecular Ecology Resources 8: 1256–1259.
  54. 54. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, et al. (2009) BLAST+: architecture and applications. BMC Bioinformatics 10: 1–421.
  55. 55. Belshaw R, Katzourakis A (2005) BlastAlign: a program that uses blast to align problematic nucleotide sequences. Bioinformatics 21: 122–123.
  56. 56. R Development Core Team (2008) R: A language and environment for statistical computing [Computer software and manual]. R Foundation for Statistical Computing, Vienna, Austria. 3: ISBN 3-900051-07-0, Available: Accessed 2011 Jun.
  57. 57. Bai M, McCullough E, Song KQ, Liu WG, Yang XK (2011) Evolutionary Constraints in Hind Wing Shape in Chinese Dung Beetles (Coleoptera: Scarabaeinae). PLoS ONE 6, e21600:
  58. 58. Sugonjaev ES, Gordh G (1981) Taxonomy and trophic relations of parasitic wasps of the genus Encyrtus Latr. (Hymenoptera; Encyrtidae) of the Holarctic region. Entomological Review, Washington 60: 124–139.
  59. 59. Prinsloo GL (1991) Revision of the Afrotropical species of Encyrtus Latreille (Hymenoptera; Encyrtidae). Entomology Memoir of the Department of Agricultural Developemtn of the Republic of South Africa 84: 1–30.
  60. 60. Rohlf FJ (2006) tps-DIG, Digitize Landmarks and Outlines, Version 2.05. [Software and Manual]. New York: Department of Ecology and Evolution. State University of New York at Stony Brook.
  61. 61. Rohlf FJ (2006) tps-RELW, Relative warps analysis, version 1.44. [Software and Manual]. New York: Department of Ecology and Evolution. State University of New York at Stony Brook.
  62. 62. SPSS (2007) SPSS version 16.0 for Windows.
  63. 63. Rokas A, Nylander JAA, Ronquist F, Stone GN (2002) A maximum-likelihood analysis of eight phylogenetic markers in gallwasps (Hymenoptera: Cynipidae): implications for insect phylogenetic studies, Molecular Phylogenetics and Evolution 22: 206–219.
  64. 64. Dowton M, Austin AD (1995) Increased genetic diversity in mitochondrial genes is correlated with the evolution of parasitism in the Hymenoptera. Journal of Molecular Evolution 41: 958–965.
  65. 65. Sha ZL, Zhu CD, Murphy RW, La Salle J, Huang DW (2006) Mitochondrial phylogeography of a leafminer parasitoid, Diglyphus isaea (Hymenoptera: Eulophidae) in China, Biological Control 38: 380–389.
  66. 66. Murphy N, Banks JC, Whitfield JB, Austin AD (2008) Phylogeny of the parasitic microgastroid subfamilies (Hymenoptera: Braconidae) based on sequence data from seven genes, with an improved time estimate of the origin of the lineage. Molecular Phylogenetics and Evolution 47: 378–395.
  67. 67. Guerrieri E, Pedata P, Romani R, Isidoro N, Bin F (2001) Functional anatomy of male antennal glands in three species of Encyrtidae (Hymenoptera: Chalcidoidea). Journal of Natural History 35: 41–54.
  68. 68. Ball SL, Hebert PDN, Burian SK, Webb JM (2005) Biological identification of mayflies (Ephemeroptera) using DNA barcodes. Journal of the North American Benthological Society 24: 508–524.
  69. 69. Molbo D, Machado CA, Sevenster JG, Keller L, Herre EA (2003) Cryptic species of fig-pollinating wasps: Implications for the evolution of the fig-wasp mutualism, sex allocation, and precision of adaptation. Proceedings of the National Academy of Sciences of the United States of America, 100: 5867–5872.
  70. 70. Meier R, Zhang G, Ali F (2008) The use of mean instead of smallest interspecific distances exaggerates the size of the “Barcoding Gap” and leads to misidentification. Systematic Biology 57: 809–813.
  71. 71. Monaghan MT, Wild R, Elliot M, Fujisawa T, Balke M, et al. (2009) Accelerated species inventory on Madagascar using coalescent-based Models of species delineation. Systematic Biology 58: 298–311.
  72. 72. Cognato AI (2006) Standard percent DNA sequence difference for insect does not predict species boundaries. Journal of Economic Entomology 99: 1037–1045.
  73. 73. Leo SST, Pybus MJ, Sperling FAH (2010) Deep mitochondrial DNA lineage divergences within alberta populations of Dermacentor albipictus (Acari: Ixodidae) do not indicate distinct species. Journal of Medical Entomology 47: 565–574.
  74. 74. Schlick-Steiner BC, Steiner FM, Seifert B, Stauffer C, Christian E, et al. (2010) Integrative taxonomy: a multisource approach to exploring biodiversity. Annual Review of Entomology 55: 421–438.
  75. 75. DeSalle R, Egan MG, Siddall M (2005) The unholy trinity: taxonomy, species delimitation and DNA barcoding. Philosophical Transactions of the Royal Society: B 360: 1905–1916.
  76. 76. Roe AD, Sperling FAH (2007) Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding. Molecular Phylogenetics and Evolution 44: 325–345.
  77. 77. Daane KM, Barzman MS, Caltagirone LE, Hagen KS (2000) Metaphycus anneckei and Metaphycus hageni: two discrete species parasitic on black scale, Saissetia oleae. BioControl 45(3): 269–284.
  78. 78. Mendes MFM, Francoy TM, Nunes-Silva P, Menezes C, Imperatriz-Fonseca VL (2007) Intra-populational variability of Nannotrigona testaceicornis Lepeletier 1836 (Hymenoptera, Meliponini) using relative warps analysis. Bioscience Journal 23: 147–152.
  79. 79. Villemant C, Simbolotti G, Kenis M (2007) Discrimination of Eubazus (Hymenoptera, Braconidae) sibling species using geometric morphometrics analysis of wing venation. Systematic Entomology 32: 625–634.
  80. 80. Billah MK, Kimani-Njogu SW, Wharton RA, Woolley JB, Masiga D (2008) Comparison of five allopatric fruit fly parasitoid populations (Psyttalia species) (Hymenoptera: Braconidae) from coffee fields using morphometric and molecular methods. Bulletin of Entomological Research 98: 63–75.
  81. 81. Francoy TM, Wittmann D, Drauschke M, Müller S, Steinhage V, et al. (2008) Identification of Africanized honey bees through wing morphometrics: two fast and efficient procedures. Apidologie 39: 488–494.
  82. 82. Kandemir I, Moradi MG, Özden B, Özkan A (2009) Wing geometry as a tool for studying the population structure of dwarf honey bees (Apis florea Fabricius 1876) in Iran. Journal of Apicultural Research 48: 238–246.
  83. 83. May-Itzá WJ, Quezada-Euán JJG, Medina LA, Enríquez E, De la Rúa P (2010) Morphometric and genetic differentiation in isolated populations of the endangered Mesoamerican stingless bee Melipona yucatanica (Hymenoptera: Apoidea) suggest the existence of a two species complex. Conservation Genetics 11: 2079–2084.
  84. 84. Heraty JM, Woolley JB, Hopper KR, Hawks DL, Kim JW, et al. (2007) Molecular phylogenetics and reproductive incompatibility in a complex of cryptic species of aphid parasitoids. Molecular Phylogenetics and Evolution 45: 480–493.
  85. 85. Locke SA, Daniel McLaughlin J, Marcogliese DJ (2010) DNA barcodes show cryptic diversity and a potential physiological basis for host specificity among Diplostomoidea (Platyhelminthes: Digenea) parasitizing freshwater fishes in the St. Lawrence River, Canada. Molecular Ecology 19: 2813–2827.
  86. 86. Whiteman NK, Sánchez P, Merkel J, Klompen H, Parker PG (2006) Cryptic host specificity of an avian skin mite (Epidermoptidae) vectored by louseflies (Hippoboscidae) associated with two endemic galápagos bird species. Journal of Parasitology 92: 1218–1228.
  87. 87. Poulin R, Keeney DB (2008) Host specificity under molecular and experimental scrutiny, Trends in Parasitology, 24: 24–28.
  88. 88. Emery VJ, Landry JF, Eckert CG (2009) Combining DNA barcoding and morphological analysis to identify specialist floral parasites (Lepidoptera: Coleophoridae: Momphinae: Mompha). Molecular Ecology Resources 9: 217–223.
  89. 89. McBride CS, Van Velzen R, Larsen TB (2009) Allopatric origin of cryptic butterfly species that were discovered feeding on distinct host plants in sympatry. Molecular Ecology 18: 3639–3651.
  90. 90. Phillips CB, Vink CJ, Blanchet A, Hoelmer KA (2008) Hosts are more important than destinations: What genetic variation in Microctonus aethiopoides (Hymenoptera: Braconidae) means for foreign exploration for natural enemies. Molecular Phylogenetics and Evolution 49: 467–476.