Mosquitoes are insects of the Diptera, Nematocera, and Culicidae families, some species of which are important disease vectors. Identifying mosquito species based on morphological characteristics is difficult, particularly the identification of specimens collected in the field as part of disease surveillance programs. Because of this difficulty, we constructed DNA barcodes of the cytochrome c oxidase subunit 1, the COI gene, for the more common mosquito species in China, including the major disease vectors. A total of 404 mosquito specimens were collected and assigned to 15 genera and 122 species and subspecies on the basis of morphological characteristics. Individuals of the same species grouped closely together in a Neighborhood-Joining tree based on COI sequence similarity, regardless of collection site. COI gene sequence divergence was approximately 30 times higher for species in the same genus than for members of the same species. Divergence in over 98% of congeneric species ranged from 2.3% to 21.8%, whereas divergence in conspecific individuals ranged from 0% to 1.67%. Cryptic species may be common and a few pseudogenes were detected.
Citation: Wang G, Li C, Guo X, Xing D, Dong Y, Wang Z, et al. (2012) Identifying the Main Mosquito Species in China Based on DNA Barcoding. PLoS ONE 7(10): e47051. doi:10.1371/journal.pone.0047051
Editor: João Pinto, Instituto de Higiene e Medicina Tropical, Portugal
Received: March 31, 2012; Accepted: September 10, 2012; Published: October 10, 2012
Copyright: © Wang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by the Infective Diseases Prevention and Cure Project of China (NO: 2008ZX10402) and (NO: 2012ZX10004219).
Competing interests: The authors have declared that no competing interests exist.
Approximately 41 genera and 3500 species and subspecies of mosquito exist worldwide. Although mosquitoes have been studied more extensively than most other insect groups because of their role as vectors of disease, our taxonomic knowledge of these insects is far from complete. Numerous Chinese taxonomists have worked on mosquito classification since 1932, particularly since Edwards provided the modern mosquito classification system . Feng Lan-Zhou reported 100 Chinese mosquito species in 1938 . This number has since then increased to approximately 390 described species and new species are still being identified, particularly within the genera Armigeres, Heizmannia, Topomyia and Uranotaenia.
Some species are vectors of medically important pathogens, such as malaria, Dengue fever and Japanese B encephalitis. Species identification therefore constitutes the first step in the surveillance and control of mosquito-borne diseases. The identification of mosquito species is mainly done on the basis of morphological characteristics. This can be problematic because diagnostic morphological features are often damaged during collection or storage, or are not present in all developmental stages. Moreover, the morphological characteristics used to identify intact adult specimens often vary so little between species that usually only experienced mosquito taxonomists are able to distinguish mosquito species reliably .
DNA analysis provides a more accurate way of identifying species and the use of molecular data, in combination to morphological methods, has resolved some long-standing taxonomic questions , . The increase in the number of available molecular markers has facilitated the accurate identification of mosquito species, particularly within groups of sibling species. For instance, Anopheles anthropophagus and Anopheles sinensis can be identified more simply, rapidly, and accurately using the ITS2 sequence than on the basis of morphology , .
After Tautz proposed using DNA sequences as the main basis of biological classification in 2002 ,  Paul Hebert suggested that sequencing the COI gene could allow DNA barcoding that would facilitate such classification –. Many studies have since then demonstrated that the COI gene is a valid molecular tool for identifying mosquito species ,  and revealing cryptic species –.
Although several studies on the distribution of Chinese mosquito species have been conducted using classical morphology identifying sibling and cryptic species remains problematic. Here we provide an updated classification of nearly one-third of China’s mosquito species based on a combination of molecular and morphological methods.
A total of 122 mosquito species belonging to 15 genera and three subfamilies were collected from sampling sites in eight Chinese provinces (Figure 1, Table 1). We identified mosquitoes on the basis of diagnostic morphological characteristics of their adult and larval stages and cercopoda , and by using molecular methods to distinguish sibling species , .
Site 1: Manzhouli City, Neimeng ProvinceXinjiang; Site 2: Yili, Kazakh Autonomous Prefecture, Xinjiang Province; Site 3: Taiyuan City, Shanxi Province; Site 4: Golmud River, QinghaiQinghai Province; Site 5: Tianmu Mountain, Zhejiang Province; Site 6: Zhenxiong County, Yunnan Province; Site 7: Maolan Natural Reserve, Guizhou Province; Site 8: Ruili City, Yunnan Province; Site 9: Mengla County, Yunnan Province; Site 10: Changjiang County, Hainan Province; Site 11: Limushan Nature Reserve, Hainan Province; Site 12: Mangrove Nature Reserve, Hainan Province.
Individual species were represented by one to eight individuals giving a total of 404 COI sequences, representing 122 species and subspecies. We identified and excluded 3 pseudogenes from further analyses by only selecting sequences without insertions, deletions and stop codons. COI sequences contain a large number of A+T pairs (average of 69% for all codons), particularly at the third codon position (93.4%) (Table S1). There was, however, no G content in Orthopodomyia anopheloides and Topomyia houghtoni at the third codon. As in the case of Drosophila , , this quite strong bias is apparently caused by the relative abundance of iso-accepting tRNA. All sequences contained less T in the first codon compared to the second. However, the A content of the first codon was higher than that of the second. The average R-value (transitions/transversions) was 0.7.
Neighbor-Joining (NJ) Tree
The Neighbor-Joining (NJ) tree method is conceptually related to clustering, but without the assumption of clock-like behavior . COI gene fragments accurately revealed species boundaries and provided a clear phylogenetic signal (Figs. 2 and 3). Most of the major branches on the tree represent distinct taxonomic groups, including all genera and subgenera. Moreover, specimens of the same species always grouped closely together, regardless of collection site, and, except for some specimens from Hainan Island, no obvious geographic differences in sequences within the same species were found.
Sequence analysis was conducted using MEGA version 4.0 software with 1000 replications. Most major branches on the tree represent recognized groups, including all genera and subgenera except Anopheles and Culex which comprise separate subtrees and are shown in detail in Fig.3.
Combining NJ tree and bootstrap analysis is the most appropriate method for evaluating phylogenetic trees using distance methods . Nodes linking sequences of individuals of the same species had a high bootstrap value (98%–99%) whereas some linking sequences of geographically different individuals had low bootstrap values (6%–99%).
All species had a distinct set of COI sequences. Excluding the Culex mirneticus subgroup and the species listed in Table 2 (see Discussion section), most (98%) conspecific sequences showed <2% (range = 0% to 1.67%), whereas >98% of interspecific divergence was in specimens with >2% K2P divergence (range = 2.3% to 21.8%). Sequence divergence was even higher among species in different genera, ranging from 10.9% to 21.8% (Fig. 4).
All sequences were grouped with MEGA software, each group includes all species of a particular genus.
Transition and transversion distances varied consistently with sequence divergence (Fig. 5). Transition distance was significantly greater than transversion distance when sequence divergence was <2%. However, transversion distances increased slowly with sequence divergence to eventually exceed transition distances at K2P divergence of ≥6%. Both transition and transversion distances then decreased until K2P divergence reached about 15%. The relationship between the transversion distance, sequence divergence, and morphological characteristics are shown in Tables 2 and 3.
Accuracy of COI
The primary function of DNA barcoding is accurate species identification. We found that COI sequence differences among congeneric mosquito species were approximately 30 times higher than the average differences within species. Moreover, more than 98% of COI fragments had clear interspecific boundaries, a result consistent with the results of other authors . The average conspecific K2P divergence in this study, 0.39%, is similar to values reported for fish species in Australia  and slightly higher than those reported for North American birds (0.27%)  and moths (0.25%) . It is slightly less than the K2P divergence value reported for Canadian mosquitoes (0.55%) .
Transversion Distance and Speciation
Mitochondrial DNA (mtDNA) functions as a molecular clock in that transversions accumulate in a linear fashion over time , . Comparison of the molecular and morphological data indicates that the number of transversions may raise to about 7 value without apparent or detectable changes in morphology. (Fig. 5). Transition distance was significantly greater than transversion distance when sequence divergence was below 2% at which level there were almost no morphological differences between specimens. At higher levels of sequence divergence transversion distances slowly increased, eventually exceeding transition distances when sequence divergence reached 6%. Morphological differences were undetectable when sequence divergence was about 2% but were distinct when this reached 6%. Transversion distances increased steadily at sequence divergence levels of 6% to 15% at which level plesiomorphy also first became evident. Plesiomorphy stabilized at sequence divergence of 15%. In addition, the vast majority of intraspecific distances occurred between sequence divergence levels of 6% and 15% whereas most intergeneric distances occurred from 15% to 20% (Fig. 4). Very few intraspecific, and no intergeneric, distances occurred between sequence divergence levels of 2% and 6%.
We found that transversion distances indicated a clear boundary between species. The transversion distance between most species was <1.1% at sequences divergence values of less than 2%. There were, however, some exceptions; although the transversion distance between two plesiomorphous species was usually <1.1% (Table 3), some species with anomalous intraspecific COI sequences divergences >2% (Table 2) had intraspecific transversion distances >1.1%. This suggests the presence of cryptic species, which, if confirmed, in turn suggests that transversion distances may be a useful supplement to barcoding information in species identification. Further research on the use of transversion as an additional index of taxonomic similarity is recommended.
Molecular Data Versus Morphology
Sequence divergence values of 14% to 16% were indicative of either interspecific or intergeneric differences. There are two possible reasons for this; temporary substitution saturation of the COI fragment and the limitations of morphological identification.
We found some cases of high intraspecific sequence divergence among Aedes dorsalis, Aedes vexans, Culex modestus, Tripteroides aranoides, and Toxorhynchites splendens (Table 2). Although the degree of niche separation within these species remains unclear, this result suggests the existence of cryptic species. We also detected intraspecific sequence divergence slightly greater than the 2% threshold within Coquillettidia crassipes and Anopheles sinensis (Table 2). Although no morphological differences within these species were observed, differences in feeding habits and habitat have been documented within Anopheles sinensis populations. This, together with the >2% level of COI sequence divergence, suggests the presence of cryptic species . Some cases of low interspecific sequence divergence were found among some pairs of species (Table 3), including Aedes craggi and Aedes annandalei, as well as Culex spiculosus and Culex minor. Although there is no evidence of niche separation between these species, slight morphological differences were observed. This suggests that the taxonomic status of these species should be re-confirmed. Although few doubt that mtDNA barcodes are a valuable molecular tool for matching unidentified specimens to described taxa, there has been relatively little use of barcodes to delimit species . More research on rDNA, morphology, biogeography and ethology are required to improve the applicability of barcoding to species-level taxonomy.
Culex neomimulus was previously classified as Culex mimulus in the Culex mirneticus group . Although our COI data supports the previous view, we found that anomalous COI sequence divergence values were relatively common in the Culex mirneticus group with some morphologically distinct specimens having similar barcodes. This could be due to infection with the Wolbachia bacteria. The maternally inherited Wolbachia bacteria causes a loss of haplotype diversity in populations by inducing a selective sweep of the initially infected individual's haplotype through a population. We detected Wolbachia infection in Culex mimulus so it’s possible that this may also occur in this species. Although Smith et.al concluded that the presence of Wolbachia DNA in total genomic extracts is unlikely to compromise the accuracy of the DNA barcode library, this is a complex problem that requires further investigation .
The presence of pseudogenes can affect the accuracy of barcoding identification but, since their incidence was <1%, their influence on our data was presumably small. The distinctive characteristics of the COI gene (no insertions, deletions and stop codons) allowed pseudogenes to be easily identified and excluded from the sequences we obtained. Although the leakage of paternal mtDNA may influence the results of barcoding this phenomenon is only occasionally (<0.004%) found in higher animals.
A total of three pseudogenes were detected. For instance, one of the samples of Aedes dissimilis collected from the same area exhibited high interspecific sequence (3.74%) and transversion divergence (3.00%). A total of 12 different protein sequence sites were observed, which is very rare in the Culicidae. The substitution rate at nucleotide codons 1, 2, and 3 was 1∶2:2, very different to the average of 5∶1:18. We also amplified the pseudogenes of Uranotaenia lutescens and Culex halifaxia, which have insertions and deletions, respectively. The sequence divergence between pseudogenes and COI fragments in Culex halifaxia was 10.93% and the substitution rate at nucleotide codons 1, 2, and 3 was 5∶4:11. The divergence time formula of mtDNA and pseudogenes  suggests that the nuclear transfer event occurred 500 million years ago in Culex halifaxia and 170 million in Aedes dissimilis. We found an insertion site at 54 bp in the sequence of Uranotaenia lutescens, with a substitution rate at nucleotide codons 1, 2, and 3 of 7∶1:18. Two different protein sequence sites were also observed. These abnormal phenomena disappeared when the inserted site was deleted manually. Therefore, these anomalous sequences likely caused by the frameshift mutations of PCR.
Overall, DNA-based species identification systems depend on the ability to distinguish intraspecific from interspecific variation. This analysis of 404 COI sequences from 15 mosquito genera and 122 species and subspecies indicates that >98% of specimens formed distinctive clusters and that barcode divergence was relatively large between these groupings. Although it has limitations, DNA barcode technology has several advantages over traditional taxonomic methods as a tool for species identification. For example, it is unaffected by morphological variation between different life cycle stages. Another benefit is that it allows the homogenization, or calibration, of the taxonomic units identified in different areas. DNA barcode technology generally produces accurate results thereby greatly reducing the need for experienced taxonomists.
In summary, this study provides the first COI barcodes for mosquitoes in China and provides further evidence of the effectiveness of DNA barcoding in identifying recognized species. An insufficient number of specimens prevented in-depth investigation of sibling species complexes but we plan to address this area in the future. Care must be taken to exclude pseudogenes from COI databases to ensure the accuracy of molecular identification. COI databases also need to include specimens of the same species collected from different geographical locations in order to determine the extent of intraspecific variation. A complete evaluation of the effectiveness of DNA barcoding for the Culicidae can be achieved through multinational research.
Materials and Methods
No specific permits were required for this study. All experiments were conducted within state-owned land in China. Therefore, the local ethics committee deemed that approval was unnecessary.
Mosquito specimens used for constructing DNA barcodes were collected from different Chinese Provinces in 2009 and 2010. Details on specimens collected are provided on Fig. 1 and Table. 1. Larval and adult mosquitoes were collected in the field. Adults were sampled with CO2-baited miniature light traps. Larvae were reared individually and associated larval and pupal skins were mounted. All specimens were identified using standard taxonomic keys .
Target Gene Preparation
Total DNA (100 µL to 150 µL) was extracted from each specimen using the Universal Genomic DNA Extration Kit (Invitrogen). PCR was performed to amplify the 5′ COI region of mtDNA using the following cycle: An initial denaturation of 1 min (94°C) followed by five cycles of 94°C for 40 s (denaturation), 45°C for 40 s (annealing), and 72°C for 1 min (extension); 30 cycles of 94°C for 40 s (denaturation), 51°C for 40 s (annealing), 72°C for 1 min (extension) and a final extension at 72°C for 5 min. PCR cocktails were made as follows: A 50 µL solution comprised of 0.3 µL Taq DNA polymerase (5 U/µL), 5 µL of 10×PCR buffer, 5 µL of 2 mmol/L dNTP, 2 µL of 10 µmol/L each of the forward and reverse primers, 5 µL of template DNA and sufficient ddH2O to make up to 50 µL. The primer pairs LCO1490 and HCO2198  were used to amplify a 650 bp fragment of COI. The amplified fragments were run on a 1% agarose gel to check the integrity of the fragments after which the PCR product was purified with a normal PCR purification kit (Tiangen). Both reads (forward as well as reverse primer) were done.
DNA sequences were aligned using Clustal X . Sequence analysis and Ts/Tv calculation was conducted using MEGA version 4.0 software . Sequence divergence and Ts, Tv distance among individuals was quantified using the Kimura two-parameter distance model . An NJ tree of K2P distances was created to provide a graphic representation of the clustering pattern among different species .
Sequence divergence and nucleotide composition for the mosquito genera. The frequencies of nucleotides in sequence are presented as the total average values for all Condon positions and for each condon position separately with the accuracy to tenths of a percent. (*) Figures in brackets are the number of mosquito species used to estimates of sequence divergence for the genus
Conceived and designed the experiments: GW TYZ. Performed the experiments: GW ZZ. Analyzed the data: GW. Contributed reagents/materials/analysis tools: GW YDD CXL XXG Z. Wang YMZ DX MDL HDZ XJZ Z. Wu. Wrote the paper: GW.
- 1. Edwards FW (1932) Genera insectorum: Diptera : Fam. Culicidae: Verteneuil & Desmet.
- 2. Feng LC (1938a) A critical review of literature regarding the records of mosquito in China. Part I. Subfamily Culicinae, tribe Anophelini; Part II. Subfamily Culicinae, tribe Megarhinni and Culicini. Pek nat Hist Bull 12: 169–181; 285–318.
- 3. Bortolus A (2008) Error cascades in the biological sciences:the unwanted consequences of using bad taxonomy in Ecology. Ambio 37: 114–118.
- 4. Hanel R, Sturmbauer C (2000) Multiple recurrent evolution of trophic types in northeastern atlantic and mediterranean seabreams (Sparidae, Percoidei). J Mol Evol 50: 276–283.
- 5. Herran de la R, Rejon C, Rejon M, Garrido-Ramos M (2001) The molecular phylogeny of the Sparidae (Pisces, perciformes) based on two satellite DNA families. Heredity 87: 691–697.
- 6. Phuc H, Ball A, Son L, Hanh N, Tu N, et al. (2003) Multiplex PCR assay for malaria vector Anopheles minimus and four related species in the Myzomyia Series from Southeast Asia. MED VET ENTOMOL 17: 423–428.
- 7. Gao Q, Beebe N, Cooper R (2004) Molecular identification of the malaria vectors Anopheles anthropophagus and Anopheles sinensis (Diptera: Culicidae) in central China using polymerase chain reaction and appraisal of their position within the Hyrcanus Group. J Med Entomol 41: 5–11.
- 8. Tautz D, Arctander P (2003) A plea for DNA taxonomy. TRENDS ECOL EVOL 18: 70–74.
- 9. Tautz D, Arctander P (2002) DNA points the way ahead in taxonomy. Nature 418: 479.
- 10. Hebert P, Cywinska A, Ball S, deWaard J (2003a) Biological identifications through DNA barcodes. ProcR Soc Lond B 270: 313–321.
- 11. Hebert P, Ratnasingham S, deWaard J (2003b) Barcoding animal life: cytochrome c oxidase subunit 1 divergence,among closely related species. Proc R Soc Lond B (Suppl) 270: S96–S99.
- 12. Remigio E, Hebert P (2003) Testing the utility of partial COI sequences for phylogenetic estimates of gastropod relationships. Mol Phylogenet 29: 641–647.
- 13. Cywinska A, Hunter FF, Hebert PDN (2006) Identifying Canadian mosquito species through DNA barcodes. MED VET ENTOMOL 20: 413–424.
- 14. Kumar NP, Rajavel AR, Natarajan R, Jambulingam P (2007) DNA Barcodes can distinguish species of Indian mosquitoes(Diptera:Culicidae). J Med Entomol 44: 1–7.
- 15. Monaghan M, Balke M, Gregory T, Vogler A (2005) DNA-based species delineation in tropical beetles using mitochondrial and nuclear markers. Phil Trams R SocB 360: 1 925–921 933.
- 16. Brown JS (2003) Miller (2003) Studies on new Guinea moths description of a new species of Xenothctis meyrick(Lepidoptera:Torthricidae:Archipini). Proc Entomol Soc Wash 105: 1043–1050.
- 17. Hebert P (2004b) Penton (2004b) Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proc Natl Acad Sci 101: 14812–14817.
- 18. Burns JM, Janzen DH, Hajibabaei M, Hallwachs W, Hebert PDN (2007) DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservació n Guanacaste, Costa Rica. PNAS 105: 6350–6355.
- 19. Baolin L (1997) FAUNA SINICA, INSECTA Vol.9, Diptera: Culicidae I, II. Beijing: Science Press.
- 20. Akashi H (1994) Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136: 927–935.
- 21. Moriyama EN, Powell JR (1997) Codon usage bias and tRNA abundance in Drosophila. Mol Bilo Evol 45: 514–523.
- 22. Moftah M, Aziz SHA, Elramah S, Favereaux A (2011) Classification of sharks in the Egyptian Mediterranean waters using morphological and DNA Barcoding approaches. PLos ONE 6: 1–7.
- 23. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evol 39: 783–791.
- 24. Ward R, Zemlak T (2005) DNA barcoding Australia's fish species. Plut trans R Soc B: 1–11.
- 25. Hebert P, Stoeckle M, Zemlak T, Francis C (2004a) Identification of Birds through DNA Barcodes. PLos Biology 2: e312.
- 26. Papadopoulou A, Anastasiou I, Vogler AP (2010) Revisiting the insect mitochondrial molecular clock: the mid-aegean trench calibration. Mol Bilo Evol 27: 1659–1672.
- 27. Marko PB, Moran AL (2002) Correlated evolutionary divergence of egg size and a mitochondrial protein across the isthmus of panama. Evolution 56: 1303–1309.
- 28. Baker RJ, Bradley RD (2006) Speciation in mammals and the genetic species concept. J MAMMAL 87: 643–662.
- 29. Lohse K (2009) Can mtDNA barcodes be used to delimit species? a response to Pons, et al. (2006). SYSTEMATIC BIOL 58: 439–442.
- 30. Sirivanakarn S (1976) A revision of the subgenus Culex on the oriental region (Diptera:Culicidae). ContribAmentInst 12: 1–272.
- 31. Smith MA, Bertrand C, Crosby K, Eveleigh ES, Fernandez-Triana J, et al. (2012) Wolbachia and DNA Barcoding insects: patterns, potential, and problems. PLos ONE 7: e36514.
- 32. Li W, Gojobori T, Nei M (1981) Pseudogenes as a paradigm of neutral evolution. Nature 292: 237–239.
- 33. Flomer O, Black M (1994) DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol Mar Biol Biotech 3: 294–299.
- 34. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The Clustal X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24: 4876–4882.
- 35. Kimura M (1980) A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequence. J MOL EVOL 16: 111–120.
- 36. Saitou N, Nei M (1987) The neighbour-joining method: a new method for reconstructing phylogenetic tree. MOL BIOL EVOL 4: 406–425.