Molecular Species Delimitation and Morphology of Aquatic and Sub-Aquatic Bugs (Heteroptera) in Cameroon

Aquatic and semi-aquatic bugs (Heteroptera) represent a remarkable diversity and a resurging interest has been given to documenting at the species level these insects inhabiting Cameroon in Central Africa due to their potential implication in the transmission of the bacterium Mycobacterium ulcerans, the causal agent of Buruli ulcer, an emerging human disease. A survey was carried out over two years in Cameroon. Morphological analyses were done in two steps. A first step consisted in separating the specimens based on broadly shared characters into morphotypes. The specimens were then separated into two independent batches containing each the same representation of each morphotype. One batch (309 specimens) was used by taxonomy experts on aquatic bugs for species level identification and/or to reconcile nymph with their corresponding adult species. The second batch (188 specimens) was used to define species based on the COI DNA sequences (standard sequence used for “DNA barcoding”) and using the Automatic Barcode Gap Discovery (ABGD) method. The first morphological analysis step separated the specimens into 63 different morphotypes (49 adults and 14 nymphs), which were then found to belong to 54 morphological species in the infra-orders Gerromorpha and Nepomorpha based on the species-level morphological identification, and 41–45 putative molecular species according to the gap value retained in the ABGD. Integrating morphology and “DNA barcoding” reconciled all the specimens into 62 aquatic bug species in Cameroon. Generally, we obtained a good congruence between species a priori identified based on morphology from adult morphotypes and molecular putative species. Moreover, molecular identification has allowed the association of 86% of nymphs with adults. This work illustrates the importance of integrative taxonomy.


Introduction
Aquatic and sub-aquatic true bugs (Heteroptera), comprised in the Leptopodomorpha, the Gerromorpha and the Nepomorpha infra-orders, represent a remarkable species diversity of the aquatic biota with 4,656 species from 326 genera and 20 families found worldwide except in Antarctica [1]. Several surveys of aquatic bugs were carried out in the 1940s in Africa, specifically in West and Central Africa, i.e. the Ivory Coast [2] and Cameroon [3][4][5]. After this period, the aquatic bugs of West Africa were not studied for decades. But since the 2000s, a resurging interest has grown, documenting the species of aquatic and sub-aquatic bugs inhabiting Cameroon because some of them are suspected to be implicated in the transmission of Mycobacterium ulcerans, the causal agent of an emerging human disease named Buruli ulcer [6][7][8][9][10][11].
The current taxonomy of bugs is mostly based on morphological characters of adults, which are more or less reliable because of their intraspecific variability. Moreover, immature forms are difficult to identify based only on morphology because of the lack of discriminating characters [12]. Therefore, complementary approaches must be developed to address taxonomic issues in Heteroptera. In that regard, Wheeler et al. [13] have shown a good congruence between morphological and 18S molecular data to delineate infraorders of Heteroptera but at the species level, more variable sequences than 18S are required. In order to identify taxa which are difficult to separate only on the basis of their morphology, different authors have proposed the "DNA barcoding" which uses a standard region of the mitochondrial gene Cytochrome Oxidase subunit I (COI) [14][15][16]. Several studies have now established that the COI gene is very useful in insect taxonomy including Hemiptera, especially aphids [17][18][19], but also true bugs [17].
Initial work [20,21] reported some limitations using COI-based identification for some Heteropteran groups, but in recent studies, true bug taxa have been identified at both family and species levels, from Asia [17], Korea [22] or Brazil [23].
With the growing implementation of DNA barcoding, it is now possible to not only assign a sample to a pre-existing classification but also to identify unknown species and to decide whether species should be separated or merged using various delimitation methods. Two main classes of methods exist: distance-based methods which consist in clustering sequences in Molecular Operational Units (MOTUs), e.g. the Automatic Barcoding Gap Discovery method (ABGD) [24] and phylogeny-based, such as the Generalized Mixed Yule Coalescent method (GMYC) [25] or the Poisson Tree Processes (PTP) [26].
Here, we integrate both methods based on morphology and molecular data. The ABGD method with COI sequences was applied for the molecular species delimitation. This study aims to contribute to better understand the Cameroon's biodiversity of aquatic and sub-aquatic bugs putatively involved in the transmission to humans of the environmentally-persistent bacteria Mycobacterium ulcerans.
nationwide Decision N°0859/PCBS/MINFOF/SG/DFAP/SDVEF/SC, insects were collected in 10 locations in Cameroon including two existing endemic zones for Buruli ulcer (Akonolinga and Bankim) and eight non-endemic zones with the same ecological characteristics that the two endemic ones (Mbalmayo, Abong-Mbang, Garoua, Tibati, Ngaoundéré, Bamenda, Buéa and Santchou) (Fig 1). A map (Fig 1) was made from field data collected in different sampling sites using Argis software version 10.2.2. Aquatic and sub-aquatic bugs were collected using two sampling methods: directly in aquatic environment by hauling a metallic dip net (32 × 32 cm and 1 mm in mesh size) within a surface of 1 m 2 and at different depth levels (down to a depth of 1 m), and indirectly by using light trapping to capture winged imagoes. Aquatic sampling, which included a large variety of streams, rivers, swamps and flooded areas, was performed in triplicate at each survey session, on three consecutive mornings. Light traps were installed twice at each survey session in each site from 6:30PM to 11:00PM. Two survey periods were realized in 2011 (March to June) and 2012-2013 (September 2012 to February 2013) in the endemic (4 times and 6 times respectively at each survey period) and non-endemic zones (2 times at each survey period). After collection, adults and nymphs (L5) were selected, counted, and preserved in 70% ethanol, which was changed weekly, for morphological identification and molecular analyses. The 22,375 specimens collected were first classified by family using the Heteroptera classification given by Schuh and Slater [27] and then identified as morphotypes in each family. For each morphotype, two independent batches were used. One (n = 309) was used for advanced morphological identification at the "Museum National d'Histoire Naturelle" (MNHN) in Paris, France and at the National Museum of Natural History (NMNH) in Leidenin, The Netherlands. And the other (n = 188) was used for molecular identification in the EGCE laboratory at Gif-Sur-Yvette, France.

COI Amplification
A total of 188 specimens were analyzed among which 45 nymphs and 143 adults. About four individuals per morphotype were sampled for DNA sequence analysis. Total DNA was extracted from legs or full body for small insects, using the NucleoSpin Tissue XS [37] according to the manufacturer's instructions. PCR amplifications were done in 20 μl reaction volumes containing 10 μM of each dNTP (Promega), 10 μM of each primer, 0.5 U of Taq DNA Polymerase (Promega), 1× PCR Buffer (Promega), and DNA extract at about 1/μl. The gene fragments (COI) was amplified using the following pairs of the specific primers LCO 5'-GGTC AACAAATCATAAAGATATTGG-3' and HCO 5'-TAAACTTCAGGGTGACCAAA AAATCA-3' [38].
PCR always started with a denaturation step of 94°C for five min, followed by 25 cycles comprising denaturation at 94°C, for one min, annealing at 50°C for 1.5 min and elongation at 72°C for one min, and ended with a ten min final elongation at 72°C. PCR products were cleaned by Exosap IT [39], a single-step enzymatic clean up that eliminates unincorporated primers and dNTPs. The COI region was sequenced in the cleaned products.

Alignments and Phylogenetic Analyses
The cleaned products were then sequenced to 690 bp nucleotides. Multiple alignments were made using Clustal W according to the default settings: full multiple alignments with bootstrap number equal to 1000. Finally, we obtained a data set of 171 COI homologous sequences of 669 bp. We corroborated the (99%) homology of our sequences with COI sequences obtained from BLAST-GenBank and added to our data set some Genbank reference sequences of COI (see Table 1 for accession numbers).
We ran a Maximum Likelihood analysis within MEGA version 5 [40] adding the outgroup Graphocephala cythura (Table 1). We used the command "best DNA models" in MEGA, which computes the maximum likelihood fits and selects the best model for 24 different nucleotide substitution models. The best score was obtained for GTR+G+I. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model. The bootstrap consensus tree inferred from 1000 replicates (Felsenstein, 1985) was taken to represent the evolutionary history of the taxa analyzed. Initial tree(s) for the heuristic search were obtained automatically as follows. When the number of common sites was < 100 or less than one fourth of the total number of sites, the maximum parsimony method was used; otherwise BIONJ method with MCL distance matrix was used. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.6035)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 42.9957% sites). The trees were drawn to scale, with branch lengths measured in the number of substitutions per site. All ambiguous positions were removed for each sequence pair.

Molecular-Based Species Delimitation
Molecular putative species limits were explored with the Automatic Barcode Gap Discovery (ABGD) [24]. This method uses the distribution of pairwise genetic distances to separate samples into putative species. The distribution of pairwise distances is bimodal with intra-specific variation and inter-specific variation separated by the barcode gap that is used as a threshold to delimit species. Alignments were uploaded at http://www.abi.snv.jussieu.fr/public/abgd/abgd. html and ABGD was run with the default settings (Pmin = 0.001, Pmax = 0.1, Steps = 10, X (relative gap width) = 1, Nb bins = 20) and with K2P distances. The data can be partitioned into finer and finer partitions until there is no further partitioning. Tree different partitions were used in this study: p1 (p = 0.0359); p2 (p = 0.0046); p3 (p = 0.0028).

Morphological Identification
Based on morphology, the 309 specimens of aquatic and sub-aquatic bugs studied were firstly grouped into 49 adult morphotypes. After the advanced morphological analysis comparing specimens to the aquatic and sub-aquatic bugs collections preserved in the Museums (MNHN, France; NMNH, The Netherlands), these morphotypes were dispatched in 54 species belonging to 11 different families (Tables 2 and 3). Indeed, each of ten morphotypes were further separated in two distinct species:  (Table 2). Noticeably, for 14 specimens, no species name could be assigned. In addition, MorL14 initially designated as a nymphal morphotype was identified as an adult of the species Neonychia congoensis congoensis Hungerford, 1946. For all remaining nymphal morphotypes it was impossible to attribute a specific name based on the morphology.

Molecular-Based Species Delimitation
Using the ABGD method, we explored the limits between the different species using the COI sequence from 188 specimens including 49 adult and 14 nymphal morphotypes. The COI amplification failed for some adult morphotypes (Mor8, Mor18, Mor25, Mor33), certainly due to DNA degradation. The molecular data set obtained for the 45 adult morphotypes were partitioned into 41 or 45 putative molecular species according to the gap value retained corresponding to distance values of 0.0359 and 0.0028, respectively (Table 4), (S1 and S2 Figs). Three morphotypes (Mor28, 36,42) were each split into two molecular species. If the partition 3 is considered, two additional morphotypes Mor7 and Mor38 would also each split into 2 molecular species (Table 4). In five cases, two or three morphotypes were merged into one molecular species (Mor26-27, Mor31-32; Mor40-41-42, Mor44-48, Mor45-46, Mor49-30) ( Table 4). Table 5 shows the assignment of nymph molecular species to adult molecular species and the a posteriori species names associated to the corresponding adult determined based on morphology. According to the gap-value used, 11 (p = 0.0359) to 17 (p = 0.0028) molecular species could be recognized for the nymph morphotypes (Table 5). If we considered the partition 3 (p = 0.0028), out of the 14 a priori morphological morphotypes, eight could be assigned to a single molecular species (Table 5). The morphotypes, MorL2, MorL4, and MorL8, were split into two or three molecular species (Table 5). Two nymphal morphotypes and MorL10 and MorL11 were merged into one molecular species (No6) ( Table 5). The molecular study of the nymphs sample allowed the association of nymphs with adults for 11 species but not for four species (Table 5).

Morphological and Molecular Aquatic Bugs Species Biodiversity in Cameroon
For the first set of samples (based on morphological criteria), 54 species were determined and for the second set (based on molecular criteria), 52-62 species were putatively delimited using COI molecular marker. Nine species were lacking from the molecular sampling and if we consider the partition p3, 8 putative molecular species have no morphological species associated with them (Tables 4 and 5). But, for the adults, we observed a good congruence between the morphological and the molecular study for the determination of the a priori morphotypes. In some cases, the a priori determination of morphotypes lead to recognize two species while only one was assessed by the two methods as for two cases in Veliidae (Mor44-48, Mor45-46), Notonectidae (Mor26-27) and Hebridae (Mor30-49). By contrast, two species were recognized by the two methods for a single a priori morphological species in Gerridae (Mor31, Mor33), Hydrometridae (Mor38). However, some discrepancies are noticed for Gerridae (Mor31-32) and Mesoveliidae (Mor40-41-42).
If we validate all the morphological species and the molecular species for which no morphological species was associated, the combination of the two dataset (morphological and molecular) yield a total of 62 aquatic bugs species in Cameroon.
The species biodiversity of aquatic bugs varies among identified families. The most diversified families are Notonectidae (13 species), Belostomatidae (11 species), Gerridae (10 species)    (Table 6).  The number of specimens examined is given in parentheses.
doi:10.1371/journal.pone.0154905.t004 Table 5. Assignments of nymph morphotypes to adult morphotypes determined by molecular identification and a posteriori species names associated to the corresponding adult as determined based on morphology (Tables 1 and 2

Discussion
This study complements the work realized by Poisson [3][4][5] on the diversity of aquatic Heteroptera in Cameroon. In view of the rising worry regarding the potential importance of aquatic bugs in the transmission of the emergent disease Buruli ulcer [6][7][8][9][10], this most up-to-date estimation of aquatic bugs biodiversity is especially relevant. Aquatic bugs biodiversity of Cameroon previously reported by Poisson [3-5, 31-33, 41, 42] reached 15 families including 95 species. With our dataset, we reported in this country 9 new genera and 34 new species records including 14 putative new species. Overall, 45 genera and 125 species of true aquatic and subaquatic bugs must be considered in Cameroon (Table 6). This study underlines the difficulty in identifying the right species based on classical literature in the field without access to museum collections and connection with expert taxonomists. There is a divergence between the number of morphotypes determined a priori in the field by non-experts relying on basic identification keys and the morphological analysis carried out by taxonomists from the museum specialized in aquatic bugs. This difference in expertise   explains why different morphological species could not be distinguished in the field as different morphotypes and vice versa. In bugs, identification at the species level is difficult in some families, such as Mesoveliidae and Veliidae, and usually requires the presence of males. Some species have different morphological characteristics in males and females for example at A. sardeus and E. glauca males and females were initially classified into different morphotypes (Table 1). Moreover, taxonomical keys are lacking to differentiate the nymphal stages. Additional specimens from both sexes and further studies, including possible generic revisions, are needed in order to confirm the status of putative new species. The use of molecular species delimitation method allows to estimate the specific richness and noticeably, independently of the life stage or the sex of the sample. Thus, molecular identification allows the association of immature stages to adult stages. All nymphal morphotypes could be assigned to an adult morphotypes using this method except for ML2, ML11 and ML17 (Table 5). Even though in most cases there is congruence between the morphological and molecular species, there are some exceptions. In their work on true water bugs, Park et al. [17] noted, in some cases, a large unusual intra-specific genetic distance and in some others a very small distance between species.
In our study, in the case of a lack of corresponding molecular species in some morphological species, this could be due to the reduced number of specimens (188) used for the molecular analysis compared to the number of specimens (309) used for the morphological analyses. In other words, the molecular samples might not have been as representative a sample as desired. An explanation for sharing the same molecular putative species between different morphological species is that using a single gene could miss instances of very recent speciation events caused by selection at a few number of loci because a drastic change in morphology could be due to a few genes among which COI is not included. In other words, the "neutral" marker COI might not carry any record of species divergence in that case. Another explanation could be the fact that COI is a mitochondrial gene; it could have introgressed into another species after some hybridization events between the two species. A last hypothesis is that the species based on molecular marker are real but several morphotypes are present within one species.
It is better to complement this type of study with phylogenetic analyses to determine relationships between species or group of species, but due to the limited sampling size of our study and the use of a single mitochondrial gene that is not intended for building phylogeny, we performed only the separation of the molecular species within the families of aquatic bugs identified. Previous works based on many loci described the phylogenetic relationships of the aquatic bugs more accurately. Hebsgaard et al. [43] used 16S, 28S and COI/COII, whereas Hua et al. [44] used 37 mitochondrial loci to obtain a well-resolved phylogeny of the true aquatic bugs. But these studies concerned Europe, USA, Australia, Philippines, Madagascar and Vietnam whereas the present study is the first in the Afrotropical region.

Conclusion
This study improves our knowledge on the diversity and distribution of aquatic bugs in Cameroon and confirms that COI can reliably be used to identify species in most families of aquatic bugs described here apart from the few exceptions observed. In the near future, molecular identification could also help to routinely identify aquatic bug species of importance in the transmission of the bacillus causing Buruli ulcer in human in the tropical region.
This pioneering study will be extended to other Afrotropical region to better document the biodiversity of aquatic bugs in this part of the world.
Supporting Information S1 Fig. Molecular putative species delimited by ABGD according to aquatic bug families using partition 1 in Nepomorpha infra-order. Each color represents one ABGD putative specie delimited with associated nymph corresponding to code situated in front of vertical line respectively families' initial name followed by putative specie number and ML followed by putative nymph specie number, before vertical line Mor = adult morphotype code following by morphotype number and individual number, ML = nymph morphotype code following by morphotype number and individual number. (TIFF) S2 Fig. Molecular putative species delimited by ABGD according to aquatic bug families using partition 1 in Gerromorpha infra-order. Each color represents one ABGD putative specie delimited with associated nymph corresponding to code situated in front of vertical line respectively families' initial name followed by putative specie number and ML followed by putative nymph specie number, before vertical line Mor = adult morphotype code following by morphotype number and individual number, ML = nymph morphotype code following by morphotype number and individual number. (TIFF) S1 File. Fasta file of sequences data set underlying the findings in our study in the manuscript with one outgroup sequence. Sequence name corresponds to families initial name followed by individual code number. (FAS)