Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Uncovering Trophic Interactions in Arthropod Predators through DNA Shotgun-Sequencing of Gut Contents

  • Débora P. Paula ,

    Affiliations Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Brasília, Brazil, Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom

  • Benjamin Linard,

    Affiliation Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom

  • Alex Crampton-Platt,

    Affiliations Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom, Department of Genetics, Evolution and Environment, Faculty of Life Sciences, University College London, Gower Street, London, United Kingdom

  • Amrita Srivathsan,

    Affiliations Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom, Department of Biological Sciences, National University of Singapore, Singapore, Singapore, Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, United Kingdom

  • Martijn J. T. N. Timmermans,

    Affiliations Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom, Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, United Kingdom, Department of Natural Sciences, Hendon Campus, Middlesex University, The Burroughs, London, United Kingdom

  • Edison R. Sujii,

    Affiliation Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Brasília, Brazil

  • Carmen S. S. Pires,

    Affiliation Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Brasília, Brazil

  • Lucas M. Souza,

    Affiliation Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Brasília, Brazil

  • David A. Andow,

    Affiliation Department of Entomology, University of Minnesota, St. Paul, Minnesota, United States of America

  • Alfried P. Vogler

    Affiliations Department of Life Sciences, Natural History Museum, Cromwell Rd, London, United Kingdom, Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, United Kingdom


Characterizing trophic networks is fundamental to many questions in ecology, but this typically requires painstaking efforts, especially to identify the diet of small generalist predators. Several attempts have been devoted to develop suitable molecular tools to determine predatory trophic interactions through gut content analysis, and the challenge has been to achieve simultaneously high taxonomic breadth and resolution. General and practical methods are still needed, preferably independent of PCR amplification of barcodes, to recover a broader range of interactions. Here we applied shotgun-sequencing of the DNA from arthropod predator gut contents, extracted from four common coccinellid and dermapteran predators co-occurring in an agroecosystem in Brazil. By matching unassembled reads against six DNA reference databases obtained from public databases and newly assembled mitogenomes, and filtering for high overlap length and identity, we identified prey and other foreign DNA in the predator guts. Good taxonomic breadth and resolution was achieved (93% of prey identified to species or genus), but with low recovery of matching reads. Two to nine trophic interactions were found for these predators, some of which were only inferred by the presence of parasitoids and components of the microbiome known to be associated with aphid prey. Intraguild predation was also found, including among closely related ladybird species. Uncertainty arises from the lack of comprehensive reference databases and reliance on low numbers of matching reads accentuating the risk of false positives. We discuss caveats and some future prospects that could improve the use of direct DNA shotgun-sequencing to characterize arthropod trophic networks.


Understanding the complexity of trophic interactions and their causes and consequences has been a major focus of ecology since Elton [1] developed the concept, continuing to the present with about 250 scientific papers each year. However, as many trophic interactions are infrequent and hard to see [2], especially for small arthropods, important trophic links are under-represented. Molecular analyses of predator gut contents have become the dominant approach to identify arthropod trophic interactions [36], especially PCR-based methods.

Lately, several studies have combined classic single DNA barcoding with next-generation sequencing, referred to as metabarcoding, which has overcome the limitations of Sanger sequencing when working with mixtures of templates, as those present in the gut [7,8]. However, these PCR-based methods have various limitations that restrict their general applicability and overall scope. Specifically, standard barcode markers (e.g., cox1, rRNAs) are not sufficiently conserved across taxonomic groups to develop universal or generic primers without target match problems [911]. Thus, taxonomic biases and inaccurate estimation of the relative abundance of some taxa are common due to marker choice and variation in primer efficiency [1118].

In addition, in most cases, the feeding by an arthropod on other arthropods limits the use of universal barcode primers. More generally, the selection of primers made by the researcher determines the target organisms and target genes, which constrains the research questions that can be addressed from the outset, e.g. targeting either eukaryotic or bacterial genomes, but not both. This constrains the characterization of complex food webs of generalist predators, as not all prey species can be anticipated and trophic links are often unknown. PCR-based approaches are particularly problematic for the analysis of intraguild predation [19] among closely related species and require the development of stringently diagnostic primers for each species. In these cases of taxonomic proximity between the target prey taxa and the focal predator, sequences of the latter prevail in PCR due to their much higher abundance and lower degradation compared to ingested DNA, unless ‘blocking’ primers can be designed that discriminate efficiently against the host [7,20] or PCR products are sequenced deeply. Together, these various constraints limit the scope of metabarcoding, which performs best on target groups in a narrow taxonomic window within which primer efficiency is relatively uniform, and at the same time the target groups are taxonomically distant from the host organism (e.g. plant chloroplast DNA in the gut of insects; e.g. [21]). Even where these conditions are fulfilled, the PCR-based analysis of ingested (degraded) DNA requires the use of short fragments, which can result in poor taxonomic resolution [22]. Whereas several attempts have been made to circumvent the limitations due to short fragment length (e.g., [2325]), biases associated with PCR still remain [11,18,26].

A more suitable approach to construct complex trophic interaction networks and to study multiple interactions at various trophic levels is needed that combines high taxonomic breadth and resolution [11,27]. Recently, metagenomics pipelines have been developed for studying the arthropod diversity in a specimen mixture [16,28,29]. However, in gut content analysis an additional challenge for using metagenomics is posed by the target environmental sample, which is digested and degraded DNA, precluding the use of existing metagenomic pipelines (assembly > binning > annotation of the metadata [30]). Using feeding trials, Srivathsan et al. [22] and Paula et al. [31] identified diet composition using an alternative metagenomic shotgun-sequencing pipeline of matching unassembled reads from faeces or gut contents with reference databases, and filtering for high overlap length and identity. The sensitivity and specificity of this method depends on the DNA reference databases. As there is no limitation to the identification of taxa imposed by specific molecular markers, the outcome of this approach promises a broader identification of foreign species, including prey, internal parasites and associated microbiomes.

In this work, we aimed to test the power of this alternative DNA shotgun-sequencing pipeline to construct a qualitative trophic interaction network of generalist arthropod predators that are considered to be effective biocontrol agents. These include three species of ladybirds (Coleoptera: Coccinellidae) and an earwig (Dermaptera: Forficulidae), which feed on various herbivorous insects, in particular aphids and moths, and potentially also on the juvenile stages of each other. We used the Illumina platform for shotgun-sequencing of DNA isolated from gut contents of field-collected specimens and an unfed predator as a control. In parallel, we constructed DNA reference databases through mining of publicly available sequence information, including partial and complete mitochondrial and nuclear genomes, and barcode sequences, complemented with sequences of taxa expected to interact with these predators in the sampled habitat. We were able to identify the taxonomic composition of foreign DNA in the guts of these predators. This includes those species directly preyed upon, but also produced a broader picture of the associated organisms, such as parasitoids and microbiomes, for the identification of the trophic interaction links with good taxonomic breadth and resolution. However, the lack of complete reference databases limits the ability to identify all prey taxa, and the low recovery of matched reads limits the sensitivity of this approach and accentuates the need to reduce the risk of false detection of spurious reads possibly generated from sample contamination or highly degraded DNA. This initiative is an effort to develop a satisfactory methodology to determine the targets of various predators that could be used in biological control.

Material and Methods


The predator coccinellids Cycloneda sanguinea (n = 5), Harmonia axyridis (n = 1) and Hippodamia convergens (n = 6), and the dermapteran Doru luteipes (n = 10), were collected at two organic farms in central Brazil (15°58'27.67"S, 47°29'49.94"W, and 15°49'28.01"S, 48°15'9.66"W) during November 2012. The farms produced similar crops in small fields, including cabbage, cassava, lettuce, and tomato, surrounded by leucaena, banana, coffee and timber trees. All specimens were immediately immersed in 100% ethanol (species were kept separate) and stored at -80°C until total DNA extraction. All collections were authorized by SISBIO (authorization number 36950), and access to the genetic heritage and transportation of biological material was authorized by IBAMA (authorization number 02001.008598/2012-42). Field-collected H. axyridis pupae were allowed to emerge in the laboratory, and unfed adults (<24 h) were stored at -80°C in 100% ethanol, and designated as a control.

Shotgun-sequencing of the predator gut contents

The methodological pipeline is illustrated in Figure A in S1 File. To prevent cross contamination among samples, all the preparation procedures (i.e. gut dissection, DNA extraction, quantification) for each predator sample were done on separate days with different tools and materials (tweezers, scalpels, and petri dishes), which were washed with neutral soap, soaked in 5% bleach for 30 min and washed in 70% ethanol. Prior to sample preparation, the bench and instruments were cleaned using 70% ethanol. Guts from the same species were pooled (the control was pooled separately) immediately after dissection in the first buffer of the DNA extraction kit using a DNA-free microtube placed in ice. The DNA extraction, estimation of DNA concentration, and sample quality check were done as described in Paula et al. [31]. The DNA concentration across samples was normalized to 20 ng/μl and TruSeq libraries were constructed. All predator gut contents were sequenced on an Illumina MiSeq for 250 bp single-end reads (300 cycles, mean insert size 300 bp, v2 chemistry, proportion of the flow cell: 16.1% for Cy. sanguinea, 19.8% for D. luteipes, 27.6% for H. axyridis), except for Hi. convergens at 10% of a run and 250 bp paired-end reads (500 cycles, insert size 600–900 bp bp, v2 chemistry) and the control H. axyridis at 17% of a run and 250 bp paired-end reads (500 cycles, insert size 450 bp, v2 chemistry). For these last two samples, library preparations and sequencing were done in a separate flow cell, months apart from that of the other predators, preventing cross-contamination. All the programs and settings used in the bioinformatics analyses were based on Paula et al. [31]. The quality assessment for each dataset was done using FastQC v0.10.1 ( Library index adapters were trimmed using Trimmomatic v0.30 (ILLUMINACLIP:2:30:10) [32]. The reads were passed through a quality control check using PrinSEQ v0.19.2 [33], with a minimum Phred quality score of 20. The retained reads after quality control were converted to Fasta format using a CAT and Perl script [31].

DNA reference databases

Six DNA reference databases (Fasta format) with different taxonomic resolution were constructed for DNA identification: a) Insecta mitogenomes, obtained by downloading all available (at November 2013) insect mitogenomes of 588 species from GenBank (Table A in S1 File), supplemented by sequencing the mitogenomes of 17 potential prey species (Tables B and C in S1 File) with long-range PCR or shotgun methodology [31,34,35]; b) cytochrome oxidase 1 (cox1) barcode sequences of 58,367 Insecta species using a MegaBLAST based pipeline implemented by Hunt et al. [36] and adapted by Srivathsan et al. [22,37]; c) aphid nuclear genomes obtained from AphidBase ( consisting of the preliminary genome assemblies of Myzus persicae and Aphis gossypii, and the complete nuclear genome of Acyrthosiphon pisum (assembly Acyr_2.0; placed and unplaced scaffolds; GenBank Assembly ID: GCA_000142985.2); d) parasitoid barcode sequences (ITS, rRNA genes, elongation factor, and cytb) from known parasitoid genera of the predators and aphids available at GenBank; e) rRNA genes containing data on 3,000,000 bacteria, 150,000 archaea, and 250,000 eukaryote sequences obtained from the SILVA rRNA database (release 115) [38]; f) genomes of bacteria genera known to include insect endosymbionts [31]: Arsenophonus, Blattabacterium, Buchnera, Cardinium, Hamiltonella, Midichloria, Nosema, Regiella, Rickettsia, Rickettsiella, Serratia, Spiroplasma and Wolbachia.

Prey and other foreign DNA detection

After quality control, reads from each predator gut content dataset were matched against the DNA reference databases using BLASTn v2.2.27+ (E-value <1e-5; maximum target sequences 3; no dust) (for Insecta mitogenome, parasitoids, and rRNA databases) and MegaBLAST 2.2.27+ (E-value <1e-9) (for cox1, aphid genomes and bacterial genome databases) [39]. The matched reads were filtered for minimum overlap length of 225 bp and identity of 95% (for the bacterial genome database), 98% (for the mitogenome database) and 99% (for all other reference databases) using a set of Python customized scripts to eliminate false taxon identifications as in Paula et al. [31] and Srivathsan et al. [22,37] ( The mapping region for each matched read was checked manually. Reads matching to the following regions were discarded: a) control region of mitogenomes; b) nuclear repeat regions, such as short sequence repeats-SSR; c) Coccinellini rRNA because this region does not differentiate prey from predator. Using these rigorous criteria, a matched read was assigned to a taxon for the mitogenome, cox1, aphid genomes and bacterial genome databases. For the rRNA and parasitoid databases, taxon assignment was made to the lowest taxonomic level that could be reliably identified.


Prey and other foreign DNA detected in the predator guts

The Illumina sequencing of the libraries of the four predator species generated >2 million reads per species after quality control. Several thousand reads matched across the mitogenome of the respective focal predator (Figures B-E in S1 File) ranging from 0.07% to 0.7% of total reads depending on the library (Table 1). An additional 15 to 239 reads per library were identified as insect mitochondrial DNA but foreign to the focal predator (Table 1). The highest number of foreign insect taxa identified based on these reads was found in Hi. convergens (n = 8, Table 2). For the other libraries, the numbers ranged from two to three, and were not related to the number of pooled predator specimens. In addition, non-insect DNA reads were found using the various databases (Table 1). Doru luteipes had the highest richness of symbionts (nine genera), plants, fungi and non-symbiont bacteria, while Cy. sanguinea had the lowest. We did not detect foreign reads in the gut of the unfed recently emerged (control) H. axyridis, except for one widespread and ubiquitous bacterium (Table 1).

Table 1. Number of Reads and Taxa (in parentheses) in the Gut Content of the Predators.

The taxonomic assignment of the foreign DNA in the gut content of each predator is presented in Table 2, along with the number of reads matched to each of the DNA reference databases. There were fewer matches and fewer species identified using the cox1 database than the mitogenome database, consistent with the ~20x greater size of the mitogenome and the greater number of target genes which were hit apparently randomly (Figures B-E in S1 File). In addition, to search for aphid prey we also used available whole-genome sequences to map the shotgun reads, which revealed the presence of Aphis gossypii in the H. axyridis gut, consistent with the matches to the Insecta mitogenomes (Table 2).

Table 2. Foreign Taxon and Corresponding Number of Reads (in parentheses) Detected in the Gut Content of each Predator using Different DNA Reference Databases.

The most common identified prey were intraguild prey and aphids. Evidence of intraguild predation was detected in all predators. Harmonia axyridis was preyed upon by all other predators, whereas predation on Cy. sanguinea and Hi. convergens was not detected. Hippodamia convergens had the highest intraguild prey richness of three coccinellid species and one non-coccinellid predator (Orius insidiosus). Other extraguild prey detected were the lepidopterans Spodoptera frugiperda, Helicoverpa sp., and Plutella xylostella, the parasitoid Di. coccinellae, and the stink bug Euschistus sp. Plutella xylostella had a population outbreak in the organic cabbage field where D. luteipes was collected.

Non-prey foreign DNA was detected using the parasitoid, nuclear rRNA, and bacteria genome databases. Parasitoids were detected in all predators using the parasitoid reference database. However, at the 99% identity threshold used here, the reads matched the conserved V5 region of 18S rRNA that was identical in closely related species. Consequently, we reported the taxonomic level at which the match provided a reliable taxonomic determination (families of Chalcidoidea and Aphidiinae in Braconidae). For aphid parasitoids, this was sufficient to distinguish hyperparasitoids from primary parasitoids and to identify two broad groups of primary parasitoids. Chalcidoidea parasitoids were detected in all coccinellids, while Aphidiinae were detected in all but Cy. sanguinea.

The nuclear rRNA database revealed that all of the predators harbored a remarkable diversity of foreign DNA (Table 2). Reads matching Spermatophyta and fungal species (Streptophyta, Ascomycota, and Basidiomycota) were found in all of the species, but the dermapteran D. luteipes had by far the highest number of these reads (Table 2). The abundance of plant DNA may be from indirect consumption because P. xylostella caterpillars are typically stuffed with largely undigested leaf tissue in their guts. The large quantity of fungal DNA may be associated with the scavenging behavior of this species.

The bacteria genome database returned several bacterial species associated with insects, including facultative and obligate symbionts of aphids. Three aphid-specific symbionts, Hamiltonella, R. insecticola, and S. symbiotica, were detected in all four predators, but the strict obligate Buchnera, which is a powerful marker of aphid consumption immediately after feeding [31], was not detected in any of the samples. This probably happened because Buchnera DNA detection drops very fast after aphid consumption, while the other prey symbionts keep in the predator guts longer [31]. Doru luteipes was the only predator containing numerous reads of Blattabacterium, which is a mutualistic endosymbiont of cockroaches and some termites [40], and possibly other cryptic arthropods such as Doru. Nosema, an obligate parasitic microsporidia, was found only in D. luteipes.

Predator qualitative trophic network

Based on read matches, a qualitative trophic network was drawn for the four predator species in the Brazilian agroecosystem (Fig 1). Primary predation is straightforwardly established for the herbivorous and sap sucking species of Lepidoptera, Hemiptera and Homoptera. In some cases, the interaction could be established only indirectly through the presence of parasitoids or bacteria that establish an association with a primary predation event on aphids. For example, in Cy. sanguinea, even though no aphid DNA was detected, an aphid symbiont (Hamiltonella) was present. Similarly, an aphid parasitoid (Aphidiinae) and an aphid symbiont (R. insecticola) were found without detection of aphid DNA in Hi. convergens. In contrast, direct evidence for aphid DNA was seen in H. axyridis and D. luteipes, which also contained aphid parasitoid DNA (Aphidiinae and Aphidius sp.) and aphid symbionts (R. insecticola, Hamiltonella and S. symbiotica). Both species also harbored Arsenophonus, Rickettsiella, Spiroplasma and Wolbachia, of which some may be (endo)symbionts of the predators.

Fig 1. Qualitative trophic network for the focal predators using DNA shotgun-sequencing of their gut content.

The focal predators are in black balloons, the prey are in white (extraguild) and grey (intraguild) balloons with black letters and edges, the parasitoids in white balloons with grey edges, in which grey letters are for the known aphid parasitoids, and black letters for other parasitoids. The arrows indicate the flux of biomass, in which black arrows indicate direct predation, and the dashed arrows indicate inferred predation associated with symbionts. The numbers at the origin of each arrow indicate the number of reads supporting the arrow, which for inferred predation is a thesum of the total aphid-specific symbiont reads.

In sum, these foreign DNA detections can be integrated into a network of interactions of predation and parasitism, based on existing biological knowledge about the detected taxa for the indirectly established interactions (Fig 1). The interpretation of the detection of known parasitoids could signal both the direct ingestion by predators or the indirect ingestion of the parasitized hosts, or could even signal the parasitism of the predator itself in the case of the coccinellid specific parasitoid, Di. coccinellae. The total insect trophic links identified for each focal predator using the Insecta mitogenome, cox1 and aphid genome databases was: two (plus one indirect link) for Cy. sanguinea, eight (plus one indirect) for Hi. convergens, two for H. axyridis, and three for D. luteipes, of which 93% were species or genus specific (Table 2).


The shotgun-sequencing of the DNA in the gut of predators enabled the identification of a set of taxonomically and functionally diverse foreign taxa, and this information was used to infer a qualitative trophic interaction network (Fig 1). We detected numerous foreign DNA sequences that could be identified to 15 taxa at the species or genus levels and assigned a clear trophic linkage. The results provided similar taxonomic breadth but higher taxonomic resolution for prey identification compared to other works using multiplex-PCR or metabarcoding on insect gut contents (Table 3). For most components of the trophic network these interactions were established based on just a few reads, although in some situations multiple evidences (e.g. co-detection of aphids and their microbiome) support the interaction. Particularly interesting are predator-on-predator interactions, inferred from the presence of predator-associated reads in a library generated from the gut of a different predator. However, some concerns remain about the minimum number of reads needed to confidently identify a prey species, because low number of reads can indicate not just low biomass of prey ingested or longer elapsed time after predation, but also false detection of spurious reads generated from sample contamination, sequencing or bioinformatics errors [41]. Although our negative control indicated lack of sample cross-contamination in our results, we nevertheless consider the network as drawn (Fig 1) to be no more than a heuristic summary of trophic interactions that require further testing with improved methodology and additional field sampling.

Table 3. Taxonomic Breadth and Resolution of the Foreign Species Identified in Arthropod Gut Contents by DNA-based Molecular Tools.

Biological knowledge about the identified taxa was required to establish the ingestion of aphids indirectly via aphid symbionts and parasitoids, and the ecological role of the linkages with parasitoids remains uncertain (direct or indirect consumption). Taken at face value, the identified linkages reveal predation on prey species known to occur at the sampled sites. Cycloneda sanguinea and Hi. convergens were the most abundant coccinellid species in the sampled organic farms. Although aphid DNA was not detected in either Cy. sanguinea or Hi. convergens, the detection of aphid parasitoid taxa in Hi. convergens, and aphid symbionts in both coccinellids strongly implies the consumption of aphids (Table 2). As the dominant aphid species in these habitats were probably Uroleucon sp. and Brevicoryne sp. [45], neither of which were in the aphid genome or mitogenome reference databases, the probability of detection of aphid DNA was reduced. In the dermapteran D. luteipes the observed direct consumption of aphids was also confirmed by the indirect linkages of an Aphidius parasitoid and three aphid-specific symbionts.

Intraguild predation was commonly detected and was asymmetrical. Hippodamia convergens fed on three other ladybirds, while no other predator species was detected in the gut of H. axyridis. In the temperate zone, H. axyridis is considered to be an aggressive intraguild predator and less commonly an intraguild prey [46]. However, in this Brazilian system, it was commonly detected as the intraguild prey. Finally, the detection of the coccinellid parasitoid, Di. coccinellae, represents an interesting addition to the trophic network with its potential to regulate the species composition of coccinellid assemblages. While we can plausibly assume that the Chalcidoidea and Aphidiinae parasitoids and symbionts of aphids (Table 2) were indirectly preyed upon through predation on their hosts, rather than direct predation on the free-living adults, Di. coccinellae pupates outside its host, and thus can be acquired by direct predation, rather than by feeding on a parasitized immature or adult coccinellid. The latter scenario seems unlikely because the parasitoid prefers adults [47], and intraguild predation is not common on adults [48]. A third scenario is that one of the ladybird specimens used for the gut analysis was itself parasitized by Di. coccinellae, and traces of its non-degraded DNA (from live tissue) was transferred into the gut extraction.

The key step of the DNA shotgun-sequencing analysis is the identification of reads as being foreign to the focal predator genome, which is established based on a stringent match to known barcode and genome sequences in the reference databases. For this, important issues are the completeness of identifying foreign DNA (reducing false negatives, i.e., identifying all of the foreign DNA in the predator gut) and the accurate taxonomic assignments of these DNA sequences (reducing false positives, i.e., not misidentifying taxa).

Regarding the issue of false negative risks, the foreign reads have to be detected among millions of others that are unidentified and presumably mostly originate from the predator nuclear genome. Thus, the taxonomic breadth and resolution of foreign DNA detection is directly related to the comprehensiveness of the DNA reference databases and the diagnostic power of the available genetic markers. Barcode and rRNA data were available in our reference databases for tens of thousands of species, but this may correspond to just a few percent of the total species in existence [49]. Thus, many species represented in the shotgun read mixture may remain unrecognized for the lack of representation in the reference database. Genetic variation and highly degraded prey DNA also add to the probability of missing a species represented in the mixture.

The use of the much more comprehensive cox1 database with 58,367 Insecta species reference sequences did not detect more or unique prey species compared to the mitogenome database (Table 3). This database differs from the mitogenome database in its much greater species representation, but unlike the mitogenome database it was not supplemented with the local pool and this may result in recovery of fewer prey species. Secondly, the ~20x shorter length of the cox1 barcode compared to mitogenome likely resulted in a lower recovery probability. A greater taxonomic coverage of mitogenomes is expected to aid in particular the resolution of parasitoid identifications and the detection of more aphid reads. Parasitoid detection was mainly against the parasitoid database, a collection of widely sequenced genes for diverse aphid parasitoid species, but the reads only matched the conserved regions of the 18S rRNA gene, which were invariable across entire clades and so parasitoid identifications could not be at the species level (Table 2).

The underrepresentation of mitochondrial and nuclear genomes of the aphid and parasitoid species in the DNA reference databases clearly limits the taxonomic resolution and the number of reads recovered. Where nuclear genome sequences available, the detection of a prey is expected with higher read numbers. For example, Paula et al. [31] detected Acyrthosiphon pisum in feeding trials using its nuclear genome at ~50x higher read numbers than by using the mitogenome. The predator gut content DNA datasets obtained here can be reanalyzed using a more comprehensive set of DNA reference databases to improve the taxonomic breadth and resolution of prey and other foreign species. In particular, investments in the construction of local mitogenome and draft nuclear genome databases of the species that share the habitat with the predators are needed because the single-locus cox1 barcode database provides comparatively low read coverage for read matching (Table 2).

Regarding the risk of false positives, this is reduced because a high stringency of overlap length and sequence identity for taxonomic identification was applied, although this approach resulted in a low number of matching reads. For example, when we filtered the Hi. convergens gut content dataset against the Insecta mitogenome database using 90% identity, we obtained 775 reads for 11 foreign species, instead of 198 reads for eight foreign species when using 98% identity (Table 1). In the three species eliminated, all of the reads mapped to the mitochondrial D-loop region. Most of the remaining eliminated reads also mapped to this region. The proportion of D-loop matching reads was unexpectedly high at a 90% identity threshold, and does not give confidence in a prey species assignment. The D-loop matches are probably spurious, perhaps resulting from the high AT content and possibly conserved tandem repeats characteristic of these regions that limit the power for species discrimination. Therefore, the 98% identity threshold for prey identification in the mitogenome database was appropriate to eliminate false positive identifications. In a similar way, we establish identity thresholds for the other reference databases.

Possible ways to address the issue of low number of matching reads could be increasing the sequencing depth and optimizing the library insert size. In our various libraries we used insert sizes between 300 and 900 bp and paired-end sequencing in some cases, but while this produces high-quality reads for secure species-level identification, it probably reduces the number of reads to be detected for the prey DNA in digestion in the gut of the predator. Degraded DNA in the gut is expected to be short (e.g., ≤200 bp [50,51]) depending of the elapsed time since ingestion. Consequently, long library insert size may discriminate against the ingested DNA in favor of the non-degraded DNA from live predator tissue. At the same time, such an insert size can be used to generate longer sequences, which improves the accuracy of taxon identifications. Therefore, a balance between the library insert size and the read length should be considered. Much higher sequencing depth can be achieved at the same cost with single-end sequencing and shorter reads, and using the more powerful HiSeq platforms.

The metabarcoding approach may still be considered, due to the much larger number of foreign reads. However, PCR on degraded DNA is notoriously difficult, and variable PCR efficiency may skew the relative read numbers in the mixture due to imperfect primer target match [13,15,26, 52, 53]. Thus, metabarcoding does not necessarily improve the completeness of species detection [12, 54, 55] for trophic interaction studies of generalist predators. Clarke et al. [17] observed in silico that a set of widely used ‘generic arthropod’ cox1 metabarcoding primers only managed to recover 43 to 64% of the species in a known mixture of arthropod DNA. Ji et al. [56] found that many cox1 metabarcoding identifications are reliable, but were only able to link the most commonly recovered foreign sequences to operational taxonomic units (OTUs), despite the large cox1 reference database. In addition, some of the reads can originate from contamination or chimeras [13, 57]. Metabarcoding studies therefore use an arbitrary threshold for the minimum number of reads per sample (e.g., discarding reads occurring less than four times, as in [42]).

In conclusion, even with modest sampling effort for four arthropod generalist predators, this DNA shotgun-sequencing approach provided similar taxonomic breadth and higher taxonomic resolution for prey identification compared to other works using multiplex-PCR or metabarcoding on insect gut contents (Table 3). At the current stage, however, the number of matching reads needs to be increased and the interpretation of low read matches needs to be validated. Despite this caveat, it is reasonable to conclude that shotgun-sequencing has significant advantages compared to the current methodologies and therefore might emerge as a preferred choice for gut content analysis because of: a) high confidence in the foreign DNA identifications; b) potential to simultaneously provide high taxonomic breadth and resolution; c) its practicality for working even with closely related prey and predator species, i.e. arthropods preyed upon by arthropods and even within a single guild; d) its barcode-free and PCR-free detection, eliminating PCR bias in species detection and measures of abundance; and e) an analysis of a broader spectrum of the ecosystem, including microbial symbionts and commensals, without the need for additional PCR targeting these groups.

Supporting Information

S1 File. Mitochondrial Insecta reference database, baits, elucidated mtDNA, bioinformatics pipeline, and mapping of reads to prey mitogenomes.



We thank Alex Cortes for helping with the insect collections.

Author Contributions

  1. Conceived and designed the experiments: DPP APV ERS CSSP.
  2. Performed the experiments: DPP LMS.
  3. Analyzed the data: DPP BL ACP AS MJTNT DAA APV.
  4. Contributed reagents/materials/analysis tools: APV.
  5. Wrote the paper: DPP BL ACP AS DAA APV.


  1. 1. Elton CS. Animal ecology. London: Sidgwick and Jackson; 1927.
  2. 2. Rosenheim JA, Limburg DD, Colfer RG. Impact of generalist predators on a biological control agent, Chrysoperla carnea: direct observations. Ecol Appl. 1999;9: 409–417.
  3. 3. Brooke MM, Proske HO. Precipitin test for determining natural insect predators of immature mosquitoes. J Natl Malar Soc. 1946;5: 45–56.
  4. 4. Sheppard SK, Harwood JD. Advances in molecular ecology: tracking trophic links through predator-prey foodwebs. Funct Ecol. 2005;19: 751–762.
  5. 5. Sunderland KD, Powell W, Symondson WOC. Populations and communities. In: Jervis MA, editor. Insects as natural enemies: A practical perspective. Berlin: Springer; 2005. pp. 299–434.
  6. 6. Pompanon F, Deagle BE, Symondson WOC, Brown DS, Jarman SN, Taberlet P. Who is eating what: diet assessment using next generation sequencing. Mol Ecol. 2012;21: 1931–1950. pmid:22171763
  7. 7. Deagle BE, Kirkwood R, Jarman SN. Analysis of Australian fur seal diet by pyrosequencing prey DNA in faeces. Mol Ecol. 2009;18: 2022–2038. pmid:19317847
  8. 8. Clare EL, Symondson WOC, Fenton MB. An inordinate fondness for beetles? Variation in season dietary preferences of night-roosting big brown bats (Eptecicus fuscus). Mol Ecol. 2014;23: 3633–3647. pmid:25187921
  9. 9. Ficetola GF, Coissac E, Zundel S, Riaz T, Shehzad W, Bessière J, et al. An in silico approach for the evaluation of DNA barcodes. BMC Genomics. 2010;11: e434.
  10. 10. Ficetola GF, Pansu J, Bonin A, Coissac E, Giguet‐Covex C, De Barba M, et al. Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data. Mol Ecol Resour. 2014;15: 543–556. pmid:25327646
  11. 11. Deagle BE, Jarman SN, Coissac E, Pompanon F, Taberlet P. DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match. Biol Lett. 2014;10(9): 20140562. pmid:25209199
  12. 12. Bru D, Martin-Laurent F, Philippot L. Quantification of the detrimental effect of a single primer-template mismatch by real-time PCR using the 16S rRNA gene as an example. Appl Environ Microbiol. 2008;74: 1660–1663. pmid:18192413
  13. 13. Schloss PD, Gevers D, Westcott SL. Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLoS ONE. 2011;6: e27310. pmid:22194782
  14. 14. Geller J, Meyer C, Parker M, Hawk H. Redesign of PCR primers for mitochondrial cytochrome c oxidase subunit I for marine invertebrates and application in all-taxa biotic surveys. Mol Ecol Resour. 2013;13: 851–861. pmid:23848937
  15. 15. Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41: e1. pmid:22933715
  16. 16. Zhou X, Li Y, Liu S, Yang Q, Su X, Zhou L, et al. Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification. Gigascience. 2013;2: 4. pmid:23587339
  17. 17. Clarke LJ, Soubrier J, Weyrich LS, Cooper A. Environmental metabarcodes for insects: in silico PCR reveals potential for taxonomic bias. Mol Ecol Resour. 2014;14: 1160–1170. pmid:24751203
  18. 18. Piñol J, Mir G, Gomez-Polo P, Agustí N. Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods. Mol Ecol Resour. 2015;15: 819–30. pmid:25454249
  19. 19. Polis GA, Myers CA, Holt RD. The ecology and evolution of intraguild predation: potential competitors that eat each other. Annu Rev Ecol Syst. 1989;20: 297–330.
  20. 20. Vestheim H, Jarman SN. Blocking primers to enhance PCR amplification of rare sequences in mixed samples–a case study on prey DNA in Antarctic krill stomachs. Front Zool. 2008;5: 12. pmid:18638418
  21. 21. Jurado-Rivera JA, Vogler AP, Reid CAM, Petitpierre E, Gómez-Zurita J. DNA barcoding insect–host plant associations. Proc R Soc B. 2009;276: 639–648. pmid:19004756
  22. 22. Srivathsan A, Sha JCM, Vogler AP, Meier R. Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf feeding monkey (Pygathrix nemaeus). Mol Ecol Resour. 2015;15: 250–261. pmid:25042073
  23. 23. De Barba M, Miquel C, Boyer F, Mercier C, Rioux D, Coissac E, et al. DNA metabarcoding multiplexing and validation of data accuracy for diet assessment: application to omnivorous diet. Mol Ecol Resour. 2014;14: 306–323. pmid:24128180
  24. 24. Leray M, Yang JY, Meyer CP, Mills SC, Agudelo N, Ranwez V, et al. A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Front Zool. 2013;10(1): 1.
  25. 25. Shokralla S, Gibson JF, Nikbakht H, Janzen DH, Hallwachs W, Hajibabaei M. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens. Mol Ecol Resour. 2014;14: 892–901. pmid:24641208
  26. 26. Pinto AJ, Raskin L. PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLoS ONE. 2012;7: e43093. pmid:22905208
  27. 27. Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. Towards next-generation biodiversity assessment using DNA meta-barcoding. Mol Ecol. 2012;21: 2045–2050. pmid:22486824
  28. 28. Tang M, Tan M, Meng G, Yang S, Su X, Liu S, et al. (2014) Multiplex sequencing of pooled mitochondrial genomes-a crucial step toward biodiversity analysis using mito-metagenomics. Nucleic Acids Res. 2014;42(22): e166. pmid:25294837
  29. 29. Andújar C, Arribas P, Ruzicka F, Crampton‐Platt A, Timmermans MJ, Vogler AP. Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics. Mol Ecol. 2015:24: 3603–3617. pmid:25865150
  30. 30. Thomas T, Gilbert J, Meyer F. Metagenomics—a guide from sampling to data analysis. Microb Inform Exp. 2012;2(1): 1.
  31. 31. Paula DP, Linard B, Andow DA, Sujii ER, Pires CS, Vogler AP. Detection and decay rates of prey and prey symbionts in the gut of a predator through metagenomics. Mol Ecol Resour. 2015;15: 880–892. pmid:25545417
  32. 32. Lohse M, Bolger AM, Nagel A, Fernie AR, Lunn JE, Stitt M, et al. RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res. 2012;40: W622–7. pmid:22684630
  33. 33. Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics,. 2011;27(6): 863–864. pmid:21278185
  34. 34. Timmermans MJTN, Dodsworth S, Culverwell CL, Bocak L, Ahrens D, Littlewood DT, et al. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics. Nucleic Acids Res. 2010;38(21): e197. pmid:20876691
  35. 35. Crampton-Platt A, Timmermans MJTN, Gimmel ML, Kutty SN, Cockerill TD, Khen CV, et al. Soup to tree: the phylogeny of beetles inferred by mitochondrial metagenomics of a Bornean rainforest sample. Mol Biol Evol. 2015;32(9): 2302–2316. pmid:25957318
  36. 36. Hunt T, Bergsten J, Levkanicova Z, Papadopoulou A, John OS, Wild R, et al. A comprehensive phylogeny of beetles reveals the evolutionary origins of a superradiation. Science. 2007;318: 1913–1916. pmid:18096805
  37. 37. Srivathsan A, Ang A, Vogler AP, Meier R. Fecal metagenomics for the simultaneous assessment of diet, parasites, and population genetics of an understudied primate. Front Zool. 2016;13:17. pmid:27103937
  38. 38. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41: D590–D596. pmid:23193283
  39. 39. McGinnis S, Madden TL. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 2004;32: W20–W25. pmid:15215342
  40. 40. Lo N, Eggleton P. Termite phylogenetics and co-cladogenesis with symbionts. In: Bignell D, Roisin Y, Lo N, editors. Biology of termites: a modern synthesis. Amsterdam: Springer; 2006. pp. 27–50p.
  41. 41. Murray DC, Coghlan ML, Bunce M. From benchtop to desktop: important considerations when designing amplicon sequencing workflows. PLoS ONE. 2015;10: e0124671. pmid:25902146
  42. 42. Harper GL, King RA, Dodd CS, Harwood JD, Glen DM, Bruford MW, et al. Rapid screening of invertebrate predators for multiple prey DNA targets. Mol Ecol. 2005;14(3): 819–27. pmid:15723673
  43. 43. Valentini A, Pompanon F, Taberlet P. DNA barcoding for ecologists. Trends Ecol Evol. 2009;24: 110–7. pmid:19100655
  44. 44. Mollot G, Duyck PF, Lefeuvre P, Lescourret F, Martin JF, Piry S, et al. Cover cropping alters the diet of arthropods in a banana plantation: a metabarcoding approach. PloS ONE. 2014;9(4): e93740. pmid:24695585
  45. 45. Sicsu PR. (2013) Choice of specific oviposition site as a structuring factor of a ladybug (Coleoptera: Coccinellidae) community in agroecosystems in the Federal District, Brazil. M.Sc. Thesis,. University of Brasília. 2013. Available:…/2011a2013/2013/Paula%20Ramos%20Sicsu.pdf.
  46. 46. Evans EW, Soares AO, Yasuda H. Invasions by ladybugs, ladybirds, and other predatory beetles. BioControl. 2011;56: 597–611.
  47. 47. Geoghegan IE, Majerus TMO, Majerus MEN. Differential parasitisation of adult and pre-imaginal Coccinella septempunctata (Coleoptera: Coccinellidae) by Dinocampus coccinellae (Hymenoptera: Braconidae). European Journal of Entomology. 1998;95: 571–579.
  48. 48. Dixon AFG. Insect predator-prey dynamics: Ladybird beetles and biological control. Cambridge: Cambridge University Press; 2000.
  49. 49. Mora C, Tittensor DP, Adl S, Simpson AG, Worm B. How many species are there on earth and in the ocean? PLoS Biol. 2011;9: e1001127. pmid:21886479
  50. 50. Zaidi RH, Jaal Z, Hawkes NJ, Hemingway J, Symondson WOC. Can multiple-copy sequences of prey DNA be detected amongst the gut contents of invertebrate predators? Mol Ecol. 1999;8: 2081–2087. pmid:10632859
  51. 51. Symondson WOC. Molecular identification of prey in predator diets. Mol Ecol. 2002;11: 627–641. pmid:11972753
  52. 52. Wintzingerode FV, Göbel UV, Stackebrandt E. Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis. FEMS Microbiol Rev. 1997;21: 213–229. pmid:9451814
  53. 53. Polz MF, Cavanaugh CM. Bias in template-to-product ratios in multitemplate PCR. Appl Environ Microbiol. 1998;64: 3724–3730. pmid:9758791
  54. 54. Acinas SG, Sarma-Rupavtarm R, Klepac-Ceraj V, Polz MF. PCR-induced sequence artifacts and bias: insights from comparison of two 16S rRNA clone libraries constructed from the same sample. Appl Environ Microbiol. 2005;71: 8966–8969. pmid:16332901
  55. 55. Amend AS, Seifert KA, Bruns TD. Quantifying microbial communities with 454 pyrosequencing: does read abundance count? Mol Ecol. 2010;19: 5555–5565. pmid:21050295
  56. 56. Ji Y, Ashton L, Pedley SM, Edwards DP, Tang Y, Nakamura A, et al. Reliable, verifiable and efficient monitoring of biodiversity via metabarcoding. Ecol Lett. 2013;16: 1245–1257. pmid:23910579
  57. 57. Qiu X, Wu L, Huang H, McDonel PE, Palumbo AV, Tiedje JM, et al. Evaluation of PCR-generated chimeras, mutations, and heteroduoplexes with 16S rRNA gene-based cloning. Appl Environ Microbiol. 2001;67: 880–887. pmid:11157258