Fig 1.
The compressed fastq files in a metagenome or metatranscriptome sequencing project are (1) compared against a protein reference database such as NCBI-nr using DIAMOND. (2) Taxonomic and functional analysis is then performed on the diamond files using Meganizer. (3) The resulting meganized diamond files remain on the server and are accessed via the MeganServer software. (4) Researchers work interactively with the data using MEGAN CE.
Fig 2.
High-level nodes represent the metagenomic GO-slim [13], whereas low-level nodes are based on InterPro [5]. Here we have uncollapsed the GO “biological process” domain node to show the second tier nodes attached below it. Each node is labeled by a bar chart representing the number of reads assigned to the node, or below it, for 12 different human stool samples [16]. In this example, 27.5% of 816 million reads are assigned to an InterPro family by MEGAN CE.
Fig 3.
Spreadsheet for entry and analysis of metadata associated with samples.
Fig 4.
A PCoA analysis of 12 human gut samples [16] computed using species-level profiles and Bray-Curtis distances.
Samples are labeled by subject pseudonym, day 0–34 and whether antibiotics were taken (+) or not (-) on the given day. For both subjects, the plot clearly shows that the taxonomic profiles move further and further away from the original during the course of antibiotics, but then return back close to the original at the end of the study. The top five bi-vectors are also shown, labeled by species name.
Table 1.
For twelve shotgun metagenome samples [16], we report (a) the number of reads, (b) wall-clock time required to align the reads against NCBI-nr using DIAMOND, (c) the number of matches obtained, (d) the number of reads that have at least one alignment and (e) the time required to run Meganizer to perform taxonomic and functional classification of all reads.
The total wall-clock time is 67 hours on a single server with 32 cores.