Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

doi:10.1371/journal.pone.0027992

Figure 1.

The GRAMMy model.

A schematic diagram of the finite mixture model underlies the GRAMMy framework for shotgun metagenomics. In the figure, ‘iid’ stands for “independent identically distributed”.

More »

Expand

Figure 2.

The GRAMMy flowchart.

A typical flowchart of GRAMMy analysis pipeline employs ‘map’ and ‘k-mer’ assignment.

More »

Expand

Table 1.

Comparison of estimation accuracy.

More »

Expand

Table 2.

Summary statistics for the metagenomic datasets.

More »

Expand

Figure 3.

Frequent species for human gut metagenomes.

The 99 species occurring in at least 50% of the 33 human gut samples with a minimum relative abundance of 0.05% were selected. ‘gut_HGS_90’ indicates that the human gut (‘gut’) read sets were mapped to the reference genome set (‘HGS’) with an identity rate cut-off at 90% (‘90’).

More »

Expand

Figure 4.

Heatmap biclustering of human gut metagenomes.

‘gut_HGS_90’ indicates that the human gut (‘gut’) read sets were mapped to the reference genome set (‘HGS’) with an identity rate cut-off at 90% (‘90’). The bottom labels indicate human gut samples. The top right legend shows the color-coding for columns indicating the sample age category and dataset origin. The bottom right legend shows color-coding for rows indicating the top 4 most abundant phyla in human gut. The relative abundance for each sample is normalized by a rank transformation.

More »

Expand

Figure 5.

GRAMMy estimates of GRAs for the acid mine drainage data.

Estimated relative abundance for each strain is shown as a percentage. The first two strains dominate the sample.

More »

Expand

Figure 6.

Running time comparison.

GRAMMy is the fastest in all cases as compared to MEGAN and GAAS in processing time. The BLAT mapping time is excluded for all compared tools.

More »

Expand