Skip to main content
Advertisement

< Back to Article

Figure 1.

Overview of the HUMAnN method for metabolic and functional reconstruction from metagenomic data.

The HMP Unified Metabolic Analysis Network (HUMAnN) software recovers the presence, absence, and abundance of microbial gene families and pathways from metagenomic data. Cleaned short DNA reads are aligned to the KEGG Orthology [26] (or any other characterized sequence database) using accelerated translated BLAST. Gene family abundances are calculated as weighted sums of the alignments from each read, normalized by gene length and alignment quality. Pathway reconstruction is performed using a maximum parsimony approach followed by taxonomic limitation (to remove false positive pathway identifications) and gap filling (to account for rare genes in abundant pathways). The resulting output is a set of matrices of pathway coverages (presence/absence) and abundances, as analyzed here for the seven primary body sites of the Human Microbiome Project.

More »

Figure 1 Expand

Table 1.

Metagenomic samples, functional modules, and metabolic pathways analyzed here and differentially abundant in the human microbiome.

More »

Table 1 Expand

Figure 2.

Accuracy of inferred module abundances and coverages using four synthetic metagenomes.

An evaluation of HUMAnN's performance on a high-complexity mock community with a randomized log-normal distribution of 100 organisms as compared to an approach using the single best BLAST hit for each gene family and direct assignment to metabolic modules. Both A) correlation of inferred abundances (arcsine square root transformed for variance stabilization) and B) partial AUC at 0.1 false positive rate are high, outperforming single best BLAST hit functional reconstruction of microbial communities.

More »

Figure 2 Expand

Figure 3.

Metabolic modules differentially present or abundant in at least one body habitat of the human microbiome.

Metabolic modules and pathways from the KEGG BRITE hierarchy [26] found to be differentially abundant (inner cladogram) or differentially covered (outer ring, presence/absence) in the human microbiome. The former were determined using LEfSe [23] and the latter by presence in at least 90% of samples with ≥0.9 coverage or absence in at least 90% with ≤0.1 coverage. Differentially abundant modules are colored by their most abundant body habitat. 168 significantly enriched module abundances were detected, in contrast to only 24 differentially covered.

More »

Figure 3 Expand

Figure 4.

Patterns of abundance of functional modules in 649 metagenomic samples covering seven body habitats.

A heatmap of the first five principal components (>95% variance) of module abundances averaged and normalized over each of the seven body sites. Cell color indicates positive (yellow) or negative (blue) variation, with the adjacent Scree plot showing the total variance in each component. The three most positively and negatively covarying modules contributing to each component are shown. Briefly, the first principal component differentiates the skin and gastrointestinal tract, the second differentiates the vaginal habitat, the third the gut, the fourth the supragingival plaque versus other oral sites, and the fifth the nares versus skin.

More »

Figure 4 Expand

Figure 5.

Gene- and module-specific reconstruction of glycosaminoglycan degradation specific to the gut microbiota.

A) Individual gene family abundances for four gut-specific high abundance modules: chondroitin, dermatan, and keratan sulfate degradation (glycosaminoglycan degradation, also including heparan sulfate), and uronic acid metabolism (occurring directly downstream in the pentose and glucuronate interconversion pathway). Relative abundance is shown from dark (high) to light (low) green, averaged over 136 stool microbiomes, with enzymes not present in the KEGG Orthology in gray. Heparan degradation is absent specifically due to the lack of heparanase (K07964-5), but no one gene family is otherwise responsible for the high abundances of the remaining four modules in the gut, despite several shared enzymes (e.g. beta-glucuronidase, K01195). B) Relative abundances of all five modules in all body habitats and samples, demonstrating gut-specific prevalence. Despite the close connections among these pathways, they show distinct patterns of relative abundance specific to the gut and covary at very low abundance in the oropharynx.

More »

Figure 5 Expand