Fig 1.
Metacoder has an intuitive and easy to use syntax.
The code in this example analysis parses the taxonomic data associated with sequences from the Ribosomal Database Project [9] 16S training set, filters and subsamples the data by sequence and taxon characteristics, conducts digital PCR, and displays the results as a heat tree. All functions in bold are from the metacoder package. Note how columns and functions in the taxmap object (green box) can be referenced within functions as if they were independent variables.
Table 1.
Primary functions found in metacoder.
Fig 2.
Heat trees allow for a better understanding of community structure than stacked bar charts.
The stacked bar chart on the left represents the abundance of organisms in two samples from the Human Microbiome Project [5]. The same data are displayed as heat trees on the right. In the heat trees, size and color of nodes and edges are correlated with the abundance of organisms in each community. Both visualizations show communities dominated by firmicutes, but the heat trees reveal that the two samples share no families within firmicutes and are thus much more different than suggested by the stacked bar chart.
Fig 3.
Heat trees display up to four metrics in a taxonomic context and can plot multiple trees per graph.
Most graph components, such as the size and color of text, nodes, and edges, can be automatically mapped to arbitrary numbers, allowing for a quantitative representation of multiple statistics simultaneously. This graph depicts the uncertainty of OTU classifications from the TARA global oceans survey [2]. Each node represents a taxon used to classify OTUs and the edges determine where it fits in the overall taxonomic hierarchy. Node diameter is proportional to the number of OTUs classified as that taxon and edge width is proportional to the number of reads. Color represents the percent of OTUs assigned to each taxon that are somewhat similar to their closest reference sequence (>90% sequence identity). a. Metazoan diversity in detail. b. All taxonomic diversity found. Note that multiple trees are automatically created and arranged when there are multiple roots to the taxonomy.
Fig 4.
Flexible parsing and digital PCR allows for comparisons of primers and databases.
Shown is a comparison of digital PCR results for three 16S reference databases. The plots on the left display abundance of all bacterial 16S sequences. Plots on the right display all taxa with subtaxa not entirely amplified by digital PCR using universal 16S primers [18]. Node color and size display the proportion and number of sequences not amplified respectively.
Fig 5.
Scale-independent appearance facilitates complex, composite figures.
This graph uses 16S metabarcoding data from the human microbiome project study to show pairwise comparisons of microbiome communities in different parts of the human body. All graph components, including text, have the same relative sizes independent of output size, unlike most graphical packages in R, making it easier to create composite figures entirely within R. The gray tree on the lower left functions as a key for the smaller unlabeled trees. The color of each taxon represents the log-2 ratio of median proportions of reads observed at each body site. Only significant differences are colored, determined using a Wilcox rank-sum test followed by a Benjamini-Hochberg (FDR) correction for multiple comparisons. Taxa colored green are enriched in the part of the body shown in the row and those colored brown are enriched in the part of the body shown in the column. For example, Haemophilus, Streptococcus, Prevotella are enriched in saliva (brown) relative to stool where Bacteroides is enriched (green).
Fig 6.
Metacoder can be used with any type of data that can be organized hierarchically.
This plot shows the results of the 2016 Democratic primary election organized by region, division, state, and county. The regions and divisions are those defined by the United States census bureau. Color corresponds to the difference in the percentage of votes for candidates Hillary Clinton (green) and Bernie Sanders (brown). Size corresponds to the total number of votes cast. Data was downloaded from https://www.kaggle.com/benhamner/2016-us-election/.
Fig 7.
Another alternate use example: Visualizing gene expression data in a GO hierarchy.
The gene ontology for all differentially expressed genes in a study on the effect of a glucocorticoid on airway smooth muscle tissue [19]. Color indicates the sign and intensity of averaged changes in gene expression and the size indicates the number of genes classified by a given gene ontology term.