Fig 1.
ShortBRED-Identify creates distinctive markers for protein families of interest. ShortBRED-Quantify maps nucleotides reads to markers and normalizes abundance.
Table 1.
Characteristics of ShortBRED markers used to profile synthetic metagenomes.
Fig 2.
Accuracy of ShortBRED and centroid-based profiling within synthetic metagenomes.
(A, B) ROC curves report the sensitivity and specificity (in terms of TPR and FPR) of the two methods for correctly identifying the presence and absence of protein families of interest in six synthetic metagenomes, spiked with 5%, 10%, and 25% of their material from the ARDB (panel A) and VFDB (panel B). (C, D) Scatterplots of protein family “predicted from mapping”, the abundance values calculated by ShortBRED and the centroids, versus “expected from gold standard”, the abundance values of the protein families in the 10% synthetic metagenome.
Fig 3.
Speed of execution: ShortBRED versus centroid-based profiling.
Results are based on time used by USEARCH in ShortBRED-Quantify.
Fig 4.
Antibiotic resistance in the human gut microbiome.
RPKM values produced by ShortBRED for antibiotic resistance protein families, summed by class of resistance. Samples in the USA-Global, Venezuela, and Malawi cohorts were profiled by mapping reads to centroids due to their lower sequencing depth. Marker information is listed in Table 2. Samples (columns) were clustered according to Canberra distance and antibiotic resistance families (rows) were clustered according to Euclidean distance.
Fig 5.
Prevalence of antibiotic resistance across bacterial isolate genomes.
Phylogenetic tree of bacterial genomes from IMG [24] overlaid with presence/absence of ShortBRED antibiotic resistance protein families. The outermost ring indicates the share of genes in each species’ genome that mapped to any of the AR protein families. This figure was produced using GraPhlAn [27].
Table 2.
Characteristics of ShortBRED markers used to profile metagenomes and bacterial genomes.