Skip to main content
Advertisement

< Back to Article

kMermaid: Ultrafast metagenomic read assignment to protein clusters by hashing of amino acid k-mer frequencies

Fig 3

kMermaid sensitivity and resource benchmarking on simulated microbial protein data.

(a) The percentage of reads classified correctly by kMermaid compared with leading read-to-protein mapping tools averaged across 10 simulated datasets per each combination of read length and mutation rate. (b) The number of reads classified by each tool normalized by the number of input reads averaged across 10 simulated datasets per each combination of read length and mutation rate. (c) kMermaid (green) provides up to a 25-fold decrease in runtimes (in seconds, log-transformed) compared to BLASTX and has comparable runtimes to DIAMOND (blue). The y-axis has been truncated and tools that exceeded a 24-hour run time for larger input sizes are denoted with an asterisk. (d) kMermaid (green) requires a fixed, low memory allocation in comparison to other read-to-protein mapping tools. BLASTX was excluded from comparisons with more than 1 million sequences due to the infeasible running times. Methods exceeding 16GB of RAM are denoted with an asterisk.

Fig 3

doi: https://doi.org/10.1371/journal.pcbi.1013470.g003