Figure 1.
ORF-based approach for the taxonomic assignment of reads of different lengths derived from different regions of the genomic DNA.
Read derived from intergenic region (A), read containing the small 5′ region of an ORF (B), read containing two partial ORFs at the 5′and 3′ terminals and a complete ORF in the middle (C), read containing only a single complete ORF (D), read containing a long partial ORF at one end (E), read obtained from within an ORF (F), read with sequencing error causing a single ORF to split into two smaller ORFs (G). X, Y, Z, K, L, and M are the genomes to which the ORFs showed matches. The taxonomic IDs of the species of these genomes are used for making the taxonomic assignments, and for creating the taxonomic bins.
Figure 2.
Flowchart of MetaBin algorithm.
ID and POS refer to %Identity and %Positives, respectively, as provided in the Blastx or Blat output. COV refers to the % coverage of the query with the hit (reference protein).
Table 1.
Summary of results using MetaBin, MEGAN and SOrt-ITEMS on simulated bacterial read datasets for different sequencing technologies.
Table 2.
Summary of results using MetaBin, MEGAN and SOrt-ITEMS on simulated archaeal read datasets for different sequencing technologies.