A Primer on Metagenomics
Figure 3
(A–C) Hamiltonian. (A) A sequence with overlapping reads; (B) Each read is represented as a vertex, with edges connecting the overlapping vertices; (C) the assembly solution is a Hamiltonian path (all vertices are visited, no vertex is visited more than once) through the resulting graph; (D) For short reads assembly, each vertex is a k-mer (or a hashed collection of k-mers), and the reads are threaded between vertices as edges. The solution is a Eulerian path, where each edge is visited once. Repeats are merged into a single edge. For detailed algorithms see [49], [50], [53]–[55].