Figure 1.
Environmental Shotgun Sequencing (ESS).
(A) Sampling from habitat; (B) filtering particles, typically by size; (C) DNA extraction and lysis; (D) cloning and library; (E) sequence the clones; (F) sequence assembly.
Table 1.
Comparison of different sequencing technologies, taken from [34].
Figure 2.
Single stranded DNA template is first hybridized with the sequencing primer and mixed with the enzymes along with the two substrates adenosine 5′-phosphosulfate (APS) and luciferin. In each cycle, (1) one of the four nucleotides (dTTPi, in this case) is then added to the reaction. (2) If the nucleotide is complementary to the base in the template strand then the DNA polymerase incorporates it into the growing strand. (3) Pyrophosphate (PPi)—in an amount equal in molarity to that of the incorporated nucleotide—is released and converted to ATP by sulfurylase in the presence of APS. (4) ATP then serves as a substrate to luciferase, causing a light reaction. Photon emission is in equimolar quanta to the amount of nucleotide incorporated in a given cycle. (5) The excess nucleotides are degraded by apyrase.
Table 2.
The information contained in different lengths of genomic DNA.
Figure 3.
(A–C) Hamiltonian. (A) A sequence with overlapping reads; (B) Each read is represented as a vertex, with edges connecting the overlapping vertices; (C) the assembly solution is a Hamiltonian path (all vertices are visited, no vertex is visited more than once) through the resulting graph; (D) For short reads assembly, each vertex is a k-mer (or a hashed collection of k-mers), and the reads are threaded between vertices as edges. The solution is a Eulerian path, where each edge is visited once. Repeats are merged into a single edge. For detailed algorithms see [49], [50], [53]–[55].
Figure 4.
Green, most or all species have been sampled; blue, this habitat has not been exhaustively sampled; red, species rich habitat, only a small fraction has been sampled.