Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Figure 1.

Flowchart of reference-free comparative genomic analysis.

Step 1:SRS data from each sample is converted into a kmer frequency table, using any kmer counter. Step 2: kmers unique to a single genome are filtered into separate subsets while kmers shared by at least two genomes are merged into a single frequency table. Step 3: The SRS reads containing the kmers for both the unique and shared groups of kmers are clustered into separate subsets, keeping the reads from each genome in a group separate. Step 4: localized de novo contigs for each genome in each subset are assembled. ‘Tip’ contigs are assembled from kmers unique to each genome while ‘group’ contigs are assembled from kmers shared by at least two genomes in separate pipelines as indicated by the dashed vertical line.

More »

Figure 1 Expand

Figure 2.

Relationship between genome size and kmer diversity (k = 21) among 174 chloroplasts.

The solid black line indicates a 1∶1 ratio between genome size and kmer diversity while the blue dashed line is fit to the chloroplasts with inverted repeats (kmers = 1801+0.82*genome size). A) 174 chloroplasts, color-coded as indicated in the legend. Several taxa are indicated by the following codes: Rg = Rhizanthella gardneri; Ev = Epifagus virginiana; Cr = Chlamydomonas reinhardtii; Px = Pelargonium_x. The two pairs of Cuscuta taxa are indicated by ellipses, according to whether they are holo- or hemi-parasitic. B) Angiosperm chloroplasts 110–170 kb, color-coded as in the legend. Several taxa are indicated by the following codes: Ce = Cuscuta exaltata; Ms = Monsonia speciosa; Et = Erodium texanum; Ts = Trifolium subterraneum; Io = Illicium oligandrum; Gp = Geranium palmatum.

More »

Figure 2 Expand

Figure 3.

Distribution of informative kmers possessing different types of polymorphisms, given k = 17.

The query sequence is bracketed by two lines with the reference sequences above it. The kmers containing the sequence variant are shown below. The black line indicates the coverage of the kmers containing the polymorphism at each base position. A) single nucleotide polymorphism; B) translocation; C) insertion-deletion.

More »

Figure 3 Expand

Figure 4.

The relationship between local density of polymorphisms and the length of de novo contigs assembled from the reads possessing them.

Each polymorphism is represented by an asterisk and the gray block indicates the window covered by the kmers possessing the polymorphism, as shown in Fig. 3. The reads below each polymorphism contain a kmer that possesses the polymorphism. A) If local density of polymorphisms is low and the reads containing each polymorphism do not overlap significantly, then separate short contigs will be assembled, each containing a single polymorphism. B) If local density of polymorphisms is high, then reads will overlap substantially and even occasionally contain two polymorphisms, thus generating a single de novo contig for the entire region.

More »

Figure 4 Expand

Table 1.

Genomes with unusual tip contig characteristics.

More »

Table 1 Expand

Table 2.

Characteristics of long group groups in deep taxonomic groups and big families.

More »

Table 2 Expand