Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology

doi:10.1371/journal.pone.0047768

Table 1.

Gap numbers and size distributions for representative high quality draft assemblies of highly studied species.

More »

Expand

Figure 1.

A schematic of PBJelly's workflow and decision-making.

(A) A flow chart of PBJelly's steps. (B) A schematic describing two hypothetical gaps supported by reads and the classifications used during the Support step. (C) A detailed flow chart for local assembly of PacBio reads in a gap region used during the assembly step.

More »

Expand

Figure 2.

Description of sequencing data sets used.

Histograms of read lengths in (A) Dmel, (B) Dpse, (C) Mund, (D) Caty. Panel (E) contains detailed metrics of each dataset.

More »

Expand

Figure 3.

Gap filling Improvements and categories produced by PBJelly.

Histograms showing gap-size distribution in the original and upgraded (A) D .mel, (B) Dpse, (C) Mund, and (D) Caty references as well as a summary of the upgrade categories for gaps.

More »

Expand

Table 2.

Gap Fill Statistics for PBJelly.

More »

Expand

Figure 4.

Validation of PBJelly

Results. Using Sanger sequencing of Dpse we validated 7 negative gap closures (A) and 45 closed gaps (B). We also compared PBJelly's gap closing sequence with the original Dmel reference (C).

More »

Expand

Table 3.

Sanger Validation Results Per Gap.

More »

Expand

Figure 5.

Distribution of amount of sequence placed in closed gaps compared to overfilled gaps.

Frequency plots of the absolute value of sequence placed into gaps subtracted from the predicted gap size in closed gaps versus overfilled gaps in (A) Dpse (B) Mund (C) Caty. Data for Dmel is not shown because synthetically inserted gaps' predicted gap sizes matched the amount of sequence that should have been placed into the gaps.

More »

Expand