Table 1.
Gap numbers and size distributions for representative high quality draft assemblies of highly studied species.
Figure 1.
A schematic of PBJelly's workflow and decision-making.
(A) A flow chart of PBJelly's steps. (B) A schematic describing two hypothetical gaps supported by reads and the classifications used during the Support step. (C) A detailed flow chart for local assembly of PacBio reads in a gap region used during the assembly step.
Figure 2.
Description of sequencing data sets used.
Histograms of read lengths in (A) Dmel, (B) Dpse, (C) Mund, (D) Caty. Panel (E) contains detailed metrics of each dataset.
Figure 3.
Gap filling Improvements and categories produced by PBJelly.
Histograms showing gap-size distribution in the original and upgraded (A) D .mel, (B) Dpse, (C) Mund, and (D) Caty references as well as a summary of the upgrade categories for gaps.
Table 2.
Gap Fill Statistics for PBJelly.
Figure 4.
Results. Using Sanger sequencing of Dpse we validated 7 negative gap closures (A) and 45 closed gaps (B). We also compared PBJelly's gap closing sequence with the original Dmel reference (C).
Table 3.
Sanger Validation Results Per Gap.
Figure 5.
Distribution of amount of sequence placed in closed gaps compared to overfilled gaps.
Frequency plots of the absolute value of sequence placed into gaps subtracted from the predicted gap size in closed gaps versus overfilled gaps in (A) Dpse (B) Mund (C) Caty. Data for Dmel is not shown because synthetically inserted gaps' predicted gap sizes matched the amount of sequence that should have been placed into the gaps.