Table 1.
Summary of sequence reads.
Table 2.
Assembly results with PacBio subreads and Illumina MiSeq reads.
Table 3.
Detail of corrected positions via MiSeq.
Figure 1.
New BEST195 genome 4,105,380 bp.
Pink and blue indicate forward and reverse genes, respectively. Black lines correspond to regions of gaps in the previous genome, and red lines indicate novel coding sequences (CDSs) found only in the new genome. The second inner circle from the centre displays the G+C content (window size = 10,000 bp, step size = 200), and the inner circle displays the GC skew. This genetic map was generated using DNAPlotter [32].
Table 4.
GAII, MiSeq, and PacBio mapping to the new genome and previous genome.
Figure 2.
Example of novel orthologous genes.
BSNT_06684 and BSNT_06685 are orthologous to BSU03480 and BSU03490, respectively. The upper part of this figure shows an anchor alignment within the new genome using Murasaki, a multiple genome comparison program [33], and the bottom part displays the mapping results of Illumina GAII reads, MiSeq reads, and PacBio reads.
Figure 3.
Example of one previous gap region.
The region correspond to 4,003,725 bp to 4,007,944 bp in the previous genome. The upper part of this figure shows the mapping results of Illumina GAII reads, MiSeq reads, and PacBio reads. The bottom part displays an anchor alignment within the new genome using Murasaki, a multiple genome comparison program [33]. For the anchor alignment, blue regions represent aligned anchors, which means that the same subsequences are present in other positions of the genome. For the mapping results, reads coloured with white were mapped with mapping quality zero; that is, each read is not uniquely mapped to one position. Compared with the mapping results of Illumina reads, blue regions in the anchor alignment corresponded to positions where white reads were mapped. Additionally, both ends of the gap region include transposases with repetitive sequences.
Figure 4.
GC rate with novel CDSs and regions corresponding to gaps in the previous genome.
GC rate was calculated with a 10,000-bp window. For the GC plot, pink and blue indicate an above-average GC rate and below-average GC rate, respectively. Red boxes indicate novel CDSs in the new genome, and grey boxes correspond to gaps in the previous genome.
Figure 5.
The improvable gene in the Marburg 168 genome by the complete BEST195 genome.
The upper part of this figure displays an anchor alignment between the Marburg 168 genome sequence and the complete BEST195 genome sequence using Murasaki. The bottom part shows the alignment results of sequences of Marburg 168 genome, BEST195, and three relative species, B. amyloliquefaciens, B. licheniformis, and B. pumilus, using CLC Sequence Viewer. The substitution of C for T and deletions of C and A in the Marburg 168 genome sequence in red dash line box were thought to be the cause of separated genes, BSU16890 and BSU16900.