Gene-Boosted Assembly of a Novel Bacterial Genome from Very Short Reads
All contigs are aligned with predicted gene sequences to identify genes that span 2 or more contigs. The DNA sequences of these spanning genes are cut out with a small buffer on each end. The amino acid translation of each gene fragment is then searched against a translated database of all singleton reads that have not yet been placed in the assembly. Finally, the reads identified by this process are assembled together with the two contigs to fill in the gap.