KrillDB: A de novo transcriptome database for the Antarctic krill (Euphausia superba)
Raw Illumina reads were first trimmed for adapters and for low quality bases at the 3’ end. They were digitally normalized with the khmer software to reduce redundancy. These sequences were independently assembled using different software (OASIS, Trinity, IDBA and SOAP) and kmer sizes (23, 33, 43, 53). Information deriving from a previous assembly based on the 454 sequencing technology was added to further increase the transcriptome coverage. Repeated sequences were identified and removed using the RepeatMasker software in order to reduce the number of chimeric misassemblies. All surviving fragments were merged into a single transcriptome using the Evidential Gene pipeline. Results were annotated using sequence homology (BLAST) and protein domain searches (InterproScan).