Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

A schematic of the core Mirage2 algorithm.

Isoforms are first mapped back to their coding exons on the genome. Once all isoforms within a gene family have been mapped, those genome mapping coordinates serve as the basis for intra-species alignment, resulting in an MSA with explicit splice site awareness and exon delineation.

More »

Fig 1 Expand

Fig 2.

An example of a “species guide” file.

The top line is a Newick-formatted species tree to set the merge order of species during Mirage2’s interspecies alignment phase. Each subsequent line associates a species with the location of its reference genome and a GTF index. Sequences belonging to species that aren’t listed in the species guide are treated as “miscellaneous” and are the last to be integrated into interspecies MSAs.

More »

Fig 2 Expand

Table 1.

Mirage2’s mapping methods map nearly all SwissProt sequences.

More »

Table 1 Expand

Fig 3.

FastMap and Spaln2 are complementary mapping methods.

The majority of sequences that Mirage2 is able to map back to the genome can be mapped using either FastMap or Spaln2, although one tool or the other is specifically required to map 14.0% of human sequences, 15.0% of mouse sequences, and 12.1% of rat sequences.

More »

Fig 3 Expand

Fig 4.

Mirage2 MSAs have extremely high percents column identity.

Percent column identity distributions for intra-species Mirage2 multiple-sequence alignments (excluding “alignments” with only 1 sequence) and Mirage2 inter-species alignments for genes present in at least 2 species.

More »

Fig 4 Expand

Fig 5.

Differences between the percents column identity of Mirage2 MSAs and alignments produced by general-purpose MSA tools.

Values were computed by subtracting the percent column identity of each Mirage2 MSA from the percent column identity of corresponding MSA produced by an alternative tool.

More »

Fig 5 Expand

Fig 6.

The length compaction factors of alternative alignment methods relative to Mirage2.

Alignment length is defined as the number of columns in an MSA and the compaction factor is computed by dividing the length of an alternative tool’s MSA by the length of the corresponding Mirage2 MSA.

More »

Fig 6 Expand

Fig 7.

A partial comparison of the alignments of human DMBT1 sequences produced by Mirage2 and MAFFT.

The underlined segments highlight sequence regions where the tools are generally in agreement, but the segments are spaced significantly further apart in the MAFFT alignment than they are in the Mirage2 alignment. This illustrates how, in cases of erroneous alignment, using the comparative lengths of isoform MSAs can be an imperfect quantification of relative alignment quality.

More »

Fig 7 Expand

Table 2.

Mapping performance comparison between Mirage2 and the original Mirage implementation.

More »

Table 2 Expand

Table 3.

Runtime comparison between Mirage2 and alternative MSA tools.

More »

Table 3 Expand