ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
Figure 4
GHMM architecture for eukaryotic protein-coding gene prediction.
is a state for representing an initial exon that ends at phase
.
is a state for representing an internal exon that begins at phase
and ends at phase
.
is a state for representing a terminal exon that begins at phase
.
is a state for representing an intron at phase
.
is a state for representing intergenic regions.
is a state for representing the start codon signal.
is a state for representing the stop codon signal.
is a state for representing acceptor splice site signal at phase
.
is a state for representing the donor splice site signal at phase
. To model the reverse strand, we used the states that begin with the prefix ‘r-’. Squares with a self-transition represent states with geometric duration distribution. Squares without a self-transition represent states with a non-geometric duration distribution. Ellipses represent states with fixed-length durations.