Global Discriminative Learning for Higher-Accuracy Computational Gene Prediction

doi:10.1371/journal.pcbi.0030054

Figure 1.

Learning Methods: Discriminative versus Generative

Schematic comparison of discriminative (A) and generative (B) learning methods. In the discriminative case, all model parameters were estimated simultaneously to predict a segmentation as similar as possible to the annotation. In contrast, for generative HMM models, signal features and state features were assumed to be independent and trained separately.

More »

Expand

Table 1.

Dataset Statistics

More »

Expand

Table 2.

Accuracy Results for BGHM953

More »

Expand

Table 3.

Accuracy Results for TIGR251

More »

Expand

Table 4.

Accuracy Results for ENCODE294

More »

Expand

Table 5.

Significance Testing

More »

Expand

Figure 2.

F-Score as a Function of Intron Length

Results for all sets combined (A) and for individual test sets shown in subfigures (B–D). The boxed number appearing directly above each marker represents the total number of introns associated with the marker's length. For example, there were 1,475 introns with lengths between 1,000 and 2,000 base pairs for all sets combined (A).

More »

Expand

Figure 3.

F-Score versus Intron Length for the Encode Test Set

Results in subfigures (A) and (B) correspond to the subset of alternatively spliced genes and its complementary subset, respectively.

More »

Expand

Figure 4.

Signal Accuracy Improvements

CRAIG's relative improvements in prediction specificity (orange bar) and sensitivity (blue bar) by signal type. In each case, the second-best program was used for the comparison: Genezilla for starts, Augustus for stops, and GenScan++ for splice sites.

More »

Expand

Figure 5.

Finite-State Model for Eukaryotic Genes

Variable-length genomic regions are represented by states, and biological signals are represented by transitions between states. Short and long introns are denoted by I^S and I^L, respectively.

More »

Expand

Table 6.

State Features for Each Segment Label

More »

Expand

Table 7.

Transition Features per Signal Type

More »

Expand