Figure 1.
Flowchart illustrating the experimental procedure.
The protocol consists of three steps: 1) Pre-processing; 2) Supervised analysis; 3) Post-analysis.
Table 1.
Datasets used in this paper.
Figure 2.
A BioHEL classification rule set obtained for the prostate cancer dataset and illustrating different types of rules.
“Exp(x)” is short for “Expression of gene x”, where x is a HUGO gene symbol, “” represents the conjunctive AND-operator, “[x,y]” is an interval of expression values in which the value of the attribute must lie to fulfill one premise of the rule, and “-
” is a class assignment operator, followed by the output class of the rule. Rule 5 is a default rule that applies if no rule above is matched.
Table 2.
10-fold cross-validation results.
Table 3.
Leave-one-out cross-validation results.
Table 4.
Comparison of prediction results from the literature for the prostate cancer dataset.
Table 5.
Comparison of prediction results from the literature for the lymphoma dataset.
Table 6.
Comparison of prediction methods.
Table 7.
Comparison of feature selection methods.
Table 8.
List of high scoring genes for the prostate cancer dataset.
Table 9.
List of high scoring genes for the lymphoma dataset.
Table 10.
List of high scoring genes for the breast cancer dataset.
Figure 3.
Comparison of text mining scores.
Histogram of text mining scores for randomly chosen gene identifier subsets compared to scores achieved by BioHEL and the ensemble feature selection (FS) approach (prostate cancer dataset).
Figure 4.
Comparison of text mining scores.
Histogram of text mining scores for randomly chosen gene identifier subsets compared to scores achieved by BioHEL and the ensemble feature selection (FS) approach (lymphoma cancer dataset).
Figure 5.
Comparison of text mining scores.
Histogram of text mining scores for randomly chosen gene identifier subsets, compared to scores achieved by BioHEL and the ensemble feature selection (FS) approach (breast cancer dataset).
Table 11.
Literature mining significance scores.