Figure 1.
Overall schema of EPU learning algorithm.
EPU is a framework that utilizes positive and ‘weighted’ unlabeled examples to build an ensemble classifier for disease gene identification. First of all, EPU extracts candidate positives (CP) and reliable negatives (RN) from unlabeled set. Then it applies random walk algorithm to weight remaining unlabeled genes on genetic networks. To achieve reliable and robust measure on U, EPU consults three biological networks, PPI network, GO similarity network and Gene expression network. After obtained ensemble weighted genes, EPU builds three PU learning classifiers. Finally, a novel ensemble strategy is applied to combines the outputs from these classifiers to make final predictions.
Figure 2.
Ensemble learning algorithm.
Table 1.
Overall comparison of classification performance among different techniques.
Table 2.
Overall comparison to single-expert classifiers.
Table 3.
Novel cancer-related genes predicted by EPU.