WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning
Fig 7
WORMHOLE SVMs improve prediction of FOSTA FEPs over constituent algorithms and voting to a degree dependent on the evolutionary separation of the compared species.
(A) Precision-recall performance charts for FOSTA FEP predictions made between vertebrate and invertebrate species separated into categories based on evolutionary distance. Points or lines represent the mean performance of the 17 constituent algorithms (black), BLASTp reciprocal best hits (RBHs) (red), voting (green), WORMHOLE SVMs (blue), or WORMHOLE RBHs (cyan) at predicting FOSTA FEPs. Lines are generated by sampling the complete range of possible threshold values for each confidence score type. Colored points indicate the performance for specified threshold values (blue numbers) on each line. (B) Box and whisker plot representing the harmonic mean of precision and recall for each of the 17 constituent WORMHOLE algorithms, voting, BLASTp RBHs, WORMHOLE SVMs, and WORMHOLE RBHs when predicting FOSTA FEPs each pair of query and target species. Ortholog prediction methods are ordered by median harmonic mean. For voting and SVMs, values represent the maximum harmonic mean for each pair of query and target species (WORMHOLE Score ≥ 0.5).