Imitating Manual Curation of Text-Mined Facts in Biomedicine
Figure 7
Comparison of a Correlation Matrix for the Features (Colored Half of the Matrix) Computed Using Only the Annotated Set of Data and a Matrix of Mutual Information between All Feature Pairs and the Statement Class (Correct or Incorrect)
The plot indicates that a significant amount of information critical for classification is encoded in pairs of weakly correlated features. The white dotted lines outline clusters of features, suggested by analysis of the annotated dataset; we used these clusters in implementation of the Clustered Bayes classifier.