Semantic Similarity for Automatic Classification of Chemical Compounds
For each dataset, the best metric was stripped of the parameter. This incomplete metric was completed with all values of and then each one was used to determine performance. The figure shows the variation of performance (as measured by the Matthews Correlation Coefficient) against the value of . There is a maximum in the plot for every dataset, consisting of the best metric: , , for the BBB (red open circles), P-gp (green closed circles) and estrogen (blue closed squares) datasets respectively).