Information Content-Based Gene Ontology Functional Similarity Measures: Which One to Use for a Given Biological Data Type?

doi:10.1371/journal.pone.0113859

Table 1.

Summary of different IC-based functional similarity and term semantic similarity measures.

More »

Expand

Figure 1.

Performance evaluation in terms of Pearson's correlation values.

These different Pearson's correlation values with Enzyme Commission (EC), Pfam and Sequence similarity are obtained from the CESSM online tool. For x-axis labels, the prefixes R, N, L, Li, S, X, A, Z, W, and U represent the approaches and stand for Resnik, Nunivers, Lin, Li, Relevance, XGraSM, Annotation-based, Zhang, Wang and GO-universal, respectively. The suffixes GIC, UIC and DIC represent SimGIC, SimUIC and SimDIC measures, respectively. In cases where the prefix X is used, it is immediately followed by the approach prefix. Refer to Table 2 and 3 for the description of these different measures.

More »

Expand

Table 2.

Pearson's correlation values of different measures.

More »

Expand

Figure 2.

Performance evaluation in terms of clustering power (RI and NI) and Area Under the Curve (AUC) values.

Different x-axis labels are the same as in Fig. 1, where different prefixes and suffixes stand for different term semantic similarity approaches and functional similarity measures.

More »

Expand

Table 3.

Area under the curve (AUC), Rand Index (RI) and Normalized Mutual Information (NI) values of different measures.

More »

Expand

Table 4.

Summary of overall ‘best’ performing measures for different biological data.

More »

Expand

Table 5.

Summary of the best performing measures for different applications.

More »

Expand