Advances in gene ontology utilization improve statistical power of annotation enrichment

doi:10.1371/journal.pone.0220728

Fig 1.

GOcats data flow diagram for creating categories of GO.

A) GOcats enables the user to extract subgraphs of GO representing concepts as defined by keywords, each with a root (category-defining) node. B) Subgraphs extracted by GOcats are used to create a mapping from all sub-nodes in a set of subgraphs to their category-defining root node(s). This allows the user to map gene annotations in GAFs to any number of customized categories.

More »

Expand

Fig 2.

The has_part relation creates incongruent paths with respect to semantic scoping.

Some tools may create questionable GO term mappings, i.e. “nuclear envelope” to “plasma membrane,” since the has_part relation edges point in from super-concepts to sub-concepts. GOCats avoids this by re-interpreting the has_part edges into part_of_some edges.

More »

Expand

Table 1.

Frequency of relations in the gene ontology and suggested semantic correspondence classes to reduce ambiguity^†.

More »

Expand

Table 2.

Prevalence of potential has_part relation mapping errors in GO.

More »

Expand

Table 3.

Summary of GO term mapping errors resulting from misevaluation of relations with respect to semantic scoping.

More »

Expand

Fig 3.

Comparison of adjusted p-values for significantly-enriched annotations using GOcats paths vs excluding has_part edges.

Most significantly-enriched GO terms had an improved p-value when GOcats re-evaluated has_part edges for the enrichment of the breast cancer data set in this investigation.

More »