Fig 1.
GOcats data flow diagram for creating categories of GO.
A) GOcats enables the user to extract subgraphs of GO representing concepts as defined by keywords, each with a root (category-defining) node. B) Subgraphs extracted by GOcats are used to create a mapping from all sub-nodes in a set of subgraphs to their category-defining root node(s). This allows the user to map gene annotations in GAFs to any number of customized categories.
Fig 2.
The has_part relation creates incongruent paths with respect to semantic scoping.
Some tools may create questionable GO term mappings, i.e. “nuclear envelope” to “plasma membrane,” since the has_part relation edges point in from super-concepts to sub-concepts. GOCats avoids this by re-interpreting the has_part edges into part_of_some edges.
Table 1.
Frequency of relations in the gene ontology and suggested semantic correspondence classes to reduce ambiguity†.
Table 2.
Prevalence of potential has_part relation mapping errors in GO.
Table 3.
Summary of GO term mapping errors resulting from misevaluation of relations with respect to semantic scoping.
Fig 3.
Comparison of adjusted p-values for significantly-enriched annotations using GOcats paths vs excluding has_part edges.
Most significantly-enriched GO terms had an improved p-value when GOcats re-evaluated has_part edges for the enrichment of the breast cancer data set in this investigation.
Table 4.
Uniquely enriched terms between GOcats paths and traditional paths from the breast cancer dataset analysis.
Table 5.
Binomial test results for GOcats vs no_hp enrichment for horse cartilage development time point comparisons.
Table 6.
Neighbor vs extreme time point comparison of enriched terms in horse cartilage development enrichment analyses.
Table 7.
Comparison of equine fetus tissue samples.