Benchmarking Ontologies: Bigger or Better?
The test ontology, X, is represented as a set of concepts and set of relations, CX and RX respectively, and is compared to domain-specific reference corpus, T. Our analysis begins by mapping concepts and relations of X to T using natural language processing tools (step 1). This mapping allows us to estimate from the text a set of concept- and relation-specific frequency parameters required for computing Breadth and Depth metrics for X with respect to T (step 2). The next step involves estimating the complete ontology for corpus T – an ideal ontology that includes every concept and every relation mentioned in T (step 3). Given the complete ontology, we can estimate the fittest ontology (a subset of the complete ontology) of the same size as the test ontology X (step 4) and compute the loss measures for X (step 5). See Materials and Methods section for precise definitions of the concepts and metrics involved.