Table 1.
Target class distribution of the 8,066 patents from which the final set was drawn.
Figure 1.
Example patent text with pre-annotations as shown by the Brat annotation tool.
Table 2.
Number of annotated terms and unique terms within the harmonized set prior to disambiguation.
Table 3.
Inter-annotator agreement (F-score) without ambiguity resolution.
Table 4.
The effect of the disambiguation process on the annotations.
Table 5.
Inter-annotator agreement after ambiguity resolution.
Table 6.
Inter-annotator agreement (F-score) between the harmonized set and the annotator groups for the main entity types.
Table 7.
Number of annotated terms and unique terms in the harmonized set and in the full patent set of the gold standard corpus after disambiguation.