Figure 1.
An example of a full-text biomedical article (pmid = 12808147) with author identified links between sentences in the abstract and figures and tables in the body of the article.
Abstract sentences are shown in different colors. Arrows denote the annotated associations and arrow colors correspond to sentence color. To save space, figure captions are truncated and Fig. 2, which is not linked with any sentence, is not shown. (Figures republished with permission from [32], Copyright (2003) National Academy of Sciences, U.S.A.).
Figure 2.
Recall-precision curves for three LMs and the baseline.
The (Fixed Size, Mixture) model is our CompleteLM. The filled circles denote locations of the points.
Table 1.
Performance measures of text-only models.
Figure 3.
Empirical cumulative distribution functions of sentence length for four collections of instances: all 5406 instances, all 947 linked instances, and the top 826 scoring instances from (Fixed Size, Mixture) and (Variable Size, Mixture) language models.
Figure 4.
Results of permutation tests showing article-effects on two performance measures: area under the ROC curve (left) and precision (right).
Blue and magenta points show actual performance values for the CompleteLM model calculated with the whole-corpus and per-article methods, respectively. The red-line shows a normalized histogram of per-article performance for 1000 random permutations of the associations between articles and abstract sentence/figure instances.
Table 2.
Percent information gain of non-text features.
Figure 5.
Whole-corpus recall-precision curves.
The solid dots indicate the recall-precision point at , when the number of predicted linked instances is equal to the total number of abstract sentences in the corpus.
Table 3.
Performance values for models that use combinations of text, positional and linkage features.
Table 4.
Results for 14 articles with human annotations provided by both authors and non-authors, and computational predictions provided by the CRF (SIS) model.
Table 5.
Survey questions and average response values.
Figure 6.
Example graph and linkage matrix representations for an article with four abstract sentences, three figures and four sentence/figure links.
Combinations of linkages that induce edges that cross in the graph representation, –
–
and
–
–
in this example, are less common as they are out of keeping with the observed tendency for consistent relative ordering among linked instances.
Figure 7.
Example HMM (a) and CRF (b) state transition diagrams using the sentences-in-states construction.
(a) States and transitions for the base HMM for a corpus where the maximum number of abstract sentences in an article () is 4. The states and transitions in sold blue are part of the derived HMM for an article with
sentences. (b) CRF states and transitions for an article with
sentences and where the maximum number of sentences per figure (
) is 2.