Figure 1.
An Energy-Minimized Graph Generated from the Full-Text Article PMID 11553787
The blue ellipses represent protein term nodes, green ellipses represent point mutation nodes, and orange ellipses represent organism nodes. The gray triangles represent regular words. The connecting edges show terms or words represented by the nodes that are present as a bigram in the text. For this article, a total of 1,052 terms are contained in 2,287 bigrams.
Figure 2.
A General Overview of the Process Flow of Mutation GraB
Table 1.
Protein Family Literature Sets
Table 2.
Protein Family and Dictionary Information
Table 3.
Mutation GraB Performance on the GPCR Literature Sets
Figure 3.
Examining the Precision of the Graph Bigram and Word Distance Metrics across Different Levels of Possible Protein Associations for the GPCR (A), Protein Tyrosine Kinase (B), and Ion Channel Transporter (C) Literature Sets
This data is for the cumulative development and validation sets combined. The yellow bars show the number of point mutations counted at each PPA. The solid blue line represents the precision measured for these point mutations using the graph bigram metric, and the dotted red line is measured using the word distance metric.
Table 4.
Mutation GraB Performance on the Protein Tyrosine Kinase Literature Sets
Table 5.
Mutation GraB Performance on the Ion Channel Transporter Literature Sets
Table 6.
Mutation GraB versus MEMA Performance
Table 7.
Xylanase Literature Set and Proteins and Point Mutations within
Table 8.
Mutation GraB versus Mutation Miner Performance
Figure 4.
Example of a Paragraph of Text Evaluated by the Graph Bigram and Word Distance Metrics
(A)Text is taken from a figure label from the article PMID 10889210.
(B)Graph generated by bigram traveral using the graph bigram method. The point mutation terms are in green, protein terms in blue, and regular words in gray.
(C)Table shows the measurements between some selected words in the text using both the word distance and graph bigram metrics. The word–distance measurements are below the diagonal, and the graph bigram measurements are above the diagonal. Two different word pairs are examined, {fig, bars} and {alteration, scatchard}.
The {fig, bars} words are shown in red in (A), the path is colored in red in (B), and the metric measurements are highlighted in red in (C). The {alteration, scatchard} items are highlighted in blue, correspondingly.
Table 9.
Mutation GraB Performance on All Protein Family Literature Sets with and without Image Mutations Using the Graph Bigram Metric