Figure 1.
Processing steps for visualizing a large corpus of medical literature based on the self-organizing map method.
The figure also references processing steps taken for the study by Boyack et al. (2011), which was centered on cluster quality.
Figure 2.
Parallelized batch training of the SOM, with 225 parallel processes.
Included is only the first of a total of 240 sequential batches, with the trained SOM serving as input to the subsequent batch.
Table 1.
Statistics of label placement for top five term dominance levels.
Figure 3.
Zoomed-out view of the complete map of medical literature, plus detailed views of several regions.
Contents and design as presented to domain experts for qualitative evaluation.
Table 2.
The ten most frequent terms in the input data set, including the depth levels at which each term appears in the MeSH hierarchy.
Table 3.
The ten terms occupying the most space in the map.
Table 4.
The ten terms occurring in the largest number of contiguous patches.
Figure 4.
Transect through the term dominance landscape from “Blood Pressure” to “Exercise” via “Obesity”.
Detailed profiles are shown for the first, second, and third-ranked terms for all transected neurons, with the line graph indicating the proportion of neuron vector weights accounted for by a particular label term. The first, last, and pivot neuron are highlighted (see also Table 5).
Figure 5.
Term rank transitions along the 36 neurons transected in Figure 4.
Included are all terms that make it to the first, second, or third term dominance rank in any neuron along the transect.
Table 5.
Top ten terms for four neurons along the transect in Figure 4.
Figure 6.
Base map function of the SOM demonstrated with an overlay of all articles containing MeSH terms "Food Habits" and "Schools".
Larger circles indicate neurons with larger number of matching articles. Note the split into three main regions, each visualized at finer scale on the right.
Figure 7.
Digitization of markings made independently by ten subject experts on poster-size map printouts.
Labels either explicitly stated by experts or extracted from encircled areas of the map. Subjects can be roughly categorized into Science of Science and information science researchers (a,b c), science analysts (d, e, f), and biomedical domain experts (g, h, i, j).
Figure 8.
Geometric zooming versus semantic zooming.
Juxtaposed are examples of geometric zooming into the static display of multiple levels optimized for preventing label overlaps (top) versus semantic zooming with successive revealing of lower levels of term dominance (bottom).