Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach

doi:10.1371/journal.pone.0058779

Figure 1.

Processing steps for visualizing a large corpus of medical literature based on the self-organizing map method.

The figure also references processing steps taken for the study by Boyack et al. (2011), which was centered on cluster quality.

More »

Expand

Figure 2.

Parallelized batch training of the SOM, with 225 parallel processes.

Included is only the first of a total of 240 sequential batches, with the trained SOM serving as input to the subsequent batch.

More »

Expand

Table 1.

Statistics of label placement for top five term dominance levels.

More »

Expand

Figure 3.

Zoomed-out view of the complete map of medical literature, plus detailed views of several regions.

Contents and design as presented to domain experts for qualitative evaluation.

More »

Expand

Table 2.

The ten most frequent terms in the input data set, including the depth levels at which each term appears in the MeSH hierarchy.

More »

Expand

Table 3.

The ten terms occupying the most space in the map.

More »

Expand

Table 4.

The ten terms occurring in the largest number of contiguous patches.

More »

Expand

Figure 4.

Transect through the term dominance landscape from “Blood Pressure” to “Exercise” via “Obesity”.

Detailed profiles are shown for the first, second, and third-ranked terms for all transected neurons, with the line graph indicating the proportion of neuron vector weights accounted for by a particular label term. The first, last, and pivot neuron are highlighted (see also Table 5).

More »

Expand

Figure 5.

Term rank transitions along the 36 neurons transected in Figure 4.

Included are all terms that make it to the first, second, or third term dominance rank in any neuron along the transect.

More »

Expand

Table 5.

Top ten terms for four neurons along the transect in Figure 4.

More »

Expand

Figure 6.

Base map function of the SOM demonstrated with an overlay of all articles containing MeSH terms "Food Habits" and "Schools".

Larger circles indicate neurons with larger number of matching articles. Note the split into three main regions, each visualized at finer scale on the right.

More »

Expand

Figure 7.

Digitization of markings made independently by ten subject experts on poster-size map printouts.

Labels either explicitly stated by experts or extracted from encircled areas of the map. Subjects can be roughly categorized into Science of Science and information science researchers (a,b c), science analysts (d, e, f), and biomedical domain experts (g, h, i, j).

More »

Expand

Figure 8.

Geometric zooming versus semantic zooming.

Juxtaposed are examples of geometric zooming into the static display of multiple levels optimized for preventing label overlaps (top) versus semantic zooming with successive revealing of lower levels of term dominance (bottom).

More »

Expand