Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Workflow and architecture of the computational processing framework.

Spouts (tap symbol) emit data (here: web pages), bolts (lightning symbol) process data (i.e. term statistics, readability metrics, vocabulary-based text difficulty, storing results). SVM: Support Vector Machine, R: Readability Metrics.

More »

Fig 1 Expand

Fig 2.

Workflow of the processing steps and software components for topic modeling: (1) text material is retrieved from a central relational database; (2) several processing threads perform a collection of pre-processing tasks; (3) LDA is applied to the resulting document vectors. The software takes raw text material as an input and outputs n topics. The n is a user-defined input parameter to LDA.

More »

Fig 2 Expand

Fig 3.

Size-rank plot of degree distribution of the host-aggregated sGHW graph.

More »

Fig 3 Expand

Table 1.

Domains of 25 top-ranked web sites for ccTLD “.de” with their respective information provider according to PageRank.

More »

Table 1 Expand

Table 2.

Domains of 25 top-ranked web sites for ccTLD “.at” with their respective information provider according to PageRank.

More »

Table 2 Expand

Table 3.

Domains of 25 top-ranked web sites for ccTLD “.ch” with their respective information provider according to PageRank.

More »

Table 3 Expand

Table 4.

Mapping readability and vocabulary scales to corresponding classes as follows: VE very easy; E easy; M moderate; D difficult; VD very difficult according to Wiesner et al. [25].

More »

Table 4 Expand

Fig 4.

Distribution of readability values on the Flesch Reading Ease scale for each ccTLD (“.de”, “.at”, “.ch”).

Difficulty indicated by color, with dark green as the highest readability (90–100) and dark red as the lowest readability (0–10). Note: For consistency reasons, the x axis is reverted and ranges from 100 to 0.

More »

Fig 4 Expand

Fig 5.

Distribution of readability values on the Vienna formula scale for each ccTLD (“.de”, “.at”, “.ch”).

Difficulty is indicated by color, with dark green as the highest readability (4–5) and dark red as the lowest readability (14–15).

More »

Fig 5 Expand

Fig 6.

Distribution of achieved vocabulary values on the SVM classification scale L for each ccTLD (“.de”, “.at”, “.ch”).

Difficulty is indicated by color with dark green as the most layman friendly (1) and dark red as the highest expert level required (10). SVM: support vector machine.

More »

Fig 6 Expand

Fig 7.

Scatter plot of the distributions for FRE, WSTF and L for each ccTLD.

More »

Fig 7 Expand

Fig 8.

Perplexity score per number of topics for those 3,747,055 health-related web pages that belong to the three times 1000 top-ranked web sites from the sGHW.

More »

Fig 8 Expand

Table 5.

The 50 topics that were identified from the web pages of the top 1000 web sites for each ccTLD.

The sample terms were ordered based on their relevance to the topic.

More »

Table 5 Expand

Fig 9.

Theme distribution per information provider type for the ccTLD “.de”.

Information provider types: GPH: Government, Public Institution or Public Health, NPO: Non-Profit Organization, PO: Private Organization, M: Mainstream or Local News, PC: Pharmaceutical Company, PB: Private Blog, Other: O.

More »

Fig 9 Expand

Fig 10.

Theme distribution per information provider type for the ccTLD “.at”.

Information provider types: GPH: Government, Public Institution or Public Health, NPO: Non-Profit Organization, PO: Private Organization, M: Mainstream or Local News, PC: Pharmaceutical Company, PB: Private Blog, Other: O.

More »

Fig 10 Expand

Fig 11.

Theme distribution per information provider type for the ccTLD “.ch”.

Information provider types: GPH: Government, Public Institution or Public Health, NPO: Non-Profit Organization, PO: Private Organization, M: Mainstream or Local News, PC: Pharmaceutical Company, PB: Private Blog, Other: O.

More »

Fig 11 Expand

Fig 12.

Theme distribution per ccTLD.

More »

Fig 12 Expand