Automatic large-scale political bias detection of news outlets

doi:10.1371/journal.pone.0321418

Table 1.

GDELT features included in analysis, accompanied by short description.

More »

Expand

Fig 1.

Overview of data aggregation process.

Overview of process to aggregate GDELT data from article- to outlet-level instances, containing themes and their respective average GDELT features.

More »

Expand

Fig 2.

Frequency of political lean classes per dataset, on a logarithmic scale.

More »

Expand

Fig 3.

Overview of experiments used to test models.

To test the impact of different bias-related data, models were trained on subsets of the data: traditional bias data (features related to tone, polarity, activity and self/group reference density); alternative bias data (features of word-, article-counts, image- or video presence); and the combination of all these features: full bias data. An additional experiment tested model performance on the full dataset when supplemented with categorical features from the MBFC data.

More »

Expand

Fig 4.

Confusion matrices.

Confusion matrices of the predictions by the best performing models per task.

More »

Expand

Table 2.

Model results per experiment.

More »

Expand

Table 3.

Output examples. Examples of domains with corresponding predictions and ground truths. Predictions were made using the best performing NN model.

More »

Expand

Fig 5.

Decision plot of Breitbart, a right-wing political news source.

The twenty most influential features are plotted in descending order. The range at the top of the graph represents the political bias labels as predicted by the model.

More »

Expand

Fig 6.

Decision plot of Forbes, a right-leaning political news source.

More »

Expand

Fig 7.

Decision plot of the Economist, a centre-leaning political news source.

More »

Expand

Fig 8.

Decision plot of the Guardian, a left-leaning political news source.

More »

Expand

Fig 9.

Decision plot of CNN, a left-wing political news source.

More »

Expand

Fig 10.

SHAP decision plots of misclassified web-domain.

Example of a misclassified web-domain, theconservativetreehouse.com, which is a right-wing domain that was falsely classified as left-leaning by the model.

More »

Expand

Fig 11.

PABS and MBFC label agreement.

A confusion matrix comparison of MBFC labels with those of PABS.

More »

Expand