Anatomy of an online misinformation network

doi:10.1371/journal.pone.0196087

Fig 1.

Verification based on a sample of 50 articles.

We excluded six articles with no factual claim. Articles that could not be verified are grouped with misinformation.

More »

Expand

Fig 2.

Hoaxy system architecture.

More »

Expand

Fig 3.

Screen shots from the user interface of Hoaxy: (a) the user enters a query in the search engine interface; (b) from the list of results, the user selects articles to visualize from low-credibility (purple) and/or fact-checking (orange) sources; (c) a detail from the interactive network diffusion visualization for the query “three million votes aliens”.

Edge colors represent the type of information exchanged. The network shown here displays strong polarization between articles from low-credibility and fact-checking sources, which is typical.

More »

Expand

Fig 4.

Usage of Hoaxy in terms of daily volume of queries since the launch of the public Web tool in December 2016.

The two most frequent search terms are shown in correspondence to some of the main peaks of user activity.

More »

Expand

Table 1.

Summary of the data used in the network analysis.

E_f is the set of edges labeled as ‘fact-check’.

More »

Expand

Fig 5.

Fraction of retweets in k-core graph that link to fact-checking vs. core number k.

More »

Expand

Fig 6.

k-core decomposition of the pre-Election retweet network collected by Hoaxy.

Panels (a)-(d) show four different cores for values of k = 5, 15, 25, 50 respectively. Networks are visualized using a force-directed layout. Edge colors represent the type of article source: orange for fact-checking and purple for low-credibility. The innermost sub-graph (d), where each node has degree k ≥ 50, corresponds to the main core. The heat maps show, for each core, the distribution of accounts in the space represented by two coordinates: the retweet ratio ρ_in and the fact-checking ratio ρ_f (see text).

More »

Expand

Fig 7.

Average fact-checking ratio as a function of the shell number k for activities of both primary spreading (‘out’) and secondary spreading (‘in’).

Error bars represent standard error.

More »

Expand

Table 2.

Sample of tweets with fact-checking content published by accounts in the main core of the misinformation network.

More »

Expand

Fig 8.

Left: Change of main core number k with the evolution of the network. A rolling window of one week is applied to filter fluctuations. The shuffled version is obtained by sampling from the configuration model. This is repeated many times to obtain the 95% confidence interval shown in orange. The inset shows the size of the main core over time. Right: Churn rate (relative monthly change) of accounts in the main core.

More »

Expand

Fig 9.

Retweet network of the stable main core of spreaders of article from low-credibility sources.

Filtering by in-degree was applied to focus on the 34 accounts that retweet the most other accounts in the core. Node size represents out-degree (number of retweeters) and node color represents in-degree.

More »

Expand

Fig 10.

Average bot score for a random sample of accounts drawn from different k-shells of the pre-Election Day retweet network, as a function of k.

Only retweets including links to sources of misinformation are considered. Error bars represent standard errors.

More »

Expand

Fig 11.

Left: Distribution of s_in and s_out. Right: The average rank of users in the main core according to each centrality metric. Error bars represent standard errors.

More »

Expand

Table 3.

The top ten central accounts, ranked in descending order of centrality, in the article network before the 2016 Election.

Rankings are based on four different centrality metrics.

More »

Expand

Table 4.

Annotation of central users.

For categorical questions (1–3), the top most frequent label, and its frequency, are reported. The question about article sharing frequency (4) was on a 5-point Likert scale; we report the mean and standard deviation of the answers.

More »

Expand

Fig 12.

Left: Fraction of the retweets remaining vs. number of spreaders disconnected in the network. Right: Fraction of unique article links remaining vs. number of spreaders disconnected in the network. The priority of disconnected users is determined by ranking on the basis of different centrality metrics.

More »

Expand