Users’ polarisation in dynamic discussion networks: The case of refugee crisis in Sweden

doi:10.1371/journal.pone.0262992

Fig 1.

Number of content-creating users per month.

Notes: The line is fitted into the points showing the number of users who posted at least one message in a given month. Total N of users = 9 300.

More »

Expand

Fig 2.

The step-wise analysis of the retrieved data.

More »

Expand

Fig 3.

Test statistics on the VADER’s accuracy of measuring the tweet tonality.

Notes: The top figure shows density (Y-axis) of the sentiment values (X-axis) assigned by VADER [103] to each annotated tweet. The colours represent the categories distinguished by the author who labeled the tweets. The bottom figure shows the overlap between the sentiment values of each stance group distinguished by the annotator. The boxplot also provides some summary statics on the distribution of the sentiment values (X-axis) assigned by VADER [103] in the stance categories found by the annotator. N of tweets = 200. N of sentiment groups = 3 (positive, neutral and negative).

More »

Expand

Fig 4.

Top 20 words associated with the tweets with positive and negative tonality.

Notes: Term-category association (TCA) [105] analysis was applied to identify the words associated with negative and positive tonality. The list of top 20 words was machine-translated. Please, find the list of top 20 words in the original language in S1 File. Y-axis shows z-scores for each of the identified terms. X-axis shows the term count. N of tweets = 678 677.

More »

Expand

Fig 5.

Statistics on the frequency (upper-left corner) and length (bottom-left corner) of users’ participation in the discussions, and the frequency of users’ community change (right side).

Notes: The upper-left histogram shows the density of the users (on the Y-axis) who participated in online discussions for the number of months (N_{months, when the user was active}) specified on the X-axis. The bottom-left histogram shows the density of the users (on the Y-axis) who participated in online discussions over the period (T_last − T_first, where T_last is the last and T_first is the first months of user’s activity) of the number of months specified on the X-axis. The pie-chart of the right-hand side shows the proportion of users (i.e., the area of the pie portion on the figure), who stayed in their dynamic communities the proportion of months (, where n is the number of communities, in which the user was active over the examined period of time) specified by the colour of the pie portion. N of users = 8451 users. The communities were distinguished using iterative detection and matching [98].

More »

Expand

Fig 6.

Users’ participation in Twitter discussions in the 4 biggest dynamic communities.

Notes: Y-axis shows the users’ activity over the examined period (X-axis). Each row represents the activity of one user. N of users = 3162 users.

More »

Expand

Fig 7.

Growth of user communities in time.

Notes: The entities show the dynamics of the community size change. Y axis shows the number of users in dynamic communities at each of the time point. N of time points = 96 (months). N of communities = 722 communities. N of users = 5602 users. The streamgraph visualisation technique was used to present the growing number of users in the dynamic communities [108]. A streamgraph is a type of stacked area graphs, where values are plotted around a varying central baseline [109]. Such a visualisation technique allows examining dynamic changes in the data. The colour palette is used to differentiate between the dynamic (i.e., temporal) communities. The communities were distinguished using iterative detection and matching [98].

More »

Expand

Fig 8.

Interactions between the users of the 10 biggest communities over time.

Notes: The entities show the users of the 10 biggest dynamic communities as nodes and connections (i.e., replies and mentions) between those users as arcs at three time points. N of users = 989 users. Groeninger’s radial axis layout within the Gephi environment [110] was used to visualise the network. Here, the nodes are grouped according to their dynamic communities, which are presented as whiskers. Each whisker (i.e., community and the nodes within this community) has its own colour to differentiate between the dynamic communities. The arcs have the colour of the users mentioning or replying to a user from another community.

More »

Expand

Fig 9.

Ratio of negative and positive tweets.

Notes: The bars show the ratio of the tweets with negative, positive or neutral sentiment. N of tweets = 686 763.

More »

Expand

Fig 10.

Mean, standard deviation and kurtosis of the sentiment values.

Notes: The mean, standard deviation and kurtosis are calculated as follows: ; ; where is the mean of the sentiment values at month i, σ_i is the standard deviation of the sentiment values and γ_2i is the kurtosis of the sentiment values, N_i is the number of users at month i and X_ki is the sentiment of k user. The regression line (within the ggplot2 package [111]) is fitted into the points measuring the mean, standard deviation and the kurtosis of the sentiment values within each time point. N of time points = 96 (months). N of tweets = 686 763. N of users = 9 300. Valence Aware Dictionary and sEntiment Reasoner (VADER) [103] was applied to the tweet texts to extract sentiment values.

More »

Expand

Fig 11.

Density of the sentiment values at 4 time points.

Notes: Density plots were built using the tools of the ggplot2 package [111]. Stacked density plots demonstrate distribution of the sentiment values within 10 biggest dynamic communities (four figures at the bottom) and in the whole network (four figures at the top). The figures show the distribution of the sentiment values calculated per user in a given month (i.e., , where X_ki is the sentiment of user k at month i, n_ki is the number of tweets written by that user and x_lki is the sentiment of each l tweet written by that user). N of time slices = 96 (months). N of dynamic communities = 10. N of tweets in the network = 686 763. N of tweets in the communities = 491 891.

More »

Expand

Fig 12.

Median of the tweet length over time.

Notes: The line is fitted into the points showing the values of the tweet length median in a given month. The length of a tweet post is the number of characters in the tweet text. N of tweets = 686 763. N of time slices = 96 (months).

More »

Expand

Fig 13.

Distribution of sentiment values in the tweets of less and more than 200 characters.

Notes: Density plots were built using the tools from the ggplot2 package [111]. Three figures at the top show the distribution of sentiment values in the tweets with less than 200 characters. Three figures at the bottom show the distribution of sentiment values in the tweets with > = 200 characters. N of time slices = 96 (months). N of tweets = 686 763.

More »

Expand

Fig 14.

Mean, standard deviation and kurtosis of sentiment values in the biggest dynamic communities.

Notes: The regression line (within the ggplot2 package [111]) is fitted into the points measuring the mean, standard deviation and kurtosis of the sentiment values within each dynamic community in the examined time-period. N of time slices = 96 (months). N of dynamic communities = 10. N of tweets = 491 891.

More »

Expand

Fig 15.

Standard deviation of mean sentiment values in the dynamic communities.

Notes: The line (within the ggplot2 package [111]) is fitted into the points measuring the standard deviation of the mean sentiment values in all of the dynamic communities. N of time slices = 96 (months). N of dynamic communities = 723. N of tweets = 565 888.

More »

Expand

Fig 16.

Ratio of homogeneous edges in the network and ratio of network’s communities with the majority of homophilic relationships.

Notes: The line is fitted into the points measuring the specified proportion for each month. . . Higher values identify the growing number of connections between the users expressing similar views in the network as a whole and in its dynamic communities. Only those communities that consist of more than 3 users were considered to visualise the the proportion of the network’s communities with the majority of the relationships being homophilic. N of time-periods = 96 (months). Total N of edges = 682 821. N of communities = 70.

More »

Expand