Table 1.
Data sets and how we combined them for our analysis.
Fig 1.
Example accounts for each data set.
Fig 2.
Receiver operating characteristics curve for Botometer and the universal score (average over 3 months for each account).
The x-axis represents the false positive rate and the y-axis represents the true positive rate (sensitivity).
Table 2.
ROC-AUC as well as the PR-AUC universal score.
Fig 3.
Receiver operating characteristics curve for Botometer and the universal complete automation probability (CAP) (average over 3 months for each account).
The x-axis represents the false positive rate and the y-axis represents the true positive rate (sensitivity).
Table 3.
ROC-AUC as well as the PR-AUC scores universal CAP.
Fig 4.
Universal score precision-recall curves for the resampled data sets.
We consider the population baseline on Twitter (15% bots) for the universal Botometer score, black points indicate the precision and the recall for the Botometer score 0.76. With the German politicians and bots data set, for almost every threshold level the identified sample of bots has more humans than real bots (precision). The x-axis represents the recall (sensitivity) and the y-axis represents the precision.
Fig 5.
CAP precision-recall curves for the resampled data sets.
We consider the population baseline on Twitter (15% bots) for the universal CAP Botometer score, Black points indicate the precision and the recall for the universal CAP 0.25. With the German politicians and bots for almost every threshold level the identified sample of bots has more humans than real bots (precision). The x-axis represents the recall (sensitivity) and the y-axis represents the precision.
Table 4.
ROC-AUC as well as the PR-AUC English score.
Table 5.
ROC-AUC as well as the PR-AUC for the English CAP.
Fig 6.
Density plot of the SDs for single accounts plotted for each group.
Left for the Botometer universal score, right for the Botometer CAP. Bandwith of 0.015 was used for the CAP and the universal scores. The x-axis represents standard deviation and the y-axis represents the density.
Fig 7.
Percentage of accounts (y-axis) that have at least once a score in the three months below as well as above the threshold for all thresholds between 0 and 1 in steps of 0.05. The x-axis represents the chosen threshold for the Botometer score. Left for the universal score, right for the universal CAP.
Fig 8.
Density plots for the different data sets.
Left: Density plots for the different combined data sets in our analysis showing the distribution of Botometer’s universal score. We used the resampled data sets with 15% bots and 85% humans with a total n = 100,000 for each data set. Right: Density plots for the human accounts data sets. Lines indicate the median, a bandwidth of 0.04 was used for all data sets. The x-axis represents the Botometer score and the y-axis represents the density.