The authors have declared that no competing interests exist.
When the threat of COVID-19 became widely acknowledged, many hoped that this pandemic would squash “the anti-vaccine movement”. However, when vaccines started arriving in rich countries at the end of 2020, it appeared that vaccine hesitancy might be an issue even in the context of this major pandemic. Does it mean that the mobilization of vaccine-critical activists on social media is one of the main causes of this reticence to vaccinate against COVID-19? In this paper, we wish to contribute to current work on vaccine hesitancy during the COVID-19 pandemic by looking at one of the many mechanisms which can cause reticence towards vaccines: the capacity of vaccine-critical activists to influence a wider public on social media. We analyze the evolution of debates over the COVID-19 vaccine on the French Twittosphere, during two first years of the pandemic, with a particular attention to the spreading capacity of vaccine-critical websites. We address two main questions: 1) Did vaccine-critical contents gain ground during this period? 2) Who were the main actors in the diffusion of these contents? While debates over vaccines experienced a tremendous surge during this period, the share of vaccine-critical contents in these debates remains stable except for a limited number of short periods associated with specific events. Secondly, analyzing the community structure of the re-tweets hyper-graph, we reconstruct the mesoscale structure of the information flows, identifying and characterizing the major communities of users. We analyze their role in the information ecosystem: the largest right-wing community has a typical echo-chamber behavior collecting all the vaccine-critical tweets from outside and recirculating it inside the community. The smaller left-wing community is less permeable to vaccine-critical contents but, has a large capacity to spread it once adopted.
The COVID-19 pandemic emerged at a tumultuous time for vaccination. The past decade had seen a growing realization among public health and political deciders that reticence towards vaccines is widespread [
When the threat of COVID-19 became widely acknowledged, at the beginning of 2020, many tried to see a silver lining and hoped that this pandemic would squash “the antivaccine movement” [
The pandemic does not seem to have squashed “the antivaccine movement” or vaccine hesitancy. But does it mean that the mobilization of vaccine-critical activists on social media is one of the main causes of this reticence to vaccinate against COVID-19? The question of causes is a complex one when it comes to attitudes towards vaccines. They are volatile, complex and highly context-dependent [
In this paper, we wish to contribute to current work on vaccine hesitancy during the COVID-19 pandemic by looking at one of the many mechanisms which can cause reticence towards vaccines. We investigate the evolution of vaccine-critical activists’ capacity to influence a wider public on social media. Our work focuses on contents produced on the French-speaking segment of the social media Twitter between March 2020 and October 2021. We draw on a cartography of the main French and francophone vaccine-critical actors and their websites conducted before the epidemic [
Finally, we show that, despite the high activity of vaccine critics, their place in discussions on vaccines has remained relatively constant across the period and very limited compared to mainstream media. Some events allowed them to reach a wider public, including the intense public debate over the efficiency of Hydroxychloroquine as a treatment for COVID-19 that arose in March 2020 and the release in November 2020 of the vaccine-critical documentary “Hold-Up”. But overall, their sphere of influence has mainly been restricted to two communities. The largest one composed of far-right conspiracy theorists and was rather closed on itself. The other, much smaller in term of number of users, composed of far-left actors and was somewhat more capable of transmitting vaccine-critical contents to a wider public.
Our study combines three independent tweet collections, obtained through a combination of the streaming and search APIs (data collection is performed in real time through the streaming APIs and backward, every week, through the search APIs using [
We merged the datasets filtering each of them with the combined set of query keywords of DataVac and DataCov. After deduplication, our dataset contained 3M tweets, 10M retweets and 840k users. We named this dataset DataCovVac. This is the large-scale dataset on which we build the retweet network and the mesoscale community structure as described hereafter.
In this study we analyze the information flow regarding vaccines and its sources, for this reason we further filtered the tweets to the subset of posts containing URLs pointing to external websites or blogs. We started from two lists of URLs: the first from [
In order to assess the temporal evolution of users’ engagement on Twitter, we take inspiration from compartmental models. At any time each user can be in any of two states:
In this model (inspired by the SIS model in epidemiology [
Analogously, the number of engaged users decreases with time as they become bored with the discussion or change their mind on the issue:
The global dynamical system obeys the differential equation:
Since we can compute directly from the dataset the quantities
The ratio between
The above can be calculated for the news media URLs as well.
In the present work we select an engagement window of
On Twitter the discussion is usually triggered by one user posting some content and an avalanche of other users retweeting such information helping its diffusion on the social network. The (possibly weighted) retweet network describes how often two users come into contact through the action of retweeting each other. While this model may grasp the structure of user interactions, it fails to distinguish different cascade structures such as users with a high number of singly retweeted tweets or few highly retweeted tweets. Further, the directed nature of the retweeting process is not reflected into the retweet network [
To model such flow we choose the directed hyper-graph as a tool that can leverage the interaction between one user (the original poster) and their audience (the retweeting users) [
A directed hyper-graph is defined by the pair
Consider a dynamical system evolving on top of the hyper-graph, in particular a random walker. On a directed hyper-graph the random walk can be defined as follows:
the walker resides on a node;
the walker chooses one of the hyper-edges incident to that node on their tail with equal probability;
the walker crosses the hyper-edge and reaches one of its head nodes, with equal probability.
The above dynamics defines a transition pattern between nodes and allows us to write an effective transition matrix:
To detect the community structure of the Twitter user base we leverage a dynamical approach such the stability algorithm [
In this framework, for any given community
The first question that we addressed is whether the circulation of vaccine-critical information on Twitter increased during the pandemic. To do so, we first globally analyzed the number of tweets concerning the whole vaccine debate in France (from DataVac), in the period between March 2020 and October 2021, and we compared this with the tweets containing a link to a vaccine-critical URL (DataCritical) and with the tweets containing a link to a media URL (DataMedia).
In the lower plot of
Upper plot: Daily fraction of tweets and retweets containing a media URL (orange) or a vaccine-critical URL (blue). ower plot: number of tweets and retweets in the whole dataset (gray), from media (orange) and vaccine-critical URLs (blue).
The first volume growth is associated to a significant event related to the pandemic: on the 8th of November 2020 Pfizer announced that their vaccine finished the trials phase and was ready to be distributed. The availability of COVID-19 vaccines shifted the debate from an abstract discussion about potential vaccines to the concrete focus on the actual vaccination campaign.
Notice from
Pre-Pfizer announcement; post-Pfizer announcement; Health pass discussion.
In the second period, the average volume of tweets settles on a stable quantity that is around 7 times the volume observed before. We can also notice that this general increase is also visible in the media (DataMedia) and in the vaccine-critical URLs datasets (DataCritical).
The third period, characterized by an even higher volume of tweets, starts at the beginning of June 2021. Discussion focuses on the health pass (
Going back to our initial question on the possible increase of the circulation of vaccine-critical contents, we can observe from the upper plot of
We also addressed the question of the possible increase of vaccine-critical presence in the debate, from the point of view of the users, using the engagement metric and the
Users that tweet or retweet a piece of information show their engagement with the content of that information. In particular tweets sharing links to well known web pages with vaccine-critical content, show that the original poster and, probably to a lower degree, all the subsequent retweeters endorse that content.
We analyze the dynamics of the users’ engagement in spreading the information from the vaccine-critical ecosystem, namely the individual propensity to tweet or retweet posts containing a vaccine-critical URL.
Users may endorse vaccine-critical online contents via a constant tweet or retweet flow or with sporadic sharing of information. We define a user as engaged if in the past
Above: evolution of the total number of users engaging with news media (orange) or with vaccine-critical contents (blue). Below: the reproduction number shows peaks of engagement around some events. The greyed area on the top panel is proportional to the daily new cases in France.
The number of engaged users does not strictly follow the volume of the discussion, both for vaccine-critical contents and for the media and in particular we do not observe a striking increase in the third phase. The temporal evolution of the reproduction number shows few events triggering a noticeable response.
The engagement with vaccine-critical information experiences three different growing phases, whose beginning can be identified by relevant peaks of the
List of events that triggered a rapid growth of engagement within Twitter users.
date | topic of the event |
---|---|
2020-02-04 | COVID-19 comes from a lab experiment. |
2020-03-26 | Retraction of hydroxychloroquine treatment. |
2020-06-06 | Adverse events and a death cases in COVID-19 critical trials. |
2020-07-17 | Hydroxychloroquine and Remdesivir instead of Bill Gates’ vaccine |
2020-08-08 | “Il faut refuser ces vaccins contre le COVID-19!” Dr Pierre Cave |
2020-10-11 | First appearance of |
2020-11-11 | Pfizer announcement / holdup movie |
2021-03-10 | Suspension of Astrazeneca in Denmark |
2021-03-24 | More suspension of Astrazeneca |
2021-04-02 | Other reported cases of “death by vaccine” |
Engagement with media content has a more marked tendency to follow the shape of the pandemic cycles, with the exclusion of the asynchronous behavior during the intermediate period and related to the discussion connected to the introduction of the Pfizer vaccine that, together with the increase of the discussion volume, also forced the engagement of new users.
The asynchronicity of the
In the previous paragraph we analyzed which part of the vaccine debate is occupied by vaccine-critical contents. Now we will enter more deeply inside the structure of the information flows, and we will study which kind of actors are present in the debate and their role in spreading patterns.
The distribution of tweets and retweets containing vaccine-critical URLs can be seen as a proxy of the reach of the vaccine-critical discourse on Twitter. The most commonly used structure to represent the information flows on Twitter is the retweet network, representing the directed links among users who retweeted each other. However this structure can hide a potential bias by not allowing to distinguish two crucially different situations: a user who is retweeted
To overcome this problem we use a higher-order network representation introducing a directed hyper-graph structure, composed by nodes and hyper-edges [
We analyze the community structure of Twitter users discussing COVID-19 vaccines through the partition of the hyper-retweet network (from DataCovVac). Such communities represent densely connected groups of users among which information circulates quickly. In particular the information flow over this mesoscopic structure of the network unveils the ability of the vaccine-critical users to reach other users beyond the natural extension of their
The community detection algorithm identified 2991 communities in the largest connected component. However, the 30 larger communities concentrate more than 90% of the users. Let us focus on these larger communities, starting with the analysis of their social composition. Performing a qualitative analysis of the most active user profiles of each community, we can infer a community profile, see
Community | Interpretation |
---|---|
media aggregators or French web influencers. | |
Far right groups. | |
public health institutions, medical doctors and associations | |
French and international news media. | |
Far left and trade unions. | |
government representatives. | |
other French-speaking countries (Canada) | |
other French-speaking countries (Belgium, Morocco, Switzerland,…) | |
non French-speaking countries (India, Israel,…) | |
local French institutions (Nord-Picardie, Loire, …) |
To further characterize the content shared by the different communities we calculated their hashtag preference profiles. We cluster communities and hashtags based on the over-usage of hashtag
Hierarchical clustering of the community hashtag preference profiles.
The configuration emerging from the clustering shows a clear preference of the right-wing communities (
The set of hashtags concerning government and public health are the most spread by the media, public-health actors and by local and national political institutions. Notice in particular the similar use of public-health connected tags among
Notice also that the left-wing community
The general community of web actors,
To better understand the internal structure of the vaccine-critical ecosystem we repeated the previous analysis on use of vaccine-critical URLs. Here we observe a clear bi-partition of these URLs, see
Hierarchical clustering of communities based on their URLs sharing profile and vice versa.
We will now analyze how the information flows between the communities. To estimate this we calculate the probabilities for a random walker to get out of a community (following the retweet hyper-graph structure) and the probability to visit a node of a given community (average visit probability), as defined in Eqs
Communities are groups of users with high inner and low outer information flow. This express, on the right plot of
This picture changes completely if we just consider vaccine-critical URLs. In this case we observe a behavior polarization in the two largest communities,
Considering media URLs, the picture reproduces similar but less heterogeneous results. Community
We can also see that the highest polarization in terms of roles in the information ecosystem is not observed among left and right-wing but rather among the right-wing (
We also notice that the escape probability for each community changes in time, in particular in the case of communities in intermediate positions (
Community size is proportional to its number of users. Only the larger communities are reported for the sake of the figure readability.
Similar changes are not found while sharing media URLs nor in the full dataset (central and right part of
The main outcome of this work is that, in the Twitter ecosystem, the relative reach of the vaccine-critical activists has remained constant and limited in comparison to that of the mainstream media.
The first main implication of our results pertains to current discussions of the spread of misinformation on social media. Our results echo the recent works suggesting that initial assessments of the prevalence of fake news, conspiracy theories, misinformation and disinformation on the Internet might have over-estimated the importance of these phenomena [
Symmetrically, this means that vaccine critics can seize the opportunity of public debates arising over issues that do not concern vaccination specifically but do engage with these broader issues. The debate over hydroxychloroquine fits this bill perfectly. France was at the center of the international controversy over this specific drug. The debate raged for weeks in the French mainstream media and took on a very politicized turn with proponents of hydroxychloroquine casting doubt on the way clinical research on COVID-19 is performed, the severity of the disease and the probity of researchers, public agencies and the government [
The second main implication of our results pertains to the relationship between what we can observe on social media and the evolution of public attitudes towards vaccines in the whole population. We found that vaccine critics’ place in overall discussions of vaccines has remained relatively constant across the period. This does not mirror existing data on the French public’s attitudes to the COVID-19 vaccines during the pandemic. Intentions to vaccinate against COVID-19 remained constant at around 75% from March 2020 to May 2020, then decreased steadily until the end of December 2020 when they were as low as 45% before they increased relatively steadily to reach around 80% in July 2021 [
The mismatch between trends on Twitter and data collected via more traditional methods has been underlined in many studies [
PONE-D-22-05617Assessing the influence of French vaccine critics during the two first years of the COVID-19 pandemicPLOS ONE
Dear Dr. Faccin,
Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
Please note that all three reviews recognize the merit and significance of your work -- but they also identify a number of minor weaknesses. In my opinion, these weaknesses can be sufficiently addressed as long as you thoroughly follow the suggestions, or respond to the concerns, of the reviewers.
Please submit your revised manuscript by Jun 27 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at
Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:
We look forward to receiving your revised manuscript.
Kind regards,
Constantine Dovrolis
Academic Editor
PLOS ONE
Journal Requirements:
When submitting your revision, we need you to address these additional requirements.
1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at
2. In your Methods section, please include additional information about your dataset and ensure that you have included a statement specifying whether the collection and analysis method complied with the terms and conditions for the source of the data.
3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see
Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see
Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data:
We will update your Data Availability statement to reflect the information you provide in your cover letter.
4. Thank you for stating the following in the Acknowledgments Section of your manuscript:
“This research has benefited from the financial support of the Agence Nationale de la Recherche (projects TRACTRUST - ANR-20-COVI-0102 and SLAVACO - ANR 20-COV8-0009-01) and the ANRS-MIE (project MEDIACAM - ANRSCOV24)”
We note that you have provided additional information within the Acknowledgements Section that is not currently declared in your Funding Statement. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.
Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:
“The author(s) received no specific funding for this work.”
Please include your amended statements within your cover letter; we will change the online submission form on your behalf.
Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.
[Note: HTML markup is below. Please do not edit.]
Reviewers' comments:
Reviewer's Responses to Questions
1. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #1: Yes
Reviewer #2: Partly
Reviewer #3: Partly
**********
2. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #1: Yes
Reviewer #2: N/A
Reviewer #3: Yes
**********
3. Have the authors made all data underlying the findings in their manuscript fully available?
The
Reviewer #1: No
Reviewer #2: No
Reviewer #3: Yes
**********
4. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #1: Yes
Reviewer #2: Yes
Reviewer #3: Yes
**********
5. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #1: Summary:
In this paper, the authors analyze the debates over the COVID-19 vaccine on the French-speaking segment of Twitter. First, they study the evolution of vaccine-related debates. They show that specific events increase the reach of the vaccine-critical activists on social media, but this information flow is relatively limited compared to mainstream media. Second, they analyze the community structure of discussions and examine how information flow between communities. Authors conclude that the largest far-right community is the echo chamber of conspiracy theorists. In contrast, a smaller community that consists of far-left actors is more capable of communicating vaccine-critical content to a broader public.
They assess the evolution of user engagement using a model similar to the SIS network epidemic model. Here, users that share vaccine-critical information are analogous to infectious individuals in the SIS model. Furthermore, their community analysis relies on hypergraphs derived from retweet data. Hypergraph helps differentiate a user who is retweeted N times for a single tweet and a user whose N tweets are individually retweeted one single time, which is crucial for capturing the dynamics of the information flow.
In conclusion, I think this paper presents a rigorous analysis of vaccine hesitancy on Twitter during the COVID-19 pandemic. I think the methodology is accurate, and the results are significant.
Minor Issues:
Some points regarding data collection are unclear to me:
(1) To the best of my understanding, "vaccine-critical URLs" refer to websites other than mainstream media. For example, websites of prominent actors in vaccine controversies. It may be helpful for the reader if this is reminded in the data subsection.
(2) "we searched our database for those URLs" is the database refer to dataset DataCovVac?
(3) I could not fully follow how the co-occurrence network is used to label URLs automatically? Did you also propagate the labels of media URLs to closest neighbors? It appears it is performed only for 285 vaccine-critical URLs. If this is the case, it is not clear how 382 media URLs were obtained from the initial 50 URLs.
I think that a figure (e.g., a flowchart) that explains the data collection process might be helpful. However, this is not a necessity.
I think the authors could mention the motivation behind using hypergraph instead of a standard retweet network earlier in the method section. I think it is a crucial choice, and the reason is explained in the middle of the results section.
Sometimes the COVID-19 outbreak is mentioned as an epidemic, while it is sometimes referred to as a pandemic. Is there a nuance based on the context, or are the words "epidemic" and "pandemic" used interchangeably? For example, the title says "COVID-19 pandemic" and the abstract says "COVID-19 epidemic" this might confuse the reader.
Reviewer #2: In this study, the authors investigated the influences and spreading of vaccine-critical content on social media. They focused on Twitter data and applied network analysis tools to answer two questions: (1) Did vaccine-critical contents exhibit a "rise" during the COVID breakout? (2) What are the roles that different communities (groups of
closely-connected Twitter users) play in the flow of vaccine-critical information.
Generally, the draft is clear about the questions and the general approaches through which these questions can be answered. Nevertheless, it could be improved in its technical soundness and presentation details.
Major comments:
1. In the abstract, one of the questions is formulated as "Who were the central actors in the diffusion of these (vaccine-critical) contents?". It doesn't seem that this question can be fully answered by the corresponding conclusion "the largest right-wing community has typical echo-chamber behavior... The smaller left-wing community is less permeable ..., but has a large capacity to spread it once adopted." For example, are these two communities the central actors? What about the rest of the communities identified? Are they less central? Why? To ensure that the question matches with conclusions, I suggest that this research question be re-formulated.
2. In "Results: The mesoscale structure of the information flow", it is not clear how the two metrics -- the escape probability and the average visit probability, are calculated. This problem is partially due to limited details in the description of behaviors of the random-walker in the Method section. Different definitions could lead to very different interpretations of the results.
a. Especially, the definition of average visit probability -- "the probability (for a random walker) to visit a node of a given community" could have various explanations. For example, do we assume here the random-walker has an equal probability to start from any nodes in the network and take only one step? Or, we let the random-walker randomly walk for a large number of steps (so that the position of the random walker follows a stationary distribution) and aggregate the result from this simulation?
b. The escape probability also suffers (but potentially less) from the same issue. We probably know that the random walker is located in a community. However, do we assume the random walker has an equal chance to start from any nodes in the community? Or node with a higher out-degree in a community has a higher chance to be a starting point?
Minor comments:
1. In "Methods: Measuring the engagement dynamics", Nt is defined as the total number of active users. What's the definition of an active user?
2. In the same section, are there supportive arguments for the specific selection of the engagement window to be 3 days? Would the result significantly change if we slightly vary this parameter?
3. In Fig. 3, the y-axis label on the left doesn't make much sense. I suppose that both of blue and orange curves correspond to the number of users engaging with vaccine-critical content and news media, correspondingly. Given that the orange curve has the label "Media", the blue curve should have the label "Vaccine-critical" instead of "Engaged".
4. In Fig. 6 & 7, what does the size of each circle refers to? size of each community? Please put this information in the caption of the figure to ensure clarity.
Reviewer #3: The manuscript is partly technically sound. The selected data collection mechanism and analysis methods are suitable for addressing the research questions. Some parts in the Data and methods and Results sections need more detailed explanation:
- The APIs used in data collection are not mentioned
- There are not details on what the dataset contains (e.g. tweet ids, tweet text, number of retweets, user ids, etc.)
- The words used for the queries, how you decided to use those and which are the words?
- The qualitative analysis of the users profiles to distinguish to left or right-partite.
- There is no information within the manuscript about wether the dataset, scripts to collect the data, scripts for analysis are available for the replication of the results or future works.
The findings of this work are well supported by the data both from within the text and the figures.
The paper structure, writing style, and language is appropriate for a research manuscript.
Comments to the authors:
1. explain the selection of 3 days time window for τ, in Measuring the engagement dynamics section.
2. add the references for "compartmental models" and SIS model in epidemiology
3. define how they can say that an engaged user looses interest in β calculation
4. explain more the community detection method in the section "Community detection".
5. There is a typo in Results section, first sentence, there is the word "weather" instead of "wether"
**********
6. PLOS authors have the option to publish the peer review history of their article (
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Reviewer #1: No
Reviewer #2: No
Reviewer #3: No
[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,
See attached file.
Response to Reviewers’ comments on:
‘Assessing the influence of French vaccine critics during the two first
years of the COVID-19 pandemic’
by M. Faccin, F. Gargiulo, L. Atlani-Duault and JK Ward
June 3, 2022
We thank the editor and the reviewers for the careful evaluation of this manuscript. In particular,
we noticed the recognition of the manuscript value and soundness, despite the lack of some
important details. We spent a lot of effort in clarifying the more obscure passages suggested by
the reviewers and extending deficient parts. Specifically we extended and clarified the Methods
and Data section, with attention to explicitly reports all steps.
We considered all relevant reviewer comments and replied to each of them in the following. We
attach a document highlighting the differences from the previous submission (as compiled by
latexdiff).
on behalf of the co-authors
Mauro Faccin
Reviewer 1
Comment 1.1.
In this paper, the authors analyze the debates over the COVID-19 vaccine on the French-speaking
segment of Twitter. First, they study the evolution of vaccine-related debates. They show that specific events increase the reach of the vaccine-critical activists on social media, but this information
flow is relatively limited compared to mainstream media. Second, they analyze the community
structure of discussions and examine how information flow between communities. Authors conclude that the largest far-right community is the echo chamber of conspiracy theorists. In contrast,
a smaller community that consists of far-left actors is more capable of communicating vaccinecritical content to a broader public.
They assess the evolution of user engagement using a model similar to the SIS network epidemic
model. Here, users that share vaccine-critical information are analogous to infectious individuals
in the SIS model. Furthermore, their community analysis relies on hypergraphs derived from
retweet data. Hypergraph helps differentiate a user who is retweeted N times for a single tweet
and a user whose N tweets are individually retweeted one single time, which is crucial for capturing
the dynamics of the information flow.
In conclusion, I think this paper presents a rigorous analysis of vaccine hesitancy on Twitter during
the COVID-19 pandemic. I think the methodology is accurate, and the results are significant.
Response 1.1.
We thank the reviewer that sees an important and rigorous contribution to the literature in this
manuscript.
Comment 1.2.
(1) To the best of my understanding, "vaccine-critical URLs" refer to websites other than mainstream media. For example, websites of prominent actors in vaccine controversies. It may be
helpful for the reader if this is reminded in the data subsection.
Response 1.2.
As suggested by the reviewer we clarified the content of the URLs of interest in the Data section.
Comment 1.3.
(2) "we searched our database for those URLs" is the database refer to dataset DataCovVac?
Response 1.3.
As this passage was unclear, we clarified which database we refer (indeed the DataCovVac database)
in the main text.
Comment 1.4.
(3) I could not fully follow how the co-occurrence network is used to label URLs automatically? Did
you also propagate the labels of media URLs to closest neighbors? It appears it is performed only
for 285 vaccine-critical URLs. If this is the case, it is not clear how 382 media URLs were obtained
from the initial 50 URLs.
I think that a figure (e.g., a flowchart) that explains the data collection process might be helpful.
However, this is not a necessity.
Response 1.4.
The passage describing the database construction has been clarified and corrected in the main
text. We started from an initial seed of 285 + 50 URLs and found its dilation in the co-shared
network of URLs. From this set, we visited each website and manually classified it to either
vaccine-critical or news media or others (which we discarted).
Finally we published the code online at
Comment 1.5.
I think the authors could mention the motivation behind using hypergraph instead of a standard
retweet network earlier in the method section. I think it is a crucial choice, and the reason is
explained in the middle of the results section.
Response 1.5.
As suggested by the reviewer we added in the method section a paragraph on the motivations
of hyper-graph choice, which depends on the lack of knowledge of the exact retweet cascade
structure and on the ability to distinguish users with many less retweeted tweets from users with
few highly retweeted tweets.
Comment 1.6.
Sometimes the COVID-19 outbreak is mentioned as an epidemic, while it is sometimes referred to
as a pandemic. Is there a nuance based on the context, or are the words "epidemic" and "pandemic" used interchangeably? For example, the title says "COVID-19 pandemic" and the abstract
says "COVID-19 epidemic" this might confuse the reader.
Response 1.6.
To avoid any ambiguity we normalized the use of “pandemic”.
Reviewer 2
Comment 2.1.
In this study, the authors investigated the influences and spreading of vaccine-critical content
on social media. They focused on Twitter data and applied network analysis tools to answer two
questions: (1) Did vaccine-critical contents exhibit a "rise" during the COVID breakout? (2) What
are the roles that different communities (groups of closely-connected Twitter users) play in the
flow of vaccine-critical information.
Generally, the draft is clear about the questions and the general approaches through which these
questions can be answered. Nevertheless, it could be improved in its technical soundness and
presentation details.
Response 2.1.
We thank the reviewer for spotting the manuscript potential and for helping us to improve its
soundness and rigorousness.
Comment 2.2.
1. In the abstract, one of the questions is formulated as "Who were the central actors in the
diffusion of these (vaccine-critical) contents?". It doesn't seem that this question can be fully
answered by the corresponding conclusion "the largest right-wing community has typical echochamber behavior... The smaller left-wing community is less permeable ..., but has a large capacity
to spread it once adopted." For example, are these two communities the central actors? What
about the rest of the communities identified? Are they less central? Why? To ensure that the
question matches with conclusions, I suggest that this research question be re-formulated.
Response 2.2.
We thank the reviewer for rising this source of confusion. Our aim was to study and analyse the
influence of those communities in the diffusion of vaccine-critical content, without referring to
any notion of centrality (measure of betweenness or random walk centrality for example.) We
amended all parts of the text referring to this concepts. On the other hand, those two communities are the main actors in the diffusion of such contents, in fact they have the highest probability
of being traversed by a tweet (visiting probability) while displaying, in the left-wing case, a non
negligible ability to reach other communities (escape probability).
Comment 2.3.
2. In "Results: The mesoscale structure of the information flow", it is not clear how the two metrics
-- the escape probability and the average visit probability, are calculated. This problem is partially
due to limited details in the description of behaviors of the random-walker in the Method section.
Different definitions could lead to very different interpretations of the results. a. Especially, the
definition of average visit probability -- "the probability (for a random walker) to visit a node of a
given community" could have various explanations. For example, do we assume here the randomwalker has an equal probability to start from any nodes in the network and take only one step? Or,
we let the random-walker randomly walk for a large number of steps (so that the position of the
random walker follows a stationary distribution) and aggregate the result from this simulation? b.
The escape probability also suffers (but potentially less) from the same issue. We probably know
that the random walker is located in a community. However, do we assume the random walker
has an equal chance to start from any nodes in the community? Or node with a higher out-degree
in a community has a higher chance to be a starting point?
Response 2.3.
The characterization of the random walk considered in this manuscript is described in the Methods, in particular in the Dynamics and hyper-graphs subsection and is as follows. At a given
time, the random walk resides on a node (a user of Twitter). From this node the random walk
choose with an even probability one of the hyper-edges incident to the node by its tail (any of
the tweets produced by that user). Once selected the hyper-edge, the random walker select one
of the head nodes with even probability. This let us define the transition probability p( j|i ), and
the visiting probability of each node p(i ) is computed at the steady state by an iterative algorithm. The community transition matrix is:
p(C 0 |C ) =
∑i∈C0 ,j∈C p( j|i ) p(i )
∑i ∈C p (i )
(1)
From here one can compute the visiting probability of community C as ∑i∈C p(i ) and its per-user
average value as ∑i∈C p(i )/|C |. The escape probability from community C is defined as
p(C |C ) = ∑ p(C 0 |C )
(2)
C 0 6=C
We have clarified the Methods section to contain these definitions.
Comment 2.4.
1. In "Methods: Measuring the engagement dynamics", Nt is defined as the total number of active
users. What's the definition of an active user?
Response 2.4.
We clarity in the text that in the context of engagement analysis, active users are those that
tweet or retweet on that day.
Comment 2.5.
2. In the same section, are there supportive arguments for the specific selection of the engagement window to be 3 days? Would the result significantly change if we slightly vary this parameter?
Response 2.5.
We thanks the reviewer for rising this question. The results are robust as long as the time window
is kept small, less than a week. On the other hand, for the purpose of computing the engagement
of users in a social network whose dynamics are fast-paced, a short time-window would better
capture how it changes. We mentioned the robustness of the method on the Methods section.
Comment 2.6.
3. In Fig. 3, the y-axis label on the left doesn't make much sense. I suppose that both of blue
and orange curves correspond to the number of users engaging with vaccine-critical content and
news media, correspondingly. Given that the orange curve has the label "Media", the blue curve
should have the label "Vaccine-critical" instead of "Engaged".
Response 2.6.
We thank the reviewer for spotting this labeling inconsistency which has been amended in the
new version of this manuscript.
Comment 2.7.
4. In Fig. 6 & 7, what does the size of each circle refers to? size of each community? Please put
this information in the caption of the figure to ensure clarity.
Response 2.7.
We have fixed the lack of information spotted by the reviewer.
Reviewer 3
Comment 3.1.
The manuscript is partly technically sound. The selected data collection mechanism and analysis
methods are suitable for addressing the research questions. Some parts in the Data and methods
and Results sections need more detailed explanation:
• The APIs used in data collection are not mentioned
• There are not details on what the dataset contains (e.g. tweet ids, tweet text, number of
retweets, user ids, etc.)
• The words used for the queries, how you decided to use those and which are the words?
• The qualitative analysis of the users profiles to distinguish to left or right-partite.
• There is no information within the manuscript about wether the dataset, scripts to collect
the data, scripts for analysis are available for the replication of the results or future works.
Response 3.1.
We thank the reviewer for recognizing the value of the manuscript.
Following the reviewer suggestion we clarified and added details to various passages of the
article.
• We already mentioned that we used both search and stream APIs; we additionally referenced other papers were an in-depth description of the dataset extraction is discussed and
upload the list of used keywords to a public repository at
• We feel that the list of metadata contained in the extracted dataset would distract the
reader from the main message of the article, we choose not to report this.
• For what concern the keywords used in the dataset extraction, the complete list has been
uploaded to a public repository and referenced in the main text. Those keywords were
selected based on previous analysis which has been explicitly referenced in the main text.
• We amended the main text shortly mentioning the availability of the dataset and the software used to extract it. The full dataset cannot be shared due to current Twitter policies,
but a list of tweet IDs can be provided upon request.
Comment 3.2.
The findings of this work are well supported by the data both from within the text and the figures.
The paper structure, writing style, and language is appropriate for a research manuscript.
Response 3.2.
We thank the reviewer for recognizing the soundness of the manuscript.
Comment 3.3.
1. explain the selection of 3 days time window for τ, in Measuring the engagement dynamics
section.
Response 3.3.
We added a deeper discussion on the choice. Particularly we stress the robustness of the analysis
on the modification of the temporal window.
Comment 3.4.
2. add the references for "compartmental models" and SIS model in epidemiology
Response 3.4.
We added a reference for the SIS model and related theoretical results.
Comment 3.5.
3. define how they can say that an engaged user looses interest in β calculation
Response 3.5.
We clarified how the calculation works.
In particular, at any day t we can compute the following quantities from the dataset:
Et the number of users that shared a link from the set of URLs within the time frame (t − τ, t];
Nt the number of users that tweeted anything within the time frame (t − τ, t];
dEt+ the number of users that were engaged at time t − 1 but uncommitted at time t;
dEt− the number of users that were uncommitted at time t − 1 but engaged at time t;
from the above and Eq. 1 and 2 in the manuscript, one can estimate the engagement and disengagement rates αt and β t at any time t and consequently the reproduction number Rt from Eq.
4.
Comment 3.6.
4. explain more the community detection method in the section "Community detection".
Response 3.6.
We clarified how the communities were computed, focusing in how to the stability algorithm was
used in our case.
Comment 3.7.
5. There is a typo in Results section, first sentence, there is the word "weather" instead of "wether"
Response 3.7.
This typo has been fixed.
Submitted filename:
Assessing the influence of French vaccine critics during the two first years of the COVID-19 pandemic
PONE-D-22-05617R1
Dear Dr. Faccin,
We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.
Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.
An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at
If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact
Kind regards,
Constantine Dovrolis
Academic Editor
PLOS ONE
Additional Editor Comments (optional):
Reviewers' comments:
PONE-D-22-05617R1
Assessing the influence of French vaccine critics during the two first years of the COVID-19 pandemic
Dear Dr. Faccin:
I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.
If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact
If we can help with anything else, please email us at
Thank you for submitting your work to PLOS ONE and supporting open access.
Kind regards,
PLOS ONE Editorial Office Staff
on behalf of
Dr. Constantine Dovrolis
Academic Editor
PLOS ONE