Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Anatomy into the battle of supporting or opposing reopening amid the COVID-19 pandemic on Twitter: A temporal and spatial analysis

  • Lingyao Li,

    Roles Data curation, Formal analysis, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Civil and Environmental Engineering, A. James Clark School of Engineering, University of Maryland, College Park, Maryland, United States of America

  • Abdolmajid Erfani,

    Roles Data curation, Formal analysis, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Civil and Environmental Engineering, A. James Clark School of Engineering, University of Maryland, College Park, Maryland, United States of America

  • Yu Wang,

    Roles Data curation, Formal analysis, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Civil and Environmental Engineering, A. James Clark School of Engineering, University of Maryland, College Park, Maryland, United States of America

  • Qingbin Cui

    Roles Conceptualization, Methodology, Project administration, Supervision, Writing – review & editing

    cui@umd.edu

    Affiliation Department of Civil and Environmental Engineering, A. James Clark School of Engineering, University of Maryland, College Park, Maryland, United States of America

Anatomy into the battle of supporting or opposing reopening amid the COVID-19 pandemic on Twitter: A temporal and spatial analysis

  • Lingyao Li, 
  • Abdolmajid Erfani, 
  • Yu Wang, 
  • Qingbin Cui
PLOS
x

Abstract

Reopening amid the COVID-19 pandemic has triggered a battle on social media. The supporters perceived that the lockdown policy could damage the economy and exacerbate social inequality. By contrast, the opponents believed it was necessary to contain the spread and ensure a safe environment for recovery. Anatomy into the battle is of importance to address public concerns, beliefs, and values, thereby enabling policymakers to determine the appropriate solutions to implement reopening policy. To this end, we investigated over 1.5 million related Twitter postings from April 17 to May 30, 2020. With the aid of natural language processing (NLP) techniques and machine learning classifiers, we classified each tweet into either a “supporting” or “opposing” class and then investigated the public perception from temporal and spatial perspectives. From the temporal dimension, we found that both political and scientific news that were extensively discussed on Twitter led to the perception of opposing reopening. Further, being the first mover with full reopen adversely affected the public reaction to reopening policy, while being the follower or late mover resulted in positive responses. From the spatial dimension, the correlation and regression analyses suggest that the state-level perception was very likely to be associated with political affiliation and health value.

Introduction

A novel SARS-CoV-2 virus (COVID-19) that emerged in December 2019 has spread worldwide and become a pandemic [1]. As of June 10, 2021, more than 33.4 million cases and 598,000 deaths were reported in the United States, and the number of new cases is still high [2]. As the stay-at-home orders took effect in early April 2020, thousands of workforces were shut down, sports events were canceled, and universities and schools moved online. Beginning in mid-April, news media reported that anti-lockdown protests erupted across U.S. states [3]. Debates surrounding the necessities of lockdown orders and the appropriate time to reopen the country have raged for a long time. People were concerned that a prolonged lockdown would damage the economy and exacerbate existing social inequality in education and workforce environment [4], while others perceived that a temporal lockdown policy was necessary for slowing down COVID-19 spreads as well as ensuring a safe economic recovery environment [5]. Understanding the public risk propensity and dissecting the rival perceptions on reopening is of significance for the policymakers to cope with the challenges of enacting reopening policies.

Policymakers follow various approaches to determine the appropriate time to implement reopening phases. Premature reopening may increase the risks of contracting the virus in communities [6], but individuals may choose to undertake the risks due to living pressures. Tracking online perceptions on the reopening policy can provide meaningful insights for policymakers to comprehend how the online public thinks and behaves amid the COVID-19 pandemic. As social media establishes timely channels for online users to communicate information, crowdsourcing through social media plays a significant role in recognizing public opinions [7]. The advantages of leveraging social media to investigate public perceptions are manifold. First, social media offers policymakers opportunities to probe into a wealth of data that reflect people’s emotions and behaviors during a pandemic [8]. Second, social media can help construct a useful instrument to identify emergent responses, and therefore, policymakers can tailor their policies to the public demands. Last, social media-based approach supports the facilitation of timely track on online public perceptions that may exceed most conventional survey methods.

Social media data from Twitter, Facebook, and other web platforms have provided a rich source of information within the scientific community to investigate public perceptions relative to the COVID-19 crisis [911]. One broad application focuses on the areas of information dissemination and public engagement [1113]. These studies have revealed that social media can support effective communication channels for government agencies or influencers to communicate important messages to the public. A recent survey based on 645 Italian clinicians reported that 47% of respondents answered that information shared on social media had a consistent impact on their daily practice [14]. Another study conducted by Chen et al. (2020) demonstrated that the dialogic loop on social media could help facilitate engagement through government accounts during the COVID-19 pandemic [12].

Social media postings contain a great deal of textual information. Textual analysis of vocabulary, semantic structures, and other textual features (e.g., text sentiment) conveys information that can be leveraged to assist in attitude survey, behavior analysis, and mental health detection [1518]. Iglesias-Sánchez et al. (2020) [19] selected the case of COVID-19 quarantine in Spain and tracked the emotion changes based on online postings. Their study implied that isolation measures could have a significant impact on residents’ emotions, particularly arouse a noticeable response of anger emotion. Xue et al. (2020) [20] analyzed Twitter data in the early stages of this crisis, and their sentiment analysis result revealed that fear for the unknown nature of COVID-19 was dominant on Twitter.

Prior studies relative to the utility of social media in understanding public opinions show temporal and spatial variations [2124]. Through the investigation of the geography of Twitter topics in London, Lansley and Longley (2016) [23] found that topics and attitudes expressed through tweets varied substantially across places and were associated with the demographic and socio-economic characters of the users. Koylu et al. (2018) [21] investigated the online public discourse and sentiment across space and time towards an immigration policy implemented in 2017, and their study manifested that such policy highlighted important partisan division within U.S. states. The opinion variations on political topics can be attributed to the inferred characteristics of online users. At this point, previous studies have also illustrated that demographic and socio-economic factors could exert an influence on public awareness, such as age, area of residence, income, educational level, and party affiliation [2527].

Since the outbreak of COVID-19, several studies have analyzed the impacts of government policies on the public [2830]. For example, Wei et al. (2020) [30] applied the interaction strategies and the evolutionary game analysis of the actions taken by the government and the public. Their study demonstrated that emergency response adopted by the government in the early stages of the pandemic could effectively contain the spread. For reopening policy, Nguyen et al. (2020) [31] quantified the effect of state reopening policies on daily mobility, and they observed an increase in mobility patterns during the reopening phase. Kaufman et al. (2020) [32] applied an interrupted time series to compare the rate of growth in COVID-19 cases after reopening to growth prior to the reopening. Their results revealed that states should delay further reopening until mask mandates were fully implemented. However, with the reviewed studies, the impacts of reopening policies have not been thoroughly investigated with social media data.

Building on the existing body of knowledge relative to the temporal and spatial analysis for online public opinions, this study aims to explore the potential of social media data (Twitter postings) to investigate online perceptions on reopening policy and demonstrate how the nature of the opinions varies according to the temporal and spatial characteristics. From the perspective of temporal analysis, online opinions on supporting the policy could vary in the appearance of influential news and events and might be affected by the timing (the first mover, follower, or late mover) to reopen the economy. From the perspective of spatial analysis, online perceptions might display significant differences geographically and could be associated with demographic and socio-economic characteristics. This study conducts correlation and regression analyses to unfold the demographic factors that can help explain the discrepancy and the consistency of public perceptions across U.S. states. As discussed, the findings of this study provide meaningful insights to understand how the online public reacted to the policies and further support policymakers to appropriately implement reopening policies.

Materials and methods

Data preparation and model framework

Twitter supports abundant data sources that can be accessed for capturing information given any topics. We used Twitter Standard Search API with the search term “reopen” to scrape tweets from April 17 to May 30, 2020. Other search terms, such as “open up,” “shut down,” may also contain information that implies a user’s perception on reopening policies. However, these terms were often used in tweets like “business were shut down,” and “the park will open up next week,” which were not indicative of inclinations to support or oppose reopening policy. As a result, using these terms may bring a large amount of noise to the dataset. More importantly, search terms including “lockdown” and “shut down” are not neutral terms as they were often appeared in tweets describing negative emotions during the lockdown period. For example, the tweet “I’m so tired of being in lockdown” describes the boredom emotion but does not adequately illustrate the user’s inclination to support or oppose reopening policy. Last, using other terms to download the data may result in a large variance of the textual information (e.g., the common topics of “reopen” and “lockdown” could be very different), which makes the machine classification process hard to implement.

For these considerations, we decided to use “reopen” as the search term to download tweet data. Since almost all states were fully reopened after May 30 [33], we restricted the search time range to May 30, 2020. Twitter provides two types of geographical data. One is the geo-tagged location, which is available when a user decides to share the location at time of tweet. However, only a small portion (<0.1%) of tweets were associated with geo-tagged locations in the dataset. The other type is registration location based on a user’s profile. Given that the research goal is to investigate the online perceptions on U.S. reopening policy, we filtered out those records with registration locations not implying a U.S. location. This filtering process resulted in a dataset with a total number of 2,407,911 records, in which 760,646 are unique tweets given that retweets are of the same textual content.

Then, we selected the 5,000 most frequently occurring unique tweets and manually classified each of them into support reopen (class 1), oppose reopen (class -1), or unrelated (class 0). Although previous studies show that retweeting behavior is not random [34, 35], we selected these tweets rather than a random subset to build the training dataset based upon two considerations. First, these 5,000 tweets are most likely to be reflective of impactful twitter data on the dataset as they received most retweets. More importantly, reviewing these 5,000 tweets could ensure more accurate classifications on the whole dataset. It was worth noting that they comprise 49.7% (1,196,274) of the full dataset (retweets have the same textual content). That being said, once the trained model gets a high training accuracy, retweets of these most occurring tweets are very likely to be correctly identified. Therefore, this treatment could ensure a higher classification accuracy on the whole dataset. Last, the testing data were randomly selected from the dataset so that the performance of the model (trained based on these selected tweets) on the testing set objectively reflects the accuracy on the whole dataset. More details regarding the training and testing process are presented in the S1 Appendix.

In the process of human labeling, two members from the research team labeled the same tweet for the first step. Once a tweet received the same label from both team members, it was considered as the final label for this tweet. Otherwise, another team member came to label this same tweet. The class of this re-labeled tweet was determined following the majority of the three manual labels. Among these 5,000 selected samples, 1,630, 1,950, and 1,420 samples were classified into class 1, class -1, and class 0, respectively. Meanwhile, we followed the same process and labeled another 2,339 unique tweets (different from the 5,000 tweets) that were randomly selected from the dataset to build a testing dataset. As a result, 744, 1,060, and 535 tweets were manually labeled as class 1, class -1, and class 0. Examples of the labels of tweets are attached in Table 4 in S1 Appendix.

As noted, “class 0” tweets contain the keyword “reopen” but do not imply the perception of supporting or opposing reopening policy. Examples include “national parks are set to reopen,” and “Gyms and fitness centers can reopen on May 26 if they can meet safety protocols.” In subsequent experiments, we found that a multi-class classification considering these tweets (class 0) largely reduced the testing accuracy from 73.0% to 58.3% possibly because they contain a large variance of textual information. Therefore, we determined to manually extract those textual patterns from 1,420 class 0 samples to remove tweets not informative of reopening perception. For example, we manually collected the word pattern “can reopen” from the tweet “Gyms and fitness centers can reopen on May 26 if they can meet safety protocols,” and used it to clear up “class 0” tweets from the dataset. Our manual collection of word patterns for filtering tweets is attached in the Section of Data Availability. As a result, the dataset was reduced from 2,407,911 to 1,591,216 with 450,450 unique tweets, even though a small portion of tweets in the dataset might not be fully cleaned. Correspondingly, we adjusted the training and testing datasets by filtering out those class 0” samples. The model framework for the implementation of the proposed method is illustrated in Fig 1.

Text cleaning and sample balance

Before text augmentation, we applied several steps to clean the tweets, as presented in the box “text cleaning” in Fig 1. We firstly removed short URLs, @username, RT @username, digits, emojis, and punctuations in a tweet. Then, we stripped those stop-words that were not informative, such as “the,” “is,” and “and.” Next, we tokenized each tweet into a list of separate words and characters. Since the words in a tweet can be written in different forms, we converted tokenized words to their base forms (also known as lemmatization). This cleaning process was completed with the aid of the Natural Language Toolkit (NLTK) python package [36].

As observed from Fig 1, samples labeled as class 1 and class -1 constitute 45.6% (1,632 out of 3,580) and 54.4% (1,948 out of 3,580) of the training dataset. Imbalance in the training dataset may result in a worse prediction performance for the minority class. Therefore, we utilized a simple text augmentation technique called Easy Data Augmentation (EDA) [37] to balance the distribution of class 1 and class -1 and increase the training data size. This text augmentation technique requires no NLP model to be pre-trained on any external dataset and is capable of improving the performance for a smaller dataset [37]. The EDA uses four operations 1) synonym replacement, 2) random insertion, 3) random swap, and 4) random deletion to increase the volume of labeled data [37]. Specific explanations and examples are presented in the S1 Appendix.

Following the recommendations by the study, we set the parameters for each of the four operations as α = 0.1, where α is a parameter that indicates the percentage of the words in a sentence that is changed [37]. However, the length of tweets can vary largely. Longer tweets have more words so that they can absorb more noise while maintaining original content. To compensate for this issue, as suggested by the research [37], the number of words changed in a tweet is defined as n = αl, where l is the length of a tweet. For shorter tweets, this EDA technique ensures that at least one word in the text is changed [37]. Further, we set the augmentation for class 1 and class -1 as 5 times and 4 times while conserved the original tweets. As a result, the data size for class 1 and class -1 was increased to 9,780 and 9,750, respectively, which were approximately equivalent in the training dataset.

Text vectorization and classification

We applied the Term Frequency-Inverse Document Frequency (TF-IDF) and Word Embedding techniques to convert the tweets in the training dataset into vectors of features. TF-IDF is a popular term weighting method implemented in text similarity, text classification, and information retrieval [38]. Although TF-IDF cannot capture word positions or semantic meaning in a text, it is an efficient and useful algorithm to deal with a broad set of texts due to its simplicity and fast computation [39]. In TF-IDF, TF measures the number of words and their frequencies on each document, while IDF is incorporated to reduce the weights of common words in the corpus. The goal of using TF-IDF instead of the raw frequencies of words in a text is to scale down the impact of words that occur frequently and are hence empirically less informative. The representation of the TF-IDF method is given below [38]: (1) where w(t, d) represents the word t’s weight in tweet d, ft,d denotes the frequency of word t in tweet d, D is the total number of tweets, and dt is the number of tweets that word t appears.

Unlike TF-IDF method, Word Embedding techniques can help capture the semantic meanings of words in a context by converting each word into a pre-trained vector of features. They are often applied to compute text similarity or text classification. In this study, we adopted a popular Word Embedding technique called Word2Vec, which was released by the Google research team in 2013 [40]. The Word2Vec model was made up of a group of two-layer shallow neural networks and deployed with two architectures of continuous Bag-of-words and the Skip-gram to produce the vector representation for each word [40]. Word2Vec was trained using Google news, and each word vector has 300 dimensions [40].

As each tweet was converted to a vector of features, we applied several classifiers provided by scikit-learn python library to build the pipeline for text classification, including Bernoulli Naïve Bayes (BNB), Support Vector Machine (SVM) with Stochastic Gradient Descent (SGD), and Logistic Regression (LR) [41]. Since Multinomial Naïve Bayes (MNB) achieved the highest testing accuracy (Table 1), we specifically explained this algorithm in this section. MNB is a specialized Bayesian method assuming the data are multinomially distributed [42]. The distribution is parametrized by θy = (θy1, θy2, …, θyn) for each class y, where y ∈ {−1, 1}, and the vectors of features can be obtained based on TF-IDF. y = 1 denotes that the tweet implies a perception of supporting reopen, while y = −1 denotes a perception of opposing reopen. n is the size of vocabulary (based on the number of words appeared in the tweets dataset). θyi = P(xi|y), i.e., the probability of word xi appearing in a tweet given that the tweet belonging to class y. Bayes theorem defines the following relation given the class y and word x1 through word xn [43]: (2)

The naive conditional independence makes an assumption that [43]: (3) for all i, formula (3) can be simplified to [43]: (4)

As P(x1, x2, …xn) is a constant given the inputs, the estimation for P(y|x1, x2, …xn) can be denoted as [43]: (5) (6)

In this study, P(y) was the relative frequency of class 1 and class -1 in the training dataset, and P(xi|y) was estimated using TF-IDF technique.

Before applying the classifiers to opinion detection, we considered sentiment analysis given that sentiment techniques can classify the text into one of positive, negative, and neutral emotional categories. However, a prior study illustrated the differences between sentiment identification and opinion detection [44]. In this study, we applied a sentiment tool (TextBlob python package [45]) over the testing dataset and found that more than 60% of the classifications were not aligned with our manual labels. In some cases, sentiment identification is in line with reopening perception. For example, some users posted positive feelings when the restaurants reopened. This positive emotion also implies that the user supported the reopening policy. However, in other cases, sentiment identification may contradict with the opinion detection. For example, many users expressed that they were unhappy about the lockdown extension. This negative sentiment suggests a supportive attitude towards the reopening policy. Similarly, the positive sentiment might imply that online users supported the stay-at-home order, while the negative sentiment might indicate a surge in cases caused by reopening protests, both of which represented an attitude of opposing reopening. In summary, sentiment classifications could not represent users’ opinions towards the reopening policy and therefore were not applied in this study. Specific tweet examples are presented in Table 5 in S1 Appendix.

Performance measurement

Precision, Recall, and F1-score were applied to assess the classification performance. Precision measures the fraction of true positive cases over the retrieved cases that a model predicts, while recall is the fraction of true positive cases over all the relevant cases. F-measure applied in this research uses the Harmonic Mean, known as F1-score. F1-score is a rating of test accuracy, representing a combination of Recall and Precision [46]. The mathematical formulas for Precision, Recall, and F1-score are presented in formula 7, 8, and 9, respectively. Performance on the testing samples of these classification pipelines is exhibited in Table 1.

(7)(8)(9)

Overall, models that were trained based on TF-IDF outperform the trained models based on Word2Vec, demonstrated by both higher training accuracy and testing accuracy. A possible explanation is that the Word2Vec was not pre-trained using COVID-19 related topics, and thus it might not be able to capture the semantic meanings of some words in the dataset. Moreover, we simply took the average word embedding from each word vector to represent the tweet, and thus the trained model might ignore the importance of key words in a tweet and result in information loss. As a result, models trained on word embeddings might not discriminate the distinctions between tweets in some context. Among the four classifiers that were built on TF-IDF vectors, MNB slightly outperforms BNB and SVM, demonstrated by a higher F1-score on both classes and a higher testing accuracy. Although LR classifier achieves the highest training accuracy, it overfits the model and yields the lowest testing accuracy. For these considerations, we selected TF-IDF + MNB to build the pipeline and applied it to the whole dataset. However, it is apparent that those more sophisticated word embeddings and classifiers could easily have been applied once their performance warranted their choice in other cases.

Results

Temporal analysis

Temporal results.

First, we computed the national-level daily perception based on the number of tweets supporting reopening divided by the total number of tweets each day. Fig 2 depicts the temporal changes in the study period. In particular, it presents a 5-day moving average to show a smoother trend of the perception changes. On most of the days, the perception was less than 0.5, demonstrated by a larger number of tweets implying opposing reopening (the blue bar is higher than the orange bar). In late April to early May, a larger proportion of online users perceived that it was too soon to reopen the country. However, when states such as Texas and Florida announced their reopening policies around April 25, tweets supporting reopening began to accumulate, and the volume exhibited a gradual increase. After May 25, when most states partially or fully reopened, the level of perception presented another increase. Overall, the perception after May 5 showed a more or less increase despite a short downturn from May 21 to May 25. This observation suggests that online users tended to switch to support reopening as the lockdown extended.

thumbnail
Fig 2. Daily perception and moving average.

a. Daily perception and 5-day moving average for supporting reopening from April 17 to May 30, 2020. A perception < 0.5 indicates that more online users opposed reopening, while a perception > 0.5 implies that the majority supported reopening. The absolute volume of tweets indicating supporting reopening or opposing reopening is also presented. b. Comparison between the 5-day moving average perception with national polling results.

https://doi.org/10.1371/journal.pone.0254359.g002

Analysis of Twitter data provide insights of online public’s responses to reopening policy. However, prior studies have shown that social media data might be an overrepresentation of young, educated, and urbanized population [4749]. Specifically, Mislove et al. (2011) [47] raised a concern about whether Twitter could be representative of the overall population. Their research discovered that Twitter users significantly overrepresent the densely population regions. In subsequent studies, Barbera and Rivero (2015) [49] showed that Twitter users who discussed politics are likely to be male gender, to live in urban areas, and to have extreme ideological preferences. Mellon and Prosser (2017) [48] also suggested that Twitter and Facebook users are not representative of general population regarding political relevant discussions including vote choice, turnout, age, gender, and education.

Therefore, we compared the estimated perception with national polls to figure out how Twitter samples are biased in the representation of the public on reopening policy, as Fig 2 illustrated. We found that the estimated perception was different from national polls between May 5 to May 15. At other time during the study period, the results based on Twitter data were close to the national polls.

Popular news-driven tweets and their effects.

News and events are of importance to drive public perceptions and often discussed in tweets. Numerous studies have showed that media coverage often exerts a significant impact on public perceptions by altering people’s exposure to information [5052]. Inspired by these studies, we investigated how the important news especially political news and scientific news drove the discussion on Twitter and how they affected public perceptions on temporal horizon. We probed into the top 214 most retweeted tweets in the study period and extracted the news contents mentioned in the tweets. The top 214 tweets were retweeted 483,336 times in the study period (483,336 of 1,591,216, covering 30% data). Among these most retweeted 214 tweets, 152 are news-driven tweets, featured by a direct reference to a news event or the main text being followed by link to a news article.

In order to identify the popular news-driven tweets and their impacts on public perceptions, we first classified the 152 news-driven tweets into different categories. Specifically, we considered those tweet contents driven by news or reports mentioning political orders, plans, guidelines, statements, or announcements made by the president, governors, or other politicians as “political-news-driven.” In comparison, we considered those tweets containing scientific-related news or evidence, including scientific research findings, experimental data and reports, and guidance from experts, health officials, and research institutes as “scientific-news-driven.” For example, the following tweet is “scientific-news-driven” as it opens up a discussion based on the scientific evidence that the testing kits were not enough to guarantee a safe reopening environment.

“We can’t safely reopen the economy until we can test millions of asymptomatic people and find out who can spread the virus. That requires a massive testing infrastructure and robust contact tracing we don’t yet have. The federal government must lead and stop blaming the states. https://t.co/vYr3kAKMTz

The following tweet reflects politics-related opinions that the administration shelved CDC guidance on how and when to reopen:

“Reasons why CDC guidance was shelved: 1. Guidelines say states should not reopen while their Covid cases are increasing. 2. Trump admin wants states to reopen regardless. 3. White House does not want to be accountable, and guidelines would make them so. https://t.co/6VXqDaQSFc

The human labeling process of the types of news was similar to the tweet labeling, as explained in Section 2.1 Data Preparation. A tweet was firstly labeled by two team members and checked by the third one if there was an inconsistency. As a result, 95 of 152 were identified as relative to political news, while 24 of them were relative to scientific news. It was also noted that some tweets could be driven by multiple types of news. Among the 152 tweets, five referred to both political news and scientific news. In addition to the tweets relative to political or scientific news, 38 of 152 were related to news that reported pandemic facts (e.g., death toll, new cases, testing), social events (e.g., protests), economic impact (e.g., unemployment). For example,

“14.7% unemployment. It’s time to reopen America. We’re not going to be able to protect our elders or the sick if we have no economy.

“Heads up re Alabama’Alabama saw its largest single-day increase in new cases Monday, a little more than three weeks after the stay-at-home order expired on April 30 and two weeks after the state allowed restaurants and bars to reopen on May 11. https://t.co/TwBHTWXb0L

With the identifications of popular tweets, we summarized typical political (in red boxes), scientific (in blue boxes), and other types (in grey boxes) of news or evidence, as illustrated in Fig 3. We probed into the contents and the classifications (automatically classified by the trained model) of these popular tweets and found that news or opinions relative to reopen policy plans, announcements, economic recovery, and controlled outbreak often resulted in supporting reopening. However, news or opinions relative to pandemic outbreak, limited testing capacity, data manipulation, concerns for increasing cases and deaths, and safe recovery often led to a perception of opposing a premature reopening.

In addition, we noticed that 95 “political-news-driven” tweets were retweeted 175,908 times, while 24 “scientific-news-driven” tweets were retweeted a total of 56,573 times. Among the 95 “political-news-driven” tweets, 26 tweets (27.4%) with a total of 36,929 (21%) retweets express a positive sentiment of supporting reopening, while 69 tweets (72.6%) with a total of 138,979 (79%) retweets are negative about reopening. Among the 24 “scientific-news-driven” tweets, 3 tweets (12.5%) with a total of 3,108 (5.5%) retweets support reopening, and 21 tweets (87.5%) with a total of 53,465 (94.5%) retweets oppose reopening. This result manifests that the majority of both political news and scientific news that were extensively discussed on Twitter resulted in the view of opposing reopening on the temporal horizon. In particular, the scientific news implies an attitude of opposing reopening even more than the political news does, such as delivering alerts for premature reopen or highlighting data manipulation issues. As these tweets were widely recognized, it reflects that a substantial number of Twitter users acknowledged the same standpoint as the original tweet. Although this study did not cover all the tweets that contained political and scientific news, we recognized that investigation of these popular tweets could help support the analysis of how the political and scientific news drove people’s perception on reopening policy in the Twitter community.

Be the first mover, follower, or late mover.

The appropriate time of reopening can be an additional driving factor on the temporal dimension that affected the perception. Being the first mover, follower or late mover is a significant question for policymakers to consider. A risk-based decision-making process should be taken into account to determine the appropriate time to reopen the economy [53]. This question has been examined in the literature of business strategy, and researchers raised the concern that being the first mover could result in potential hazards [54]. In this section, we attempted to evaluate the impacts on the public perception under multiple scenarios of reopening policies.

We categorized the reopening policies into three groups based on the date when a reopening policy took effect, including “first mover” (before May 5), “follower” (from May 5 to May 14), and “late mover” (after May 14). Such division is mainly based upon the following two reasons. First, it generates three time periods with almost equal length. Second, we observed that May 5 and May 14 were the two time points that multiple U.S. states altered their reopen statuses [33]. Then we investigated the policy’s effects on the perception level, as exhibited in Table 2. Typical examples of dynamic perception changes are presented in Fig 4. The impact on perception was evaluated based on a 3-day average trend analysis after the implementation of a reopening policy, as presented below.

  • Negative = more than 3 percentage of negative reaction within 3 days. <-3%
  • Slight negative = 1 to 3 percentage of negative reaction within 3 days, -1~3%
  • Neutral = less than 1 percentage change in perception within 3 days, -1 ~1%
  • Slight positive = 1 to 3 percentage of positive reaction within 3 days, 1~3%
  • Positive = more than 3 percentage of positive reaction within 3 days, >3%
thumbnail
Fig 4. The impact of reopening policy on perception level.

a. Group a: first mover (partially reopen or fully reopen before May 5). b. Group b: follower (partially reopen or fully reopen between May 5 and May 14). c. Group c: late mover (partially reopen or fully reopen after May 14). The upward arrow corresponds to the time of full reopen, and the downward arrow corresponds to the time of partial reopen.

https://doi.org/10.1371/journal.pone.0254359.g004

For the “first mover” group, the pattern manifests that, for 10 out of 21 states, the perception emerged on Twitter supported an early partial reopening policy (allowing some major sectors to reopen) but adversely reacted to a full early reopening policy (allowing every major sector to reopen). One possible explanation is that Twitter users were aware of the risks of increasing cases that might result from an early full reopening policy even though such policy aimed to reinvigorate a slumping economy. Moreover, 8 out of 21 states in the “first mover” group reacted negatively even to an early partial reopen, while only 3 out of 21 states (“AL,” “NE,” and “ND”) displayed a positive reaction to a fully early reopening policy. For the “follower” group, the observed pattern appeared to be consistent. A neutral or slightly positive reaction to reopening policies was reported from 16 out of 17 states in this group excepting “NV” state where an adverse reaction was observed. For the “late mover” group, the overall response was positive. 14 out of 19 states in this group showed a positive sentiment on the partial or full reopening policy. However, 5 states (“IN,” “WV,” “OH,” “NC,” and “KY”) displayed a negative reaction to a full reopening policy. In conclusion, the perception towards reopening policy exhibited a shift from negative to positive as the lockdown extended.

Overall, a partial reopening policy was likely to result in a more or less favorable increase on the perception level, possibly because people were concerned about the economic pressure under the COVID-19 pandemic. The result also suggests that in many U.S. states, the public willingness expressed on Twitter was not inclined to support a swift reopening strategy. By contrast, being the follower or late mover rather than the first mover of reopening policy was likely to be favored by Twitter users. However, the trends of different U.S. states could show variations as the COVID-19 outbreak hit with varying severity and time.

Spatial analysis

Spatial results.

From the spatial perspective, we focused on the analysis of state-level perception. We firstly binned the data and calculate the state-level perception based on the total number of tweets with users’ registration locations indicating the same state (e.g., “California, USA,” “Los Angeles,” “California,” and “Santa Monica CA” all indicate the California state). As shown in Fig 5, the average perception in the study period ranges from the lowest 33.8% to the highest 54.7% across the U.S. states. The five states with the highest perception for supporting reopening policy were West Virginia (WV, 54.7%), Missouri (MI, 54.6%), Tennessee (TN, 54.5%), Idaho (ID, 53.4%), and Oklahoma (OK, 53.1%). In comparison, the five states with the lowest perceptions were Vermont (VT, 33.8%), Washington (WA, 36.7%), Maryland (MD, 37.8%), Oregon (38.2%), and Massachusetts (38.2%). Overall, states located in the West, Midwest (especially East North Central), and Northeast region had a higher perception in comparison to states located in the South and Middle areas. Moreover, we presented the geographical distribution of state-level perception on a weekly basis in Fig 5. A continuous and consistent pattern observed from Fig 5 manifests that the majority of states located in the South and Midwest, especially West North Central held a higher perception to support reopening.

thumbnail
Fig 5. State-level average perception.

a. Overall state-level average perception from April 17 to May 30, 2020. b. State-level average perception from April 17 to April 23. c. State-level average perception from April 24 to April 30. d. State-level average perception from May 1 to May 7. e. State-level average perception from May 8 to May 14. f. State-level average perception from May 15 to May 21. g. State-level average perception from May 22 to May 28. The figure was generated using the python choropleth graphing libraries (the code was released under MIT license) [55]. If the figure is similar, this figure is not identical to the original image and is therefore for illustrative purposes only.

https://doi.org/10.1371/journal.pone.0254359.g005

Correlation analysis.

We extended the spatial analysis to focus on the relations between the state-level perception and geodemographic attributes. This analysis aims to figure out geodemographic factors that were associated with the changes of the state-level perception. Previous studies revealed that socio-economic and political factors could affect the public perception, such as age, gender, race, income, educational level, party affiliation, and area of residence [2527]. Therefore, we firstly performed a correlation analysis with nine selected geodemographic factors, including educational level (bachelor’s degree %) [56], health (health value 2018) [57], party affiliation (net democratic) [58], household income (average household income 2018) [59], age (median age 2018) [60], gender (male to female ratio 2018) [61], ethnic group (non-white percentage 2018) [62], and some factors related to the pandemic including the reported case rate (as of June 2) [63] and unemployment change (unemployment change from May 2019 to May 2020) [64]. The correlation results are exhibited in Fig 6.

thumbnail
Fig 6. Correlation analysis with selected factors.

a. Correlation with bachelor degree % (R = -0.69, p-value < 0.001). b. Correlation with health value (R = -0.66, p-value < 0.001). c. Correlation with net democratic (R = -0.61, p-value < 0.001). d. Correlation with average household income (R = -0.54, p-value < 0.001). e. Correlation with unemployment change rate (R = 0.35, p-value = 0.013). f. Correlation with case rate (R = -0.27, p-value = 0.060). g. Correlation with median age (R = -0.098, p-value = 0.497). h. Correlation with male to femal ratio (R = -0.06, p-value = 0.677). i. Correlation with non-white % (R = -0.037, p-value = 0.801).

https://doi.org/10.1371/journal.pone.0254359.g006

According to Fig 6, the state-level perception exihibited a moderate and negative correlation with health value (R = -0.66, p-value < 0.001), bachelor degree (R = -0.69, p-value < 0.001), net democratic (R = -0.61. p-value < 0.001), and average household income (R = -0.54, p-value < 0.001). Twitter users in the states with higher health value, higher educational level, higher average household income, and more democratic inclined were less likely to support reopening policy. For those two selected factors directly referring to the COVID-19 pandemic, the state-level perception showed a weak correlation with unemployment change (R = 0.35, p-value = 0.013) and case rate (R = -0.27, p-value = 0.060). In a state with a higher case rate, Twitter users felt less inclined to support reopening policy. However, we observed that the perception appeared to be weakly and positively correlated with the unemployment rate change, indicating that users were likely to support reopening policy when the state had a lower unemployment rate. Among these investigated factors, the perception level didn’t show a significant correlation with median age, male to female ratio, and non-white ratio, demonstrated by p-value > 0.05. One interesting observation for Fig 6 was that the state-level perception showed a moderate correlation with those socioeconomic or political factors (education, party affiliation, health, income) within a state, weak correlation with the factors relative to the pandemic (unemployment change rate, case rate), and no significant correlation with demographic attributes (gender, age, ethnic group).

Regression analysis.

Some selected geodemographic factors might be inter-correlated (e.g., bachelor degree, health value, household income, and net democratic). Therefore, we suspected that the selected attributes might not be statistically significant to estimate the perception level. Therefore, we performed a regression analysis to identify what identified socioeconomic and political factors (independent variables) could explain the changes of the perception level (dependent variable).

Ordinary Least Squares (OLS) was applied to fit a multi-linear regression model. The OLS model minimizes the sum of the squares of the differences between the calculated dependent variable (perception level) in the dataset and those predicted by the function. Since the OLS model assumes non-multicollinearity and homoscedasticity, we performed two diagnostic tests on the model, including the multicollinearity test and the heteroscedasticity test. Prior to feeding the data into the model, we selected the six features that showed moderate to strong correlations (explained in Section 3.2.2. Correlation analysis) with perception levels and applied the min-max scaling approach to normalize these input values into the range of (0, 1) to avoid that features in greater numeric ranges dominate those in smaller ranges.

The Variance Inflation Factor (VIF) quantifies the severity of multicollinearity. It provides an index that measures how much the variance (the square of the estimate’s standard deviation) of an estimated regression coefficient is inflated due to collinearity [65]. A VIF exists for each of the independent variable in a multiple regression model, and the VIF for ith independent variable is represented as [65]: (10) where is the R-square value obtained by regressing the ith independent variable on the remaining independent. A VIF of 1 implies that there is no correlation between the ith independent variable and the remaining variables, and thus the variance is not inflated. As a rule of thumb, VIF > 5 is caused for concern, and VIF > 10 indicates a serious collinearity problem [65]. In this study, we performed the VIF analysis on these six selected factors, and the VIF score for each factor is listed Table 3. The VIF score for the variable “bachelor degree %” is 7.297, which raises a concern about the colinearity issue for the regresssion model. Therefore, we removed this variable for subsequent regression analysis.

Meanwhile, the OLS model assumes that the observations have the same error variance. We performed a heteroscedastic analyiss using the White test [66]. Heteroscedasticity refers to the circumstance in which the conditional variance is not constant (Conditional variance is the variability of dependent variable for each value of the independent variables) [67]. According to the White test result, the F-statistic is 0.909, and the p-value is 0.581. This result does not reveal a significant goodness-of-fit and thus accepts the null hypothesis that the residuls are homoscedastic.

Based on the results of these two diagnostic tests, we selected the independent variables, including health value, net democratic, average household income, and unemployment rate, to perform the regression analysis. The number of observations is 50 (each U.S. state is considered as a data point in the model). As a result, the R-squared value is 0.55, implying that these identified independent variables can explain 55% of the dependent variable–perception level. The specific result of each independent variable is presented in Table 3.

According to Table 3, the t scores and p-values were used for the hypothesis testing of the coefficients–the variables of net democratic (p-value = 0.004) and health value (p-value = 0.001) have statistically significant p-value. It also means that these two variables were statistically significant in explaining the state-level perception.

From correlation and regression analyses, it is reasonable to conclude that the state-level perception was likely to be associated with the changes of party affiliation (net democratic) and health condition (health value), as these demographic characteristics within a state could affect its public perception of supporting reopening policy.

Discussion

The COVID-19 pandemic has posted significant health threats to the U.S. society and weakened the domestic economy since its outbreak in March 2020 [68]. Reopening the country after the shutdown was a challenging decision for the policymakers to cope with. Premature reopening might trigger a second wave of widespread infections that could invalidate previous efforts [5], but a prolonged lockdown could dampen the economy and cause severe mental problems for people [69, 70].

The government’s decision to reopen the country should be subject to the inspection of public concerns, thoughts, and behaviors. Social media presents a rich source of information for the government agencies to detect the impact of their policies on the public. This study anatomizes the debate on Twitter surrounding the reopening policy from temporal and spatial perspectives. The goal of this study is to provide policymakers insights to understand the perception emerged on social media and its association with geodemographic factors.

In this study, we investigated more than 1.5 million tweets and employed NLP and machine learning techniques to classify the tweets into supporting or opposing reopening. With these classifications, we computed the perceptions and conducted the analysis from temporal and spatial dimensions. From the temporal dimension, our results show that popular political-news-driven and scientific-news-driven tweets could result in a view of opposing reopening. On top of that, we divided the reopening policies into three scenarios: first mover (before May 5), follower (May 5 ~ May 14), and late mover (after May 14). The result manifests that an early full reopening policy often exerted a negative influence on supporting reopening, but a late reopening policy or an early partial reopening policy could result in the positive sentiment on supporting reopening.

From the spatial dimension, we explored the correlations between the state-level perception and geodemographic factors. Our findings reveal a significant difference on the average state-level perceptions. The state-level perception showed a moderate negative correlation with socioeconomic and political factors, including education, health, party affiliation, and income. The state-level average perception also showed a weak correlation with factors relative to the COVID-19 pandemic, including the unemployment change rate and reported case rate. However, the perception was unlikely to be correlated with intrinsic demographic attributes on population, such as age, gender, and ethnic groups. More importantly, through the regression analysis, we found that the state-level perception was likely to be associated with the changes of party affiliation and health condition.

In this study, we demonstrate the feasibility of using social media data to track online public perceptions of reopening policy, and present a quantitative process to develop a pipeline to classify the tweets. This social media-based approach can be generalized to quantify the level of online perceptions on a policy or an event and has the advantages of rapidity, quantity, and spatial coverage. From practical perspectives, this study provides an instrument for the government agencies to detect the perception and insights on the public risk propensity, which further supports them to formulate a well-thought-out strategy.

Despite the aforementioned benefits, some limitations need to be highlighted. First, a small portion of unrelated tweets (class 0) might not be fully cleaned from the dataset since some textual patterns were not present in the collected samples. Second, using the key term “reopen” to download the data might result in some information loss. For example, this filtering would eliminate tweets, such as “I don’t think the government should lift the stay-at-home order too soon,” which expresses an opinion towards reopening but does not contain the key word. Third, since the testing accuracy is 73%, misclassifications could lead to biases in the result analysis. However, it was observed that the F1-scores of class 1 and class -1 were close, and therefore the impacts from misclassified tweets might be offset. Moreover, a large set of retweets were aligned with the same manual labels so the actual accuracy of labels could be much higher.

Ongoing and future work will first pay attention to the improvement of text classification models. One possible direction is to apply more sophisticated classifiers, such as deep neural networks. Another piece of future work will incorporate social media data from Facebook or Instagram into current findings and extend this study to establish a public perception tracking system, which may benefit government agencies, health officials, research institutes, and the residents.

Conclusion

This study utilized a social media-based approach to investigate public perceptions towards reopening policy and anatomized the debate surrounding reopening policy on Twitter. This study investigated more than 2 million Twitter postings related to reopening policy in the date range from April 17 to May 30, 2020, and it built a pipeline for text classification using NLP and machine learning techniques. The result analysis was investigated from both temporal and spatial perspectives. From the temporal horizon, the results suggested that popular tweets mentioning political news and scientific news expressed more negative sentiment on supporting reopening. Moreover, being the first mover to reopen the state was more likely to result in a negative response to support reopening, while being the late mover triggered a more positive response. From the spatial horizon, the state-level perception exhibited a moderate and negative correlation with socioeconomic and political factors, including education level, health value, party affiliation, and household income. However, it did not show apparent correlations with intrinsic attributes of population like age, gender, or ethical group. The research findings provide the policymakers meaningful insights to track the public perception and understand how it reacts and interacts with related policies or news events and thus enable policymakers to enact appropriate solutions to implement reopening phases.

Acknowledgments

We thank Kunqi Zhang and students in ENCE422 (Spring 2020) class from University of Maryland at College Park to help us classify the tweets into supporting or opposing reopening policy.

References

  1. 1. Cucinotta D, Vanelli M. WHO Declares COVID-19 a Pandemic. Acta Bio Medica Atenei Parmensis [Internet]. 2020 Mar 19 [cited 2020 Jul 6];91(1):157–60. Available from: http://doi.org/10.23750/abm.v91i1.9397. pmid:32191675
  2. 2. Coronavirus (COVID-19) [Internet]. Google News. [cited 2020 Jul 23]. https://news.google.com/covid19/map?hlen-US&gl=US&ceid=US:en.
  3. 3. Ferguson N, Laydon D, Nedjati Gilani G, Imai N, Ainslie K, Baguelin M, et al. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand [Internet]. Imperial College London; 2020 Mar [cited 2020 Jul 9]. http://spiral.imperial.ac.uk/handle/10044/1/77482.
  4. 4. CNN TJ. Critics say lockdowns will be more damaging than the virus. Experts say it’s a false choice [Internet]. CNN. [cited 2020 Jul 23]. https://www.cnn.com/2020/05/29/europe/lockdown-skeptics-coronavirus-intl/index.html.
  5. 5. López L, Rodó X. The end of social confinement and COVID-19 re-emergence risk. Nat Hum Behav [Internet]. 2020 Jul [cited 2020 Jul 23];4(7):746–55. Available from: http://www.nature.com/articles/s41562-020-0908-8. pmid:32572175
  6. 6. Public Health Principles for a Phased Reopening During COVID-19: Guidance for Governors.: 24.
  7. 7. Lorenz-Spreen P, Lewandowsky S, Sunstein CR, Hertwig R. How behavioural sciences can promote truth, autonomy and democratic discourse online. Nat Hum Behav [Internet]. 2020 Jun 15 [cited 2020 Jul 6]; Available from: http://www.nature.com/articles/s41562-020-0889-7. pmid:32541771
  8. 8. Bavel JJV, Baicker K, Boggio PS, Capraro V, Cichocka A, Cikara M, et al. Using social and behavioural science to support COVID-19 pandemic response. Nat Hum Behav [Internet]. 2020 May [cited 2020 Jul 6];4(5):460–71. Available from: http://www.nature.com/articles/s41562-020-0884-z. pmid:32355299
  9. 9. Sharma K, Seo S, Meng C, Rambhatla S, Liu Y. COVID-19 on Social Media: Analyzing Misinformation in Twitter Conversations. arXiv:200312309 [cs] [Internet]. 2020 May 8 [cited 2020 Jul 6]; http://arxiv.org/abs/2003.12309.
  10. 10. Medford RJ, Saleh SN, Sumarsono A, Perl TM, Lehmann CU. An “Infodemic”: Leveraging High-Volume Twitter Data to Understand Public Sentiment for the COVID-19 Outbreak [Internet]. Health Informatics; 2020 Apr [cited 2020 Jul 6]. Available from: http://medrxiv.org/lookup/doi/10.1101/2020.04.03.20052936.
  11. 11. Lai D, Wang D, Calvano J, Raja AS, He S. Addressing immediate public coronavirus (COVID-19) concerns through social media: Utilizing Reddit’s AMA as a framework for Public Engagement with Science. Fu K, editor. PLoS ONE [Internet]. 2020 Oct 6 [cited 2020 Oct 25];15(10):e0240326. Available from: https://dx.plos.org/10.1371/journal.pone.0240326. pmid:33021985
  12. 12. Chen Q, Min C, Zhang W, Wang G, Ma X, Evans R. Unpacking the black box: How to promote citizen engagement through government social media during the COVID-19 crisis. Computers in Human Behavior [Internet]. 2020 Sep [cited 2020 Oct 25];110:106380. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0747563220301333. pmid:32292239
  13. 13. Chan AKM, Nickson CP, Rudolph JW, Lee A, Joynt GM. Social media for rapid knowledge dissemination: early experience from the COVID-19 pandemic. Anaesthesia [Internet]. 2020 Mar 31 [cited 2020 Oct 25]; Available from: http://doi.wiley.com/10.1111/anae.15057. pmid:32227594
  14. 14. Murri R, Segala FV, Del Vecchio P, Cingolani A, Taddei E, Micheli G, et al. Social media as a tool for scientific updating at the time of COVID pandemic: Results from a national survey in Italy. Di Gennaro F, editor. PLoS ONE [Internet]. 2020 Sep 3 [cited 2020 Oct 25];15(9):e0238414. Available from: https://dx.plos.org/10.1371/journal.pone.0238414. pmid:32881933
  15. 15. Depoux A, Martin S, Karafillakis E, Preet R, Wilder-Smith A, Larson H. The pandemic of social media panic travels faster than the COVID-19 outbreak. Journal of Travel Medicine [Internet]. 2020 May 18 [cited 2020 Jul 6];27(3):taaa031. Available from: https://academic.oup.com/jtm/article/doi/10.1093/jtm/taaa031/5775501. pmid:32125413
  16. 16. Gao J, Zheng P, Jia Y, Chen H, Mao Y, Chen S, et al. Mental health problems and social media exposure during COVID-19 outbreak. Hashimoto K, editor. PLoS ONE [Internet]. 2020 Apr 16 [cited 2020 Jul 6];15(4):e0231924. Available from: https://dx.plos.org/10.1371/journal.pone.0231924. pmid:32298385
  17. 17. Christensen SR, Pilling EB, Eyring JB, Dickerson G, Sloan CD, Magnusson BM. Political and personal reactions to COVID-19 during initial weeks of social distancing in the United States. Santana GL, editor. PLoS ONE [Internet]. 2020 Sep 24 [cited 2020 Oct 25];15(9):e0239693. Available from: https://dx.plos.org/10.1371/journal.pone.0239693. pmid:32970761
  18. 18. Nguyen TT, Criss S, Dwivedi P, Huang D, Keralis J, Hsu E, et al. Exploring U.S. Shifts in Anti-Asian Sentiment with the Emergence of COVID-19. IJERPH [Internet]. 2020 Sep 25 [cited 2020 Oct 25];17(19):7032. Available from: https://www.mdpi.com/1660-4601/17/19/7032. pmid:32993005
  19. 19. Iglesias-Sánchez PP, Vaccaro Witt GF, Cabrera FE, Jambrino-Maldonado C. The Contagion of Sentiments during the COVID-19 Pandemic Crisis: The Case of Isolation in Spain. IJERPH [Internet]. 2020 Aug 14 [cited 2020 Oct 25];17(16):5918. Available from: https://www.mdpi.com/1660-4601/17/16/5918. pmid:32824110
  20. 20. Xue J, Chen J, Chen C, Zheng C, Li S, Zhu T. Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter. Zhao J, editor. PLoS ONE [Internet]. 2020 Sep 25 [cited 2020 Oct 25];15(9):e0239441. Available from: https://dx.plos.org/10.1371/journal.pone.0239441. pmid:32976519
  21. 21. Koylu C, Larson R, Dietrich BJ, Lee K-P. CarSenToGram: geovisual text analytics for exploring spatiotemporal variation in public discourse on Twitter. Cartography and Geographic Information Science [Internet]. 2019 Jan 2 [cited 2021 Apr 18];46(1):57–71. Available from: https://www.tandfonline.com/doi/full/10.1080/15230406.2018.1510343.
  22. 22. Williams ML, Burnap P, Javed A, Liu H, Ozalp S. Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime. The British Journal of Criminology [Internet]. 2019 Jul 23 [cited 2021 Apr 18];azz049. Available from: https://academic.oup.com/bjc/advance-article/doi/10.1093/bjc/azz049/5537169.
  23. 23. Lansley G, Longley PA. The geography of Twitter topics in London. Computers, Environment and Urban Systems [Internet]. 2016 Jul [cited 2021 Apr 18];58:85–96. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0198971516300394.
  24. 24. Han X, Wang J, Zhang M, Wang X. Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China. IJERPH [Internet]. 2020 Apr 17 [cited 2021 Apr 18];17(8):2788. Available from: https://www.mdpi.com/1660-4601/17/8/2788. pmid:32316647
  25. 25. Kannan VD, Brown TM, Kunitz SJ, Chapman BP. Political parties and mortality: The role of social status and personal responsibility. Social Science & Medicine [Internet]. 2019 Feb 1 [cited 2020 Jun 30];223:1–7. Available from: http://www.sciencedirect.com/science/article/pii/S0277953619300292. pmid:30684874
  26. 26. Karytsas S, Theodoropoulou H. Socioeconomic and demographic factors that influence publics’ awareness on the different forms of renewable energy sources. Renewable Energy [Internet]. 2014 Nov 1 [cited 2020 Jun 10];71:480–5. Available from: http://www.sciencedirect.com/science/article/pii/S0960148114003346.
  27. 27. Leeper TJ, Slothuus R. Political Parties, Motivated Reasoning, and Public Opinion Formation. Political Psychology [Internet]. 2014 [cited 2020 Jun 30];35(S1):129–56. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/pops.12164.
  28. 28. Chen J, Cheng Z, Gong K, Li J. Riding Out the COVID-19 Storm: How Government Policies Affect SMEs in China. SSRN Journal [Internet]. 2020 [cited 2020 Oct 19]; Available from: https://www.ssrn.com/abstract 3660232.
  29. 29. Fang Y, Nie Y, Penny M. Transmission dynamics of the COVID‐19 outbreak and effectiveness of government interventions: A data‐driven analysis. J Med Virol [Internet]. 2020 Jun [cited 2020 Oct 25];92(6):645–59. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/jmv.25750. pmid:32141624
  30. 30. Wei J, Wang L, Yang X. Game analysis on the evolution of COVID-19 epidemic under the prevention and control measures of the government. Jiang L-L, editor. PLoS ONE [Internet]. 2020 Oct 23 [cited 2020 Oct 25];15(10):e0240961. Available from: https://dx.plos.org/10.1371/journal.pone.0240961. pmid:33095788
  31. 31. Nguyen T, Gupta S, Andersen M, Bento A, Simon K, Wing C. Impacts of State Reopening Policy on Human Mobility [Internet]. Cambridge, MA: National Bureau of Economic Research; 2020 May [cited 2021 Apr 19] p. w27235. Report No.: w27235. http://www.nber.org/papers/w27235.pdf.
  32. 32. Kaufman BG, Whitaker R, Mahendraratnam N, Smith VA, McClellan MB. Comparing Associations of State Reopening Strategies with COVID-19 Burden. J GEN INTERN MED [Internet]. 2020 Dec [cited 2021 Apr 19];35(12):3627–34. Available from: http://link.springer.com/10.1007/s11606-020-06277-0. pmid:33021717
  33. 33. New York Times. See Coronavirus Restrictions and Mask Mandates for All 50 States—The New York Times [Internet]. [cited 2021 Apr 19]. https://www.nytimes.com/interactive/2020/us/states-reopen-map-coronavirus.html.
  34. 34. Shi J, Lai KK, Hu P, Chen G. Understanding and predicting individual retweeting behavior: Receiver perspectives. Applied Soft Computing [Internet]. 2017 Nov [cited 2021 Apr 19];60:844–57. Available from: https://linkinghub.elsevier.com/retrieve/pii/S1568494617305264.
  35. 35. Lim Y, Lee-Won RJ. When retweets persuade: The persuasive effects of dialogic retweeting and the role of social presence in organizations’ Twitter-based communication. Telematics and Informatics [Internet]. 2017 Aug [cited 2021 Apr 19];34(5):422–33. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0736585316301186.
  36. 36. Bird S, Loper E, Klein E. Natural Language Processing with Python. O’Reilly Media Inc.; 2009.
  37. 37. Wei J, Zou K. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) [Internet]. Hong Kong, China: Association for Computational Linguistics; 2019 [cited 2020 Jul 1]. p. 6381–7. https://www.aclweb.org/anthology/D19-1670.
  38. 38. Ramos J. Using TF-IDF to Determine Word Relevance in Document Queries.: 4.
  39. 39. Kowsari Meimandi Jafari, Heidarysafa Mendu, Barnes Brown. Text Classification Algorithms: A Survey. Information [Internet]. 2019 Apr 23 [cited 2020 Jun 25];10(4):150. Available from: https://www.mdpi.com/2078-2489/10/4/150.
  40. 40. Mikolov T, Chen K, Corrado G, Dean J. Efficient Estimation of Word Representations in Vector Space. arXiv:13013781 [cs] [Internet]. 2013 Sep 6 [cited 2021 Apr 18]; http://arxiv.org/abs/1301.3781.
  41. 41. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. MACHINE LEARNING IN PYTHON. 2011;2825–30.
  42. 42. Naive Bayes text classification [Internet]. [cited 2020 Jul 9]. https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html.
  43. 43. Zhang H. The Optimality of Naive Bayes.: 6.
  44. 44. Munezero M, Montero CS, Sutinen E, Pajunen J. Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in Text. IEEE Trans Affective Comput [Internet]. 2014 Apr 1 [cited 2021 Jun 11];5(2):101–11. Available from: https://ieeexplore.ieee.org/document/6797872/.
  45. 45. Loria S. TextBlob: Simplified Text Processing [Internet]. 2018. https://textblob.readthedocs.io/en/dev/.
  46. 46. Lever J, Krzywinski M, Altman N. Classification evaluation. Nat Methods [Internet]. 2016 Aug [cited 2020 Jun 25];13(8):603–4. Available from: http://www.nature.com/articles/nmeth.3945.
  47. 47. Mislove A, Lehmann S, Ahn Y-Y, Onnela J-P, Rosenquist JN. Understanding the Demographics of Twitter Users. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. 2011;554–7.
  48. 48. Mellon J, Prosser C. Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users. Research & Politics [Internet]. 2017 Jul [cited 2021 Apr 18];4(3):205316801772000. Available from: http://journals.sagepub.com/doi/10.1177/2053168017720008.
  49. 49. Barberá P, Rivero G. Understanding the Political Representativeness of Twitter Users. Social Science Computer Review [Internet]. 2015 Dec [cited 2021 Apr 18];33(6):712–29. Available from: http://journals.sagepub.com/doi/10.1177/0894439314558836.
  50. 50. Thaker J, Zhao X, Leiserowitz A. Media Use and Public Perceptions of Global Warming in India. Environmental Communication [Internet]. 2017 May 4 [cited 2020 Jul 9];11(3):353–69. Available from: https://www.tandfonline.com/doi/full/10.1080/17524032.2016.1269824.
  51. 51. McCluskey JJ, Kalaitzandonakes N, Swinnen J. Media Coverage, Public Perceptions, and Consumer Behavior: Insights from New Food Technologies. Annu Rev Resour Econ [Internet]. 2016 Oct 5 [cited 2020 Jul 9];8(1):467–86. Available from: http://www.annualreviews.org/doi/10.1146/annurev-resource-100913-012630.
  52. 52. Callanan VJ, Rosenberger JS. Media and public perceptions of the police: examining the impact of race and personal experience. Policing and Society [Internet]. 2011 Jun [cited 2020 Jul 9];21(2):167–89. Available from: https://www.tandfonline.com/doi/full/10.1080/10439463.2010.540655.
  53. 53. Liu P, Zhong X, Yu S. Striking a balance between science and politics: understanding the risk-based policy-making process during the outbreak of COVID-19 epidemic in China. Journal of Chinese Governance [Internet]. 2020 Apr 2 [cited 2020 Jul 23];5(2):198–212. Available from: https://www.tandfonline.com/doi/full/10.1080/23812346.2020.1745412.
  54. 54. Przychodzen W, Leyva‐de la Hiz DI, Przychodzen J. First‐mover advantages in green innovation—Opportunities and threats for financial performance: A longitudinal analysis. Corp Soc Resp Env Ma [Internet]. 2020 Jan [cited 2020 Oct 25];27(1):339–57. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/csr1809.
  55. 55. plotly.express.choropleth—4.14.3 documentation [Internet]. [cited 2021 Jun 11]. https://plotly.com/python-api-reference/generated/plotly.express.choropleth.
  56. 56. List of U.S. states and territories by educational attainment. In: Wikipedia [Internet]. 2020 [cited 2020 Jun 30]. https://en.wikipedia.org/w/index.php?title=List_of_U.S._states_and_territories_by_educational_attainment&oldid=965165403.
  57. 57. Findings State Rankings | 2018 Annual Report [Internet]. America’s Health Rankings. [cited 2020 Jun 30]. https://www.americashealthrankings.org/learn/reports/2018-annual-report/findings-state-rankings.
  58. 58. Inc G. Democratic States Exceed Republican States by Four in 2018 [Internet]. Gallup.com. 2019 [cited 2020 Jun 30]. https://news.gallup.com/poll/247025/democratic-states-exceed-republican-states-four-2018.aspx.
  59. 59. List of U.S. states and territories by income. In: Wikipedia [Internet]. 2020 [cited 2020 Jun 30]. https://en.wikipedia.org/w/index.php?title=List_of_U.S._states_and_territories_by_income&oldid=947939623.
  60. 60. List of U.S. states and territories by median age. In: Wikipedia [Internet]. 2020 [cited 2020 Jun 30]. https://en.wikipedia.org/w/index.php?title=List_of_U.S._states_and_territories_by_median_age&oldid=965073937.
  61. 61. U.S. population: male to female ratio, by state 2018 | Statista [Internet]. [cited 2020 Jul 10]. https://www.statista.com/statistics/301946/us-population-males-per-100-females-by-state/.
  62. 62. Population Distribution by Race/Ethnicity | KFF [Internet]. [cited 2020 Jul 10]. https://www.kff.org/other/state-indicator/distribution-by-raceethnicity/?currentTimeframe=0&sortModel=%7B%22colId%22%3A%22Location%22%2C%22sort%22%3A%22asc%22%7D.
  63. 63. Home—Johns Hopkins Coronavirus Resource Center [Internet]. [cited 2020 Jul 6]. https://coronavirus.jhu.edu/.
  64. 64. State Employment and Unemployment Summary [Internet]. [cited 2020 Jun 30]. https://www.bls.gov/news.release/laus.nr0.htm.
  65. 65. Menard S. Applied Logistic Regression Analysis [Internet]. 2455 Teller Road, Thousand Oaks California 91320 United States of America: SAGE Publications, Inc.; 2002 [cited 2021 Apr 19]. http://methods.sagepub.com/book/applied-logistic-regression-analysis.
  66. 66. White H. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica [Internet]. 1980 May [cited 2021 Apr 19];48(4):817. Available from: https://www.jstor.org/stable/1912934?origin=crossref.
  67. 67. Nelson DB. Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica [Internet]. 1991 Mar [cited 2021 Apr 19];59(2):347. Available from: https://www.jstor.org/stable/2938260?origin=crossref.
  68. 68. Nicola M, Alsafi Z, Sohrabi C, Kerwan A, Al-Jabir A, Iosifidis C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): A review. International Journal of Surgery [Internet]. 2020 Jun [cited 2020 Jul 23];78:185–93. Available from: https://linkinghub.elsevier.com/retrieve/pii/S1743919120303162. pmid:32305533
  69. 69. Lippi G, Henry BM, Bovo C, Sanchis-Gomar F. Health risks and potential remedies during prolonged lockdowns for coronavirus disease 2019 (COVID-19). Diagnosis [Internet]. 2020 May 26 [cited 2020 Jul 23];7(2):85–90. Available from: https://www.degruyter.com/view/journals/dx/7/2/article-p85.xml. pmid:32267243
  70. 70. Rahman MA, Zaman N, Asyhari AT, Al-Turjman F, Alam Bhuiyan MdZ, Zolkipli MF. Data-driven dynamic clustering framework for mitigating the adverse economic impact of Covid-19 lockdown practices. Sustainable Cities and Society [Internet]. 2020 Nov [cited 2020 Jul 23];62:102372. Available from: https://linkinghub.elsevier.com/retrieve/pii/S221067072030593X. pmid:32834935