Weather impacts expressed sentiment

We conduct the largest ever investigation into the relationship between meteorological conditions and the sentiment of human expressions. To do this, we employ over three and a half billion social media posts from tens of millions of individuals from both Facebook and Twitter between 2009 and 2016. We find that cold temperatures, hot temperatures, precipitation, narrower daily temperature ranges, humidity, and cloud cover are all associated with worsened expressions of sentiment, even when excluding weather-related posts. We compare the magnitude of our estimates with the effect sizes associated with notable historical events occurring within our data.


Introduction
Mood and emotional state support human physical, psychological, and economic well-being. Positive emotions are associated with improved physiological factors such as cortisol levels and cardiovascular functioning [1] and amplify cognitive performance and mental flexibility [2]. They can also increase social connectedness and perceived social support [3] and may augment income and economic success [4]. Emotional states can also be transmitted through social networks [5,6], amplifying the broad-scale effects of altered individual emotions.
Prior work suggests that environmental factors-and ambient meteorological conditions in particular-may substantively impact emotional state. However, previous empirical investigations of this relationship have found conflicting results. Early studies found large associations between meteorological conditions and mood [7,8] but were limited by small sample sizes and limited generalizability. A number of studies in the most recent decade have found small to negligible associations [9][10][11], while others document associations that vary across individuals [12,13], associations observed at high levels of aggregation [14], or associations that are contingent on other factors [15]. A still more recent large-scale longitudinal analysis reports robust linkages between daily weather variation and reported well-being [16]. Whether, and if so, how meteorological variables shape human emotions remains an open question.
This pattern of divergent results is due in part to a lack of large-scale data on emotional states. To rectify this problem, we employ a correlate of emotional states: the sentiment of a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 human lexical expressions on social media [17]. We report on associations between meteorological conditions and the expressed sentiment of tens of millions of United States residents across 3.5 billion posts on both Facebook and Twitter between 2009 and 2016. This work expands on Baylis (2015), who uses a billion Twitter posts over a shorter time-frame to estimate preferences for and valuations of different realizations of temperature in order to project the amenity cost of climate change [18]. In particular, the significantly longer sample period and the inclusion of the more representative, longer-form Facebook data in our analysis allows us to defensibly generalize our findings more broadly than was possible in Baylis (2015).
In this manuscript, we examine three questions. First, do weather conditions associate with changes in the sentiment of human expressions? Second, are these associations robust to excluding discussion about the weather itself? Third, how do the magnitudes of these weathersentiment associations compare to the effect sizes of other events that alter expressed sentiment?

Data collection procedure
Our social media data consist of 3.5 billion posts in total, with 2.4 billion from Facebook and with 1.1 billion from Twitter. By using data from both social media platforms, we take advantage of the relative strengths of each as a data source on sentiment: Facebook data is more likely to be representative and to consist of text expressions revealing the user's underlying emotional state, while Twitter data provides for comparison of results across social media contexts. A central benefit of using the Facebook data is that the Facebook population is more likely to reflect the population at large. By the end of our sampling period, nearly 80% of online adults used Facebook, as compared to between 15 and 30% for Twitter, LinkedIn, Pinterest, and Instagram [19]. Surveyed adults also indicated that they use Facebook more frequently than any other social media platform. Of note, the authors complied with all relevant platform terms of service.
To measure expression of sentiment on Facebook, we use data from a previous work [5]. These data are based on "status updates" which are text-based messages that a user's contacts may view on their own Facebook News Feed. Our Facebook dataset starts on January 1st, 2009 and ends on March 31st, 2012, with 1,176 days in total. The Facebook data contain all users on the platform-both public-facing and private accounts-that chose English as their language, selected United States as their country, and could be matched to our selection of metropolitan areas by their IP-based geographic location at time of posting. The Facebook data we use here are described in detail elsewhere [5].
Our Twitter data consist of posts, or "tweets", that are short messages limited to 140 characters and are publicly viewable by default. Our Twitter data cover the period from November 30th, 2013 to June 30th, 2016, with 938 days in total. We collected tweets using Twitter's public Streaming API, placing a bounding box filter over the United States to gather our sample of precisely geo-located tweets. We then assigned tweets falling within a metropolitan area's boundaries to that specific area. This procedure allows for a high level of certainty that the tweet originated within a specific metropolitan area. We exclude re-tweets from our analysis and only consider direct user-generated content.
We employ gridded meteorological data from the PRISM Climate Group for our daily maximum temperature, temperature range, and precipitation measures [20]. We also employ cloud cover-a measure of sun exposure-and relative humidity data from the National Centers for Environmental Prediction (NCEP) Reanalysis II project [21] as these measures have been central to previous studies of the relationship between weather and emotional states (as daily NCEP data run only to June 30th 2016 at the time of writing, we use that date as the end date of our analysis) [7,8]. We match daily meteorological variables in a location to the posts of users in that particular location on that day.

Measures of expressed sentiment
To determine whether a social media post uses words that express positive or negative sentiment, we rely on the Linguistic Inquiry Word Count (LIWC) sentiment analysis tool (see Table B in S1 File) [22]. LIWC is a highly validated, dictionary-based, sentiment classification tool that is commonly used to assess sentiment in social media posts [5,6,23,24] (of note, our results are similar under the use of alternative sentiment classifiers). In our analysis, we follow the psychological literature and treat positive and negative sentiment as separate constructs [25].

Dependent variable aggregation
We aggregate both our Facebook and Twitter data to the 75 most populated metropolitan areas in the US. For each post, we calculate whether the post contains either a positive or a negative LIWC term. We then average each user's posts on each day to produce a positive rate and a negative rate for each user. Then, for each city on each day, we average the values of users in that city to calculate our city-level dependent variables. In order to ensure our results aren't driven by ecological inference, we present user-level analyses in Fig A, B, and C and Tables D and F in S1 File.

Analysis
City-level. To investigate if weather alters expressed sentiment at the city-level, we combine our aggregated city-level positive and negative sentiment scores with our daily meteorological data. We empirically model this relationship as: In this time-series cross-sectional model, j indexes cities, m indexes unique year-months, and t indexes calendar days. Our dependent variable Y jmt represents our city-level measure of positive or negative expressed sentiment, respectively.
Our independent variables of interest are maximum temperatures (tmax jmt ) and total precipitation (precip jmt ). We also examine the marginal effects of temperature range, percentage cloud cover, and relative humidity, represented via h(μ). We empirically estimate our relationships of interest using indicator variables for each 5˚C maximum temperature and temperature range bin, for each 1cm precipitation bin, and for each 20 percentage point bin of cloud cover and relative humidity (represented here by f(), g(), and h() respectively). This procedure allows for flexible estimation of the association between our meteorological variables and expressed sentiment [26][27][28].
Unobserved geographic or temporal factors may influence sentiment in a way that correlates with weather. For example, people may be happier on average in cities that have better infrastructure or on days when they are likely to have more leisure time. Further, there may be unobserved, city-specific trends, such as changes in amount of daylight throughout the year or evolution in city-level economic conditions over time, that influence the expressed sentiment of a city. To ensure that these factors do not bias our estimates of the association between weather and expressed sentiment, we include in Eq 1 ν jm and γ t to represent city-by-calendarmonth and calendar date indicator variables, respectively. These variables control for all constant unobserved characteristics for each city across its seasons and for each unique day in the data [29,30].
We adjust for within-city and within-day correlation in jmt by employing heteroskedasticity-robust standard errors clustered on both city-year-month and day [31]. We exclude nonclimatic control variables from Eq 1 because of their potential to generate bias in our parameters of interest [30,32]. Finally, we weight the city-level regression by the number of underlying social media posts for each city-day.
We omit the 20˚C-25˚C maximum temperature, the 0˚C-5˚C temperature range, 0cm precipitation, 0-20% cloud cover, and 40-60% humidity indicator variables when estimating Eq 1. We interpret our estimates as the percentage point change in positive or negative expressed sentiment associated with a particular meteorological observation range relative to these baseline categories.
Exclusion of weather terms. Our first analyses examine the sentiment of all expressions contained within our data, inclusive of terms that may refer directly to the weather. As weather discussion may not necessarily reflect changes in individuals' underlying emotional states, however, we use a large dictionary of weather terms to filter out posts in our Twitter data that contain a plausible reference to the weather, and again run the models in Eq 1 and Equation A in S1 File on the messages that do not contain these weather related terms (because we no longer have access to the raw Facebook posts, we were unable to re-calculate our non-weather expressed sentiment metrics for our Facebook corpus). Approximately 4% of tweets contained one or more of our weather terms. To investigate the effectiveness of this filter, we manually classified a random sample of 1,000 posts that included a weather term and determined that approximately 28% of that sample were about the weather. We also manually classified a second random sample of 1,000 non-weather term posts and determined that only 0.2% were about weather-related constructs.
Effect sizes in context. To contextualize our estimates, we compare the effect size of the association between below freezing temperatures and non-weather related positive expressed sentiment to the effect size of a number of plausibly negative events that occurred over the time span of our Twitter data. To do so, we again estimate both Eq 1 and Equation A in S1 File. But, in addition to the terms in those models, we include indicator terms for each of the comparison events to estimate these parameters simultaneously alongside our meteorological variables. These indicator terms isolate the specific location and the specific dates of the event so that they are not collinear with the fixed effects in our models. For example, for the effect size of the San Francisco Bay Area earthquake, we create an indicator variable for the San Francisco/Oakland metropolitan area on August 24th 2014, the date of the earthquake.

Descriptive statistics
We present the descriptive statistics associated with our main variables in Table 1.

All expressed sentiment
The results of estimating Eq 1 for the association between meteorological conditions and positive and negative sentiment on Facebook indicate that temperature, precipitation, humidity, and cloud cover each significantly relate to expressions of sentiment (see Fig 1 panels Table C in S1 File for further details on these results).
Panels (c) and (d) of Fig 1 display the results of estimating Eq 1 on the Twitter city-level data. The nature of the impact of temperature and precipitation on sentiment expression is quite similar to the associations in the Facebook data, though attenuated in magnitude. For example, the association between below freezing temperatures and city-level positive sentiment expressions on Twitter is approximately 45% the size of this parameter in the Facebook data. The effect sizes of precipitation, temperature range, cloud cover, and humidity on expressed sentiment on city-level Twitter data retain statistical significance but are similarly attenuated in magnitude as compared to Facebook (see Table C in S1 File for more details).

Expressed sentiment of non-weather messages
Analyzing all posts provides useful descriptive characteristics of the ways in which aggregate sentiment associates with the weather. However, in order to employ expressed sentiment as a better proxy for underlying emotional states, it is useful to exclude direct references to the weather itself, as more extreme weather conditions significantly alter both the rate of weather discussion and the sentiment of this discussion (see Figs D and E in S1 File). We present the results of estimating Eq 1 on our corpus of Twitter posts excluding weather terms in Fig 2. Panels (a) and (b) of Fig 2, illustrate that the effect sizes associated with maximum temperature and precipitation on non-weather sentiment at the city-level are slightly smaller than the effect sizes seen in the all-posts sample (Fig 1 panels (c) and (d)). As an example, the association between below freezing temperatures and positive sentiment, non-weather-term expressions is approximately 84% the size of this association in the all-posts model. The effect sizes of temperature range, high humidity, and cloud cover retain significance in this model, though are also attenuated in size compared to the all-posts model (see Table E in S1 File for details).

Effect sizes in context
To understand the relative magnitude of these predicted changes, it can be helpful to look at the effect sizes associated with other types of events on expressed sentiment. We chose five salient events for their theoretically negative association with expressed positive, non-weather sentiment coupled with their diversity of type: 1) the end of daylight saving time in the cities Twitter analyses for posts without weather terms. Panel (a) depicts the relationship between daily maximum temperatures and the rates of expressed sentiment for non-weather posts, aggregated to the city-level. It draws from the estimation of Eq 1 and plots the predicted change in expressed sentiment associated with each maximum temperature bin. Panel (b) depicts the relationship between daily precipitation and the rates of sentiment expression of non-weather posts, also drawing on estimation of Eq 1. Shaded error bounds represent 95% confidence intervals calculated using heteroskedasticity-robust standard errors clustered on both city-yearmonth and day. above the median latitude of our sample of cities 2) the annual anniversary of the September 11th terrorist attacks in the New York City metropolitan area 3) the December 2015 terrorist attacks in the San Bernardino metropolitan area 4) the October 2015 flooding in the Carolinas in the Charlotte metropolitan area (the term 'flood' and its derivatives are part of our excluded weather terms) and 5) the San Francisco Bay Area earthquake in August 2014 in the San Francisco/Oakland metropolitan area. Fig 3 indicates that each of these events is significantly associated with reductions in nonweather, expressed positive sentiments in the areas local to the events. Further, a day of below freezing temperature in our sample is substantively meaningful. For example, the effect size of a day of below freezing temperature on positive expressed sentiment is 62% the effect size of the 2015 Carolina floods on aggregate expressed sentiment in Charlotte. We present the effect of these events on negative sentiment in Fig F in S1 File. Of important note, by conducting this comparison, we do not intend to comment on the overall impact of these events on society or on individual well-being. There may be many reasons why, for example, the San Bernardino shootings produced a less substantial reduction in positive expressed sentiment on Twitter than did the SF Bay Area earthquake. Speculation on the underlying factors that drive the association between these specific events and expressed sentiment is beyond the scope of this study.

Discussion
There are several considerations important to the interpretation of our results. First, while we have data on millions of individuals' expressed sentiment as reflected by their social media posts, optimal data would also include these individuals' daily self-reported emotional states. As mentioned above, while sentiment expressions on social media can be reflective of underlying emotions [17], the linguistic measures we employ here represent an imperfect and noisy proxy of emotional states. Future studies are needed to improve the psychometric validity of sentiment metrics.
Second, and relatedly, our chosen LIWC sentiment metrics may imperfectly measure the sentiment of expressions on social media. We examine the robustness of our findings to the use of other sentiment classification tools with our Twitter data in Figs G, H, I, J in S1 File. In those analyses we employ both the SentiStrength and Hedonometer algorithms and find that our results are quite robust across all three of our employed sentiment metrics [33,34]. However, because all three of our metrics likely have idiosyncratic errors associated with them, our measurement of the sentiment of expressions remains imperfect. Future studies should endeavor to investigate the effect of meteorological conditions on specific changes in the distribution of the corpora of human lexical expressions [34][35][36][37][38].
Third, measurement error may exist between observed weather and the weather users actually experience, possibly attenuating the magnitude of our estimates [39]. Issues of right-handside measurement error may be particularly salient with respect to our measures of cloud cover and humidity, as they are derived from reanalysis data rather than directly from station observations [30]. Further, automated accounts in our Twitter data that may somewhat bias or attenuate our effect estimates for that platform. Future studies are needed to investigate the role of automated Twitter accounts in shaping overall sentiment on the platform.
Fourth, our analysis is conducted on individuals who self-select into participation in social media. Our results may not apply to demographics that do not use either Facebook or Twitter, such as older generations. Because the elderly are less common users of social media and because they may be particularly vulnerable to adverse weather conditions [32], the results we present here may underestimate the true population associations between weather and expressed sentiment.
Fifth, our data are restricted to observations from one country with a predominately temperate climate and with one of the highest rates of air conditioning prevalence in the world [40]. It is critical to repeat this analysis where possible in poorer countries with more extreme climates as they may see even greater alterations in overall expressed sentiment due to meteorological conditions. Ultimately, given the ubiquity of our exposure to varying weather conditions, understanding the influence they may have on our emotional states is of high importance. Here we provide a window into this relationship via the measurement of expressed sentiment on social media. We observe that the weather is associated with statistically significant and substantively meaningful changes in expressed sentiment for posts both inclusive of and exclusive of weather-related terms. We find substantial evidence that less ideal weather conditions relate to worsened sentiment. To the extent that the sentiment of expressions serves as a valid proxy for underlying emotions, we find some observational evidence that the weather may functionally alter human emotional states.  Table A in S1 File: Summary statistics of main dependent and independent variables. Table B in S1 File: Examples of LIWC sentiment encoding. Table C in S1 File: City-level weather and expressed sentiment, all posts. Table D in S1 File: User-level weather and expressed sentiment, Twitter all posts. Table E in S1 File: City-level weather and expressed sentiment, Twitter no weather terms. Table F in S1 File: User-level weather and expressed sentiment, Twitter no weather terms. (PDF)