Table 1.
List of key informational attributes accompanying each tweet.
Figure 1.
Daily trends for example sets of commonplace words appearing in tweets.
For purposes of comparison, each curve is normalized so that the count fraction represents the fraction of times a word is mentioned in a given hour relative to a day. The numbers in parentheses indicate the relative overall abundance normalized for each set of words by the most common word. Data for these plots is drawn from approximately 26.5 billion words collected from May 21, 2009 to December 31, 2010 inclusive, with the time of day adjusted to local time by Twitter from the former date onwards. The words ‘food’ and ‘dinner’ appeared a total of 2,994,745 (0.011%) and 4,486,379 (0.016%) times respectively.
Figure 2.
Demonstration of robustness and tunability of our text-based hedonometer, and reasoning for choice of a specific metric.
To measure the happiness of a given text, we first compute frequencies of all words; we then create an overall happiness score, Eq. (1) , as a weighted average of subsets of 10,222 individual word happiness assessments on a 1 to 9 scale, obtained through Mechanical Turk (see main text and Methods). In varying word sets by excluding stop words [58], we can systematically explore families of happiness metrics. In plot A, we show time series of average happiness for Twitter, binned by day, produced by different metrics. Each time series is generated by omitting words with as indicated in plot B, which shows the overall distribution of average happiness of individual words. For
we use all words; as
increases, we progressively remove words centered around the neutral evaluation of 5. Plot C provides a test for robustness through a pairwise comparison of all time series using Pearson's correlation coefficient. For
, the time series show very strong mutual agreement. We choose
(black curve in A and F, shown in B, white symbols in C, D, and E) for the present paper because of its excellent correlation in output with that of a wide range of
, and for reasons concerning the following trade-offs. In A, we see that as the number of stop words increases, so does the variability of the time series, suggesting an improvement in instrument sensitivity. However, at the same time, we lose coverage of texts. Plot D first shows how the number of individual words for which we have evaluations decreases as
increases. For
, we have 3,686 individual words down from 10,222. Plot E next shows the percentage of the Twitter data set covered by each word list, accounting for word frequency; for
, our metric uses 22.7% of all words. Lastly, in plot F (which uses plot A's legend), we show how coverage of words decreases with word rank. When
, we incorporate all low rank words, with a decline beginning at rank 5,000. For
, we see similar patterns with the maximum coverage declining; for
, we see a maximum coverage of approximately 50%.
Figure 3.
Overall happiness, information, and count time series for all tweets averaged by individual day.
A. Average happiness measured over a three year period running from September 9, 2008 to August 31, 2011 (see Sec. 3 for measurement explanation). A regular weekly cycle is clear with the red and blue of Saturday and Sunday typically the high points (examined further in Fig. 5). Post May 21, 2009 (indicated by a solid vertical line), we use reported local time to assign tweets to particular dates. See also Figs. S1 and S2. B. Simpson lexical size as a function of date using Simpson's concentration as the base entropy measure (solid gray line; see Sec. 3). The red squares with the dashed line show
as a function of calendar month. C. The number of words extracted from all tweets as a function of date for which we used evaluations from Mechanical Turk. For both the happiness and Simpson lexical size plots, we omit dates for which we have less than 1000 words with evaluations.
Figure 4.
Word shift graph showing how changes in word frequencies produce spikes or dips in happiness for three example dates, relative to the 7 days before and 7 days after each date.
Words are ranked by their percentage contribution to the change in average happiness, . The background 14 days are set as the reference text (
) and the individual dates as the comparison text (
). How individual words contribute to the shift is indicated by a pairing of two symbols:
shows the word is more/less happy than
as a whole, and
shows that the word is more/less relatively prevalent in
than in
. Black and gray font additionally encode the
and
distinction respectively. The left inset panel shows how the ranked 3,686 labMT 1.0 words (Data Set S1) combine in sum (word rank
is shown on a log scale). The four circles in the bottom right show the total contribution of the four kinds of words (
,
,
,
). Relative text size is indicated by the areas of the gray squares. See Eqs. 2 and 3 and Sec. 4.2 for complete details.
Figure 5.
Average happiness as a function of day of the week for our complete data set.
To make the average weekly cycle more clear, we repeat the pattern for a second week. The crosses indicate happiness scores based on all data, while the filled circles show the results of removing the outlier days indicated in Fig. 3A. The colors for the days of the week match those used in Fig. 3A. To circumvent the non-uniform sampling of tweets throughout time, we compute an average of averages: for example, we find the average happiness for each Monday separately, and then average over these values, thereby giving equal weight to each Monday's score. We use data from May 21, 2009 to December 31, 2010, for which we have a local timestamp.
Figure 6.
Evaluations of the individual days of the week as isolated words using Mechanical Turk.
Figure 7.
Average of daily average happiness for days of the week over four consecutive time periods of approximately five months duration each.
As per Fig. 5, crosses are based on all days, circles for days excluding outlier days marked in Fig. 3. The vertical scale is the same in each plot and matches that used in Fig. 5.
Figure 8.
Word shift graph comparing Saturdays relative to Tuesdays.
Each day of the week's word frequency distribution was generated by averaging normalized distributions for each instance of that week day in May 21, 2009 to December 31, 2010, with outlier dates removed. See Fig. S5 for word shifts based on alternate distributions.
Figure 9.
Simpson lexical size as a function of day of the week.
We compute for individual dates Fig. 3B, again excluding dates shown in Fig. 3A, and then average these values. (See also Fig. S20 for the effects of alternate approaches.)
Figure 10.
Average happiness level according to hour of the day, adjusted for local time.
As for days of the week in Fig. 5, each data point represents an average of averages across days. The plot remains essentially unchanged if outlier dates marked in Fig. 3A are excluded. The maximum relative difference between the two plots is 0.08%. The daily pattern of happiness in tweets shows more variation than we observed for the weekly cycle (Fig. 5), here ranging from a low of between 10 and 11 pm to a high of
between 5 and 6 am.
Figure 11.
Normalized distributions of five example common expletives as a function of hour of the day.
Figure 12.
Word shift graph comparing the happiest hour (5 am to 6 am) relative to the least happy hour (10 pm to 11 pm).
Days given equal weighting with outlier dates removed. (See Fig. S6 for word shifts based on alternate distributions.)
Figure 13.
Average Simpson lexical size for time of day, corrected according to local time, and computed for each day with outlier days removed, and then averaged across days.
See also Fig. S4 for a demonstration of the robustness of the form of throughout the day under alternate averaging schemes.
Figure 14.
Ambient happiness and occurrence frequency time series for some illustrative text elements.
A. Ambient happiness is the average happiness of all words found co-occurring in tweets containing a given text element, with the background average happiness of all tweets removed (n.b., the text element's contribution is excluded). Binning is by calendar month and symbols are located at the center of each month. B. Fraction of tweets containing text elements.
Table 2.
Selection of 100 text elements ordered by average ambient happiness .
Table 3.
The same keywords and text elements as listed in Table 2 sorted according to the Simpson lexical size for all tweets containing them.
Figure 15.
For the 100 keywords and text elements listed in Table 2, a rank-rank plot of Simpson lexical size versus ambient happiness
.
The two quantities show no correlation with Spearman's correlation coefficient measuring (
-value
).
Figure 16.
Ambient happiness time series and word shift graphs for tweets containing the keywords ‘Tiger Woods’ and ‘BP’.
Ambient happiness of a keyword is for all words co-occurring in tweets containing that keyword, with the overall trend for all tweets subtracted. The word shift graphs are for tweets made during the worst month and the ensuing one–November and December, 2009 for ‘Tiger Woods’ and May and June, 2010 for ‘BP’.
Figure 17.
Time series and word shift graphs for tweets containing the keywords ‘Pope’ and ‘Israel’.
The word shift graphs are for the time periods March and April, 2010 for ‘Pope’ and January and February, 2010 for ‘Israel.’ See Fig. 16's caption for more details.