Local Variation of Hashtag Spike Trains and Popularity in Twitter

We draw a parallel between hashtag time series and neuron spike trains. In each case, the process presents complex dynamic patterns including temporal correlations, burstiness, and all other types of nonstationarity. We propose the adoption of the so-called local variation in order to uncover salient dynamical properties, while properly detrending for the time-dependent features of a signal. The methodology is tested on both real and randomized hashtag spike trains, and identifies that popular hashtags present regular and so less bursty behavior, suggesting its potential use for predicting online popularity in social media.


Introduction
In this paper, we focus on the statistical properties of Twitter and, in particular, on the dynamics and popularity of hashtags. Twitter is a micro-blogging service allowing users to post short messages and to follow those published by other users. Messages often incorporate hashtags, keywords identified by the symbol #, which users can track and respond to the message content and makes the platform interactive. Hashtags play a significant role in information diffusion by enhancing information and rumor spreading and consequently increase the impact of news. Discussions on protests [1,2] and political elections, advertisement of new products in marketing, announcements of scientific innovations [3], panic events such as earthquakes [4], and comments on TV shows are some examples where hashtags are widely used. Additionally, hashtags can be even used to track and locate crisis [5] and can spread under the influences of both endogeneous factors, that is the propagation between Twitter users following each others, and exogeneous sources such as TV and newspapers [6].
The statistical properties of Twitter and, more generally, of human activity, are characterized by a strong heterogeneity in different dimensions. First, human behavior is known to generate bursty temporal patterns, significantly deviating from independent Poisson processes, as a majority of events take place over short time scales while a few events take place over very large times. This property translates into fat-tailed distributions for the timings Δτ between occurrences of a certain type of events, e.g. between two phone calls or two emails emitted by an individual. For instance, the inter-event time distribution P(Δτ) for the timings between two tweets of a user, or the use of a hashtag is well fitted by a power law such as P(Δτ) % Δτ α [3]. The deviation from an exponential (uncorrelated) distribution may be either driven by complex decision-making and cascading mechanisms [7][8][9] or by the time dependency of the underlying process, partly because of its intrinsic circadian and weekly rhythms [10,11], as described in Fig 1, or by a combination of these factors [12][13][14][15]. Importantly, the nonstationarity of the signal is known to broaden P(Δτ) and therefore to artificially increase the value of standard metrics, such the variance or the Fano factor, originally defined for stationary processes.
In addition to temporal heterogeneity in Δτ, online human activity often generates a heterogeneity in popularity [16]. The popularity p of a hashtag is measured by the number of times that it appears in an observation time window. While a majority of hashtags attracts no attention only very few of them propagate heavily [8,17]. Understanding the mechanisms by which certain hashtags or messages gain attention is a central topic of research in the study of online social media [18]. Potential mechanisms for the emergence of this heterogeneity include forms of preferential attachment and competition-induced forces [19][20][21][22] driven by the limited amount of attention of users.
Our main purpose is to explore connections between temporal heterogeneity and heterogeneity in popularity. As a first contribution, we introduce a temporal measure for online human dynamics, suited for the analysis of nonstationary time series to quantify bursts, regularity, and temporal correlations. Originally defined for the study of inter-spike intervals of neurons [23][24][25][26][27], the so-called local variation L V is then shown to identify deviations from Poisson (uncorrelated) processes and to help characterize successful hashtags.

Data collection and basic overview
The data set has been collected via the publicly open Twitter streaming API between April 30, 2012, 10 pm and May 10, 2012, 10 pm. Only the geographical constraint has been applied as follows: The actions of all Twitter users located in France have been considered to avoid the existence of time differences between countries and regions, and no language filtering has been applied. The time resolution is 1 second and multiple activity can be recorded in the same second. During this time period, two major public events took place: An important political debate held on May 2 and the French presidential election-2012 held on May 6. These events are not the topic of this work, but they are clearly visible in the time series, as shown in Fig 1. The total number of tweets, including retweets, captured during the data collection is 9,747,351. The total number of tweets including at least one hashtag is 2,942,239. Around 30% of the tweets therefore contain a hashtag. The fact that hashtags are used in regular tweets or in retweets is not specified. Moreover, any message (identical or not) considering at least one hashtag is recorded. Due to the debate and the election taking place during the data collection, the most popular hashtags are related to politics, as seen in Table 1. The time series of the hashtag study in this paper are provided in Supporting Information (S1 File). A total number of 473,243 individual users has been identified. Among those, 228,525 users published at least one hashtag, e.g. almost half of the social network is associated with hashtag diffusion. To further characterize the importance of hashtags in Twitter activity, we compare the total number of seconds when any action is performed in the data set, 763,262 s % 8.8 days and thus 88% of the total duration, to the number of seconds when at least one hashtag is published, 667,996 s % 7.7 days, that is 77% of the total duration. In any case, the hashtag data cover a majority of the time window, even during off-peak hours. These numbers confirm the importance of hashtags in the Twitter ecosystem and their prevalence in a variety of contexts.
Any type of human activity is influenced by circadian and weekly cycles. This observation has been verified in recent years in a variety of social data sets, going from mobile phone [12] to online social media [13][14][15]. In addition, deviations from these cycles can help at detecting atypical events such as responses to catastrophes [3][4][5]. Fig 1 in Introduction shows the total number of tweets per minute over a sub-period of 6 days and confirms these findings, with clear circadian patterns and two peaks during major public events related to the French presidential election-2012. Besides this smooth periodic behavior, the data also exhibit a noisy signal at a finer time scale, as shown in the inset of Fig 1. In the following, we will analyze the properties of these complex time series, by decomposing it into groups of hashtags depending on their popularity, and uncover temporal statistical differences between these groups.

Heterogeneity in popularity of hashtags
The success of a hashtag can be measured by its popularity p, defined as its number of occurrences, and equivalent to its frequency. Fig 2 presents the Zipf-plot and the probability density function (PDF) of p, for the 295,697 unique hashtags observed in the data set. The Zipf-plot [Fig 2(a)] indicates that more than half of the hashtags (% 60%) appears just once in the data set, with p = 1. Moreover, around 83% of the hashtags has p < 5, in the pink-colored region in the last (right) rectangle of Fig 2(a). For moderate values of p, if we set a threshold of p to 1000 with an upper-bound to 25000, only 0.15% of the hashtags fits in the yellow-colored rectangle. Finally, the top hashtags with p > 25000, in the red-colored rectangle, are very rare (% 0.0001%), but more frequent than would be expected for values so large as compared to the median. These observations are confirmed in Fig 2(b), where we show the probability distribution of p, P(p) in a log-log plot. P(p) is a clear example of a fat-tailed distribution associated with a strong heterogeneity in the system.
The heterogeneity in p has been already observed [8,11,16,17]. A mechanism proposed for its emergence is the competition between information overload and the limited capacity of each user [19][20][21][22], sometimes coupled with cooperative effects [8,9]. It has been also shown that hashtags having unique textual features become more popular than hashtags presenting common textual features [28]. In this paper, we are not interested in the origin of the heterogeneity, but in its relation with the temporal characteristics of hashtags.

Temporal heterogeneity
We will draw an analogy between hashtag dynamics and neuron spike trains. To this end, we introduce standard methods from the spike train analysis into the field of hashtag dynamics. Hashtags are keywords associated to different topics, which can be created, tracked and reused by users. Their popularity and unambiguity make them an essential object for information diffusion in Twitter. The statistical description of neuron spike sequences is crucial for extracting underlying information about the brain [29]. It was originally believed that in vivo cortical  neurons behave as time-dependent Poisson random spike generators, where successive interspike intervals are independently chosen from an exponential distribution with a time-dependent firing rate [30]. However, more recent observations have shown that the inter-spike interval distribution exhibits significant deviations from the exponential distribution, which has led to the construction of appropriate tools to describe neuron signals [23][24][25][26][27].
Similarly, a hashtag spike train is defined as the sequence of timings at which the concerned hashtag is observed in Twitter. In this framework, we do not specify the type of dynamics of hashtags, endogeneous or exogeneous [6], i.e. endogeneous, hashtag diffusion among members of the social network, or exogeneous, the diffusion driven by external factors such as TV and newspapers, but only in the timings. Each hashtag thus generates a unique hashtag spike train with a characteristic popularity p. As a first basic indicator, in Fig 3(a) and 3(b) we show the inter-hashtag spike interval cumulative and probability distributions, CDF(Δτ) and P(Δτ), respectively. To avoid deforming the distributions artificially because of the heterogeneity in p, we classify CDF(Δτ) and P(Δτ) in classes depending on p, illustrated by different colors in Fig  2. We observe similar behavior across the classes, as P(Δτ) deviates strongly from an exponential distribution (Poisson), P(Δτ) = ξe −ξΔτ , where ξ is a firing rate (frequency and so p in our concept) at which hashtags appear. Instead, we observe fat-tailed distributions [3,7,12,16,[31][32][33] as shown in Fig 3(b) for high and moderate p. As mentioned in Introduction, this deviation may either originate from temporal correlations or non-stationary patterns, making the system different from a stationary and an uncorrelated random signal [34][35][36][37]. Recently and unlikely, a stochastic model considering Poisson processes also suggests a broad distribution of the dynamics of brand names in Twitter [15].

Real and randomized data sets
We will analyze two sets of data, which we now describe: The empirical data set, directly coming from the data, and a randomized data set, serving as a null model in our analysis.
The real data set contains one spike train per hashtag, as illustrated in Fig 4(a). The time resolution of the spikes is the same as that of the data set, that is 1 second. In situations when multiple spikes of the same hashtag take place at the same time only one event is considered. The statistics of such events are provided at the end of this subsection. In each spike train, the appearance time of the spikes is ordered from the earliest time to the latest time.
The random data set is randomized version of the real data set, where each spike train of size p generates a spike train of the same size with random times. In practice, we first combine all hashtag spike trains and obtain one merged hashtag spike train as illustrated in Fig 4(b). This train carries the full history of all hashtags and, importantly, reproduces the nonstationary features of the original data in the presence of temporal correlations, burstiness, and the cyclic rhythm. As before, if two or more spikes generated in the same time, only one spike is shown in that time in the merged spike train, e.g. see the black spikes in Fig 4(b).
Randomization is performed by permuting elements, as shown in Fig 4(c), for instance by using randperm(T, p) in Matlab. Here, T represents the full matrix of times in the merged spike train and p is the desired popularity, number of total spikes in a train. The permutation procedure generates p times uniformly distributed unique numbers out of T and these numbers define the artificial spike train, e.g. . . ., t r iÀ1 , t r i , t r iþ1 , . . ., as shown in Fig 4(c). In our data set, p ( T is always verified, as the maximum p is 180,900 and the length of T is 667,996. This procedure is applied to each spike train of size p [Fig 4(d)]. Generating independent, yet time-dependent events, the procedure is expected to create time-dependent Poisson random processes, P (Δτ, t) = ξ(t)e −ξ(t)Δτ , where the firing rate ξ(t) in this case explicitly depends on the time of the day and of the week. The cumulative (a), CDF(Δτ), and probability (b), P(Δτ), distributions of the inter-hashtag spike intervals. We observe that P(Δτ), for different classes of hashtags distinguished by their popularity, exhibits non-exponential features. The different colors correspond to those in Fig 2. The legend provides the average popularity hpi in each hashtag class. The dash lines indicate the positions of 1 day, 2 days, and 3 days, where P(Δτ) gives peaks for low p (pink symbols). The binning is varied from 8 minutes to 2 hours depending on p, e.g. 8 min. for high p (red-orange), 1.5 hour for moderate p (yellow-greenblue-purple), and 2 hours for low p (pink). All P(Δτ) present maxima at 1 second, which is not shown to describe tails in a larger window.  Statistics of multiple tweets in 1 second. We detect multiple occurrences in 1 second for 6661 hashtags. Fig 5 presents the probability distribution P(c h ) of observing c h occurrences of a hashtag during one second for different hashtag popularity class. Even though c h > 1 occurs rarely, we observe that this possibility is more probable for popular hashtags (red open circles), as expected. For the most popular hashtag, ledebat, one finds max(c h ) = 40.

Local variation
The time series of spike trains are inherently nonstationary, as shown in Fig 1. For this reason, metrics defined for stationary processes are inadequate and might lead to incorrect conclusions. For instance, the non-exponential shapes of the inter-event time distribution P(Δτ) in  Table 1 with ranking 1-11 and presented here in red symbols, multiple activity in 1 second is very rare. The different colors correspond to those in Figs 2 and 3. The legend provides the average popularity hpi in each hashtag class.  Similarly, statistical indicators based on this distribution, such as its variance or Fano factor, might be affected in a similar way. For this reason, we consider here the so-called local variation L V , originally defined to determine intrinsic temporal dynamics of neuron spike trains [23][24][25][26][27].
Unlike quantities such as P(Δτ), L V compares temporal variations with their local rates and is specifically defined for nonstationary processes [27] Here, N is the total number of spikes and . . ., τ i−1 , τ i , τ i+1 , . . . represents successive time sequence of a single hashtag spike train. Eq 1 also takes the form [27] where Δτ i+1 = τ i+1 −τ i and Δτ i = τ i −τ i−1 . Δτ i+1 quantifies the forward delay and Δτ i represents the backward waiting time for an event at τ i . Importantly, the denominator normalizes the quantity such as to account for local variations of the rate at which events take place. By definition, L V takes values in the interval [0:3]. The local variation L V presents properties making it an interesting candidate for the analysis of hashtag spike trains [23][24][25][26][27]. In particular, L V is on average equal to 1 when the random process is either a stationary or a non-stationary Poisson process [23], with the only condition that the time scale over which the inverse firing rate 1/ξ(t) fluctuates is slower than the typical time between spikes. Deviations from 1 originate from local correlations in the underlying signal, either under the form of pairwise correlations between successive inter-event time intervals, e.g. Δτ i+1 and Δτ i which tend to decrease L V , or because the inter-event time distribution is non-exponential. An interesting case is given by Gamma processes [23,25] P ðDt; t; x; kÞ ¼ ðxkÞ k Dt ðkÀ1Þ e ÀxkDt =GðkÞ ð 3Þ where κ is called a shape parameter and determines the shape of the distribution, ξ is a firing rate (frequency) as previously defined, and Γ is the Gamma function. Here, ξ and κ are the two parameters of the Gamma process and both can be time-dependent. While ξ determines the speed of the dynamics, κ controls for the burstiness (irregularity) of the spike trains. Assuming that events are independently drawn, the shape factor is related to L V as follows [23,25] Here, the brackets describe the average taken over the given distribution [23]. When κ = 1, an exponential is recovered, and one finds hL V i = 1 as expected. Smaller values of κ increase the variance in Δτ and therefore its burstiness, making L V larger than 1. On the other hand, larger values of κ decrease the variance of Δτ and the burstiness of the process, making hL V i % 0 smaller than 1. We measure L V of hashtag spike trains and group the values depending on the popularity p of their hashtags as was done in Figs 2 and 3. Fig 6 shows scatter plots of L V for the real data set (a), the empirical sequence . . ., τ i−1 , τ i , τ i+1 , . . ., and the random data set (b), the random sequence . . ., t r iÀ1 , t r i , t r iþ1 , . . ., on linear-log plots. Different colors are used to distinguish the different groups and the inset legend provides the average popularity hpi in the groups.
A more readable representation is provided in Fig 7, where we show histograms P(L V ) of the values of L V , for the two data sets and for the distinguished hashtag groups in p. The results clearly show that L V fluctuates around 1 in the random data set [Fig 7(b)], as expected for a time-dependent Poisson process. On the other hand, L V systematically deviates from 1 in the original data set [Fig 7(a)], where temporal correlations and bursts are expected to be present.
These observations are confirmed in Fig 8(a), where we plot the mean μ(L V ) of L V , with error bars, as a function of hpi. L V of the original data (blue circles) indicates that high impact hashtags (high p) are associated with lower values of L V suggesting more homogeneous and regular time distributions. The results encourage the potential use of L V as a metric not only to capture deviations from Poisson temporarily uncorrelated processes (red squares), but also to identify distinct statistical properties generated specifically in high p. Moreover, Fig 8(b) presents the statistical differences between the real and the random spike trains in detail. The deviations from Poisson processes where μ 0 (L V ) = 1 are calculated by z = mðL V Þ À m 0 ðL V Þ=sðL V Þ= ffiffiffi n p with the standard deviations of L V , σ(L V ), and the number of the data points given in the distributions in Fig 7, n. We observe that z−values for the random spikes (red squares) are almost equal to 0, excluding in high p, indicating the agreement between Poisson signals and our random spike trains, which is not the case for the real trains (blue circles) giving z ≇ 0 in any of hpi.
To conclude, we perform an analysis to test the persistence of the temporal characteristics of the hashtags, as measured by L V , through time. To do so, we divide each hashtag time series into two equal time series. The resulting values of local variations are L V (t 1 ) for the first half of a spike train and L V (t 2 ) for the second half of the train, and then we calculate the Pearson correlation coefficient r(L V (t 1 ), L V (t 2 )) between these values [38]. In Fig 9(a), we show the linear relations between L V (t 1 ) and L V (t 2 ) for different p classes and Fig 9(b) presents r(L V (t 1 ), L V (t 2 )) as a function of the average popularity hpi on a linear-log plot. Both indicate that the values of L V for the same hashtags at different times are significantly and temporarily correlated.     Interestingly, we observe that while bursty (low p) and regular (high p) signals give small r, the spike trains with moderate p provide the largest values of r, indicating more uniform temporal behavior through the individual trains in moderate p.

Discussion
The main purpose of this paper is to introduce a statistical measure suitable for the analysis of non-stationary time series, as they often take place in online social media and communications in social systems. As a test case, we have focused on the dynamics of hashtags in Twitter. However, the same methodology could be also applied to the other types of correlated, bursty, and non-stationary signals, for instance the dynamics of cascades in Twitter and Facebook or phone call activity.
Instead of measuring standard statistical properties of noisy hashtag signals such as the inter-event time distribution, the variance or the Fano factor, conventionally applied to characterize non-stationarity of a signal, we have focused on the local variation L V , a metric capturing the fluctuations of a signal as compared to a local characteristic time. This measure, previously defined for neuron spike train analysis, nicely uncovers the regularity and the firing rate of the trains [23][24][25][26][27] and so helps to identify local temporal correlations. It is important to stress that the current analysis exclusively focuses on properties of time series and considers neither the mechanisms leading to the observed statistical dynamic properties nor the effects of the underlying topology, e.g. through following-follower relations. Interesting lines of research would study the relation between L V and the underlying topology [39] and would consider diffusive models, for instance the Hawkes process [40,41]. In addition, both neurons [30] and hashtags can be driven by multiple firing rates and L V analysis associated to Gamma distributions would provide more concrete results on hashtag spike trains, as done for neuron spikes [25].
We should also note that the finite temporal resolution of the data (1 sec), which induces the fact that multiple events per time window are neglected, makes L V artificially small for popular hashtags. In an extreme case, the time series is indeed regular, with events taking place every second. In this work, we have therefore carefully verified that the fluctuations in L V are not artificially driven by these limitations. To this end, we have compared the values of L V in the empirical data with those of a null model. We observe a small decay of L V for popular hashtags in the null model (see Fig 8), but this decay is much more limited than the one observed in the empirical data, e.g. L V = 0.89 for hpi % 10 5 in the null model while it is equal to L V = 0.54 for the real data. In addition, a decay of L V in the real hashtag data is also present in moderately popular hashtags, where multiple events per second are very rare. An interesting research direction would be to generalize the definition of local variation to allow for the analysis of multiple events per time window, thereby evaluating the dense time series more precisely. Finally, in a finite time window, as observed in the empirical data, the statistics of high frequency hashtags is much better than that of low frequency hashtags, simply because the former occurs many more times than the latter. For this reason, the measurements of L V for less popular hashtags are more subject to noise.
The empirical analysis also reveals an interesting pattern observed in the data, as more popular hashtags tend to present more regular temporal behavior. This lack of burstiness ensures that popular hashtags do not disappear from the social network for very long periods of time, consequently allowing for a regular activation of the interest of Twitter users. These findings are reminiscent of a recent observation in numerical simulations showing that burstiness hinders the size of cascades [42], and should be incorporated into the modeling of theoretical information diffusion models, in particular threshold [43] and stochastic [44] models, on temporal networks.