Temporal clustering of disorder events during the COVID-19 pandemic

The COVID-19 pandemic has unleashed multiple public health, socio-economic, and institutional crises. Measures taken to slow the spread of the virus have fostered significant strain between authorities and citizens, leading to waves of social unrest and anti-government demonstrations. We study the temporal nature of pandemic-related disorder events as tallied by the “COVID-19 Disorder Tracker” initiative by focusing on the three countries with the largest number of incidents, India, Israel, and Mexico. By fitting Poisson and Hawkes processes to the stream of data, we find that disorder events are inter-dependent and self-excite in all three countries. Geographic clustering confirms these features at the subnational level, indicating that nationwide disorders emerge as the convergence of meso-scale patterns of self-excitation. Considerable diversity is observed among countries when computing correlations of events between subnational clusters; these are discussed in the context of specific political, societal and geographic characteristics. Israel, the most territorially compact and where large scale protests were coordinated in response to government lockdowns, displays the largest reactivity and the shortest period of influence following an event, as well as the strongest nationwide synchrony. In Mexico, where complete lockdown orders were never mandated, reactivity and nationwide synchrony are lowest. Our work highlights the need for authorities to promote local information campaigns to ensure that livelihoods and virus containment policies are not perceived as mutually exclusive.

From left to right: k-means clustering applied to Israel, India, Mexico. On the vertical axis is the WSS(k). Note that the scales reflect the spatial extent of the countries. India being the largest by territorial extent is associated to the largest WSS(k) range, India being the smallest is associated to the smallest WSS(k) range. The vertical line denotes our elbow method best estimate for the optimal k * value which we identify as k * = 4 in all countries.
arg min Here, x − ν i 2 is the square of the Euclidean distance between the points in a given 27 cluster and its centroid ν i . Procedurally, k centroids ν i are initialized and each data 28 point is assigned to its closest centroid. The mean of the positions of all points within a 29 cluster define the new centroid. An iterative process ensues until discrepancies between 30 iterations falls below a given threshold. To identify the optimal number of clusters k * we utilized the heuristic elbow method.

33
Here, k-means clustering is applied for several increasing values of k. Once clusters are 34 identified, the sums of the square of the distance of each point within a cluster to its 35 centroid is calculated. This k-dependent quantity is termed WSS(k), within-cluster sum. 36 As k increases, more clusters are possible, hence, one may expect the WSS(k) to 37 decrease as a function of k as there may be a centroid closer to them. However, beyond 38 a critical value k * the decrease may be marginal, indicating that allowing for extra 39 clusters does not improve on the compactness of the clustering process. The value of k * 40 beyond which decreases in WSS asymptote yields the elbow, optimal value of k * . In our 41 work we use 1 < k < 10; as can be seen from for all three countries of interest, India,

42
Israel and Mexico, the optimal k * value is k * = 4. 43 S1.3 Hawkes Process parameter estimation 44 We use MLE to derive the Hawkes process parameters µ, α, β. These emerge as the ones 45 that maximize the loglikelihood function defined as where {t 1 , ..., t n } is the set of the times of occurrence of given events. The loglikelihood 47 function compares the value of the intensity function of the Hawkes process λ(t) at  Figure S3 reveals low values of averaged weekly disorders, however many outliers emerge 60 corresponding to the interval between weeks j = 37 and j = 50 mentioned above.  In this section we list the numerical values of the Pearson coefficient r correlating the 69 number of weekly of events in pairs of clusters within a given country. If we denote two 70 clusters within a country C X and C Y then r is defined as where X, Y are the sets of weekly data in clusters C X and C Y , respectively, µ X , µ Y 72 their averages, and σ X , σ Y their standard deviations. Pearson's correlation coefficient 73 ranges from −1 to 1; r = 1 implies a perfect, positive, linear relationship between the correlations become weaker, so that r = 0 implies data points in the two sets   Table 3. Pearson's correlation matrices for Mexico and shown in Fig. 14. Top: Entries represents correlation coefficients r derived on weekly events {n j } for the period January 3 rd to December 12 th 2020 and between the associated clusters. Overall, correlation values are moderately large. The highest r = 0.826 is observed between the geographically contiguous clusters C2 and C4. The lowest r = 0.571 is observed between clusters C2 and C4. Bottom: Entries represent correlation coefficients r derived on differentiated weekly events {∆n j } show vanishing or even negative correlation and implying lack of synchrony in the rate of change of the occurrence of events.

S1.6 Hawkes process in a restricted time window 84
In this section we apply the Hawkes process to disorder events recorded from the CDT 85 from January 3rd to October 10 th 2020. Similarly to what observed for the entire data 86 set, the Hawkes process outperforms the Poisson process in all three countries and in all 87 clusters, even in this limited time range. A noteworthy observation is that while the 88 sequence of events in C4 in Israel is appropriately described by a Hawkes process until

89
October 10 th 2020 as per Table 5, the sequence of events that extends to December 12 th 90 is not as per    Table 6. Statistical outcomes of the Hawkes process applied to data from Mexico up to October 10 th 2020. The Hawkes process outperforms the baseline Poisson process both nationwide and in each cluster, since the Hawkes AIC is always less than the Poisson AIC. The Hawkes process passes the KS test at the 95% significance level in all cases.