Contagion in Mass Killings and School Shootings

Background Several past studies have found that media reports of suicides and homicides appear to subsequently increase the incidence of similar events in the community, apparently due to the coverage planting the seeds of ideation in at-risk individuals to commit similar acts. Methods Here we explore whether or not contagion is evident in more high-profile incidents, such as school shootings and mass killings (incidents with four or more people killed). We fit a contagion model to recent data sets related to such incidents in the US, with terms that take into account the fact that a school shooting or mass murder may temporarily increase the probability of a similar event in the immediate future, by assuming an exponential decay in contagiousness after an event. Conclusions We find significant evidence that mass killings involving firearms are incented by similar events in the immediate past. On average, this temporary increase in probability lasts 13 days, and each incident incites at least 0.30 new incidents (p = 0.0015). We also find significant evidence of contagion in school shootings, for which an incident is contagious for an average of 13 days, and incites an average of at least 0.22 new incidents (p = 0.0001). All p-values are assessed based on a likelihood ratio test comparing the likelihood of a contagion model to that of a null model with no contagion. On average, mass killings involving firearms occur approximately every two weeks in the US, while school shootings occur on average monthly. We find that state prevalence of firearm ownership is significantly associated with the state incidence of mass killings with firearms, school shootings, and mass shootings.

and is zero otherwise.
We consider a self-excitation contagion model with an additional baseline (i.e.; non-contagion related) average number of events per day of N 0 (t). Taking into account all prior events in some stochastic data realization, the total number of expected events, N exp , on day t n for that realization is thus where the summation is over all prior events. The parameters of this contagion model are the average number of secondary events inspired by the contagion of a single event, N secondary , the duration of the contagion process, T excite , and whatever parameters are needed to describe the temporal evolution of the baseline number of events, N 0 (t).
Let us refer to the parameters of the self-excitation contagion model for N exp (t) in Equation 1 as vector θ (these parameters include N secondary , T excite , and whatever parameters are needed to parameterize the temporal evolution of the baseline average number of events per day, N 0 ). We find the best-fit model to the data by finding the value of θ that minimizes the negative Poisson log-likelihood where the fit is performed to observations i 0 to M ; to take into account the fact that the data are left censored, meaning that information about possible earlier events in the self-excitation process are missing, and to avoid potential bias this censoring might cause in the estimation of the parameters of the self-excitation process, we choose i 0 to be large enough that the calculation of N exp includes several prior events. We nominally take i 0 = 365 days, and cross-check each analysis with i 0 equal to two years to ensure that the results do not significantly change.
We simulated data samples of the same sizes and temporal variation (including weekday and seasonal variation) of the samples used in this study to affirm that the fitting ansatz produced unbiased and efficient estimates of T excite and N secondary .
With additional simulation studies, we examined the effect of incomplete observations of the data; that is to say that only a fraction, f observed , of the true number of incidents is actually contained within a particular data set, as would for instance occur if some killings or school shootings did not receive media attention. We found that for such data the estimates of T excite were unbiased and efficient, but that the estimates of N secondary and N 0 were scaled by f observed . Thus the estimates of N secondary obtained from the fits to the data samples used in this analysis are lower bounds on the true value.

Simulation of self-excitation contagion processes
To simulate a self-excitation process under some hypothesis of N secondary and T excite , along with a functional parameterisation of N 0 (t), we start with 0 events on day t 1 and calculate the expected number of events on the next day N exp (t 2 ) using Equation 1. The number of events on day t 2 is then simulated with a random number drawn from the Poisson distribution with mean N exp (t 2 ). On each subsequent day, t i , the number of events depends on the timing of the past events, and is simulated with a random number drawn from the Poisson distribution with mean N exp (t i ). This is repeated for the desired length of the time series.

Running mean estimation of baseline number of incidents per day
In order to obtain an estimate of the temporal evolution of the baseline (non contagion-related) number of events per day, N 0 (t), from the data itself, we assume that the changes in N 0 (t) occur on a relatively longer time frame compared with the changes due to the self-excitation process, and employ the approach where σ is a bin-width parameter. The parameter ∆t must be chosen to be large enough that points close in time to t that may be due to self-excitation do not bias the estimate of N 0 . Similarly the bin-width σ must also be chosen to be large enough to avoid similar bias (and include enough data events within the bin-width in order to ensure accurate estimates of the true value of N 0 (t)), yet small enough that the running mean provides unbiased estimates the short-term changes in the temporal evolution of the baseline process. Using simulated data sets similar to the size of the sets used in these studies, with a simulated self-excitation process with various hypotheses of T excite between 7 to 28 days and N secondary between 0 to 0.5, we found that σ = 365/2 days and ∆t = 30 days yielded unbiased and efficient estimates of N secondary and T excite for simulated samples of the same sizes and roughly the same temporal trends as the data sets.
We cross-checked our studies by repeating all of the fits with σ = 365 and σ = 365/4 days, and ∆t = 15 and ∆t = 45 days, and found no significant difference in the central values returned by any of the fits.

Model validation
To ensure that the model reliably detects when no contagion is present, and to ensure that day-of-week or seasonal effects do not spuriously make it appear that significant contagion is evident when none in fact is, we generated 100 simulated samples under the null hypothesis of no contagion, with the samples having similar size and similar weekday and seasonal variation as the Brady Campaign school shooting data set (this particular data set exhibited the most extreme variations by weekday and season of the three data sets considered in this analysis); from this data set we determined the average number of events occuring within month and weekday, and used this as our expected model.
The contagion model fit to these samples yielded values of N secondary consistent with zero in 98 out of the 100 trials.
In addition, in order to ensure that the contagion model has good predictive power for our data, for each data set we perform 100 bootstrap iterations where half of the sample is randomly selected as the 4 training sample and the remainder is selected as the test sample. For the Brady Campaign mass shooting data, the full model with parameters fit to the training sample is more likely than the null model for the test sample 58% of the time.

Other cross-checks
In addition to model validation, we perform cross-checks for each data sample where the dates of each incident are randomly shifted by time ∆T , where ∆T is sampled from the discrete uniform distribution from −90 to +90 days. Simulation studies show this shift is large enough to destroy any self-excitation effects with T excite < 90/3, yet is short enough to still preserve the overall temporal shape of the kernel weighted running mean of the sample (thus confirming that the overall temporal distribution of the data is not in itself responsible for potential spurious evaluation of significant contagion when in fact none is present). For all three data samples, this cross-check yielded values of N secondary consistent with zero.