Clustered Desynchronization from High-Frequency Deep Brain Stimulation

While high-frequency deep brain stimulation is a well established treatment for Parkinson’s disease, its underlying mechanisms remain elusive. Here, we show that two competing hypotheses, desynchronization and entrainment in a population of model neurons, may not be mutually exclusive. We find that in a noisy group of phase oscillators, high frequency perturbations can separate the population into multiple clusters, each with a nearly identical proportion of the overall population. This phenomenon can be understood by studying maps of the underlying deterministic system and is guaranteed to be observed for small noise strengths. When we apply this framework to populations of Type I and Type II neurons, we observe clustered desynchronization at many pulsing frequencies.


Introduction
High frequency deep brain stimulation (DBS), a medical treatment in which high-frequency, pulsatile current is injected into an appropriate brain region, is a well established technique for alleviating tremors, rigidity, and bradykinesia in patients with Parkinson's disease [1,2]. While the underlying mechanisms of deep brain stimulation remain unknown, it is well documented that local field potential recordings recorded in the subthalamic nucleus of patients with Parkinson's disease display increased power in the beta range (approximately 13-35 Hz) [3][4][5]. These findings have led to the hypothesis that pathological synchronization among neurons in the basal ganglia-cortical loop contribute to the motor symptoms of Parkinson's disease [6][7][8]. This hypothesis has been supported by findings that when DBS is applied to the STN, abatement of motor symptoms is correlated with a decrease the power in the beta band of the local field potential recorded from STN [9][10][11]. This line of thinking has led researchers to develop new strategies for desynchronizing populations of pathologically synchronized oscillators, [12][13][14], some of which have shown promise as new treatment options for Parkinson's disease in animal and human studies [15,16].
While many factors including the location of the DBS probe, stimulus duration, and stimulus magnitude influence the efficacy of DBS, one factor that is difficult to explain is the strong dependency on stimulus frequency. Low-frequency stimulation ( 50 Hz) is generally ineffective at reducing symptoms of Parkinson's disease while high-frequency stimulation from 70 to 1000 Hz and beyond has been shown to be therapeutically effective [17][18][19]. However, not all high frequency stimulation is equally effective, and clinicians have generally settled on a therapeutic range at about 130-180 Hz. [20,21].
In an effort to provide insight into the frequency dependent effects of DBS, the authors of [22] postulated that specifically tuned pulse parameters might yield chaotic desynchronization in a network of neurons. If desynchronization is the goal of DBS, then achieving it chaotically is a worthwhile objective. However, this can generally only be seen in a small window of pulse parameters and frequencies which may make it difficult to observe in real neurons. Furthermore, in both brain slices and in vivo recordings, individual neuronal spikes have been found to be time-locked to the external high-frequency stimulation [23][24][25][26][27][28] which would be unlikely if the spike times were chaotic.
Here we present a different viewpoint showing that with high frequency pulsatile stimulation, in the presence of a small amount of noise, a population of neurons can split into distinct clusters, each containing a nearly identical proportion of the overall population. We find that the number of clusters, and hence desynchronization, is highly dependent the pulsing frequency and strength. We provide theoretical insight into this phenomenon and show that it can be observed over a wide range of pulsing frequencies and pulsing strengths. This viewpoint merges two seemingly contradictory hypotheses, showing that the therapeutic effect of the periodic pulsing could be to replace the pathological behavior with a less synchronous pattern of activity, even if individual neuronal spikes are phase locked to the DBS pulses. capacitance, I b = 1.93μA/μF is a baseline current chosen so that in the absence of external perturbations and noise the firing rate is 60 Hz, and N is the total number of neurons.
Using phase reduction, [30,31], we can study Eq (1) in a more convenient form: where θ 2 [0, 1) is the phase of the neuron with θ = 0 defined to be the time the neuron fires an action potential, ω is the natural frequency of oscillation, f(θ) is a continuous function which describes the effect of the DBS pulse, τ is a positive constant that determines the period of the DBS input, and Z(θ) is the neuron's phase response curve to an infinitesimal perturbation.
Here we assume that is small enough so that higher order noise terms are negligible (c.f. [32,33]). Fig 1 shows an example charge-balanced pulsatile stimulus. We take the positive portion to be five times larger than the negative portion, with the positive current applied for 100 μs.
The bottom panel shows the function f(θ) for a given stimulus intensity, calculated using the direct method [34]: a pulsatile perturbation is applied to a neuron at a known phase θ p so that f (θ p ) can be inferred by measuring the timing of the next spike. We note that even though the DBS pulse itself is not a δ-function, it is of short enough duration that Eq (2) is an accurate approximation to Eq (1). We simulate Eq (1) with 1000 neurons, taking a pulse strength S = 110μA/μF, and noise strength ¼ ffiffiffiffiffiffiffiffiffi 0:05 p , for various pulsing frequencies, with results shown in Fig 2. After some initial transients, we find the network tends to settle to a state with different numbers of clusters for different pulsing frequencies. From the probability distributions of neural phases ρ(θ), the bottom panels show somewhat surprisingly that once the network settles to a clustered state, each cluster contains a nearly identical portion of the overall population. Also, upon reaching the steady distribution, neurons can still transition between clusters, but on average, the amount that leave a given cluster must be identical to the amount that enter.  individual voltage traces for 50 sample neurons from this population after the network settles to a clustered state. Highlighted traces represent neurons from each cluster. In general, increasing the number of clusters will decrease synchrony in the population. Furthermore, neurons are more likely to transition between clusters as the overall number of clusters becomes larger.

A Theoretical Basis for Clustered Desynchronization
For simplicity of notation, we will take ω = 1 for Eq (2) in the theoretical analysis, but note that any other value could be considered to obtain qualitatively similar results. In the absence of noise, one may integrate Eq (2) for a single neuron θ to yield In this work, we are interested in the state of the system immediately after each pulsatile input. By integrating Eq (2), the system dynamics can be understood in terms of compositions of a map where g(s) = s + f(s + τ) + τ and g (n) denotes the composition of g with itself n times, and θ 0 is the initial state of a neuron. In Eqs (3) and (4), θ(t) and the arguments of f and g are always  evaluated modulo 1. If g (n) has a stable fixed point, then any oscillator which starts within its basin of attraction will approach that fixed point as time approaches infinity [35]. With noise, the dynamics are more complicated. In this case, the phase of each neuron cannot be determined exactly from Eq (2), but rather, follows a probability distribution. For a neuron with known initial phase θ 0 , after mτ has elapsed, the corresponding δ-function distribution δ(θ − θ 0 ) will be mapped to the Gaussian distribution with mean μ = g (m) (θ 0 ) given by Eq (23) and variance ν given by Eq (24). In order to study the infinite time behavior of Eq (2) it can be useful to calculate steady state probability distributions for the population of neurons. To simplify the analysis, we will study Eq (2) as a series of stochastic maps applied to an initial phase density (c.f. [22,33]), where P mτ is the linear Frobenius Perron operator corresponding to evolution for the time mτ, and ρ(θ, t) is the probability distribution of phases at time t. We can approximate P mτ by the matrix P mt 2 R MÂM by using eq (5) to determine each column of the matrix for a set of discretized phases, Δθ = 1/M. In Fig 4 for instance, the map g (2) yields the stochastic matrix P 2t , shown in the panel on the right. The delta function distribution (arrow) is mapped to a Gaussian distribution (solid line). The matrix P mt has all positive entries, and since probability is conserved, the matrix is column stochastic (i.e. the columns of P mt sum to 1). For this class of matrices, the Perron-Frobenius theorem allows us to write [36,37], where v and w are the right and left eigenvectors associated with the unique eigenvalue of 1, and normalized so that w T v = 1. Recalling that P mt is column stochastic, its left eigenvector associated with λ = 1 is 1 T . Therefore, as the map is applied repeatedly, any initial distribution will approach a steady state distribution determined by v. We find that the existence of m fixed points of the underlying map g (m) (θ) provide the basis for the clustered desynchronization seen in Fig 2, with a more formal main theoretical result given below.

Main Theoretical Result
Consider the map g (m) with the following properties: 1. g (m) has exactly m stable fixed points corresponding to a period m orbit of g (1) , 2. g (m) has m unstable fixed points and no center fixed points, Then for a given choice of 1 ( 1, we may choose small enough in eq (2) so that the distribution of phases will asymptotically approach a state with m distinct clusters, each containing 1=m þ Oð 1 Þ of the total probability density. A proof of this statement is given in the Methods Section. In this detailed proof, we find that desynchronization can still be guaranteed even when g (m) is not monotonic as long as a more general set of conditions is satisfied. Note that because Eq (2) does not contain any coupling terms, noise will drive the system to a uniform, desynchronized, distribution in the absence of DBS input. In the sections to follow, we give numerical and theoretical evidence that clustered desynchronization can emerge in a population of pathologically synchronized neurons when the DBS pulses overwhelm the terms responsible for the synchronization.

Arnold Tongues and Lyapunov Exponents
Using the main theoretical result, we can calculate regions of parameter space where we expect clustered desynchronization. The top-left panel of Fig 5 gives regions of parameter space where clustering is expected, giving the appearance of Arnold tongues [35]. White regions in the graph represent regions where either clustering is not guaranteed, or where we expect more than five clusters. However, we do not include these regions in the figure because they only exist for very narrow regions of parameter space. At around 60 Hz, the natural unforced period of the neural population, exactly one cluster is guaranteed, corresponding to 1:1 locking (one DBS pulse per neural spike). This locking corresponds to a highly synchronous state, which we found when forcing the population at 63 Hz in Fig 2. For pulsing frequencies between 80 and 120 Hz, we see prominent tongues corresponding to states with three, four and five clusters, which correspond to the states in Fig 2 where we force at 83 Hz and 94 Hz. A very wide tongue corresponding to 2:1 locking (two DBS pulses per neural spike) exists at frequencies ranging from 120 to 200 Hz, which is where DBS is often seen to be effective. Pulsing in this region manifests in the behavior seen with 120 Hz forcing in Fig 2. To make comparisons with [22] we calculate the average Lyapunov exponent of the resulting steady state distributions using Eq (12). For Lyapunov exponents greater than zero (resp., less than zero), the pulsatile stimulus will, on average, desynchronize (resp., synchronize) neurons which are close in phase, and this has been proposed as an indicator of the overall desynchronization that might be observed in a population of neurons receiving periodic DBS pulses. The Lyapunov exponent is calculated for multiple parameter values for a system with a noise strength = 0.1. Results are given in the bottom-left panel of Fig 5. We note that results are not qualitatively different for different noise strengths. Compared with the Arnold tongues in the top-left panel, we find very narrow regions where the Lyapunov exponent is positive at relatively high stimulus strengths. The top-right panels show the steady state distribution for a population with pulses of strength S = 50μA/μF at a rate of 119 Hz (resp., 180 Hz) corresponding to a two (resp., three) cluster steady state. The bottom-right panels show the steady state distribution for a pulsing strength of S = 208μA/μF at 120 Hz and S = 206μA/μF at 290 Hz corresponding to regions with positive Lyapunov exponents. Even though the clustered states have very negative Lyapunov exponents, they show similar clustering behavior to the states with a positive Lyapunov exponent. However, the clustered desynchronization in the top-right panels can be accomplished using a significantly weaker stimulus and can be observed at a much wider range of pulsing parameters.

Heterogeneous Pulsatile Inputs
Results thus far have focused on populations of neurons receiving homogeneous pulsatile inputs. However, it is well established that voltage fields vary significantly with distance from an external voltage probe [38]. In computational models such heterogeneity has been shown to create complicated patterns of phase locking that are dependent on the stimulation strength [39] and can improve methods designed to desynchronize large populations of neurons [14].
To understand the emergence of clustered synchronization when external inputs are different among neurons, we can modify the stochastic differential eq (2) as follows Here, f i (θ) is calculated based on the pulsatile input to each neuron. For each neuron, we use Eq (7) to calculate its steady state probability distribution. The state of each neuron is independent of the others, so that the average of the individual distributions gives an overall probability distribution for the population.
As an illustrative example, we model 1000 neurons of the form Eq (1) receiving a charge balanced input of the same shape as in . While the main clustering results are guaranteed when the noise strength is small enough, we find that clustering can still occur at higher levels of noise. The steady state probability distribution in corresponding simulations of Eq (1) with heterogeneous pulsing strengths (blue dashed curve) agrees well with the theoretical probability density (red dashed curve) calculated from the average of each black curve in the top-right panel. The bottom panel shows corresponding cumulative distributions for the theoretical (red) and computationally observed (blue) probability densities highlighting that similar numbers of neurons are contained in each cluster.
Similar clustering results can be obtained for different heterogeneous stimulus parameters. For instance, from

Desynchronization of Neural Populations
Our main clustering results are for single population of neurons which do not explicitly take interactions between multiple populations of neurons into account, as is the case for the brain circuit responsible for Parkinsonian tremor. Here, we provide evidence that clustered desynchronization can still emerge when additional forcing terms are much smaller in magnitude than the external DBS pulses.
Populations of coupled oscillators subject to common external forcing have been widely studied in the form of the forced Kuramoto model [40][41][42]. Synchronization can be observed when either external forcing or intrinsic coupling dominate the system dynamics. For intermediate coupling and external forcing strengths, a complicated bifurcation structure emerges and the macroscopic order parameter, describing the overall synchronization of the population, can oscillate. These behaviors have been observed in chemical oscillator systems [43] and have implications to externally forced biological rhythms such as circadian oscillations and neural oscillations [44]. We simulate Eq (1) with an additional external sinusoidal forcing frequency which could represent an aggregate input from a separate, unperturbed neural population. We note that this is not meant to represent a physiologically accurate model of DBS, but instead is meant to illustrate clustered desynchronization in the presence of a common periodic perturbation. Here, we take u(t) = 0.1 sin(ω ext t) + u DBS (t), where ω ext is chosen so that the frequency of oscillation is the same as the natural period of the unperturbed neurons (60 Hz) and u DBS (t) represents the common pulsatile input. For these simulations, N = 1000. As we show in the Methods Section we may write this system in an identical form as Eq (2) We find that the presence of a sinusoidal external stimulus is sufficient to synchronize the network in the absence of DBS forcing. When the DBS is turned on at both 83 and 94 Hz, we see three and four cluster states, respectively, just as we observed in the simulation without external forcing. However, in this simulation, the mean phase of each cluster varies with the external sinusoidal stimulation. Note that 120 Hz stimulation in this network also leads to two cluster desynchronization but is not shown.
When neurons are synchronized through forcing that is not periodic, clustered desynchronization may still emerge when the DBS pulsing overwhelms the stimulation responsible for synchronization. As a second example, we model a network of neurons Eq (1) with an additional synaptic current, with each neuron's transmembrane voltage dynamics taking the form Here, where K determines the magnitude of the synaptic current, V G is the reversal potential of a given neurotransmitter, and s k an additional synaptic variable associated with neuron k that evolves according to (c.f. [30]) _ s k ¼ a 2 ð1 À s k Þð1=ð1 þ expðÀðV k À V T Þ=s T ÞÞÞ À b 2 s k , where α 2 = 2, V T = -37, σ T = 2, and β 2 = 0.1. We simulate the resulting network withV g = 60mV,  Simulations of (2) with additional sinusoidal forcing. In the top panel, snapshots of the probability distribution are taken immediately preceding every third and fourth pulse for the 83 and 94 Hz DBS pulsing, respectively. When there is no DBS pulsing, snapshots are taken at the sinusoidal forcing frequency. As averaging the term associated with the sinusoidal forcing would suggest, we observe desynchronization into three and four clusters for pulsing at 94 and 83 Hz pulsing, respectively. When the DBS pulsing is turned off at t = 8 seconds, the system quickly settles to a synchronized state. The bottom panels show the probability density averaged over multiple snapshots rðyÞ after the initial transient has decayed. Horizontal dashed lines denote troughs of the probability distributions. The probability contained between successive troughs is 0.32, 0.35, and 0.33 in the left panels, and 0.26, 0.26, 0.25, and 0.23 in the right panels. stimulation with S = 200μA/μF, the pulsing quickly overwhelms the synchronizing influence of the coupling, and the population splits into two separate clusters as shown by the probability densities in Panels A and individual voltage traces in panel C. When DBS is applied, we see from the average probability distributions and cumulative distributions in panels F and E, respectively that there are nearly equal proportions of neurons in each cluster. Other computational modeling [29] has suggested that pulsatile DBS may help regulate neural firing patterns, and help alleviate strongly oscillatory synaptic inputs. Panel D shows a similar phenemenon, when DBS is on, high amplitude oscillations in synaptic current are replaced by oscillations with a higher frequency but smaller amplitude. The desynchronization results here can be observed for many choices of parameters provided the pulsatile stimulation is significantly stronger than the synchronizing stimulation and that clustering behavior is expected in the absence of coupling.

Clustered Desynchronization in Type II Hodgkin-Huxley Neurons
Consider a two dimensional reduction of the classic Hodgkin-Huxley equations [45] which reproduce the essential dynamical behavior [46]: Here V i and n i represent the transmembrane voltage and gating variables, respectively. All functions and parameters are identical to those given in [47]. DBS pulses are represented by the external current u(t), which is given identically to each neuron, η i (t) is a white noise process, C = 1μF/cm 2 is the constant neural membrane capacitance, I b = 10μA/μF is a baseline current yielding a firing rate of 84.7 Hz in the absence of external perturbation, and N is the total number of neurons.
Unlike the model for thalamic neurons used in the main text, the Hodgkin-Huxley neuron displays Type II phase response properties, i.e., a monophasic pulsatile input can act to either significantly increase or decrease the phase of the neuron. The top panel of Fig 9 shows an example monophasic stimulus which will be applied to the network Eq (10) at different strengths, S and periods τ. For this example, the pulse duration will be 100 μs, approximately, one percent of the neural firing rate.
For this model, using our main theoretical results, we can also calculate regions of parameter space in which we expect to observe clustered desynchronization, with results shown in the middle panel of Fig 10. The Arnold tongues for clustering greater than five become quite narrow and are not included in this figure. We also calculate the average Lyapunov exponent for the steady state distribution using eq (12) from the main text for a noise strength of = 0.15, with results shown in the bottom panel. We note that unlike for the thalamic neurons, the Lyapunov exponent for the Hodgkin-Huxley network is never positive. We find that regions with the lowest Lyapunov exponents tend to correlate with regions where clustered desynchronization is guaranteed. Even though the Lyapunov exponent might be quite negative, the steady state distribution can still be relatively desynchronized if there are a large number of clusters, as evidenced by the four cluster state in the top panel.
Finally, we simulate Eq (10) with N = 1000 neurons with a pulse strength S = 10μA/μF and ¼ ffiffiffiffiffiffi ffi 0:3 p for pulsing frequencies that are expected to yield clustered desynchronization determined from Fig 10. Results are shown in Fig 11. We find clustered states begin to form almost immediately, and in the bottom panel, after the system has approached the steady state distribution, each cluster contains an approximately identical proportion of the population.

Discussion
While deep brain stimulation is an important treatment for patients with medically intractable Parkinson's disease, its fundamental mechanisms remain unknown. Making matters more complicated, experimental studies have shown that the symptoms of Parkinson's can be alleviated using strategies that seek to desynchronize a population of pathologically synchronized oscillators [15,16], while other seemingly contradictory studies have shown that neurons have a tendency to time-lock to external high-frequency pulses [23][24][25][26][27], supporting the hypothesis that entrainment is necessary to replace the pathological neural activity in order to alleviate the symptoms of Parkinson's disease. In this work we have have shown that these two phenomenon may be happening in concert: in the presence of a small amount of noise, high frequency pulsing at a wide range of frequencies is expected to split a larger population of neurons into subpopulations, each with a nearly equal proportion of the overall population. The number of subpopulations, and hence the level of desynchronization, is determined by phase locking relationships which can be found by analyzing the phase reduced system in the absence of noise. We note that other theoretical [12] and experimental [16] [15] work has yielded control strategies that are specifically designed to split a pathologically synchronized neural population into distinct clusters. The theory presented in this paper suggests that clinical DBS may be accomplishing the same task with a single probe.
The conditions we have developed guarantee clustered desynchronization for small enough noise, but we do not give any a priori estimate of how small the noise needs to be so that distinct clusters can be observed. If the noise is too large, the clusters may start to merge into one another, particularly when there are a large number of clusters (see the bottom left panel of Fig  2). Even in this case, however, we still have discernible clusters, throughout which the overall population of neurons is spread relatively evenly. We also note that this theory does not give any estimates on the time it takes for the system to achieve its steady state population distribution, but this can be calculated for a specific population by examining the second smallest eigenvalue in magnitude, λ 2 , of a given stochastic matrix P mt (c.f. [33]). As λ 2 becomes closer to 1, more iterations of the map P mt will be required for the transient dynamics to die out, and it will take longer for the system to approach the steady state distribution. In general, we find that for a given map, λ 2 becomes closer to one as noise strength decreases, which is consistent with the notion that the average escape time between clusters will increase as the strength of the external noise decreases [48].
For the networks that we have investigated, regions of parameter space which are associated with either clustered desynchronization or positive Lyapunov exponents can display similar levels of desynchronization. However, numerics show that the regions with positive Lyapunov exponents are quite small and may be difficult to find without explicit calculation. In contrast, the regions of parameter space associated with clustered desynchronization are fairly large and are likely to be observed without knowledge of the system properties. If desynchronization of the overall population is an important mechanism of high-frequency DBS, doing so chaotically may be an overly restrictive objective if clustered desynchronization is sufficient to alleviate the motor symptoms of Parkinson's disease.
This study is certainly not without limitations. For instance, the computational neurons considered in this study are based on simple, low-dimensional models of neural spiking behavior. However, we have developed the theory to understand the clustered desynchronization phenomenon in such a way that it can easily be extended to more complicated neural models with more physiologically detailed dynamics provided the neural phase response properties can be measured experimentally in vivo [49]. Furthermore, while we only considered homogeneous populations in this study, the phase response properties and natural frequencies of a living population of neurons will surely have a heterogeneous distribution. In this context, we could still show that clustered desynchronization is expected by applying the theory developed in this work to a family of neurons with different phase response properties and natural frequencies. The expected steady state population could then be obtained as a weighted average of the individual steady state distributions. Numerical results presented here apply to networks for which external DBS perturbations overwhelm the intrinsic coupling between neurons. In this work, we have not considered the complicated interplay between multiple populations of neurons which give rise to the symptoms of Parkinson's disease; more detailed modelling studies would be required to determine the effect of clustered desynchronization on the overall network circuit. Others have studied synchrony and clustering behavior in coupled populations of neurons [50][51][52] and it is possible that our results could be extended to describe clustering for weaker pulsatile stimuli when coupling cannot be neglected.
Our results suggest that high-frequency external pulsing could have the effect of separating a neural population into equal subpopulations in the presence of noise. This viewpoint could help explain the frequency dependent nature of therapeutically effective DBS and could help merge competing hypotheses, as desynchronization and entrainment are not mutually exclusive when even small amounts of noise are present. If clustered desynchronization does provide a mechanism by which the motor symptoms associated with Parkinson's disease can be mitigated, it could provide a useful control objective for designing better open-loop DBS stimuli in order to prolong battery life of the implantable device and to mitigate potential side effects of this therapy.

Average Lyapunov Exponents
To make comparisons with [22] we calculate the average Lyapunov exponent of the resulting steady state distributions, giving a sense of whether, on average, the orbits of the trajectories oscillators from Eq (2) are converging or diverging. For instance, let ϕ denote the phase difference between oscillators θ 1 and θ 2 which are close in phase, i.e. ϕ(t)|θ 1 (t) − θ 2 (t)|. Then from Eq (3), immediately after a DBS pulse occuring at time τ, ðt þ Þ ¼ jf ðy 1 ðt À ÞÞ þ y 1 ðt À Þ À f ðy 2 ðt À ÞÞ À y 2 ðt À Þj; ¼ jf ðy 2 ðt À ÞÞ þ f 0 ðy 2 ðt À ÞÞðt À Þ þ Oððt À Þ 2 Þ þ y 1 ðt À Þ À f ðy 2 ðt À ÞÞ À y 2 ðt À Þj; where 0 d/dθ and θ(τ − ) (resp. θ(τ + )) denotes the limit of θ(t) as t approaches τ from below (resp. above). Note that in the second line, we have used a Taylor expansion of f about θ 2 for small values of ϕ(τ − ). Therefore, the oscillators converge or diverge locally depending upon the derivative of f. For a population of neurons, the stochastic matrix P t for a given pulsing rate can be used to determine the steady state distribution ρ Ã (θ) before each pulse, with an average Lyapunov exponent taken to be (c.f. [22]): For LE > 0 (resp., LE < 0), the pulsatile stimulus will, on average, desynchronize (resp., synchronize) neurons which are close in phase, and this has been proposed as an indicator of the overall desynchronization that might be observed in a population of neurons receiving periodic DBS pulses.

Expected Value and Variance of a Noisy Neuron with External Pulses
For a single neuron with a known initial phase θ evolving according to the stochastic differential Eq (2), we calculate the expected value and variance with a strategy that is similar to the one employed in [33]. We first asymptotically expand θ(t) in orders of , with θ 0 (0) = θ(0), and θ 1 (0) = 0. Substituting Eq (13) into Eq (2) and taking ω = 1 for simplicity of notation, allows us to write Integrating eq (14) for a time τ yields, This relationship can be used to write θ 0 in terms of compositions of a map: where g(s) = s + f(s + τ) + τ and g (n) denotes the composition of g with itself n times. In Eqs (16) and (17), θ(t) and the arguments of f and g are always evaluated modulo 1.

Proof of Clustered Desynchronization in the Limit of Small Noise
Suppose that conditions 1 and 2 from our main theoretical results are satisfied for g (m) with m stable fixed points. Consider the corresponding stochastic matrix P mt . Denote the stable and unstable fixed points as θ s i and θ u i , respectively. To begin, it will be convenient to define so that F ðyÞ g ðmÞ ðyÞ À y. With this definition, F ðyÞ ¼ 0 corresponds to the fixed points of the map g (m) (θ). We subdivide θ 2 [0, 1) into into 4m disjoint subregions with the following procedure: Choose α > 0 and define the region s i near each stable fixed point so that Likewise, define the region u i near each unstable fixed point so that u i ¼ ½y See Figs 12 and 13 for an example of how the matrix P mt is partitioned into submatrices according to this procedure.
In the analysis to follow we will show that the steady state probability distribution v lim k!1 P k mt rðy; 0Þ exists and is invariant to ρ(θ, 0). We define subregions of v a ; a ¼ s i ; u i ; n þ i ; n À i to represent the subset of v contained in the region a. We have carefully defined the matrix partition in Fig 13 so that, for instance, jjP s i !n þ i mt v s i jj 1 corresponds to the amount of probability that is mapped from the region s i into to the region n þ i when P mt is applied to v.   (1) and g (2) and P 2t . Note that g (2) has two stable and two unstable fixed points. The partition of P 2t shown in the right panel with dashed lines can be determined from a given choice of α. We note that the matrix has been flipped in the vertical direction to emphasize the correspondence between P 2t and g (2) (θ) so that in this visual example, matrix multiplication would not be performed in the usual way. From left to right, the submatrices denoted with asterisks are defined to be P n þ 2 !u 1 2t , P u 2 !u 1 2t , and P n À 1 !u 1 2t . From top to bottom, the regions submatrices denoted with dots are defined to be P . The steady state distribution calculated as the right eigenvector of P 2t associated with λ = 1 is shown to the right of P 2t , along with the associated partition. Note that because the conditions guaranteeing clustered desynchronization are satisfied, by taking small enough, the difference between the amount of the steady state population found in s 1 and s 2 can be made arbitrarily small. Suppose that there exists α for the resulting partition such that, 3a. for all y 2 n þ i (resp., n À i ), g ðmÞ ðyÞ 2 n þ i (resp., n À i ) [s i 3b. for all θ 2 s i , g ðmÞ ðyÞ 2s i 3c. for all θ 2 u i , d dy g ðmÞ j y > 1 and g ðmÞ ðyÞ= 2 u j for i 6 ¼ j Note here that for a given set p,p denotes its interior, and p denotes its closure. Furthermore, we will also assume, without loss of generality, that in the absence of noise, upon successive iterations of the map g (1) the period m orbit is θ s 1 ! θ s 2 ! Á Á Á ! θ s m ! θ s 1 . Let γ = mod(i, m) + 1. Suppose then that 3d. for all θ 2 s i , g (1) (θ)2 the interior of n À g [ s g [ n þ g Then for any choice of 1 ( 1, we may choose (the noise strength) small enough in eq (2), so that as time approaches infinity, regardless of initial conditions, the population will be split into m distinct clusters, and to leading order in 1 , each cluster will contain an equal portion of the population. We note that if conditions 1 and 2 from our main theoretical result are satisfied and g (m) is monotonic, then we will be guaranteed to be able to choose and α so that the remaining conditions 3a-d are satisfied.
In order to prove the main theoretical result presented earlier, we will first show that for small enough, as time tends toward infinity, the probability of finding an oscillator far from any of the stable fixed points is Oð 1 Þ. We will then show that as time approaches infinity, the chance of finding a randomly chosen oscillator near any of the stable fixed points is identical to leading order 1 .
Throughout this proof, we are interested in the unique steady state solution which solves v ¼ P t v. We note that for any positive integer k, P kt ¼ P k t so that v ¼ P kt v, i.e. the unique steady state solution of P kt is the same for any choice of k. Using Eqs (23) and (24), we will assume that is taken small enough so that errors in the approximations of all necessary stochastic matrices P kt are negligible.
Bounding near unstable fixed points. To begin, for the stochastic matrix P mt , we intend to bound the row sums of the submatrix P u i !u i mt , the portion of the matrix P mt which maps u i Fig 14. Mapping probability between regions. In the limit as time approaches infinity, the probability that a neuron will be found in the region s a for a ¼ s i ; u i ; n þ i ; n À i is given by jjv s a jj 1 . After the time mτ has elapsed, the probability that a randomly chosen neuron started in s a and was mapped to s b for b ¼ s i ; u i ; n þ i ; n À i is given by jjP s a !s b mt v s a jj 1 . Based on these relations, the above diagram gives an example characterizing how probability initially found in the region s i is mapped to all other regions after the time mτ has elapsed. back to u i (see Fig 13 for an example of this notation). Let μ θ and σ θ correspond to the expected value and standard deviation of any oscillator with initial condition θ, respectively. Let θ L and θ R correspond to the left and right boundaries of u i . We assume that phase space is partitioned into equal bins of length Dy ¼ 1=M ¼ Oð 1 Þ, where P mt 2 R MÂM . For row j of P u i !u i mt corresponding to θ j , its row sum R j can be calculated as R j ¼ N ðm y L À y j ; s y L Þ þ N ðm y L þDy À y j ; s y L þDy Þ þ . . . h þN ðm y R ÀDy À y j ; s y R ÀDy Þ þ N ðm y R À y j ; s y R Þ

À Á
is the equation for a Gaussian curve. As we showed in Eq (23), μ θ = g (m) (θ). Let y E ¼ arg min y2½y L ;y R ðjm y À y j jÞ correspond to the closest to θ j any oscillator in u i is expected to map to. Taylor expanding around θ E yields Similarly, we Taylor expand σ θ around θ E as Substituting Eqs (27) and (28) into Eq (26) yields, . Therefore, if we choose small enough, σ θ E will be small enough so that the remaining Taylor expanded terms contribute at most Oð 1 Þ error. When we do this we may rewrite eq (29) as Recall that μ(θ E ) − θ j is Oð 1 Þ so that it can be lumped with the other Oð 1 Þ terms upon Taylor expansion. Notice that Eq (30) can be written as a Riemann sum approximation to The integral in Eq (31) is less than one because g 0(m) (θ k )>1 so that it falls off more quickly than a Gaussian distribution. Therefore, if we choose Δθ and 1 small enough, R j < 1, which implies jjP u i !u i mt jj 1 < 1. Because the column sums of the stochastic matrix P mt are all equal to one, and each entry is positive, we may use Gershgorin disks to show all eigenvalues are less than or equal to one. Using the Perron-Frobenius theorem, we know that there is exactly one eigenvalue equal to one, so that the steady state solution, v, solves v ¼ P mt v [36]. Recall that v s i is defined as the subset of v contained in s i . Then, because each element of the steady state vector and stochastic matrix is positive, The individual terms of Eq (32), for instance, P n þ i !u i mt v n þ i represent the steady state probability density that is mapped from n þ i to u i upon one iteration of the stochastic map. We note that from the conditions 3a, 3b, and 3c, only oscillators which start in u i can map to the interior of u i . This implies that we may choose small enough so that the probability of transitioning from anywhere outside of u i to u i will be Oð 1 =nÞ where n is the maximum length over all i of u i . This implies jjP a!u i mt jj 1 Oð 1 =nÞ for a 6 ¼ u i . Finally, using properties of matrix norms with Eq (32), we have jjv u i jj 1 jjP u i !u i mt v u i jj 1 þ Oð 1 =nÞ: jjP u i !u i mt jj 1 Á jjv u i jj 1 þ Oð 1 =nÞ ) jjv u i jj 1 Oð 1 =nÞ 1 À jjP u i !u i mt jj 1 : Note that the final line results from rearranging the second line using the fact that jjP u i !u i mt jj 1 < 1. It follows immediately from properties of matrix norms that jjv u i jj 1 Oð 1 Þ; ð34Þ in other words, given 1 ( 1, we may choose small enough so that once the distribution reaches its steady state, the probability that an oscillator can be found in a given region u i is of order 1 . Bounding between stable and unstable fixed points. For y i 2 ðn þ 1 [ n À 1 [ Á Á Á [ n þ k [ n À k Þ, let b ¼ min jF ðy i Þj, i.e., if an oscillator is expected to be found in either n þ i or n À i , it will move at least β closer to s i . Let k ¼ max i ðdðx; s i Þjx 2 n þ i [ n À i Þ where d(a, b) is the distance between the sets a and b. Consider oscillator j with initial condition y j ð0Þ 2 n þ i [ n À i [ s i . Let c dκ/βe. From eq (23), E[θ j (cmτ)] = g (cm) (θ j (0)). The oscillator can be at most κ away from s i and will either move at least β closer to s i , or will move inside s i upon each iteration of g (m) . Thus, after c iterations of g (m) , the oscillator is expected to be in s i , and by condition 3b, E[θ j ((c + 1)mτ)] will be in the interior of s i . From Eq (24), the variance is proportional to 2 , and therefore, we may then choose small enough so that an oscillator starting in n þ i or n À i will be mapped with probability 1 À Oð 1 Þ to a location in the interior of s i . Furthermore, from condition 3b, any oscillator starting in s i will be expected to be found in the interior s i , we can choose small positive, we may write Using an identical procedure, we can formulate the bound Oð 1 Þ ! jjP s i !n À g t v s i jj 1 . With these bounds we may rewrite Eq (36) as jjv s g jj 1 ¼ jjv s i jj 1 þ Oð 1 Þ: ð38Þ Applying eq (38) to each region s i yields jjv s 2 jj 1 ¼ jjv s 1 jj 1 þ Oð 1 Þ jjv s 3 jj 1 ¼ jjv s 2 jj 1 þ Oð 1 Þ . . . jjv s 1 jj 1 ¼ jjv s k jj 1 þ Oð 1 Þ: Here, the sum of the Oð 1 Þ terms represents the probability that a given oscillator will be found in a region without a stable fixed point of g (m) . Finally, using Eq (39), and the fact that v is positive, one can show that for any v s i and v s j , jjv s i jj 1 À jjv s j jj 1 Oð 1 Þ; ð40Þ which completes the proof.

Clustered Desynchronization with Common Periodic Perturbations
We consider a modified version of Eq (2) where each neuron feels a small, common periodic perturbation _ y i ¼ o þ f ðy i Þdðmodðt; tÞÞ þ pðo 1 tÞZðy i Þ þ Z i ðtÞZðy i Þ þ Oð 2 Þ; i ¼ 1; . . . ; N: ð41Þ Here, p(ω 1 t) is a periodic perturbation with period T 1 = 1/ω 1 common to each oscillator. This perturbation may represent the effect of coupling from an external population of neurons or coupling between neurons in the population under study. Here we give conditions for which Eq (41) exhibits clustered desynchronization. Defining ϕ i θ i − ω 1 t allows us to selectively average the term associated with p(ω 1 t) from Eq (41) [54], c.f. [55]: where GðφÞ ¼ R T 1 0 ½pðo 1 tÞZðφ i þ o 1 tÞdt and φ i = ϑ i − ω 1 t. Here ϑ i is a close approximation to θ i so that φ i % ϕ i . Noting that G is also T 1 periodic, we selectively average Eq (42) to yield Here, Θ i % ϑ i and K ¼ R T 1 0 Gðφ i À o 1 tÞdt. Notice that Eq (43) is in an identical form as Eq (2) for which our main theoretical result still holds.