Linking structure and activity in nonlinear spiking networks

Recent experimental advances are producing an avalanche of data on both neural connectivity and neural activity. To take full advantage of these two emerging datasets we need a framework that links them, revealing how collective neural activity arises from the structure of neural connectivity and intrinsic neural dynamics. This problem of structure-driven activity has drawn major interest in computational neuroscience. Existing methods for relating activity and architecture in spiking networks rely on linearizing activity around a central operating point and thus fail to capture the nonlinear responses of individual neurons that are the hallmark of neural information processing. Here, we overcome this limitation and present a new relationship between connectivity and activity in networks of nonlinear spiking neurons by developing a diagrammatic fluctuation expansion based on statistical field theory. We explicitly show how recurrent network structure produces pairwise and higher-order correlated activity, and how nonlinearities impact the networks’ spiking activity. Our findings open new avenues to investigating how single-neuron nonlinearities—including those of different cell types—combine with connectivity to shape population activity and function.


Introduction
A fundamental goal in computational neuroscience is to understand how network connectivity and intrinsic neuronal dynamics relate to collective neural activity, and in turn drive neural computation. Experimental advances are vastly expanding both the scale and the resolution with which we can measure both neural connectivity and neural activity. Simultaneously, a wealth of new data suggests a possible partitioning of neurons into cell types with both distinct dynamical properties and distinct patterns of connectivity. What is needed is a way to link these three types of data: How is it that patterns of connectivity are translated into patterns of activity through neuronal dynamics?
Any model of neural activity should also capture the often-strong variability in spike trains across time or experimental trials. This variablity in spiking is often coordinated, or correlated, across cells which has a variety of implications. First, correlations play an essential role in plasticity of network structure [1][2][3][4]. Theories that describe spiking correlations allow for a self-consistent description of the coevolution of recurrent network structure and activity [5,6]. Second, correlations between synaptic inputs control their effect on postsynaptic neurons: inputs that arrive simultaneously can produce stronger responses than those arriving separately. This has been referred to as "synergy" or "synchronous gain" in early work [7], and the magnitude of this synergy has been measured in the LGN by Usrey, Reppas & Reid [8] and cortex by Bruno & Sakmann [9] (but see [10]). Indeed, the level of correlation in an upstream population has been shown to act as a gain knob for firing rates downstream [11]. Finally, correlated fluctuations in activity can impact the fidelity with which populations can encode information [12,13]. Importantly, the coding impact depends on a subtle interplay of how signals impact firing rates in a neural population and of how noise correlations occur across the population [14][15][16][17][18]. An accurate description of how network connectivity determines the individual and joint activity of neural populations is thus important for the understanding of neural activity, plasticity and coding.
Many studies of collective activity in spiking systems can be traced to the early work of Hawkes on self-or mutually-exciting point processes [19,20]. The Hawkes model is also closely related to the linear response theory that can be used to describe correlations in integrate-and-fire networks [21,22]. Here, each neuron and synapse is linearized around a central "operating point", and modes of collective activity are computed around that point [23][24][25]. Including a nonlinear transfer of inputs to rates in the Hawkes model gives a generalized linear model, which has been applied with considerable success to multi-neuron spike train data [26].
While analyses based on computing modes of collecting activity based on linearized dynamics have led to significant insights, they also impose a limitation. While shifts of the operating point can modulate the linearized dynamics of biophysical models [27], this approach cannot capture the impact of nonlinear neural dynamics at the operating point.
Here, we present a systematic method for computing correlations of any order for nonlinear networks of excitatory and inhibitory neurons. Nonlinear input-rate transfer couples higher-order spike train statistics to lower-order ones in a manner that depends on the order of the nonlinearity. In its simplest form, this coupling shows how pairwise-correlated inputs modulate output firing rates. This generalizes the effects of pairwise correlations on neural gain in single-neurons [7,8,11] and feedforward circuits [28][29][30][31] to networks with high levels of recurrence and feedback.
We begin with simple models and progress to nonlinearly interacting networks of spiking neurons. Our method is diagrammatic, in the sense that the interplay of network connectivity and neural dynamics in determining network statistics is expressed and understood via a systematic series of graphical terms. Such graphs are commonly referred to as "Feynman diagrams" for Richard Feynman, who invented them. We use this diagrammatic expansion to make and explain three main scientific points. First, we show how neural dynamics lead to spike correlations modulating firing rates in a recurrent, nonlinear network. Second, we illustrate an additional role of the prominent 'heavy-tailed" feature of neural connectivity where some neurons have many more connections than others, and some connections are much stronger than others. We show how this feature interacts with nonlinearities to control network activity. And third, we show how different single-neuron nonlinearities affect the dependence of firing rates on correlations.

Diagrammatic expansion for spike train statistics
We will show that any coupled point process model, even one with nonlinearities or negative interactions, has an associated expansion for all spike train cumulants organized by the strength of coupling of higher statistical moments with lower ones (e.g. the influence of the two-point correlation on the mean). The full model we aim to describe is one where each neuron generates a spike train which is conditionally renewal with intensity: Here g ij (t) is a matrix of interaction filters, λ i (t) is the baseline point process drive to neuron i and * denotes convolution: (g * f )(t) = ∞ t0 dt g(t − t )f (t ) (with the integral starting at the initial time for the realization). φ i is the transfer function of neuron i. Neuron j's spike train is dNj dt = k δ(t − t k j ), a sum over Dirac deltas at each of the k spike times. We will take the spike trains to be conditionally Poisson given the input, so that in each time window (t, t + dt), the probability of neuron i generating m spikes is (r i (t)dt) m /m! exp (−r i (t)dt). This corresponds to a generalized linear point process model (GLM), or nonlinear multivariate Hawkes process [32]. In contrast to biophysical or integrate-and-fire models in which spike trains are generated deterministically given the membrane potential (which might, however, depend on noisy input), this model with "escape noise" generates spike trains stochastically with a rate that depends on the "free membrane potential" (i.e. with action potentials removed) [33].
Current methods for the analysis of single-and joint spiking statistics rely on linear response techniques: using a self-consistent mean field theory to compute mean firing rates, and then linearizing the spiking around those rates to determine the stability and correlations. We begin with a simple example highlighting the need to account for nonlinear effects. We take an excitatory-inhibitory network of N E = 200 excitatory (E) neurons and N I = 40 inhibitory (I) neurons, all with threshold-quadratic transfer functions φ i (x) ≡ φ(x) = α x 2 + . In this example we took network connectivity to be totally random (Erdös-Rényi), with connection probabilities p EE = 0.2 and p EI = p IE = p II = 0.5. For simplicity, we took the magnitude of all connections of a given type (E − E, etc.) was taken to be the same. Furthermore, the time course of all network interactions is governed by the same filter g(t) = t τ 2 exp (−t/τ ) (with τ = 10 ms), so that g ij (t) = W ij g(t). W is a matrix of synaptic weights with units of mV, so that the input to φ can be interpreted as the free membrane potential. We set the strength of interactions such that the net inhibitory input weight on to a neuron was, on average, twice that of the net excitatory input weight so that for sufficiently strong interactions, the network was in an inhibitory-stabilized regime [34].
We examined the magnitude and stability of firing rates as we increase the strength of synaptic coupling. We used mean-field theory to predict the firing rates, and predicted their linear stability by the standard measure of the spectral radius of the stability matrix Ψ ij = φ (1) i g ij . (φ (1) i denotes the first derivative of neuron i's transfer function with respect to its input.) As the strength of interactions increases, the mean field prediction for the firing rates loses accuracy ( Fig. 1 A). This occurs well before the mean field theory crosses the stability boundary |Ψ| = 1 (Fig. 1B). Examining simulations as the weights are increased reveals an even more fundamental failure of the theory: before the synaptic weights are strong enough for the mean field theory to be unstable, the simulations reveal divergent firing rates ( Fig. 1C; the raster stops when the instantaneous rates diverge). Spectral radius of the stability matrix of mean field theory as synaptic weights are scaled. Stars indicate the weight values for the simulations below. C) Example realizations of activity for three different interaction strengths. As synapses become stronger, correlated activity becomes apparent. When synapses are strong enough the activity becomes unstable, even though the mean field theory is stable. All plotted firing rates in A) are averaged over the time period before the rates diverged (if they did Rather than restricting theoretical analysis to regimes of extremely low coupling strength or linear models, we here develop a framework for activity cumulants that can apply to models with nonlinear input-spike transfer. This will allow us to properly predict both the spiking cumulants and the location of the rate instability in the nonlinear network above. Thus, we develop a framework for activity cumulants that can apply to models with nonlinear input-spike transfer for strongly coupled networks. The mean field and linear response predictions for spiking cumulants correspond to the lowest order terms of this expansion, and are good whenever that lowest-order truncation is valid.

Input
We will build this framework systematically: We begin with statistics of the drive λ i (t), then consider a filtered point process, g * λ(t). In these simple models we will introduce the method and terminology which we will use to understand the more complicated models. We continue by considering the linearly self-exciting Hawkes process, taking a single neuron so g = g and φ(x) = x, before proceeding to arbitrary nonlinearities φ. Finally, we introduce an arbitrary network structure g. This model is intimately related to the popular generalized linear models (GLMs) for spiking activity, where the nonlinearity φ is commonly taken to be exponential, refractory dynamics can be embedded in the diagonal elements of g, and λ corresponds to the filtered stimulus [26]. The common use of GLMs it to fit them to recorded spike trains, and then ask about the structure of the inferred functional connectivity g. In contrast, we interpret g as reflecting the structural connectivity and synaptic and membrane dynamics of a specified model network and aim to compute statistics of its neurons' spike trains. The derivation given here will be heuristic. A more rigorous derivation of the expansion is given in Methods: Path integral representation.

Introduction to the general framework: Poisson process
An inhomogeneous Poisson process generates counts within a window dt independently with a Poisson distribution at rate λ(t). A spike train produced by this process is where t k is the kth spike time, and N (t) is the spike count. The mean and autocovariance for this process are given by the familiar formulas: where angular brackets denote the expectation across realizations and the subscript c denotes a cumulant, not the moment (i.e. we have subtracted all terms which factor into products of lower moments) [35]. The delta function arises because the process is independent at each time step, so that there is no correlation between events from one time t and any other time t . In fact, because the events are generated independently at each time point, all of the cumulants of this process can be written as where integrating out one of the delta functions puts the second cumulant in the above form. We can interpret this equation as describing a source of events appearing at rate λ(t) at time t that propagate to times t i . In this case, because the events are generated independently at each t, the events only propagate to the same time t. For a general point process, cumulants measured at the collection of times {t i } could be affected by events occurring at any past time, so that we would have to account for how events propagate across time.
The expansion for cumulants we will develop has a natural construction in terms of graphs ("Feynman diagrams"), wherein components of the graph represent factors in each term. A set of defined rules dictate how each term in the expansion is formed from those graphs. While this graphical representation is not necessary to understand the inhomogeneous Poisson process, we describe it in detail in this simple case to develop intuition and introduce terminology. We use cumulants in this construction because they provide the fundamental building blocks of moments; any n-th order moment can be constructed from up to n-th order cumulants). This also simplifies the developed expansion.
To begin, the nth cumulant has n arguments {t i , i = 1, . . . , n}, one for each factor of dN dt in Eq. (5). We represent each of these by an open white vertex in the graph labeled with the time point, t i , and we represent the source term λ(t) with a gray vertex. The white vertices are called "external" vertices whereas the gray vertex is called "internal".
The internal gray vertex represents the intensity of the underlying stochastic process generating events with rate λ(t). The white external vertices represent the spike trains whose statistics we measure at times {t i }. For each delta function, δ(t − t i ), in Eq. 5, we place an edge drawn as a dotted line from the gray vertex to the white vertex, i.e. from the event-generating process to the spike train's measurement times (Fig. 2). More generally, events generated by the source propagate through the graph to affect the external vertices.
In order to construct the term associated with each diagram, we multiply the factors corresponding to edges (delta functions linking t and t i ) or the internal vertex (λ(t)), and then integrate over the time t associated with the internal vertex. This links the generation of events by λ(t) to their joint measurement at times {t i } through the propagator (here δ(t − t i )). For the diagrams shown in Fig. 2, these rules reproduce the cumulant terms in Eq. 5. Note that these graphs are directed, since we only consider causal systems where measured cumulants are only influenced by past events.
In general, a given moment will be the sum of terms associated with many different graphs. For example the second moment is given by . Each term on the right hand side will have a corresponding graph. Moreover, the graph for the second term will include two disconnected components, one for each factor of the mean rate, which appears as in Figure 2. The graphs for the cumulant will always be described by connected graphs.

Filtered Poisson process
We proceed to a simple model of synaptic input: a presynaptic Poisson spike train, with count N (t) and intensity λ(t), drives postsynaptic potentials with shape g(t): Feynman diagrams for the first three cumulants of the inhomogeneous Poisson process. Each dotted edge corresponds to a delta-function connecting the time indices of its two nodes. White nodes denote the measurement times, while gray nodes denote the times at which spikes are generated. The cumulants are constructed by convolving the term corresponding to the gray node with the product of all outgoing edges' terms (Eq. (5)).
where * denotes convolution: (g * f )(t) = ∞ t0 dt g(t − t )f (t ) (with the integral starting at the initial time for the realization). We assume g is normalized, ∞ −∞ g(t)dt = 1, so that gives the magnitude of the filtering.
The cumulants of the postsynaptic potential ν(t) can be calculated directly. In general, they are given by: where the input spikes are generated at times t, arrive at times given by the delta functions and influence the cumulant measured at {t i } through g. Eq. 7 is the same as that for the inhomogeneous Poisson process but with factors of g * . This provides a simple interpretation of Eq. 7: cumulants of the filtered Poisson process are given by taking the cumulants of the underlying Poisson process and examining how they can be filtered through the system at hand.
Similarly to the case for the Poisson process, we can represent the cumulants graphically. We again represent each measurement time on the left hand side of Eq. (7) by an external vertex (Fig. 3a). The convolution of δ and g in Eq. 7 corresponds to an internal time point which we integrate over (denoted by primes). We also represent these internal time points with white vertices that carry a factor of , the magnitude of the filter. We represent the source term λ(t) with a gray vertex. All vertices that are not "external" are called internal. Every internal vertex also carries its own time index, t .
The internal gray vertex again represents the intensity of the underlying stochastic process, λ(t). The white external vertices represent the processes whose statistics we measure at times {t i }. For each delta function, δ(t − t), in Eq. 7, we place an edge drawn as a dotted line from the gray vertex to the white vertex, i.e. from the event-generating process to the arrival time of the event t . In this example an event "arrives" as soon as it is generated. A wavy edge corresponds to the filter, g, and represents the effect of a spike arriving at time t on the output process measured at time t i (Fig. 3b). Events generated by the source thus propagate through the graph to affect the observed, external vertices.
In order to construct the expression associated with each diagram, we again multiply the factors corresponding to each edge in the diagram (e.g. δ(t − t) or g(t i − t )) or internal vertex ( or λ(t)), and then integrate over the times associated with the internal vertices. Note that integration over the internal times t , t , etc, results in the convolutions (g * δ)(t i − t). Integration over the time t associated with the source term corresponds to the outermost integral in Eq. (7) This links the generation of events by λ(t) to their joint measurement at times {t i } through their arrival times (via δ(t − t )), and temporal filtering (g(t i − t )). For the diagrams shown in Fig. 3, these rules reproduce the cumulant terms in Eq. 7. Note that the graphs are directed, as for the expansion we describe the "propagator" term will be causal.
We can simplify the cumulants of this process (and the corresponding diagrammatic representations) by considering the propagator of ν(t) (also known as the linear response or impulse response). The propagator measures the change in ν(t) in response to the addition of one input spike in N (t). We can compute it by taking a functional derivative with respect to the input intensity λ(t): Since the dynamics are linear, this is also equivalent to the change of the expected rate with the addition of one spike to the input, i.e. taking λ(t) ← λ(t) + δ(t − t) and ν(t) c ← ν(t) c + ∆ * δ(t) (or equivalently the Green's function of the expected rate). This allows us to rewrite the cumulants in terms of the input rate and the propagator: which can be represented graphically by introducing a solid, directed edge for ∆(t, t ) (Fig. 3c). The propagator will be a central feature of the expansion for cumulants in more complicated models involving connections among neurons.
Impact of self-excitation on activity statistics of any order: linearly self-exciting process In order to generalize the graphical representation of Poisson cumulants, we begin with a linearly self-exciting process as considered by Hawkes [19]. Let the rate be a linear function of the instantaneous event rate (that is to say the firing rate conditioned on a particular realization of the event history) We assume that g(τ ) and λ(t) are such that r(t) > 0, and ∞ −∞ dτ g(τ ) = 1. If < 1, then an event will generate less than one event on average, and the rate will not diverge. The history dependence of the firing rate will now enter into our calculations. We can compute the expected rate using the self-consistency equation: We provide an alternate derivation of this result that will prove useful below: We construct a perturbative expansion of the mean firing rate and show how this expansion can be re-summed to yield the full rate of the self-exciting process. This procedure can also be applied to obtain cumulants of arbitrary order for this process.
We will begin with a recursive formulation of the self-exciting process. In contrast to the filtered Poisson process of the previous section, here the process with count N generates events, which then influence its own rate, dN/dt. Each event can thus generate others in turn. In the case of a linear filter, g, the following approach is equivalent to the Poisson cluster expansion [36][37][38] and similar to the construction of Feynman diagrams for the first three cumulants of the filtered inhomogeneous Poisson process. A) Cumulant corresponding to the graph. B) Diagrammatic expressions using the filter and the underlying Poisson process. Each dotted edge corresponds to a delta-function connecting the time indices of its two nodes. Each wavy edge corresponds to the filter g connecting the time indices of its two nodes. C) Diagrammatic expressions using the propagator. In all graphs, external white nodes (leaves of the graph) denote measurement times. Gray nodes denote the times at which spikes are generated in the input spike train. Internal white nodes (with time indices t ) denote the times at which input spikes arrive at the postsynaptic neuron. The cumulants are constructed by convolving the term corresponding to the gray node with the product of all outgoing edges' terms (Eq. (7)).
previous linear response theories for spike train covariances [22]. Define the nth order self-exciting process, N n (t), to be the inhomogeneous Poisson process given by: where N 0 (t) and M n (t) are inhomogeneous Poisson processes with rates λ(t) and ν n (t), respectively, where , so ν 0 (t) = g * dN0 dt (t). M n (t) is a process with intensity that depends on a stochastic realization of N n (t); M 0 (t) is a "doubly-stochastic" process. We can generate these processes recursively: To generate N n (t), we use a realization of N n−1 (t) to compute the rate ν n−1 and generate a realization of M n−1 (t). These are added to events generated from the original inhomogeneous Poisson process with rate λ(t) to produce N n (t). We can use this recursive procedure to develop an expansion for the cumulants of the process at a given order in (thus a given order in the self-convolution of g).
Let us compute the value of dN dt (t) in powers of using our recursive approach. The zeroth order solution, dN0 dt (t) , is the rate of the inhomogeneous Poisson process λ(t). At order n, we compute dNn dt (t) using the (n − 1)st order solution in the right hand side of Eq. (12). At first order, using the Poisson solution for dN0(t) dt we get = λ(t) + (g * λ)(t) At second order we similarly arrive at At higher orders we would obtain further terms with additional convolutions with g.
It will be useful to write these expansions in another way, which will allow their form to generalize to non-linear processes: we will construct the cumulants from the baseline rate and the propagator. We can always replace resulting in Fig. 4a shows the graphical representation of this expansion. As before, the order of the moment is given by the number of external vertices and each external vertex carries a measurement time t i . We have three types of internal vertices: two open white vertices that carry factors of (one type has one wavy incoming and one wavy outgoing linethe other has one incoming dotted line and one wavy outgoing line) and one gray vertex (that has one outgoing dotted line). As before, each gray internal vertex corresponds to the source term, and thus represents the factor λ(t). The white internal vertices, and their edges represent how the events generated by the source are propagated through the filter g. Each white vertex corresponds to a possible past event time, t . To construct the cumulant corresponding to a diagram we integrate over all these possible internal times, weighting each by their influence on the current spiking rate. These weights are given by the filters, g, represented by the wavy edges. The graphical representation of dN1 dt (t) (using the delta function as in Eq. 22) is shown in Figure 4a.
We can compute the firing rate of the self-exciting processr(t) as the limit of the nth order self-exciting processes, continuing the process outlined for Eq. 12: where g (n) is the n-fold convolution of g with itself and g (0) (t) = δ(t). Indeed, we can see that this expression forr(t) yields the same recursive self-consistency condition as above:r We can also represent this recursive relation graphically as in Fig. 4b, using a black vertex to denote the mean-field rater(t). The infinite sum defined by Eq. (12) has a specific graphical representation: Notice that the leftmost vertex and wavy line in the right-hand side of Figure 4b (top) can be detached and factored, with the remaining series of diagrams corresponding exactly to those of the mean. This series of subgraphs on the right hand side sums to dN dt (t) , leading to the recursion relation in Eq. 24 (Fig.  4b). This graphical representation is equivalent to the recursion relation.
The propagator, ∆(t, t ), measures the fluctuation in the expected rate (around the mean-field value) in response to the addition of one spike at time t to the drive λ(t).
where for convolutions involving ∆(t, t ), we use the notation As with the expected rater(t), we can examine the propagators of the n-th order self-exciting processes. For the first-order process N 1 (t), Diagrammatic expansion for the mean firing rate and linear response of the self-exciting process. A) First-order approximation of the firing rate. B) Diagrams corresponding to the re-summing of the expansion of the mean field rate (Eq. 24), which is represented by the black dot. C) Diagrams corresponding to the re-summing calculation of the propagator (Eq. 27), which is represented by the solid edge. In all diagrams, time indices associated with internal vertices have been suppressed.
The first term is the propagator of the inhomogeneous Poisson process. The second term of ∆ 1 is the propagator of the filtered Poisson process, Eq. 8. This equation can be represented by the same type of graphs as for the expected rate ( Fig. 4c top), but stand for functions between two time points: the initial time t of the perturbation, and the later time t, at which we are computing the rate of the process. We don't represent these initial and final points as vertices, because the propagator is a function that connects two vertices. However, we still integrate over the times corresponding to the internal vertices since the propagator accounts for the total effect of a perturbation of the source on the observed activity.
In general, the propagator for the nth-order self-exciting process can be computed by taking a functional derivate of the rate with respect to the input rate λ: This recursion relation can be expressed graphically just as for the mean rate ( Fig. 4c, top). Factoring out g * corresponds to popping off an edge and vertex from the series (Fig. 4c, middle). Taking the limit n → ∞ in Eq. 27 yields the self-consistency condition for the full propagator ∆(t, t ) given by Eq. 25, and indicated by the solid black line in Fig. 4c (bottom).
These diagrammatic expansions may seem cumbersome for so simple a model. Even for the self-exciting Hawkes process, however, they allow the fast calculation of any order of spike train cumulant. Let us begin with the second cumulant of the instantaneous firing rate. Again we will construct an expansion in , i.e. powers of g. To zeroth order, this is the inhomogeneous Poisson solution. To first order in we have The first term on the second line is the second cumulant of the inhomogenous Poisson process. The other terms arise from the dependency of the processes M 0 (t) and N 0 (t).
The expectation over M 0 (t) must be performed first, followed by that over N 0 (t), because the process M 0 (t) is conditionally dependent on the realization of N 0 (t), having intensity g * dN0 dt (t) (Eq. (12)). This decomposition relies on the linearity of the expectation operator.
We can construct diagrams for each of these terms using the same rules as before, with the addition of two new internal vertices (Fig. 5a). These new vertices are distinguished by their edges. The first has two outgoing dotted lines representing the zeroth-order propagator δ(t − t ), as in the second cumulant of the inhomogeneous Poisson process. It represents events that are generated by the drive λ(t) and propagate jointly to the two measurement time points, The second new vertex has the same two outgoing lines and one incoming wavy line for the filter g(t, t ) -it represents the fourth term on the right hand side of Eq. (28). This vertex carries a factor of and represents the filtering of past events that then propagate to the two measurement time points.
Continuing the computation of the second cumulant to any order in will result in higher order terms of the linear response and expected rate being added to the corresponding legs of the graph. At a given order n, one leg of each diagram will be associated with a particular term in the expansion, to order n, of the expected rate or the linear response. The second cumulant of dN 2 /dt would thus add diagrams with two filter edges to the diagrams of Fig. 5a, either both on the same leg of the graph or distributed amongst the graph's three legs.
As with the filtered Poisson process, we can simplify this sum of four diagrams for the second cumulant of the first-order self-exciting process. Examining subgraphs of each term on the right-hand side of Fig. 5A reveals a connection to the linear response and mean rate of the first-order self-exciting processes. On each leg emanating from the internal branching vertex, the four terms sum to the product of two copies of the linear responses of the first-order self-exciting process N 1 (t) (compare subgraphs on the top branch of the diagrams in Fig. 5a with Fig. 4a). Similarly, the sum of the legs coming into the branching vertex is the firing rate of N 1 (t) (compare to Fig. 4b). So, we will group the terms on the legs of the graph into contributions to the linear response and the mean (Fig. 5b middle).
When we add the diagrams of up to order n together, we can separately re-sum each of these expansions because of the distributivity of the expectation. So, we can replace the entire series to all orders in with simpler diagrams using the full representations for the linear response and expected rate (Fig. 5b). This can be proved inductively, or by rigorously deriving the Feynman rules from the cumulant generating functional (Methods: Path integral representation). This yields the following result for the second cumulant, which corresponds to the final graph at the bottom far right of Fig. 5b: This is the complete analytic result for the second cumulant of the self-exciting process for fluctuations around the mean field solutionr(t) [19]. It can be represented by the single term on the right-hand side of Eq. 29 and the corresponding single diagram (Fig.  5b, right). Compare this with the filtered Poisson process, which has a diagram of the same topology but with different constituent factors (Fig. 3C, middle row). The Feynman diagrams capture the form of the re-summed perturbative expansions for the cumulants, while the definitions of the vertices and edges capture the model-specific rate,r(t), and propagator, ∆(t, t ).
One might think that the higher cumulants are generated as simply by replacing each leg of the filtered inhomogeneous Poisson process with the correct propagator, along with the rater(t). This would mean that the general cumulant term would be given by: This is incorrect, as many important terms arising from the self-interaction would be lost.
The reason this naive generalization fails is that it neglects the higher-order responses to perturbations in the event rate. For example, the second cumulant responds to perturbations in the rate; this quadratic response impacts the third cumulant. We can see this in the third cumulant of the first-order self-exciting process: The first term is the third cumulant of the inhomogeneous Poisson process. The second and third are generalizations of the terms found in the second cumulant (we have used (t ↔ t ↔ t ) to denote "all other permutations of t, t , t "). These terms are part of the naive expression in (30). The last term is the novel one that arises due to the "quadratic response". It appears when we compute We have to take into account that the process dN0 dt (t) is correlated with the rate of the process dM0 dt (t) (since one is a linear function of the other!). This produces a "cascade" effect that results in the quadratic response. For the first-order process, only one step in the cascade is allowed. By introducing branching internal vertices similar to those in Fig. 5, we can express these somewhat unwieldy terms with diagrams. These are shown in Fig. 6. The cascade of one source spike producing three spikes in the first-order process is represented by the second diagram of Fig. 6a and the cascade of one source spike producing two spikes, one of which then produces another two spikes in the first-order process, is represented by the last diagram of Fig. 6a.
As before, continuing to higher orders in the recursively self-exciting process would add diagrams with additional filter edges along the legs of the graphs in Fig. 6a, corresponding to additional steps in the potential cascades of induced spikes. For example, the third cumulant of the second-order process, dN2 dt (t) dN2 dt (t ) dN2 dt (t ) c , would add diagrams with two filter edges to those of Fig. 6a, with those two filter edges appearing either sequentially on the same leg of the graph or distributed amongst the legs of the graph. We can then use the same ideas that allowed us to resum the graphs representing the second cumulant. As before, we identify the expansions of the mean-field rate,r, and the linear response, ∆, along individual legs of the graph and use the multilinearity of cumulants to resum those expansions to give the diagrams at the bottom of Fig. 6.
Considering the resummed graphs, we have the following result for the third cumulant: The types of diagram developed for up to the third cumulant encompass all the features that occur in the diagrammatic computations of cumulants of linearly self-exciting processes. The general rules for diagrammatically computing cumulants of this process are given in Fig. 7. They are derived in general in Methods: Path integral representation. The graphs generated with this algorithm correspond to the re-summed diagrams we computed above.
For the nth cumulant, i dN dt (t i ) c , begin with n white external vertices labelled t i for each i. Construct all fully connected, directed graphs with the vertex and edge elements shown in Figure 7. For each such fully connected directed graph constructed with the component vertices and edges, the associated mathematical term is constructed by taking the product of each associated factor, then integrating over the time points of internal vertices. The nth cumulant is the sum of these terms. This produces cumulants of up to third order, as recently shown by Jovanović, Hertz & Rotter [38], as well as cumulants of any order. As we show next, this procedure can also be generalized to calculate cumulants in the presence of a nonlinearity, including both thresholds enforcing positive activity (as commonly disregarded in studies of the Hawkes process) and any nonlinear input-rate transfer function.
Nonlinearities impose bidirectional coupling between different orders of activity: nonlinearly self-exciting process Now we include a nonlinearity in the firing rate, so that the process produces events dN/dt with a rate given by We begin by considering the mean-field solutionr which, if it exists, is self-consistently given byr(t) = φ ((g * r)(t) + λ(t)). Thus, as always the mean-field solution is given by neglecting second and higher-order cumulants of the spiking process. Next, we consider the propagator, which as above is the linear response of the rate around the mean field, given by expanding Eq. 34 around the mean-field solutionr(t) and examining the gain with respect to a perturbation of the rate. This propagator obeys: where φ (1) is the first derivative of φ with respect to the input, evaluated at g * r + λ.
We will first develop a recursive formulation of the mean-field rate and propagator, which will be required for calculating cumulants of the full process. First, we Taylor expand φ about λ. For simplicity, consider a quadratic φ so that: II. Construct all directed, connected graphs such that each vertex from I. has a single incoming propagator (∆) edge, using the vertices and edges below. The time variables for each propagator or filter edge should match those of its vertices. Filter edges can only impact internal vertices. Each internal vertex has a unique associated time variable, t .
III. To construct the cumulant: for each graph, multiply the vertex or edge factors together and integrate over the times for all internal vertices. Add the terms so obtained for each graph together.
Internal Vertex or Edge Factor Feynman rules for the self exciting process. These rules provide an algorithm for computing the expansion of the cumulants around the mean field solutionr(t). The dots between the legs of the first two vertices indicate that there are such vertices with any number of outgoing legs greater than or equal to two.
where k is the kth Taylor coefficient, evaluated at λ(t). We now develop the point process dN/dt recursively at each order of the nonlinearity: where M m,n is an inhomogeneous Poisson process with rate 1 g * dNm,n dt (t) and P m,n is an inhomogeneous Poisson process with rate 2 g * dNm,n dt 2 (t) and N 0,0 (t) is an inhomogeneous Poisson process with rate λ(t). To generate a set of events in N at order m in the linear term of φ and order n in the quadratic term of φ, we take events at each previous order, (m − 1, n) and (m, n − 1) and use those to develop M m−1,n (t) and P m,n−1 (t). These, together with N 0,0 (t), give dNm,n dt (t). In contrast to the linear self-exciting process, the quadratic process here is recursively defined on a lattice.
Similar to the case of the linearly self-exciting process, we can use this recursive definition to develop an expansion for the mean-field rate and propagator in powers of 1 and 2 . When we calculate higher-order cumulants, we will identify the expansions of the mean-field firing rate and propagator which will allow us to use them to simplify the resulting diagrams. The mean-field rate to finite order in m, n is once again given by neglecting second and higher-order cumulants of N n,m which allows us to take an expectation inside the quadratic term of Eq. (37). Taking the expectation of both sides of this equation in the mean field approach then yields: For example,r Similarly, the propagator (for the dynamics of the recursive process, linearized around zero) is, to finite order in m, n: To zeroth order in 2 , this yields an expansion of the mean-field rater(t) which takes the same form as the expansion of the rate of the linearly self-exciting process, Eq. 12 and admits the same graphical representation (Fig. 4b). Similarly, a perturbative expansion of the linear response about the mean-field rate to zeroth order in 2 takes the same form as for the linearly self-exciting process (Eq. 27) and admits the same graphical representation (Fig. 4c).
To account for the nonlinear terms arising at first order and greater in 2 , we will need to add another type of internal vertex in diagrammatic descriptions of the cumulants. These vertices, carrying factors of 2 , will have two incoming edges and any number of outgoing edges. Each incoming edge carries the operator g * and the number of incoming edges corresponds to the order in the Taylor expansion of the nonlinearity.

A)
Expansion of the mean firing rate and propagator for the nonlinearly self-exciting process. A) One of the first nonlinear terms of the expansion of the mean-field firing rate, to first order in the quadratic term of the nonlinearity and zeroth order in the linear term. The two diagrams shown correspond to the two terms in (41). B) First nonlinear terms of the expansion of the propagator around the mean-field firing rate.
(It also corresponds to the order of cumulant influencing that vertex's activity. The number of outgoing edges corresponds to the order of cumulant being influenced, locally in that subgraph.) The factor ofr(t) that appears in other vertices is modified to be consistent with the mean firing rate under the quadratic nonlinearity, and will thus obey Eq.(34) above.
The mean-field rate and propagator, to first order and greater in 2 , can be represented diagrammatically using the new vertex (e.g. Fig. 8a, b). Notice that these directed graphs are tree-like, but with their leaves in the past. Repeating these calculations to the next order in 2 can be accomplished by taking the basic structure of e.g. Fig. 8 and, along each leg entering the new vertex for 2 , inserting the previous-order graphs (Figs. 4a, 8a). Including higher-order terms in 1 would require inserting those graphs along the -carrying vertices of Fig. 4a.
We next consider the fluctuations of the firing rate around the mean-field: Again taking a quadratic nonlinearity, we have: where k = φ (k) /k!, evaluated at (g * r)(t) + λ(t). Similarly to before, we recursively define dN/dt: Once again, M m,n is an inhomogeneous Poisson process with rate 1 g * dNm,n dt (t) and P m,n is an inhomogeneous Poisson process with rate 2 g * dNm,n dt 2 (t). N 0,0 (t) now A) Corrections to the mean-field firing rate of the nonlinearly self-exciting process, to quadratic order in the nonlinearity φ. A) The correction to mean-field theory for the firing rate to first order in the quadratic term of φ. B) The full one-loop correction to the mean-field rate.
has rate δλ(t) = λ(t) −r(t). We will begin calculating cumulants of dN/dt to finite order in m, n. The first nonlinear correction to the firing rate appears at first order in 2 : which can be represented diagrammatically using the new vertex (Fig. 9a). Notice that in contrast to the corresponding graph for the mean-field expansion (Fig. 8a), this diagram has a "loop" (a cycle were it an undirected graph). This reflects the dependence of the rate on the second cumulant of the baseline process N 0,0 . This dependence of the firing rate on higher-order spiking cumulants is a fundamental feature of nonlinearities.
Proceeding beyond the first order in both 1 and 2 , we see that the expansion of each term of the nonlinearity depends on the other: so that at each order in 2 we must insert the solution at the previous order in 2 and the same order in 1 (and vice versa). This recursive mixing of expansions between the linear and nonlinear terms of φ seems intractable. However, this joint expansion can be re-summed to obtain the full correction to the firing rates [39]; Methods: Path integral representation. The Feynman rules for the re-summed diagrams of the nonlinearly self-exciting process are given in Fig. 10. For a quadratic nonlinearity, this yields the one-loop correction to the firing rate: where we relabel r as r 1 to denote that this is the one-loop correction and φ (2) is evaluated at (g * r)(t) + λ(t).
Similarly to the tree-level propagator, the propagator with an arbitrary nonlinearity φ can be calculated to one loop by taking the rate dynamics to one loop and expanding to linear order around the mean-field solution: Providing a perturbation to the rate fluctuation, r(t) → r(t) + (t) and differentiating with respect to (t ) yields: where only keeping the first two terms defines the tree-level propagator,∆(t, t ) and the third term is the one-loop correction to the propagator.
The appearance of a loop in the Feynman diagram for the mean rate of the quadratically self-exciting process is a general feature of nonlinearities. It represents the influence of higher-order spike train cumulants on lower order ones. In order to measure that dependency, we can count the number of loops in the graphs. To do this, we add a bookkeeping variable, h. We count the number of loops by multiplying factors of h and 1/h. Each internal vertex adds a factor of 1/h and each outgoing edge a factor of h. In this way every vertex with more than one outgoing edge will contribute a factor of h for every edge beyond the first. h thus effectively counts the order of fluctuations contributed by the vertex. For example, the mean for the linear self-exciting process has a graph with a single internal vertex and a single internal edge, so it is zeroth order in h (Fig. 4b). The two point function, however, having two edges and one internal vertex (Fig. 5b), is first order in h. Similarly, the tree-level diagrams will always contribute a total of h n−1 , where n is the order of the cumulant.
In terms of powers of h, a graph for a nth order cumulant with one loop will be equivalent to a graph for a n + 1st order cumulant with one less loop. Consider cutting one of the lines that form the loop in Fig. 9b at the internal vertex and leaving it hanging. Now the graph for the one-loop correction to the mean rate appears to be a graph for a second cumulant -it has two endpoints. The power counting in terms of h, however, has not changed. The one-loop correction to the mean is of the same order in h as the tree-level second cumulant. In general, we will have that the order h m will be given by where n is the number of external vertices and l is the number of internal loops. The number of loops thus tracks the successive contributions of the higher order fluctuations. This expansion is called the "loop" expansion and is equivalent to a small-fluctuation expansion. If one can control the size of the fluctuations, one can truncate the loop expansion as an approximation for the statistics of the system. One way of doing this with networks is to insure that the interactions are O(1/N ) so that h ∝ 1/N and the expansion becomes a system size expansion. II. Construct all directed, connected graphs such that each vertex from I. has a single incoming propagator (∆) edge, using the vertices and edges below. The time variables for each propagator or filter edge should match those of its vertices. Filter edges can only impact internal vertices. Each internal vertex has a unique associated time variable, t .
III. To construct the cumulant: for each graph, multiply the vertex or edge factors together and integrate over the times for all internal vertices. Add the terms so obtained for each graph together.
Internal Vertex or Edge Factor The Taylor expansion of an arbitrary nonlinearity φ could have infinitely many terms. This would lead, in the recursive formulation, to infinitely many processes {M, P, . . .}. Even after re-summing the recursive formulation, this would leave an infinite number of diagrams corresponding to any given cumulant. There are two ways to approach this. The first is to insist on a perturbative expansion in the non-linear terms, e.g. only consider terms up to a fixed order in the Taylor expansion of the nonlinearity φ.
The second approach to controlling the order of the loop expansion is to consider a regime in which mean field theory is stable as this will also control the fluctuations, limiting the magnitude of the loop contributions [39]. The expansion then breaks down in the regime of a bifurcation or "critical point". In this case, the linear response diverges, causing all loop diagrams to similarly diverge. This is a fluctuation-dominated regime in which mean field theory, along with the fluctuation expansion around it, fails. In that case, renormalization arguments can allow discussion of the scaling behavior of correlations [40].
Interaction between single-neuron nonlinearities and network structure No new concepts are required in moving from a nonlinear self-exciting process to a network of interacting units. Each external and internal vertex must now be associated with a unique neuron index i and the integrations over time for the internal vertices must now be accompanied by summations over the indices of the internal vertices. In addition, the filter g(τ ) must be expanded to include coupling across units. In general, this is given by g ij (τ ) for the coupling from neuron j to neuron i. We will consider the general model of a network of units that generate conditionally Poisson-distributed events, given an input variable. The conditional rate for unit i is given by Similarly, the propagator now obeys These dynamics are qualitatively the same as those of the nonlinearly self-exciting process (Eq. 34 but replace the neuron's own rate with the sum over its presynaptic inputs). Introducing these sums over neuron indices yields the complete set of rules for generating Feynman diagrams for the cumulants of this model, shown in Figure 11. II. Construct all directed, connected graphs such that each vertex from I. has a single incoming propagator (∆) edge, using the vertices and edges below. The time variables and neuron indices for each propagator or filter edge should match its vertices. Filter edges can only impact internal vertices. Each internal vertex has a unique associated time variable and neuron index.
III. To construct the cumulant: for each graph, multiply the vertex or edge factors together, integrate over the times and sum over neuron indices for all internal vertices. Add the terms so obtained for each graph together.
Internal Vertex or Edge Factor Mid-course summary: diagrammatic expansion reveals interplay of network structure and neural nonlinearities in driving neural activity To summarize, we now have in hand a set of tools-a fluctuation expansion-to compute spike train cumulants of arbitrary order in networks of linear-nonlinear-Poisson neurons. This expansion provides a systematic way to account for the synergistic dependence of lower-order activity on higher-order activity through the spiking nonlinearity, and naturally incorporates the full microstructure of the neuronal network. The order of the nonlinearity (for non-polynomials, the order of its Taylor expansion) determines the single-neuron transfer gain (Fig. 12 left). It also determines how activity propagates through the network (Fig. 12). Fluctuation expansion links single-neuron nonlinearities and network structure to determine network activity. The linear response for linear neurons depends on the network structure both explicitly and implicitly, through the mean-field rates. The first nonlinear correction brings in additional explicit and implicit dependencies on the connectivity. We next provide an example of using this fluctuation expansion to compute a cumulant of spiking activity: the first nonlinear correction to the second cumulant. We will compute these using the Feynman rules (Fig. 11) to construct the corresponding diagrams, from which we will write the corresponding equation.
We begin by placing external vertices corresponding to the measurement times, each with one propagator edge coming into it. For the two-point cumulant there are two external vertices (Fig. 13a). The propagators coming into those external vertices can, according to the Feynman rules, arise from either a source vertex or from an internal vertex with incoming filter edges (Fig. 11). If both propagators arise from a source vertex, we arrive at the tree-level diagram of Fig. 5b, which provides the linear prediction for the two-point cumulant. To obtain the first nonlinear correction, we will begin by adding an internal vertex. There are two ways we can do this: with one internal vertex providing the propagators for both external vertices or with the internal vertex providing the propagator of just one external vertex (Fig. 13b, top and bottom respectively).
We next proceed to add another layer of vertices. The internal vertices added in each diagram of Fig. 13b have incoming propagator edges. Those edges could emanate from other internal vertices or from source verticies. We will start by finishing the diagrams where they emanate from source vertices; placing these yields the top two, final diagrams of Fig. 13c. We then continue constructing the diagram with an internal vertex providing the two propagators (Fig. 13c, bottom). Note that if we added an internal vertex to the propagator hitting the t 2 external vertex, it would require at least two incoming filter edges (in order to obey a + b ≥ 3 per the rules of Fig. 11) which would give rise to a second loop in the graph.
This last diagram has a hanging filter edge, which must arise from an internal vertex with one incoming propagator edge. We finish the diagram with that internal vertex and the source vertex providing its propagator (Fig. 13d). We could not add additional internal vertices along that path, since they would either violate n + m ≥ 3 or give rise to more than one loop in the diagram (and thus belong at a higher order in the loop expansion).
Following the rules of Fig. 11, while restricting ourselves to graphs with a certain number of loops (here one) thus allowed us to construct the diagrams correspond to the first nonlinear correction to the two-point cumulant. We next write the equation corresponding to these diagrams. For each of the complete diagrams, we begin at the external vertices, and proceeding from left to right in the graph, multiply the factors corresponding to each edge and vertex together. We finish by summing over indices for all internal vertices and integrating over all internal time points. The contributions from each diagram are then added together. This yields: Note: this construction neglects several diagrams (see Correction (2020)). In the networks we study here, those neglected terms are small (of third order in the coupling strength). Demonstrating the interplay between single-neuron transfer, connectivity structure, and network dynamics Our methods predict how spike time statistics of all orders emerge from the interplay of single-neuron input-output properties and the structure of network connectivity. Here we demonstrate how these methods can be used to predict key phenomena in recurrent spiking networks: the fluctuations and stability of population activity, and stimulus coding. First, we isolate the contributions of nonlinearities in single-neuron dynamics to network activity and coding as a whole. We do so by computing "one-loop" correction terms; these correspond to the first structures in our diagrammatic expansion that arise from nonlinear neural transfer. The one-loop corrections provide for the dependence of nth order spiking cumulants on n + 1st order cumulants. Predictions that would be made by linearizing neural dynamics, as in classic approaches for predicting pairwise correlations [19,22,41] and recent ones for higher-order correlations [38], are described as "tree-level." We show how these one-loop corrections, which give new, explicit links between network structure and dynamics (Fig. 12), predict spiking statistics, stability, and the accuracy of coding in recurrent networks.

Recurrent spike-train correlations drive firing statistics and stability in nonlinear networks
In our analysis of the impact of nonlinear neural transfer on network dynamics, a principal finding was that spike correlations could affect firing rates, as described by the one-loop correction to the mean-field firing rates. In this section we illustrate the importance of this effect in a class of networks under intensive study in neuroscience: randomly connected networks of excitatory and inhibitory cells. We began with a network for which we expect classical theoretical tools to work well, taking the neurons to have threshold-linear transfer functions φ(x) = α x . Here, as long as the neurons do not receive input fluctuations that push their rates below this threshold, the "tree-level" theory that takes transfer to be entirely linear should work well. We then move on to consider nonlinear effects.
As in our original motivational example, we took network connectivity to be totally random (Erdös-Rényi), with p EE = 0.2 and p EI = p IE = p II = 0.5. The magnitude of all connections of a given type (E − E, etc.) was taken to be the same and the time course of all network interactions was governed by the same filter g(t) = t τ 2 exp (−t/τ ) (with τ = 10 ms), so that g ij (t) = W ij g(t). (The matrix W contains synaptic) The net inhibitory input weight on to a neuron was, on average, twice that of the net excitatory input weight.
We examined the spiking dynamics as the strength of synaptic weights proportionally increased (Fig. 14A), and studied network activity with using both theory and direct simulation. Due to the high relative strength of inhibitory synapses in the network, firing rates decreased with synaptic weight (Fig. 14D). The magnitude of spike train covariances (reflected by the integrated autocovariance of the summed excitatory population spike train) increased (Fig. 14E). These changes were also visible in raster plots of the network's activity (Fig. 14B,C).
At a critical value of the synaptic weights, the mean-field theory for the firing rates loses stability (Fig. 14F). The location of this critical point is predicted by the linear stability of the dynamics around the mean-field rate; the spectrum of the propagator ∆(ω) = I − φ (1) g(ω) −1 diverges when the spectral radius of φ (1) g is ≥ 1. (This is also the point where the spectral radius of the inverse propagator crosses zero.) Until that critical point, however, the "tree-level" predictions for both firing rates and spike train covariances (i.e. mean-field theory and linear response theory) provided accurate predictions (Fig. 14D,E). This combination of mean-field theory for firing rates and a linearization around it to predict spike train covariances has a long history in theoretical neuroscience (e.g. [21-23, 41, 42]).
We next give a simple example of how nonlinearities in neural transfer cause this standard tree-level theory (mean-field formulas for rates and linear response theory for covariances) to fail -and how tractable "one-loop" corrections from our theory give a much-improved description of network dynamics. We take the same network as above, but replace neurons' threshold-linear transfer functions with a rectified power law φ(x) = α x p (Fig. 15A). This has been suggested as a good description of neural transfer near threshold [43][44][45][46][47]. For simplicity, we take the power law to be quadratic (p = 2). As we increased synaptic weights, the tree-level theory qualitatively failed to predict the magnitude of spike train covariances and firing rates (Fig. 15D,E black curve vs dots). This occurred well before the mean-field firing rates lost stability (Fig.  15F, black).
Higher-order terms of the loop expansion described above (Nonlinearities impose bidirectional coupling between different orders of activity) provide corrections to mean-field theory for both firing rates and spike train correlations. These corrections represent coupling of higher-order spike train cumulants to lower order cumulants. In the presence of an input-rate nonlinearity, for example, synchronous (correlated) presynaptic spike trains will more effectively drive postsynaptic activity [7,48]. This effect is described by the one-loop correction to the firing rates (Fig. 9).
The one-loop correction for the mean field rate of neuron i in a network is given by the same diagram as the one-loop correction for the nonlinearly self-exciting process, Fig. 9, but interpreted using the network Feynman rules (Fig. 11). This yields: where t 0 is the initial time. This correction was, on average, positive (Fig. 15D for excitatory neurons; also true for inhibitory neurons). Similarly to firing rates, the loop expansion provides corrections to higher-order spike train cumulants. The one-loop correction to the spike train covariances (Eq. (55), derived in Fig. 13) accounts for the impact of triplet correlations (third joint spike train cumulants) on pairwise correlations and provided an improved prediction of the variance of the population spike train as synaptic weights increased (Fig. 15E).
Since the one-loop correction to the firing rates could be large, we also asked whether it could impact the stability of the firing rates -that is, whether pairwise correlations could, through their influence on firing rates through the nonlinear transfer function, induce an instability. This is a question of when the eigenvalues of the propagator diverge-or equivalently, when the eigenvalues of the inverse propagator cross zero. The inverse of the one-loop propagator is given by the "proper vertex" obtained by amputating the outside propagator edges of the one-loop correction to the propagator; or equivalently, calculated from the Legendre transform of the cumulant-generating functional [39]. We can heuristically derive the one-loop stability correction as follows.
The full propagator, ∆, obeys the expansion where∆ is the tree-level propagator, ∆ 1 is the one-loop correction, two-loop corrections are collected in ∆ 2 , and so on (Fig. 16A). The one-loop correction is of the form ∆Γ 1∆ ; the diagram begins and ends with the tree-level propagator, and we label the loop Γ 1 . The first two-loop correction is a chain of loops (Fig. 16A), and so can also be factored as∆Γ 1∆ Γ 1∆ . We can represent this factorization diagrammatically by pulling out the tree-level propagator and the loop Γ 1 (Fig. 16B). Just as at two-loop , and spectral radius of the stability matrix of mean-field theory (F) vs excitatory-excitatory synaptic weight. While excitatory-excitatory weight is plotted on the horizontal axis, all other synaptic weights increase proportionally with it. Black line: tree-level theory. Red line: one-loop correction accounting for impact of the next order (pairwise correlations' influence on mean and triplet correlations' influence on pairwise). Dots: simulation. All dots after the one-loop spectral radius crosses 1 represent results averaged over the time period before the activity diverges. order we were able to factor out a factor∆Γ 1 and obtain the expansion of the propagator to one loop, continuing to higher-orders in the loop expansion of the full propagator would all the rest of the full propagator with factors of∆Γ 1 in front. The remaining terms would have factors starting with the two-loop correction, and so forth.
Pulling out all terms of Eq. (57) that begin with∆Γ 1 and summing them allows us to write (Fig. 16B): We now truncate at one loop, and operate on both sides with the inverse of the tree-level propagator: , revealing that −Γ 1 is the one-loop correction to the inverse propagator. From the Feynman rules (Fig. 11), that factor is: where φ (2) j denotes the second derivative of the transfer function of neuron j, evaluated at its mean-field input (and similar for φ . The eigenvalues of this provide a correction to the stability analysis based on the tree-level propagator. This predicted that the firing rates should lose stability significantly before the bifurcation of the mean-field theory (Fig. 15F, red vs black). Indeed, we saw in extended simulations that the spiking network could exhibit divergent activity even with synaptic weights which the mean-field theory predicted should be stable (Fig. 1C). In summary, mean-field theory can mis-predict the bifurcation of the rate of spiking models since it fails to capture the impact of correlations on firing rates through nonlinear transfer functions.

Impact of connectivity structure on correlation-driven instabilities in nonlinear networks
Recent work has shown that cortical networks are more structured than simple Erdős-Rényi networks (e.g. [49][50][51][52][53][54]). One feature of cortical networks is a broad spread of neurons' in-and out-degree distributions (i.e., the distributions of the number of synaptic inputs that each neuron receives or sends); another is broadly spread synaptic weights. These network properties, in turn, can have a strong impact on population activity [55][56][57][58][59]. Here, we illustrate the link between network structure and activity in the presence of nonlinear neural transfer. To generate structured networks, we began with the type of excitatory-inhibitory networks discussed in the previous section, but took the excitatory-excitatory coupling to have both heavy-tailed degree and weight distributions. Specifically, we took it to have truncated, correlated power law in-and out-degree distributions (Methods: non-Erdős-Rényi network model). We then took to the synaptic weights to be log-normally distributed [49,60]. For simplicity, we took the location and scale parameters of the weight distribution to be the same.
We then examined the network dynamics as the location and scale of the excitatory-excitatory synaptic weights increased. For each mean weight, we sampled the excitatory-excitatory weights from a lognormal distribution with that mean and variance. The excitatory-inhibitory, inhibitory-excitatory and inhibitory-inhibitory weights remained delta-distributed. Each such network specified a weight matrix W, which allowed the methods described previously for computing tree-level and one-loop rates, covariances and stability to be straightforwardly applied. For strong and broadly distributed synaptic weights (Fig. 17B), the network exhibited a similar correlation-induced instability as observed in the Erdős-Rényi network (Fig. 17C) even though mean-field theory predicted that the firing rates should be stable (Fig. 17F, black vs. red curves). As synaptic weights increased from zero, the mean-field theory for firing rates provided a misprediction (Fig. 17D, black line vs dots) and the linear response prediction for the variance of the population spike train also broke down (Fig.  17E, black line vs dots). The one-loop corrections, accounting for the impact of pairwise correlations on mean rates and of triplet correlations on pairwise correlations, yielded improved predictions (Fig. 17D, E red lines) and a much more accurate prediction for when firing rates would lose stability (Fig. 17C, F). These effects were similar to those seen in Erős-Rényi networks (Fig. 15), but the transition of the firing rates occurred sooner, both for the mean field (because of the effect of the weight and degree distributions on the eigenvalues of the weight matrix) and one loop theories (because of the impact of the correlations on the firing rates).

Exponential single-neuron nonlinearities
In the previous section, we investigated how a non-Erdős-Rènyi network structure could amplify the one-loop corrections by increasing spike train correlations. We now examine a different single-neuron nonlinearity: φ(x) = αe x , which is the canonical link function commonly used to fit GLM point process models to spiking data [26]. The exponential has arbitrary-order derivatives, so there is no reason to expect the one-loop description to be a sufficient correction to mean-field theory.
As before, we take the mean synaptic weight onto each neuron in the network to be 0. First, we take excitatory and inhibitory neurons to have the same baseline drive, λ E = λ I = −1.5. As we scale synaptic weights, we see that the one-loop correction is small compared to the tree-level theory for the firing rates, population variances and stability analysis (Fig. 18A-C, red vs. black lines). It nevertheless provides an improved correction for the variance of the excitatory population spike train (Fig. 18B, between 1.5 and 2 mV synaptic weights). The bifurcation of the one-loop theory is close to the bifurcation of the mean-field theory, and before that point the mean-field theory and one-loop corrections both lose accuracy (Fig. 18A,B). This makes sense: when the mean-field theory fails, the only reason that the one-loop correction to the rates would be accurate is if all third-and higher-order spike train cumulants are small. Those higher-order correlations are not small near the instability.
Next, we broke the symmetry between excitatory and inhibitory neurons in the network by giving inhibitory neurons a lower baseline drive (λ I = −2.). This shifted the bifurcation of the mean-field theory and the one-loop correction to much higher synaptic weights (Fig. 19C). For intermediate synaptic weights, we saw that the one-loop correction provided a better match to simulations than the tree-level theory (Fig. 19A, B, between 1 and 1.5 mV synaptic weights). For stronger synapses, however, the simulations diverged strongly from the tree-level and one-loop predictions (Fig. 19A,B, around 1.5 mV synaptic weights). In principle, we could continue to calculate additional loop corrections in an attempt to control this phenomenon. The exponential has arbitrary-order derivatives, however, so we might need infinitely many orders of the loop expansion-suggesting a renormalization approach [39], which is beyond the scope of this article. In sum, with an exponential transfer function we saw that for intermediate synaptic weights, the one-loop correction improved on the tree-level theory. For strong enough synaptic weights, however, both failed to predict the simulations. How soon before the mean-field bifurcation this failure occurred depended on the specific model.

Discussion
Joint spiking activity between groups of neurons can control population coding and controls the evolution of network structure through plasticity. Theories for predicting Using this expansion, we investigated how nonlinear transfer can affect firing rates and fluctuations in population activity, imposing a dependence of rates on higher-order spiking cumulants. This coupling could significantly limit how strong synaptic interactions could become before firing rates lost stability.

Convergence and truncation of the loop expansion
The loop expansion is organized by the dependence of lower-order activity cumulants to higher-order ones. The first-order ("tree level") description of the nth activity cumulant does not depend on higher-order cumulants. One loop corrections correspond to dependence of the order n cumulants on the tree-level n + 1-order cumulants, two-loop corrections correspond to dependence of the order n cumulant on the tree-level n + 1 and n + 2 order cumulants, and so on. This coupling arises from the nonlinearity of the single-neuron transfer function φ (Results: Nonlinearities impose bidirectional coupling between different orders of activity: nonlinearly self-exciting process; Methods: Path integral representation). When the transfer function is linear at the mean-field rates, the tree-level theory provides an accurate description of activity so long as the network is stable (Fig. 14). This corresponds to the 2nd and higher-order derivatives of the transfer function φ with respect to the total input, evaluated at the mean-field rates, being zero. When φ has non-zero 2nd or higher derivatives at the mean-field rates, orders of the loop expansion corresponding to those order derivatives can be important (with one loop corresponding to the second derivative, two loops corresponding to the third derivative, etc.) The magnitude of the n-loop correction depends on two things: the magnitude of the n + 1-order tree-level activity cumulant and the magnitude of the n + 1st derivative of φ at the mean-field rates (i.e. the strength of the coupling to that Recent work has shown that the the magnitude of order-n activity cumulants depend on the motif structure of the network ( Fig. 17; [41,56,57,61]), as well as on the correlation structure of the inputs it receives. We also used a particular form of the interaction kernel, g(t) and assumed that it had unit integral over t; any continuous g(t) with finite integral over t should work as well. This criterion on the integral of g is necessary for the stability of mean-field theory with only excitatory interactions. If g(t) did not have a well-defined integral, an appropriate balance between excitation and inhibition could perhaps still ensure a stable mean-field solution (similar to [62]). If the system lies close to a bifurcation of the mean-field theory, so that the eigenvalues of the propagator diverge, then the mean field theory and this expansion around it can also fail. In that case, renormalization arguments can allow the discussion of the scaling behavior of correlations [40].

Relationship to other theoretical methods
A classic and highly influential tool for analyzing the dynamics of neural rate models with Gaussian-distributed synaptic weights is dynamical mean field theory, which reveals a transition to chaotic rate fluctuations in networks of rate units with Gaussian connectivity [63]. Dynamical mean field theory proceeds, briefly, by taking the limit of large networks and replacing interactions through the quenched heterogeneity of the synaptic weights by an effective Gaussian process mimicking their statistics. Recent extensions of dynamical mean field theory have incorporated a number of simple biological constraints, including positive-valued firing rates [64][65][66] and certain forms of cell type-specific connectivity [64,67,68]. In this framework, spiking is usually only described in the limit of slow synapses as additive noise in the rates which can shift the transition to chaotic rate fluctuations to higher coupling strengths and smooth the dynamics near the transition [64,69].
An alternative approach to dynamical mean-field theory is to start from the bottom up: to posit an inherently stochastic dynamics of single neurons and specify a finite-size network model, and from these derive a set of equations for statistics of the activity [70,71]. This approach provides a rigorous derivation of a finite-size rate model as the mean field of the underlying stochastic activity, as well as the opportunity to calculate higher-order activity statistics for the activity of a particular network [40,[72][73][74]. This is the approach taken here with the popular and biologically motivated class of linear-nonlinear-Poisson models. A similar approach underlies linear response theory for computing spike train covariances [21,75], which corresponds to the tree level of the loop expansion presented here. For integrate-and-fire neuron models receiving Poisson inputs, Fokker-Planck theory for the membrane potential distributions can be used to calculate the linear response function of an isolated neuron [76], which together with the synaptic filter and weight matrix determines the propagator.

Dynamics and stability in spiking networks
Fluctuations in large spiking networks Networks of excitatory and inhibitory neurons with instantaneous synapses have been shown, depending on their connectivity strengths and external drive, to exhibit a variety of dynamics, including the "classical" asynchronous state, oscillatory population activity, and strong, uncorrelated rate fluctuations [24,77,78]. The classical asynchronous state and oscillatory regimes exist in the presence of Poisson-like single-neuron activity, either due to external white noise or to internally generated high-dimensional chaotic fluctuations [79,80]. Transitions between these modes correspond to bifurcations in which a given state loses stability. The present results allow one to compute these transition points with greater accuracy, by explicitly computing correlations of arbitrary order and, crucially, how these correlations "feed back" to impact firing rates and the stability of states with different rates.
Inhibitory-stabilized and supralinear-stabilized networks Beyond the overall stability of network states, an important general question is how firing rates depend on inputs in recurrent neural networks. "Inhibitory-stabilized" networks can have surprising dependencies on inputs, with rates trending in opposite directions from what one would at first expect [34]. Supra-linear input-rate transfer in inhibitory-stabilized networks can explain a variety of physiological observations [81][82][83]. Our results are therefore useful in predicting how correlations emerge and couple to firing rates in these networks. The impact of cell type-specific dynamics on dynamics and coding remains to be fully elucidated [84].
A new potential impact of correlations on population coding Many studies have examined the impact of "noise correlations" on population coding, examining the vector of neural responses. If all responses are conditionally independent given the stimulus, the distribution of responses to a particular stimulus is spherical. The discriminability of the responses to two stimuli corresponds to the area of overlap of those multi-neuron response distributions. To tree level in the loop expansion of the population responses, correlations stretch that response distribution. These correlations can either improve or lower coding performance, depending on how they relate to the stimulus-evoked responses [12,[16][17][18]85]. In the presence of a nonlinear transfer function, a further potential impact of correlations is to change neurons' mean activities (Fig. 15. This corresponds to a translation of the multi-neuron response distributions (Fig. 20, bottom) which could, in principle, either increase or decrease their discriminability (Fig. 20, bottom).

Independent Poisson responses
Tree-level theory: Correlations stretch response distribution

non-Erdős-Rényi network model
For Fig. 17, we generated the excitatory-excitatory connectivity with a truncated power-law degree distribution. The marginal distributions of the number of excitatory-excitatory synaptic inputs (in-degree) or outputs (out-degree) obeyed: where d is the in-or out-degree. Parameter values are contained in Table ??; C 1 and C 2 are normalization constants to make the degree distribution continuous at the cutoff L 1 .
The in-and out-degree distributions were then coupled by a Gaussian copula with correlation coefficient ρ to generate in-and out-degree lists. These lists generated likelihoods for each possible connection proportional to the in-degree of the postsynaptic neuron and the out-degree of the presynaptic neuron. We then sampled the excitatory-excitatory connectivity according to those likelihoods.

Path integral representation
Here we outline the derivation of a path integral formalism for a network of processes with nonlinear input-rate transfer, following methods developed in nonequilibrium statistical mechanics [86][87][88][89][90][91]. We will begin by developing the formalism for a simple model, where a spike train is generated stochastically with intensity given by some input ν(t). We will specify the cumulant generating functional of the spike train given ν(t) below.
The general strategy is to introduce an auxiliary variable, called the "response variable", whose dynamics will determine how information from a given configuration (e.g. the spike counts n(t)) at one time will effect future configurations of the dynamics. Introducing the response variable allows us to write the probability density functional for the process in an exponential form. The integrand of that exponential is called the "action" for the model, which can then be split into a "free" action (the piece bilinear in the configuration and response variables) and an "interacting" one (the remainder). Cumulants of the process can then be computed, in brief, by taking expectations of the configuration and response variables and the interacting action against the probability distribution given by the free action.
Let n(t) be the number of spike events recorded since some fiducial time t 0 . In a time bin dt, ∆n events are generated with some distribution p(∆n) and added to n(t). Let the events generated in any two time bins be conditionally independent given some inhomogeneous rate ν(t), so that p(∆n) = p(∆n|ν). So, assuming that initially n(t 0 ) = 0, the probability density functional of the vector of events over M time bins is: where P (ñ i |ν i ) is the Laplace transform of p(∆n i |ν i ) and W [ñ i |ν i ] is the cumulant generating functional for the auxiliary variable. In the third step we have written the distribution of p(∆n i ) as the inverse Laplace transform of the Laplace transform. The Laplace transform variableñ i is our auxiliary response variable. In the fourth step we identified the Laplace transform of the probability density functional as the moment generating functional, so that W [ñ i |ν i ] is the cumulant generating functional of the spike count. Note that these are complex integrals. The contour for the integration over n i is parallel to the imaginary axis.
Taking the continuum limit M → ∞, dt → 0 then yields the probability density functional of the spike train processṅ: where dñi 2πi andṅ = dn dt and we suppress the conditional dependence ofñ(t) on ν(t). In the continuum limit the integral is a functional or path integral over realizations ofñ(t). We will call the negative exponent of the integrand in Eq. 64 the action: We have slightly abused notation here in that a factor of 1/dt has been absorbed into W [ñ]. We will justify this below.
We have not yet specified the conditional distribution of the events given the input ν(t), leaving W [ñ(t)] unspecified. Here, we will take the events to be conditionally Poisson [92], so that (In the continuum limit, the rate ν(t) allowed us to absorb the factor of 1/dt into W . A finite size time bin would produce ν(t)dt events in bin dt.) This representation of the probability density functional yields the joint moment generating functional (MGF) of n andñ: (67) and the moment generating functional ofṅ: The above strictly applies only to the inhomogeneous Poisson process. This formalism is adapted to the self-exciting process by introducing conditional dependence of the rate ν(t) on the previous spiking history. In the discrete case, before taking the limit M → ∞, we say that the rate ν i = φ[n i− ], where φ is some positive function and n i− indicates all spiking activity up to but not including bin i. This requirement is equivalent to an Ito interpretation for the measure on the stochastic processṅ(t). Because of this assumption, the previous derivation holds and we can write whereṅ(< t) =ṅ(s) : s < t. In the continuum limit, there is an ambiguity introduced by the appearance of the time variable t in bothñ(t) andṅ(t). This is resolved in the definition of the measure for the functional integral, and affects the definition of the linear response (below). Again, this is a manifestation of the Ito assumption for our process.
The specific model used in this paper assumes a particular form for the argument of φ. We assume that the input is given by where g(t) is a filter that defines the dynamics of the process in question and λ(t) is an inhomogeneous rate function. The result is that the action for non-linearly self-exciting process is given by The only extension required to move from the above action to the network model is to introduce indices labelling the neurons and couplings specific for each neuron pair. Nothing of substance is altered in the above derivation and we are left with Mean field expansion and derivation of Feynman rules.
We could use the above action in order to derive Feynman rules for these processes. The expansions so described would be equivalent to our initial expansions before resumming (the sets of diagrams that use dashed lines). These would describe an expansion aboutṅ(t) = 0. We can arrive at this expansion by separating the action into two pieces, called the "free" action and the "interacting" action: for some operator K ij (t, t ). Define Taking the expectation with respect to the probability density given by the free action yields so that K is the operator inverse of ∆ under the free action. That expectation can be computed via the moment generating functional for the free action (which we denote Z 0 [J, J]), and then completing the square in order to compute the integral. This leaves which implies that ṅ i (t)ñ j (t ) = ∆ ij (t, t ). We have used the fact that Z 0 [J, J] = 1.
Computing moments requires functional integrals of the form We Taylor expand each neuron's nonlinearity φ (around its λ i (t)) and expand the exponential arising from the cumulant generating functional of the spike counts (that in eñ − 1 ) around zero. We then collect the terms with one power ofñ i and ofṅ i in the free action. This leaves the interacting action S V [ñ,ṅ] as: Note that at each term in this expansion, each of the p factors ofñ i and the q factors of j g ij * ṅ j carries its own time variable, all of which are integrated over; we have suppressed these time variables and their integrals. Now the action can be written as: where we have suppressed the sums over neuron indices and all time integrals. We have defined the "vertex factor" V i pq = φ (q) i /q! (the index p recalls which power ofñ it arrived with). Note that we have defined vertex factors with a minus sign relative to S V [ñ,ṅ]. Introducing the shorthand (g * ṅ) i = j g ij * ṅ j , and then again suppressing neuron indices, we write the moment in Eq. (77) as: ṅ pñq = DñDṅṅ pñq e −S = DñDṅṅ pñq eñ Kṅ+ p,q 1 p! Vpqñ p (g * ṅ) q where we denote the expectation with respect to the free action S 0 [ñ,ṅ] by 0 . Expectations with respect to the free action are determined by its generating functional, Eq. (76). Due to Wick's theorem, any moment will decompose into products of expectation values ∆ ij (t, t ) = ṅ i (t)ñ j (t ) 0 , according to all possible ways of partitioning the operators intoñ,ṅ pairs, i.e.
where the indices i, j, t, t are determined by the partitioning. For the terms in the expansion (82), each term will be decomposed into a sum over ways in which factors of n can be paired with factors ofṅ [93].
We can represent each term in this sum diagrammatically by associating each of the p factors ofṅ and q factors ofñ from the moment with external vertices with a single outgoing or exiting line, respectively. Each vertex factor V pq gets a vertex with p lines exiting to the left (towards the future) and q wavy lines entering from the right (from the past). The partitions of pairingñ and n are determined by connecting outgoing lines to incoming lines. The terms in the expansion with l powers of a vertex factor will also appear l! times in the partitioning. As such, the sum over partitions will result in the cancellation of the factor of l! for vertex factor V pq . All such terms from a vertex factor V pq with p outgoing lines will generate p! copies of the same the same term which will cancel the factor of p!, justifying our definition. Each vertex factor also carries a sum over neuron indices i and an integral over internal time variable which must be performed to compute the moment; these are the sums and integrals we suppressed in Eq. (79).
Thus, in order to compute the terms in the expansion for a moment 1) each factor of n orñ gets an external vertex, 2) every graph is formed using the vertices associated with the vertex factors V nm by constructing the available partitions with all possible vertices, 3) For each vertex, contribute a factor of V nm , 4) for each line contribute a factor of ∆ ij , 5) contribute an operation g * for each wavy line (operating on the term associated with the attached incoming line) and finally 6) all integrals and sums are performed. Note that some of these terms will produce disconnected graphs. These correspond to factorizable terms in the moment.
The rules derived using the action above will produce the initial expansions that we demonstrated about the n = 0 configuration. The "resummed" rules that we present in the Results arise from first performing a slight change of variables in the integrand. Instead of considering the fluctuations about n(t) = 0, we shift the configuration and response variables by their mean field solutions. Definingr i (t) = ṅ i (t) 0 and r i (t) = ñ i (t) 0 , these are determined bỹ We shift by these solutions by defining This leaves us with the action Now we can develop the rules for the expansion we provide in the text using the same procedure outlined above. The only difference is that ∆ ij (t, t ) will be replaced by the linear response around mean-field theory and the vertex factors will be determined by an expansion around the mean field solution. The rules otherwise remain the same. The rules so derived are shown in Figure 11. An expansion around the true mean ṅ(t) would lead to the "effective action", the expansion of which gives rise to the proper vertex factors definiing the different orders of stability correction.
Counting powers of the vertex factors allows one to compute a "weak coupling" expansion. Alternatively, the fluctuation expansion is determined by the topology of graphs and is equivalent to a steepest descent evaluation of the path integral. This allows us to truncate according to the number of loops in the associated graphs and is the approach we use in this paper. The approach here is a standard device in field theory and can be found many texts, for one example see [39].

Correction (2020)
We were recently made aware of a mistake in this article, "Linking structure and activity in nonlinear spiking network" [94]. In Figure 13 we constructed the one-loop correction to the two-point correlation for multivariate generalized linear point process models (also called nonlinear Hawkes processes). Drs. Kordovan and Rotter recently pointed out that we neglected several contributions to that loop correction [95]. Inhibitory-excitatory connection probability 0.5 p IE Excitatory-excitatory connection probability 0.5 p II Excitatory-excitatory connection probability 0.5 τ Time constant for postsynaptic potentials 10 ms g(t) Shape of postsynaptic potentatials (t/τ 2 ) exp (−t/τ ) L 1 Left cutoff for power-law degree distribution 0 L 2 Right cutoff for power-lay degree distribution N E γ 1 Rising exponent for power-law degree distribution .8 γ 2 Falling exponent for power-law degree distribution -1.5 ρ Correlation of Gaussian copula for degree distributions .8 The Feynman diagrams corresponding to these contributions for networks with threshold-quadratic transfer functions are shown in Fig. 21. In the networks we studied, these corrections provide small contributions (Fig. 22), which can be expected since they are of at least third order in the coupling strength.
If the transfer function has higher than second order derivatives, there is an additional contribution to the one-loop correction for the two-point correlation ( [95]; Fig. 23). For the networks with exponential transfer functions, the additional contributions to the one-loop correction for the two-point function were also small (Fig.  24). Dr. Todorov also pointed out that our simulation code for the networks with exponential transfer functions calculated firing rates using the full simulation length (even when rates diverged before the simulation finished); that bug correction improves the match between theory and simulation for the networks with exponential transfer functions ( Fig. 24 vs Figs. 18, 19).
Kordovan & Rotter also suggested an alternative scheme for constructing diagrams at a given loop order by starting from the loops rather than from external vertices, and discuss the possibility of automated algebraic programs for constructing and solving for loop corrections, comparable to systems that exist for particle physics. We agree that this would be quite useful for the future application of these tools to neuroscience models.   Corrected one-loop approximation of the two-point correlation for the excitatory-inhibitory networks with exponential transfer functions of Fig. 18 (left) and Fig. 19 (right) in [94].