Dynamic Finite Size Effects in Spiking Neural Networks

We investigate the dynamics of a deterministic finite-sized network of synaptically coupled spiking neurons and present a formalism for computing the network statistics in a perturbative expansion. The small parameter for the expansion is the inverse number of neurons in the network. The network dynamics are fully characterized by a neuron population density that obeys a conservation law analogous to the Klimontovich equation in the kinetic theory of plasmas. The Klimontovich equation does not possess well-behaved solutions but can be recast in terms of a coupled system of well-behaved moment equations, known as a moment hierarchy. The moment hierarchy is impossible to solve but in the mean field limit of an infinite number of neurons, it reduces to a single well-behaved conservation law for the mean neuron density. For a large but finite system, the moment hierarchy can be truncated perturbatively with the inverse system size as a small parameter but the resulting set of reduced moment equations that are still very difficult to solve. However, the entire moment hierarchy can also be re-expressed in terms of a functional probability distribution of the neuron density. The moments can then be computed perturbatively using methods from statistical field theory. Here we derive the complete mean field theory and the lowest order second moment corrections for physiologically relevant quantities. Although we focus on finite-size corrections, our method can be used to compute perturbative expansions in any parameter.


Introduction
Realistic models of neural networks in the central nervous system are analytically and computationally intractable, presenting a challenge to our understanding of the highly complex spiking dynamics of neurons. Consequently, some degree of simplification is necessary for theoretical progress and there is a corresponding spectrum of models with a range of complexity. ''Mean Field'' models represent the highest degree of simplification and classically consider the evolution of an ''activity'' variable which is some average of the output of a population of neurons. Early examples of mean field models are those of Wilson-Cowan [1,2], Cohen and Grossberg [3], and Amari [4]. These models have proven to be useful in studies of neural dynamics such as in pattern formation and visual hallucinations [5][6][7]. However, because of the nature of the activity variables as averages, they necessarily neglect individual neuron dynamics as well as population level effects of phase information and synchrony. Additionally, it is not clear how the time scales in the equations of mean field models are related to the response properties of the constituent neurons [8].
The next level of model complexity requires relating population level activity to single neuron dynamics. This is a question explored by Knight [9,10], who noted in particular that although the population firing rate may track an external stimulus, the single neuron firing rate need not and generally does not. The important conceptual feature introduced was that a population of neurons, each of which has some potential variable, v, can be replaced with a density, r(v), which counts the fraction of neurons whose potential lies within the infinitesimal range (v,vzdv). The firing rate of the population is then the current density of the population at the threshold potential. In the limit of an infinite population of neurons, one can introduce a continuity equation derived from the single neuron dynamics, producing what can be called density mean field theory. The density mean field approach to analyzing coupled networks has been pursued by Desai and Zwanzig [11], Strogatz and Mirollo [12][13][14], Treves [15], Abbott and van Vreeswijk [16] and others [17][18][19][20][21][22][23][24][25][26]. The spike response formalism considers an integral formulation of the continuity equations [27]. These density mean field approaches have been recently put on a mathematically rigorous footing using results from probability theory [28][29][30][31].
Neuronal firing is inherently variable and the source of this variability has been subject to much study and debate [32][33][34]. Incorporating neuronal variability into theories is another level of complexity. Activity mean field models have been shown to exhibit complex dynamics with high variability when coupled with highly variable connectivity [35][36][37][38], but this is independent of single neuron dynamics. It is not clear in the context of the density mean field approach how to quantify the fluctuations arising from the interactions of discrete neurons in a finite-sized network, where the fluctuations are not suppressed by averaging over an infinite pool of neurons. Ad hoc attempts at quantifying finite-size effects include driving the system with external noise [13], introducing a selfconsistent noise from neural firing [39], or assuming Poisson firing rates of the neurons within the population [17,22,40]. However, a systematic means of handling fluctuations due to the finite size of a population of neurons remain lacking.
Here, we present a systematic expansion around the density mean field behavior that quantifies the finite-size fluctuations and correlations of a population of neurons in terms of the interactions in the network. The expansion utilizes a kinetic theory approach adapted from plasma physics [41][42][43][44][45][46]. Because we are interested specifically in intrinsic fluctuations which arise across the population evolving via deterministic dynamics, we do not include any external ''noise'' or internal stochasticity. The network variability is thus entirely due to the fact that many possible neuron initial conditions and parameters are consistent for a given network, which implies that a given network is selected from an ensemble of networks. One should think of this ensemble as the ensemble of networks consistent with an initial experimental setup, or of those networks which are consistent with the experimentally accessible quantities in the network. In particular, we show that fluctuations and correlations and their effect on population behavior can be quantified in a fully deterministic dynamical system by considering the ensemble of system histories given a distribution of initial conditions and network parameters. In the finite size case, the density r(v) will not represent the fraction of neurons in the network with potential in the interval (v,vzdv) (as it is in the infinite neuron case), but will represent the fraction of networks in the ensemble for which there is a neuron within the interval (v,vzdv). In the cases we consider, there is a ''typical'' system in the large neuron limit, so that the two are nearly identical. To a given order in the network size N, one can derive a moment hierarchy of differential-integral equations for the statistical moments of the density r(v). The calculations are facilitated by transforming the moment hierarchy into a functional or path integral expression of the moment generating functional from which a perturbative expansion can be derived. We show this for two synaptically coupled neural networks in the Results and provide some guidance on generalization to other models in the Discussion.
Our approach is thus in the spirit of Gibbs' view of statistical mechanics [47]. Like Gibbs, we do not rely on ergodicity or make any claims about time averages of the dynamics. The systems we study do not obey detailed balance and thus there will not be a necessary correspondence between time averaging and the ensemble averages we study. Nonetheless, we obtain useful results for characterizing the fluctuations and correlations in a network. We consider a specific example with global coupling where these correlations will have well-defined expansions in terms of the inverse systems size 1=N and we refer to them as ''finite-size'' effects. However, we wish to stress that our approach is not restricted to a finite-size expansion in 1=N per se. Our main result is to provide a systematic framework to ''average'' over unknown or unessential degrees of freedom.

The density description of neural networks
We present a formalism to analyze finite-size effects in a network of N synaptically coupled spiking neurons. Under fairly generic conditions, such a system can be reduced to a set of phase variables with a set of ancillary variables (such as those representing synaptic input) [48][49][50][51]. We consider the phase dynamics of a set of N phase neurons obeying _ u u(t)~{bu(t)zbn(t) ð2Þ where each neuron has a phase h i that is indexed by i, u is a global synaptic drive, n is the population firing rate of the network and t l j is the lth firing time of neuron j and a neuron fires when its phase crosses p. The frequency function F (h,u,c)[R N depends on all the phases h[R N and a set of m parameters c[R m , that can be distinct for each neuron i. The neuron can be in an oscillatory or excitable regime. We will develop our theory for a general frequency function F and apply it to the specific cases of a simple phase oscillator where F (h,u,c)~I(t)zau(t) [10] and the theta model where F (h,u,c)~1{coshz(I(t)zau(t))(1zcosh), where c~(I(t),a), I(t) is an external input and a is a parameter that can be neuron dependent. The theta model is the normal form of a Type I neuron near the bifurcation to firing and is equivalent to a quadratic integrate-and-fire neuron [52]. For some neural networks, a phase reduction of this sort results in a phase coupled model, such as the Kuramoto model (e.g. Hansel and Golomb [53]), which we have previously analyzed [41,42]. In the present paper, we consider all-to-all or global coupling through a synaptic drive variable u(t). However, our basic approach is not restricted to global coupling.
Our goal is to derive the fluctuation and correlation effects beyond mean field theory for the system. For global coupling, these effects arise from the finite number of neurons N in the network. We calculate the effects of finite N on the dynamics of the system as a perturbation expansion in 1=N around the mean field limit of N??. In particular, we will compute the fluctuations and correlations of the synaptic drive u and network firing rate n, defined as the variability over instances of the network given initial conditions as well as neuron and network parameters. We will do this through a probability density functional description of the neuron firing histories. Before we introduce our density functional approach, we describe the Klimontovich description of many-body systems. This description allows us to introduce the fundamental degrees of freedom in a straightforward manner without recourse to the statistical field theory formalism used in the density functional approach. While we focus on finite size effects in

Author Summary
One avenue towards understanding how the brain functions is to create computational and mathematical models. However, a human brain has on the order of a hundred billion neurons with a quadrillion synaptic connections. Each neuron is a complex cell comprised of multiple compartments hosting a myriad of ions, proteins and other molecules. Even if computing power continues to increase exponentially, directly simulating all the processes in the brain on a computer is not feasible in the foreseeable future and even if this could be achieved, the resulting simulation may be no simpler to understand than the brain itself. Hence, the need for more tractable models. Historically, systems with many interacting bodies are easier to understand in the two opposite limits of a small number or an infinite number of elements and most of the theoretical efforts in understanding neural networks have been devoted to these two limits. There has been relatively little effort directed to the very relevant but difficult regime of large but finite networks. In this paper, we introduce a new formalism that borrows from the methods of many-body statistical physics to analyze finite size effects in spiking neural networks.
Finite Size Effects in Spiking Neural Networks this paper, our method could also be used to generate perturbation expansions in other parameters.
Klimontovich description. We adapt the methods of the kinetic theory as applied to gas and plasma dynamics to create a probabilistic description of the network dynamics [45,46]. The approach will allow us to calculate the corrections to mean field theory due to correlations in the firing times of neurons. In particular, we employ a Klimontovich description, which considers the probability density of the phases of a population of neurons (i.e. the density of the empirical measure) where d( : ) is the Dirac delta functional, and h i (t) and u(t) are the solutions to system (1)-(3). The neuron density gives a count of the number of neurons with phase h and parameters c at time t. We have included the parameter vector c i in the neuron density. Hence, neurons are characterized by their phase and parameter values. For systems that obey exchange symmetry or exchangeability (i.e. the system remains unchanged statistically after a relabeling of the neurons), the neuron density in (4) gives a complete description of the system. In systems without exchangeability, the neuron density will still capture the complete dynamics of the system if it includes labels for the information attached to individual neurons. Using the fact that the Dirac delta functional in (3) can be expressed as P l d(t{t l j )~_ h h j d(p{h j ) the population firing rate can be rewritten as The neuron density formally obeys the conservation equation which is known as the Klimontovich equation in kinetic theory and is only valid in the weak or distributional sense since g is not differentiable. The Klimontovich equation, the equation for the synaptic drive (2), and the firing rate expressed in terms of the neuron density (5), fully define the system. For the systems defined above, we expect that in the limit of a large number of neurons the ensemble of networks will converge to a ''typical'' network. In the infinite neuron limit, this will give the density equations of mean field theory, whereas for finite but large N, there will be some variation in systems around the mean field solution. For this reason, we consider taking expectations of the Klimontovich equation (6) over initial conditions and neuron parameters, which produce smooth moment functions for the density. Because the interacting dynamics have a non-trivial effect on the distribution functions, computing this average is not always simple. In the next section, we will formally derive an expression for the measure or density functional P½g,u over which these averages are taken. Denoting averages over initial conditions and neuron parameters (i.e. those over P½g,u) by S : T, the average of (6) yields the equation where Sg(h,c,t)T~r 1 (h,c,t):r(h,c,t) is the first moment of g and called the one-neuron distribution function, which will depend on higher order moments since F is a function of u and hence g.
Equations for the higher order moments can be constructed from (6) by multiplying by factors of g. However, each moment will depend on yet higher moments, resulting in a system of coupled moment equations called the BBGKY hierarchy. Solving the entire BBGKY hierarchy is equivalent to solving the original system and thus provides no computational advantage. However, perturbative solutions in a small parameter such as 1=N can be obtained by truncating the hierarchy and solving the truncated system. This has been the traditional approach in kinetic theory but is generally difficult to do. In the next section, we present a computational formalism where moments for the firing rate and synaptic drive are computed directly from a probability density functional of the neuron density. Mean field theory is obtained by neglecting all correlations and higher order cumulants. Thus, setting SF gT~SF TSgT gives the self-consistent mean field system Higher order moments (distribution functions) r n are likewise defined. The second moment (2-neuron distribution function), r 2 (h,c,t; h',c',t'), is the fraction of networks in the ensemble for which there is a neuron of type c at (h,t) and another of type c' at (h',t'). It is given by We have implicitly defined the function C using the fact that if the neurons are prepared identically and independently, then C~0. We call C the connected contribution and the product of r's the disconnected contribution. These labels are equivalent to whether the contribution can be factored into products of lower moments. The two-neuron density function has connected, disconnected and finite-size (those with factors of 1=N) contributions. The finite-size contributions arise from the deviations in the ensemble average due to finite sample size. There are two types of finite size correction. There is a ''sampling'' correction because of the ''diagonal'' contribution where the indices i,j from the two factors of the neuron density g (4) coincide. Since r 2 represents the joint probability density function of two neurons drawn from the population, there is a finite-size correction due to the fact that once a neuron has been drawn from the population, that neuron's phase h is fixed and the probability density for that neuron is a point mass at that phase. Thus, the sampling finite size term consist of removing 1=Nth of the joint probability mass from p 2 and adding it back as the one-neuron density multiplied by the Dirac delta functional. In the infinite N case, the probability of drawing a strictly identical neuron twice is zero. The second type of finite size effect is due to the coupling and is contained in C (it will be proportional to 1=N). For uncoupled neurons, if the neurons are not prepared such that C=0, then no such correlations will be generated by the dynamics. Note that integration of C over h,c (or h',c') gives 0. One can derive similar expressions for the higher moments, i.e. for the n-neuron densities. There will be connected terms which cannot be factored into products of lower moments, there will be disconnected terms which can be so factored, and there will be finite size corrections given by the combinatorics of drawing n neurons from a population of size N.

Density functional description
We have shown that one tractable approach for incorporating fluctuations and correlations is to truncate the BBGKY hierarchy. However, solving such truncated systems for any model of reasonable complexity quickly becomes unwieldy. For this reason, we adapt the density functional formalism developed for statistical field theory to obtain a formal expression for the probability density functional of the neuron density and synaptic drive P½g,u. The fundamental degrees of freedom in this approach reflect the moments of g, albeit in a more compact and manageable form. The measure P½g,u is a distribution over the possible network realizations. The ''variance'' of this distribution (represented by the two-neuron distribution function) provides an indication of the extent to which different realizations of the network will differ from each other. For the systems we consider, the estimates of the n-neuron distribution functions behave as a power of 1=N. This has the side benefit of demonstrating that there is a limit in which the ensemble converges to a ''typical'' system described by the 1neuron distribution function, r(h,c,t), i.e. the mean field theory. For the same reason, at large N, we can use the n-neuron distribution functions as estimates of the fluctuations in the density for a single system. Because these fluctuations vanish in the limit of large N, we term them ''finite-size'' effects. In the examples below, we concentrate on computing the 2-neuron distribution function to lowest order in 1=N, which gives estimates of fluctuations of the network coupling variables and the firing rate.
In this section we present only final results, the complete derivation and description of the computational method can be found in the Methods. The essential element of the field theoretic method is that the density functional be expressed in the form P½g,u~e {NS½g,u , where S½g,u is called the action. Given this density functional, moments can be obtained by integrating over this density. For example, the second moment of u is given by a functional or ''path'' integral where the measure in the integral is over functions of g and u in some appropriate functional space. A generating functional for all the moments or cumulants can be similarly defined (see Methods). The strategy of field theory is to exploit the fact that Gaussian integrals have closed form expressions in an arbitrary (including infinite) number of dimensions. Hence, the path integrals can be performed using Laplace's method or the method of steepest descents to obtain an asymptotic series expression for the integrals in terms of a small parameter, which in this case will be 1=N.
In general, the action S½g,u is not expressible in simple form. This is overcome by augmenting the system with an auxiliary set of imaginary response functionsg g andũ u and defining an expanded action S½g,g g,u,ũ u. The action can then be Taylor expanded around a critical or saddle point where dS dX~0 (where X [(g,g g,u,ũ u)), which produces an expansion of moments of a ''Gaussian'' distribution, in this case arising from the terms bilinear in the auxiliary variables (g g,ũ u) and the configuration variables (for some variables (z,z z)) have closed form expressions in terms of linear response functions or propagators D and are nonzero only if m~n. This path integral identity can be used to formulate an explicit set of rules to obtain expressions for each term of the perturbation expansion. The computation is simplified by encapsulating the rules for constructing the terms in the expansion into diagrams (i.e. Feynman diagrams, see Methods). The variables in the action can be compared to those in a stochastic differential equation. The original variables (without a tilde, e.g. z(t)) denote the configuration variables, while the auxiliary variables (with a tilde, e.g.z z(t)), denote stochastic or noise forcing terms although in our case the noise is imposed by the uncertainty in the initial conditions and heterogeneity in a fully deterministic network. Finally, the method does not compute the action directly in terms of the neuron density g but rather transforms it to a new set of neuron density variables Q andQ Q through the transformation Q~ge {g g andQ Q~eg g {1. This transformation renders the action to be more amenable to analysis in a way that is similar in spirit to how the Cole-Hopf transformation reduces the nonlinear Burger's equation into the linear heat equation [54]. Specifically it removes the Poisson-like counting noise from the definitions of the moments. As an example, whereas As discussed in Methods, the population level coupling implies that the desired quantities will have an expansion in powers of 1=N.
We describe basic results of this approach on two particular example networks: the phase model and the quadratic integrateand-fire model. For each model, we describe mean field theory, the linear response of the population, and all the correlation functions involving the population and the synaptic drive. Each quantity is calculated to lowest non-trivial order. Phase model. We first apply the formalism on the simple phase model defined by where a is the magnitude of the coupling of a given neuron to the global activity u(t) and V indexes the input. (In analytical terms, V is an element of the sigma algebra representing the realizations of the inputs I(t), for example an instance of Brownian motion input).
The action for the phase model as derived in the Methods has the form represents the contribution of the transformed neuron density to the action and represents the global synaptic drive. The action (11) contains all the information about the statistics of the network. Given the action, mean field theory and a perturbative expansion around mean field theory can be derived using standard methods developed in field theory. The mean field equations, which are given by a critical point of the action, are given by (8), which for parameters a and V are rewritten as For the phase model, we can solve (12) directly for r(h,a,V,t) to obtain where r 0 is the initial distribution. In this case, the functional form given above is also the general (non-mean field) solution, upon replacing r 0 with g 0 . Recall that r 0 is the population distribution averaged over the ensemble of prepared networks. If the neurons are distributed uniformly in phase, then r 0 ! 1 2p . In this case, the global activity does not affect the phase distribution. On the other hand, if the neurons are always prepared at the same phase, then r 0 !d(h{h 0 ), where h 0 is the prepared phase. In this case the neurons will remain in phase.
Solving for r allows us to write a closed integro-differential equation for the synaptic drive Note that as long as r 0 is known, this mean field equation reduces the system from a partial differential equation to a two dimensional ODE, namely: The population behavior is reduced to the synaptic drive dynamics along with the dynamics of a fictitious oscillator H(t). This is the result of the fact that the only important dynamical quantity is the overall phase shift of each neuron from its initial phase and that this quantity is the same for each neuron. Knowing the initial distribution of states is therefore enough to reduce the dimensionality of the system. The steepest descent expansions to the path integrals will be expressed in terms of the propagators or linear response functions D, which appear as the inverses of the integral kernels of the bilinear terms in the actions. The linear response can be derived to order 1=N by linearizing about the solutions of the mean field equation. Because there are two fields in the action (synaptic drive and density), there are four separate propagators: where D a b describes the response in the quantity a to a perturbation in the quantity b, y~Q{r denotes perturbations around the mean field solution, x~(h,a,V,t) and x p~( p,a,V,t). The equations for b~y reflect a perturbation that consists of adding a single neuron to the population with the specified initial condition and parameters.
If we assume a constant input I V (t)~I V then in order to have a steady-state, the mean field must satisfy for a fixed parameter probability density g(a,V), where e~1{ a a 2p , and a a and I I V are the means of a and I V under the distribution g(a,V). The linear response around this solution is which we can immediately solve in closed form to obtain where t k are the firing times of the fictitious oscillator H(t) with initial condition H(t')~h', and is determined by . D u u is the expected form of the linear response upon perturbing the synaptic drive u(t), i.e. exponential decay. The response of u(t) to the population density y, D y u , is a series of exponential pulses at the firing times of the additional neuron, which is what we would expect if we added a single neuron at a given phase. The other propagators govern the response of the population. Since the distribution is uniform and the firing rate does not depend upon phase, perturbing the synaptic drive only makes the entire population fire faster, but does not change the relative phase, thus D u y~0 . On the other hand, adding a single neuron adjusts the population density by a single delta function at the location of the new neuron, hence the form of D y y . The fact that single oscillator perturbations are not damped away by the linear response is an indication that the stationary state is marginally stable. We expect that finite size effects at the next order will stabilize these marginal modes assuming there is some degree of heterogeneity similar to what happens in the Kuramoto model [41,42].
As described in Methods, the expansion of any n-neuron correlation function in powers of 1=N can be computed from the linear response and the ''vertices'' derived from the action S.
Here we give the lowest order contribution to the 2-neuron correlation functions. In addition, this will give us the firing rate fluctuations. For the fluctuations in the synaptic drive du(t) about an arbitrary mean field state u u(t), the diagrams at tree level (O(1=N)) give where ds p~d V''da''dt''. Inserting the expressions for the linear response in the stationary state (15) we obtain : for twt', where the dt k are determined by 2pk~(I V' za' u u 0 )Dt k and d k,0 is the Kronecker delta. The reason for the Kronecker delta term is to account for the limiting process which defines the interaction vertex. Essentially only half the neurons within the vicinity of firing will contribute to the first cycle of firing (about half are above threshold, half under). On subsequent cycles, all neurons will contribute. This issue arises here because of an ambiguity of the continuum representation we are using. The vertex only measures those neurons which have passed threshold, whereas the linear response from (15) considers the limiting behavior of neurons initially configured in the neighborhood of some phase h' (consider the last equation in (15)). If the distribution g(a,V) is smooth, it is more convenient to compute Performing the time integration gives The equal time correlation function has a simpler form: This correlation function quantifies the fluctuations in the global coupling variable u(t) as a function of time. Recall that we defined an initial state in which each neuron is statistically independent in phase and parameters. The time t is the interval elapsed since the network was in that initial state. Sdu(t) 2 T is a measure of the expected variance of the synaptic drive from the mean u(t) at time t. As mentioned above, due to the fact that higher moments of u(t) will be suppressed by higher powers of 1=N, this is also an estimate of the variance of the global coupling as a function of time t from a known mean field configuration. Because the linear response has a Finite Size Effects in Spiking Neural Networks spectrum which includes the spectrum of the single neuron activity, we expect behavior characteristic of the time scales of single neuron dynamics to appear. We now turn to the correlations in the density variable g(h,V,a,t). As discussed in the Methods section, these are given by (let twt' and dg~g{r) The first term is given by expressions derived above. The second term is of the same form as the correlation of the synaptic drive variable.
The above is the general expression. For the fluctuations about steady state, D u y~0 from (15), giving the simple relation which is just the negative of the product of the mean field steady state solutions at each argument x,x' times a factor of 1 N . This term is due to the factor 1{ 1 N from the sampling correction in the two-neuron distribution function (see equation (9) and below). The 2-neuron distribution function is given by At equal times (t~t') we have which shows that (22) is the correction term for the normalization of the two-neuron distribution function. So for the case of the simple phase model, the fluctuations in the density about steady state are given by the sampling fluctuations from the steady state distribution. Note that for large N this means that the variance of the number of neurons at firing (h~p) is equal to the mean times a factor of 1 N , which is equivalent to the Poisson counting assumption of Brunel-Hakim [17]. As we will show in the next section, this will not hold in general. Note the form of the linear response (15) for the term D u y . The fact that the linear response D u y~0 , eliminated the first term in (21), which is the contribution to the fluctuations from the coupling. Comparing to the general form of the linear response (13), we see that the equation for D u y has a source term proportional L h r. Because the phase model has a uniform steady state, this source term is zero. For a model with a non-uniform steady state (such as the quadratic integrate-and-fire model, which we examine in the next section) this will not be the case, and there will be further corrections to the fluctuations in g. It occurs in the phase model because perturbations in the synaptic drive do not perturb the density in steady state. Thus the only fluctuations of the density in steady state are from the sampling fluctuations.
The correlation function between the global coupling and the density is given by (with twt').
Again, the first term is composed of factors derived above. The remaining unique term is given by |r(p,a 00 ,V 00 ,t 00 )zb In steady state, this term is where t m~m in(t,t'), the t k are defined such that h'{p{½I V 'za' u u 0 (t'{t k )~2pk, and k max is the largest k such that t k vt m . The firing rate of the population is given by The mean field solution for this is and in steady state we have Finite Size Effects in Spiking Neural Networks The second moment of the firing rate is given by Using our expression for the variance of g in steady state, we have where dn~n{ n n and the Dt k are such that 2pk~½I V za u u 0 Dt k . At equal time we have the simple form which is equivalent to the Poisson finite size ansatz. The delta function evaluated at zero is a singularity which arises upon attempting to isolate a counting process at a single point on the real line. This can be regularized by considering an estimate of this quantity in a time interval Dt. The variance in the counts will vary as where Dn~ð tzDt=2 t{Dt=2 dn(t') dt'. This indicates that the population firing rate will appear as that from a population of independent Poisson neurons even though the individual neurons are regular. For intuition as to why this is the case, consider dividing up the interval ({p,p) into bins of equal size and distributing N neurons into these bins. This is the initial state of the network when initialized in steady state. The distribution of the neuron counts in each bin will follow a hypergeometric distribution. In the limit of small bin size and large N, the number of neurons in each bin will approximate a Poisson distribution. The factor of 1 N arises from normalizing the coupling by 1 N . Recall that the absence of any other correction is an artifact of the uniformity of the steady state of the phase model. This will not be the case for the quadratic integrateand-fire model. Figure 1 shows comparisons between our analytical predictions and numerical simulations. In (a) through (d), the network the parameters I V and a are constant and homogeneous (i.e. g(a,V)~d(a{ a a)d(I V { I I)). Figure 1 (a)-(c) shows examples of the variance of the synaptic drive as a function of time. As seen in the figures, the correlation function has contributions that appear at the firing times of the fictitious oscillator H(t) (Recall that H(t) is a function which parameterizes the linear response). Each such ''firing event'' produces a new positive transient response in the correlation function. As t??, each firing event produces ever smaller perturbations as the correlation approaches steady state. Note also in those figures that the analytic computation at order 1=N becomes better as N grows larger, and that the overall magnitude scales as 1=N. Deviations are observable for small N, particularly for the case I~b~a~1:00. Note also the firing rate of the fictitious oscillator increases as the population input increases. Comparison of numerical and analytic results for NSdn(t) 2 T is shown in Figure 1 (d). We measured this quantity by binning the firing counts in a time window Dt and have also subtracted the ''Poisson'' contribution. The analytic result is the first term from equation (33). Figure 1 (e) shows the two-time correlation function Sdu(t)du(t')T, where we have fixed t'~100. As expected by our prediction in equation (18), the oscillations are much more pronounced. Figure 1 (f) shows the effects of heterogeneity on the synaptic drive. The drive distribution was chosen to be uniform, with inputs to each neuron chosen from the interval ½0:5,1:5). The oscillations in the synaptic drive are damped by the heterogeneity and there is an effective increase in the mean drive fluctuations as expected from the theory. In this case the heterogeneity clearly dominates as a contribution to the fluctuations, as can be seen by comparing figures 1 (a) and 1 (f), which differ by close to a factor of four in steady state.

The quadratic integrate-and-fire model
The second model we analyze is the quadratic integrate-and-fire model, whose single neuron dynamics are given by This model exhibits a finite-time blow-up that is considered to be ''firing'' at which point the neuron's membrane potential v i (t) is reset to {?. We couple the neurons in the same manner as in the phase model with the synaptic drive u(t). Ermentrout and Kopell mapped this model to an oscillator using the transformation This form of the model is often called the theta model [55]. Hence, the function F is given by: A convenient feature of this model is that neurons cross the firing phase h~p at a constant rate _ h h~2. Defining the neuron density in the same way as before the continuity equation is The action, constructed according to the procedure outlined in the Methods section, is where the population part of the action is Finite Size Effects in Spiking Neural Networks Mean field theory is given by Note that because the firing rate is constant at h~p, the input to the synaptic drive is only dependent upon r(p,V,a,t) and not directly on the synaptic drive itself. It is useful to examine the steady state of this model in some detail. For a constant drive I V (t)~I V , the steady state obeys For (I V za u u 0 )w1, this solution is a unimodal distribution peaked at h~p whose width narrows in proportion to the size of the input. Conversely, for (I V za u u 0 )v1, the peak is at h~0. The higher the input, the more likely it is that any given neuron will be found near the firing phase, h~p. The synaptic drive variable must satisfy a consistency condition: This equation can be viewed as the steady state solution to a Wilson-Cowan type rate equation. The firing rate for the quadratic integrate-and-fire model is given, in the mean field approximation, by n n(t)~2 ð dVdar(p,V,a,t) ð44Þ In steady-state we have n n(t)~2 so that we can identify ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi I V za u u 0 p as the ''gain'' function for the neurons of type (V,a).
The linear response for the coupled theta model is given by the equations: where again x~(h,V,a,t) and x p~( p,V,a,t).
Consider the steady state and transform the angle variable for each a,V with Then we have dw dh~2 This change of variables makes the steady state uniform in h for each V,a. The equations for the linear response in steady state in terms of w are where y~(w,V,a,t) and y p~( p,V,a,t) (note that w(p)~p). The linear response for the theta model is most easily expressed in terms of the Laplace variable s and is given by D u u is similar to the linear response of the synaptic drive in the phase model with the addition of the feedback response of the population through the filter Q(s). D y u is the same as in the phase model with this transformation. It is a series of pulses with the pulse shape given by the linear response and the pulse times determined by the firing times of a fictitious oscillator driven at rate n(a,V,u 0 ).
We also have

Finite Size Effects in Spiking Neural Networks
These results produce the primary qualitative difference between the phase and the theta models. The first term in D y y is analogous to the phase model calculation. It represents a perturbation of adding a single oscillator with initial coordinate w' evolving at rate n(a,V,u 0 ). The second term and the non-zero value of D u y arise from the non-uniform distribution of the steady state, which arises from the functional dependence on h of the neural input function. This term produces deviations from the ''Poisson'' behavior of the firing rate fluctuations.
We can use these expressions to compute the tree level correlations with: with t m~m in(t,t'). The other correlation functions are given by {N ð dw 00 dV 00 da 00 D y y (x; w 00 ,V 00 ,a 00 ,t 0 )r w,V 00 ,a 00 ,t 0 ð Þ Þ | ð dw 00 dV 00 da 00 D y y (x 0 ; w 00 ,V 00 ,a 00 ,t 0 )r(w,V 00 ,a 00 ,t 0 ) and Sdu(t)y(w',V',t')T~2b These are more difficult to put in closed form, other than in terms of the response function for the synaptic drive. Instead we show numerical results. We can use the linear response formulas above to compute analytic formula for steady state. Changing coordinates and using the steady state mean field values we have where the Laplace transform of U(t) is given by and n n 0~1 2p is the firing rate of the population in steady state. The correlations in the synaptic drive variable has the same basic form as that of the phase model. Because of the structure of D y u it will also have the same pulse behavior at an interval defined by a fictitious oscillator evolving according to the population activity. The primary difference is the replacement of the response function for the synaptic drive with the response for the theta coupling and the firing rate with the theta model firing rate.
The two-neuron density function, by contrast, is different by virtue of the non-uniform nature of the steady state. In this case, D u y =0 so there will be a contribution at first order in the perturbation expansion (i.e. tree level) to the density fluctuations.
Similarly, there is an extra term for the correlation function Sdu(t)y(x)T. Each of these correlation functions is only computable in closed form in terms of the response functions, which we compute numerically.
The firing rate fluctuations for the theta model are simpler than the phase model because the input for each neuron is the constant 2 at h~p. For the firing rate obeying the second moment of the firing rate is for twt'. The equal time second moment is given by where the d(0) term has the same meaning as in the phase model. In the phase model case, the analogous expression to the first term on the right hand side was zero, and the population firing rate appeared to be the firing rate of the average of N Poisson firing neurons. In the theta model case, however, there is a correction of order 1=N. From (52), it is simple to show that the firing rate fluctuations in a bin of size Dt obey Comparisons between analytic and numerical results for the quadratic integrate-and-fire model are given in Figure 2. In (a) through (e), the parameters I V and a are constant and homogeneous. One can see the qualitative similarity between the phase and quadratic integrate-and-fire models in the behavior of the activity correlations, Sdu(t) 2 T. Both share the same pulsatile behavior driven by the fictitious oscillator, i.e. both show the spectral characteristics inherited from the single neuron dynamics.

Finite Size Effects in Spiking Neural Networks
The density fluctuations, however, have an effect on the fluctuations in the firing rate. These effects can be seen in Figures 2 (c), (d). In addition to the nontrivial firing rate fluctuation dynamics, the quadratic integrate-and-fire model also shows nearcritical behavior, owing to the phase transition between the ''asynchronous state'' and synchronous firing. For a population with no external drive, this transition occurs at a~0. With I~0, as in Figure 2 (c,) this represents a configuration in which the system is usually not firing, but with the occasional neuron moving across threshold. The reader is encouraged to draw an analogy with ''avalanche'' dynamics, in which the population will briefly fire in bursts and then go silent. While there is a small but fixed average firing rate, the fluctuations are large owing to this transient behavior. Even a small drive will regularize the system, as in Figure 2 (d). The finite size expansion is expected to break down near a phase transition, accordingly here it is expected to breakdown at the onset of synchrony. The breakdown of the expansion is evident in Figure 2 (c), where one can see enormous discrepancy between the analytic and numerical computations. Figure 2 (e) shows the two-time correlation function Sdu(t)du(t')T where t'~100. Figure 2 (f) shows the effects of heterogeneity on the synaptic drive, where the drive distribution was chosen to be uniform, with inputs to each neuron chosen from the interval ½0:5,1:5). The oscillations in the synaptic drive are damped and there is an effective increase in the mean drive fluctuations as expected from the theory. Again the heterogeneity is the dominant contribution to the fluctuations, as can be seen by comparing figures 2 (a) and 2 (f), which differ by close to a factor of six in steady state. Figure 3 shows a comparison of the firing rate fluctuations. In contrast to the Phase Model, there is non-trivial temporal behavior owing to the phase dependence of the neuron dynamics.

Discussion
We have constructed a system size expansion for the density formulation of spiking neural networks and computed the fluctuations and correlations of network variables to lowest order. In particular, we explicitly calculate two-neuron and higher order moments in the network. We have demonstrated our method in globally coupled networks with two different neuron types. We note that all the fluctuations and correlations are ''finite-size'' effects, i.e. they do not exist in mean field theory. There will also be finite-size effects on the mean firing rate and synaptic drive, which could also be calculated using our methods. However, in the systems we studied, the finite-size corrections to the mean field density in the steady state are necessarily zero by neuron conservation. The steady state is uniform and the fluctuation effects will not (for these models) break the symmetry.
The method is based on the Klimontovich equation, which is an exact formal continuity equation for the finite-size neuron density. Solutions to the Klimontovich equation only exist in the weak or distributional sense because the neuron density is a collection of Dirac delta functionals and is not differentiable. In the limit of infinite system size, it can be shown that under some conditions, the neuron density becomes a smooth function that obeys a strong continuity equation called the Vlasov equation [45,46] that describes the mean field dynamics of the system. Previous work on large networks of coupled oscillators took the infinite system size limit immediately and started with the Vlasov equation [11,15,16]. If the oscillators are subjected to white noise, then the Vlasov equation becomes the McKean-Vlasov equation [12][13][14]54,56], which has sometimes been erroneously called a nonlinear Fokker-Planck equation. Recent work has put these density mean field methods onto a rigorous mathematical footing [28][29][30][31]. These authors prove that under reasonable assumptions, a network of stochastically coupled neurons under various conditions conditions will obey the McKean-Vlasov equation (Vlasov equation with diffusion) in the mean field limit. The network obeys the ''propagation of chaos'' property where neurons that are initially statistically independent will remain independent and the fluctuations are purely Gaussian. They also show that a self-consistent set of moment equations for the mean and variance when stochastically forced.
Our approach is based on the traditional Gibbs picture of statistical mechanics, to wit: the variability in the dynamics (in the absence of externally supplied noise or explicitly probabilistic dynamics) is a reflection of the distribution of ''microscopic dynamics'' which are consistent with the ''macroscopic dynamics'', population level variables such as the global coupling, u(t). The fluctuations in the firing rate n(t) arise from the variability across neuron distributions which are approximately consistent with the mean field value. Those variables which converge to well defined values as N?? define the set of ''macroscopic'' variables. In the examples we have shown, the global coupling u(t) and the population density r(h) are considered macroscopic. In a more general network, such as one with heterogeneous coupling, the identification of macroscopic variables is likely to be a more complex issue. Put another way, in our simple cases there is a clear sense of the ''typical'' system for large N to which all initial conditions and parameters approach. There is no general requirement that ''typical'' systems exist.
The Gibbs picture is realized by taking the ensemble average of the Klimontovich equation, which leads to a moment hierarchy where lower ordered moments (or cumulants) of the neuron density depend on higher order moments. The moment hierarchy is an exact ensemble averaged description of the finite-size system. However, in general, solving the moment equations is as difficult if not more difficult than integrating the original system directly. For systems with a well defined large N limit, the moments, such as the two-neuron correlation function, represent the finite size effects. Estimates for the moments can be obtained by truncating the moment hierarchy and solving a reduced system of equations, wherein 1=N is a natural expansion parameter.
A truncated moment hierarchy is still unwieldy to solve. Our approach is to compute the moments directly by constructing a formal expression for the probability density functional of this distribution. This density functional is a ''doubly'' infinite dimensional object since its elements are infinite dimensional functions. Its formal construction hinges on the fact that it is proportional to a point mass (in an infinite dimensional functional space) located at a population density function that obeys the Klimontovich equation. Intuitively, this can be thought of as a Dirac delta functional with the Klimontovich operator as an argument. This expression is rendered computationally useful by noting that a Dirac delta functional in infinite dimensions has a Laplace transform representation where the integration is over a space of functions or fields and a set of imaginary response fields corresponding to the Laplace transform variables. The exponent of the integrand is called the action and fully specifies the distribution over the neuron density and synaptic drive.
Methods developed in quantum and statistical field theory are then employed to construct perturbative expansions for desired quantities such as moments. The expansions use the infinite dimensional analogue of the method of steepest descents. The action is expanded around a critical point at which the gradient is zero. The critical point condition yields mean field theory. The first order correction, or tree level, expands the action to quadratic  order yielding an infinite dimensional Gaussian integral. The integral has a closed form expression in terms of the inverse of the Hessian matrix, which is analogous to the inverse of the covariance matrix of a finite dimensional normal distribution. Just as in a finite dimensional steepest descent expansion, the terms in the perturbative series will be in terms of the elements of the inverse of the Hessian matrix, which in our case correspond to the equations satisfied by the linear responses. Hence, the perturbative expansion of the time dependent moments of the coupled network will be in terms of the linear responses.
Previously, we applied this strategy to the Kuramoto model where oscillators are coupled directly through their phase differences. The corresponding action is a function of the population density together with the response field. The linear response satisfies the linear Vlasov equation. The tree level expression for the second moment of the population density, which captures the fluctuations due to finite-size effects, is identical to a solution of the truncated moment hierarchy known as the Lenard-Balescu solution in plasma physics [45]. Here, we consider a network of neurons coupled via synapses that are triggered whenever a given neuron fires. Hence, the field theory now involves the density and synaptic drive fields with their auxiliary fields. There are now four linear response functions, which makes the computations more complex.
Finite size effects were considered by Brunel and Hakim [17]. They assumed that the connections were sparse enough so that the arrival times of synaptic events at a given neuron would be uncorrelated. They then assumed that these inputs could be modeled by a Poisson process that was scaled by the number of inputs. We considered the opposite regime of a fully connected network. We find that for the phase model, the Poisson ansatz is essentially correct to order 1=N. The theory of coupled diffusions in probability theory provides an explanation called ''propagation of chaos'' where the uncertainty in the initial conditions is propagated forward by the deterministic dynamics of the system [54,56,57].
Our approach generates a natural explanation for Poisson like firing rates in a population of neurons. Indeed, it is a natural consequence of the neurons firing in a stable asynchronous state. The number of neurons firing is the number of neurons out of N randomly chosen that fall into a small bin of size dh around the firing threshold. In the limit of large N, this should follow a Poisson distribution. For this reason, Poisson firing of the population is a natural assumption. However, as we have shown, if the neurons have some phase dependence in their voltage evolution, this will produce fluctuations in the firing rate beyond the simple sampling induced Poisson fluctuations.
The mean field theory for our system is comparable to a differential equation form of the spike response theory [27]. The use of phase oscillators allows for a continuity equation without a jump condition at the boundaries in a threshold crossing integrate-andfire neuron. It may be possible to perform a similar finite size expansion within the spike response theory by incorporating the boundary conditions. The mean field equations have Wilson-Cowan rate equation form in that all the inputs to population activity enters through the firing rate function. This arises because of our choice of the global synaptic drive dynamics where the synaptic inputs of the population are first summed and then ''filtered''. If we had instead chosen synaptic dynamics such that the synaptic inputs are first filtered and then summed, we would arrive at the ''Amari'' formulation of the mean field equations in which the external inputs to the activity equation lie outside of the rate function.
We considered the example of an all-to-all network. In this case, the N-neuron joint distribution for the network obeys exchangeability, which means that the marginalization of the distribution over any set of N{1 neurons yields the same distribution. For such a system, the neuron density function is a complete description of the network. However, we can always write down a neuron density function for any network even if it does not posses an exchange symmetry. For such a situation, the density function still captures useful global dynamics of the network. In the case of heterogeneous neuron parameters, as we considered here, the network is exchangeable in the infinite N mean field limit and close to exchangeable for large but finite N. Hence, our formalism is directly applicable in this case. Such networks are said to be ''self-averaging'' in that the large network can be divided into sub networks, whose average behavior mirrors the full network. However, the situation with heterogeneous connection weights is more complicated. In such a system, it is not certain that the network is self-averaging in the infinite N limit. If so then the mean field equations are not a useful description of the system. An analogy can be drawn to spin glasses, where depending on parameters, the system may or may not be self-averaging. The conditions under which a heterogeneous network of spiking neurons is self-averaging is a question that we wish to pursue in the future.
However, even in the case of a heterogeneous network without self-averaging, we can still apply our formalism if we consider the network to be comprised of local populations which exhibit exchangeability [28][29][30][31]43]. In this case, each local population would be represented by its own neuron density, which are then coupled to other neuron densities. Each local density would obey its own Klimontovich equation and corresponding moment hierarchy. If the local populations are sufficiently large then the hierarchies can be truncated in a finite-size expansion as shown here. However, even if the local populations are not large or even consisting of a single neuron, our formalism could still be applied. A moment hierarchy or density functional for the entire system could still be constructed. Although a perturbation expansion cannot be constructed using the inverse system size as a small parameter, an expansion could still be constructed using some other small parameter such as the inverse of a slow synaptic time constant or the inverse of the number of connections. The mean field limit would consist of a network of coupled local activity fields. This could then be generalized to a network of coupled moment equations such as the activity and correlations. We had previously derived generalized activity equations for an abstract spike count model [43].
There is always a tension in computational neuroscience between detailed realistic models versus simpler reduced models. The main purpose of this work is to build quantitative tools to bridge the gap between the two approaches. We have developed a principled method of coarse-graining a neural system that is relatable to experimentally accessible quantities. Even with the exponential increase in available data and computational power, detailed realistic modeling will still have limitations. For one, a large scale simulation of the brain may not necessarily be easier to understand than the brain itself. An exhaustive exploration of parameter space will be intractable even if Moore's law holds up for centuries. Thus, there will always be a role for theoretical analysis of simple models. However, one of the criticisms of reduced models is that they are ad hoc and cannot be easily linked to the underlying physiology. Hence, there is a need for methods to derive reduced models directly from detailed models. Additionally, one would also like to derive reduced models that can incorporate single neuron effects such as synchronization and correlated firing, which are lost in classical mean field models. This motivated our desire to derive generalized activity equations that include such discrete neuron effects. Applications for generalized activity equations include studying the effects of correlation-based learning rules as seen in spike-timing dependent plasticity, understanding the role of oscillations in motor and sensory processing, and probing the neurophysiological basis of cognitive disorders by analyzing how perturbations to neural parameters affect cortical circuit function.

Action and generating functionals
The population statistics of the network is encoded in a hierarchy of moment functions of the population density, g and the synaptic drive u(t). We now show that these moments can be systematically encoded into a generating functional specified by an action, from which each can be calculated via perturbation theory. The system is fully specified by Equations (2), (5), and (6), which we rewrite as where we have substituted (5) into (2) and the equations are subject to appropriate initial and boundary conditions. We wish to derive a probability density functional P½g,u for the dynamical variables u(t) and g(h,c,t), from which we can derive all statistical measures for the network dynamics. We can factorize the density functional into P½uDgP½g and compute the probabilities separately. The density functionals are usually represented in terms of an action, which for the networks we consider are given by (11) and (39).
The derivation is applicable to any dynamical system, so we derive it for a generic variable x(t) that is governed by the differential equation with an initial probability density for x 0 , P(x 0 ). The dynamical system is fully deterministic and the density functional will describe the ensemble of many such systems starting from different initial conditions. Given the probability density at time t', the probability density at a later time t can be written as where X (t) is the solution of the dynamical system (54) with fixed initial condition X (t')~x(t'). The generating function for the moments of P(x(t)) is given by the Laplace transform of P(x(t)): where the variablex x(t) is called the ''response field''. The moments are obtained from the generating function by taking derivatives with respect tox x(t') and settingx x(t') to zero. The natural log of the generating function is called the cumulant or connected generating function. Derivatives of ln Z generate cumulants, i.e. those contributions to the moments which cannot be factored into products of smaller moments. The nonlinear terms in ln Z therefore represent the ''noise'' or correlations in the distribution being represented. For example, the connected generating function for a Gaussian with mean m and variance s 2 is ln Z½l~mlz 1 2 s 2 l 2 and for a Poisson distribution with mean m it is ln Z½l~m e l {1 À Á . Inserting (55) into (56) yields where we have used the inverse Laplace transform for P(x(t')).
Setting t~t'zDt and taking Dt to be much smaller than any time scale in (54) allows us to write the solution as an Euler step which leads to Any given time interval t[½t 0 ,T) can be divided into M subintervals of length Dt. Repeated application of (58) then expresses the generating functional at time t~T as where t j~t0 zjDt. Taking the M?? limit gives the functional or path integral where the measure is defined as with thex x(t i ) integrations following a contour parallel to the imaginary axis and the x(t j ) integrations following a contour parallel to the real axis. The action S½x x(t),x(t) is where we have integrated by parts and expressed the initial generating functional in terms of the cumulant generating functional W ½x x(t 0 )~ln Z½x x(t 0 ). Note that the bracketed term is the left hand side of the differential equation (54). This property is generic and provides a short cut for deriving the action. Because the initial distribution is normalized we have The path integral thus defines a normalized measure wheñ x x(T)~0. The generating function for the synaptic drive u(t) will directly follow this prescription, where the initial probability density P(u(0)) is for similarly prepared networks. The action will have the form of (59), with (29) replacing the ODE for x. For the population density g, the generating function becomes a generating functional and the expectation value which defines it is a functional integral over the possible values of the field g(h,V,t). Again we introduce a response fieldg g(h,V,t) in order to define where the expectation value is taken over the ensemble of similarly prepared networks. The experimental preparation of the network is equivalent to choosing the initial network configuration from some ensemble distribution. We will address the exact form of this distribution below, but for now it will suffice to note that this implies an initial generating functional for the initial time t 0 . We focus on the time evolution of Z here. The derivation above in terms of the single variable x(t) works equally well in the network case (consider the arguments to the field as indices for a configuration vectorx x(t)). The probability density functional of g at a t is given by where g(h,t) is the solution to the Klimontovich equation (6). This produces the probability density functional P½g,g g~e {S½g,g g ð63Þ with action S½g,g g~N The action completely defines the system and all moments of the ensemble distribution can be computed from it. However, in general, closed form expressions will not be possible and thus perturbation theory is used. The appearance of the factor of N tells us immediately how to calculate finite size corrections to the infinite N network in terms of a perturbative expansion in 1=N (cf. a steepest descent evaluation of a standard integral with integrand !e Nf (x) for large N). For the coupled system, we have both the synaptic variable u(t) as well as the population density g. The action for the coupled system is just the sum of the actions for the variables u(t) and g.
S~S½ũ u(t),u(t)zS½g g(h,c,t),g(h,c,t) ð 65Þ where coupling terms have been included implicitly in the dependence of u and g upon each other. Initial distributions. The initial values of the generating functions will be determined by the ensemble distribution of the Finite Size Effects in Spiking Neural Networks initial state. In the simplest case, we will consider that the initial state of the synaptic drive variable u(t 0 ) is fixed; furthermore, we will choose it to be fixed at u(t 0 )~0. This means that W ½ũ u(t 0 )~ũ u(t 0 )u 0~0 The initial state of the population density g is imposed by the Nneuron distribution of the initial state of the network, r(h h,c c) (note that we use the terminology of plasma physics where the ensemble distribution is equivalent to an N variable joint probability density function). In order to compute the initial state of Z, we must compute the following ensemble average over this distribution.
Using the definition of the population density g (equation (4)), we can write where the index i runs over the neurons in the network. This means that the initial generating functional Z is equivalent to which is the generating function for the ensemble distribution specified by r(h h,c c). Consider an initial distribution that is independent for each neuron, which means that r factors into a product over all neurons in the network. Thus where r 1 is the one-neuron distribution function marginalized from the N-neuron distribution. We choose the notational convention r~r 1 . The first term of an expansion of the logarithm in (69) about r 1~0 gives precisely the term which would appear in a Poisson distribution. The remaining terms account for the sampling corrections to the n-neuron distributions due to a finite number of neurons. If the neurons are not prepared independently then the expressions for the connected n-neuron distributions (such as C) will appear as coefficients of powers ofg g in the exponent along with a combinatoric factor of N n , e.g.
Doi-Peliti-Janssen transformation. Just as the nonlinear terms in the cumulant generating function ln Z½l are the ''noise'' terms, the nonlinear terms in the response fieldsũ u andg g in the actions S determine the correlations in the fields u(t) and g. Since the dynamics are deterministic, the initial distribution for the ensemble is the only part of the action which provides non-trivial correlations. This follows because the only introduction of ''noise'' per se has been through the the fact that the initial conditions and parameters of the network are drawn from a distribution. However, once those are decided, the dynamics are fixed and completely deterministic. It is difficult to compute the effects of fluctuations due to the initial state because the term eg g {1 that appears in the initial generating functional. This term is ''linearized'' by a transformation similar to a Cole-Hopf transformation, which we call the Doi-Peliti-Jannsen transformation [58], given by Q(h,c,t)~g(h,c,t)e {g g(h,c,t) The form ofQ Q is specified by the Poisson distribution, while the form of Q is derived by imposing the requirement that the transformation preserves bilinear derivative forms, i.e. g gL t g?Q QL t Qzboundary terms ð72Þ These boundary terms do not contribute to the moments and can be ignored. The Doi-Peliti-Jannsen transformation replaces the Poisson term eg g(h,c,t) {1 in the action withQ Q(h,c,t). Hence, an action which is bilinear in Q,Q Q represents a Markov counting process whose solution is a Poisson distribution with mean Q~SgT.
In a more general case, the Doi-Peliti Janssen transformation provides an elegant means of expanding around Poisson solutions and is thus useful for models whose statistics should be near Poisson, such as population densities in networks, in which the statistics are essentially coupled counting processes, though not simple ones. The moments of the variables Q are the joint distributions of g with the finite size sampling corrections removed. We call these moments factorial moments or normal ordered moments, borrowing the terminology from the field theory literature. The moments of Q do not include the effects of coincident indices, which is to say they are moments from a distribution without replacement (i.e. there is no probability of drawing the same neuron twice). The distribution implied by the moments of g is derived from drawing with replacement.

Feynman Rules for neural models
The neural models we describe have two different fields, one for the synaptic drive variable and one for the density variable (along with the response field counterparts). As above, the class of models we consider is given by Each term in the action (post expansion) containing anything other than precisely one response field and one configuration field is called a ''vertex'' term because these terms constrain the types of vertices for our diagrams. The terms with one response field and one configuration field are linear responses and correspond to edges of the graphs. For our models, the linear responses are the  Finite Size Effects in Spiking Neural Networks