Estimation of neural network model parameters from local field potentials (LFPs)

Most modeling in systems neuroscience has been descriptive where neural representations such as ‘receptive fields’, have been found by statistically correlating neural activity to sensory input. In the traditional physics approach to modelling, hypotheses are represented by mechanistic models based on the underlying building blocks of the system, and candidate models are validated by comparing with experiments. Until now validation of mechanistic cortical network models has been based on comparison with neuronal spikes, found from the high-frequency part of extracellular electrical potentials. In this computational study we investigated to what extent the low-frequency part of the signal, the local field potential (LFP), can be used to validate and infer properties of mechanistic cortical network models. In particular, we asked the question whether the LFP can be used to accurately estimate synaptic connection weights in the underlying network. We considered the thoroughly analysed Brunel network comprising an excitatory and an inhibitory population of recurrently connected integrate-and-fire (LIF) neurons. This model exhibits a high diversity of spiking network dynamics depending on the values of only three network parameters. The LFP generated by the network was computed using a hybrid scheme where spikes computed from the point-neuron network were replayed on biophysically detailed multicompartmental neurons. We assessed how accurately the three model parameters could be estimated from power spectra of stationary ‘background’ LFP signals by application of convolutional neural nets (CNNs). All network parameters could be very accurately estimated, suggesting that LFPs indeed can be used for network model validation.


Introduction
The traditional physics approach to modeling typically involves four steps: (i) A hypothesis is formulated in terms of a candidate mechanistic mathematical model, that is, a model based on interactions between building blocks of the system, (ii) predictions of experimentally measurable quantities are calculated from the model, (iii) the predictions are compared with experiments, and (iv) if necessary, the hypothesis is adjusted, that is, a new candidate model is proposed. In neuroscience, a descriptive or statistical approach has been more common, in particular in systems neuroscience aiming to understand neural network behaviour in vivo.
Here statistical techniques are used to look, for example, for correlations between measured neural activity and sensory stimuli presented to the animal to estimate receptive fields [1,Ch. 2]. While descriptive models can inform us about neural representations in various brain areas, mechanistic models may explain the biological mechanisms underlying these representations [2].
Starting with Hodgkin and Huxley's development of a mechanistic model for action-potential generation and propagation in squid giant axons [3], mechanistic modeling of neurons is now well established [1,4,5]. Numerous biophysically detailed neuron models tailored to model specific neuron types have been constructed, for example, for cells in mammalian sensory cortex [6,7], hippocampus [8] and thalamus [9,10]. Further, simplified mechanistic point-neuron models of the integrate-and-fire type excellently mimicking experimental data, have been constructed [11,12]. At present, mechanistic network models mimicking specific neural circuits are scarce, however. For small networks like the circuit in the crustacean stomatogastric nervous system comprising a few tens of neurons, some excellent models have been developed [13]. For cortical networks important pioneering efforts to construct comprehensive networks with tens of thousands of neurons mimicking cortical columns in mammalian sensory cortices, have been pursued, e.g., [7,[14][15][16][17]. These models were found to predict spiking activity in rough qualitative accordance with some observed population phenomena (spiking statistics, spike oscillations, . . .). Fitting of cortical network models to trial-averaged multiunit activity (MUA) recorded in somatosensory cortex has been pursued for population firingrate models [18]. However, we do not yet have validated, general-purpose network models that accurately predict experimentally recorded neural activity both in the various 'background' states and as a response to sensory stimulation.
The cortical models above have been compared with experimental spiking activity, that is, the high-frequency part of extracellular electrical potentials. The low-frequency part, the local field potential (LFP), in contrast largely reflects how synaptic inputs are processed by dendrites in the populations of neurons surrounding the electrode contacts [19][20][21]. Several methods for analysis of cortical LFP signals have been developed, see [20,21] for reviews. LFP signals have also been used to validate network models both in cortex [22][23][24] and hippocampus [25][26][27]. However, a systematic exploration of how informative the LFP signal is for network model validation, has not yet been pursued.
In the present work we explore to what extent the LFP signal generated by a neuronal network model can be used to extract the connectivity parameters of the same network. As a model network we consider the so-called Brunel network comprising an excitatory and an inhibitory population of recurrently connected integrate-and-fire (LIF) neurons [28]. Point neurons do not generate extracellular potentials, however, and to compute corresponding LFPs we use a hybrid LFP scheme [29]: First the spiking activity is computed by use of the simulator NEST [30], and next the computed spikes are replayed as presynaptic spikes onto biophysically detailed multicompartmental neuron models to compute the LFP using LFPy [31,32]. The LFP generated by a network depends crucially on the level of temporal correlations of synaptic input onto the neurons [29,[33][34][35]. Thus the LFPs generated by the Brunel network will, as the spiking activity, vary strongly between the different network states as obtained for different choices of network model parameters.
We assess how well network model parameters can be estimated from the stationary 'background' LFP signal. For this, we first train convolutional neural nets (CNNs) [36] with LFP training data for which the underlying model parameters are known, and then test the accuracy of parameter estimation on a separate set of LFP test data. As it turns out, a relatively simple CNN is sufficient for the task and is indeed found to accurately estimate the network model parameters. Thus for the present example, the LFP signal contains sufficient information to accurately recover the underlying model parameters. This suggest that not only spiking data, but also LFPs, can be used to validate candidate network models.

Point-neuron network model
The Brunel network [28] consists of two local populations, one with excitatory and one with inhibitory neurons. These populations of size N E and N I , respectively, consist of leaky integrate-and-fire (LIF) neurons interconnected with current-based delta-shaped synapses. Inputs from external connections are modeled as uncorrelated excitatory synaptic input currents with activation governed by a fixed-rate Poisson process with rate ν ext .
The sub-threshold dynamics of the point-neurons obey a first-order differential equation, cf. Equation (1) and Equation (2) in Table 1. When the membrane potential of a neuron reaches its firing threshold θ, the neuron emits a spike, the synapses onto all its postsynaptic neurons are activated after a time delay t d , and the neuron's membrane potential is clamped to a potential V reset for a refractory period of t ref . Each neuron receives a fixed number of incoming connections (fixed in-degree) from a fraction � of all other local neurons in the network in addition to the external input. The synaptic connection strengths are constant for each population, for excitatory neurons and external input it is given by J E = J and for inhibitory neurons J I = −gJ. The amount of input the local neurons receive from the external population is determined by the parameter η = ν ext /ν thr , where ν thr = θ/(Jτ m ) is the minimum constant rate input that by itself will drive a neuron to its firing threshold, and τ m is the membrane time constant. A complete description of the point-network model is given in Table 1, with specific parameter values given in Table 2.

Forward-model predictions of LFPs
In order to compute local field potentials (LFPs) from the point-neuron network, we utilized the recently introduced 'hybrid LFP scheme' [29] (github.com/INM-6/hybridLFPy), illustrated in Fig 1. The scheme allows for the decoupling of the simulation of spiking dynamics (here computed using point neurons) and predictions of extracellularly recorded LFPs. The latter part relies on reconstructed cell morphologies and multicompartment modeling in combination with an electrostatic forward model. As the complete description of the scheme (including the biophysics-based forward model) and its application with a cortical microcircuit model [15] is given in [29], we here only briefly summarize the main steps taken to predict LFPs from the two-population network described above: To represent each network population we chose one layer-4 pyramidal neuron and one interneuron reconstruction for the excitatory and inhibitory populations, respectively ( Fig 1B). The corresponding morphology files L4E_53rpy1_cut.hoc and L4I_oi26rbc1.hoc were also used in [29] (cf. their Table 7), but the apical dendrite of the pyramidal neuron was cut to make it shorter to better fit our smaller column. The somatic positions of all N E + N I neurons were drawn randomly with homogeneous probability within a cylinder with radius r = 564 μm and height Δz = 100 μm, z between -450 and -350 μm (Fig 1B). This depth spread of 100 μm is in qualitative agreement with what can be expected for a laminarly organized population in rodent cortices. Each excitatory cell where τ m is the membrane time constant, V the membrane potential, R m the membrane resistance, and I the synaptic inputs. • Reset + refractoriness:

Type
Delta-shaped postsynaptic current where the first sum is over all the presynaptic neurons j, including the external ones, and the second sum is over the spike times of those neurons. t j l is the lth spike of presynaptic neuron j, and t d is the synaptic delay. δ denotes the Dirac delta function. morphology was oriented with their apical dendrite pointing upwards in the direction of the positive z-axis and rotated with a random angle around that axis, while inhibitory neurons were rotated randomly around all three axes. The membranes of each morphology were fully passive, with the same membrane time constant τ m as in the point-neuron network.
In the present hybrid scheme the activity in the LFP-generating populations of multicompartment neurons are obtained by mapping spikes generated by individual LIF neurons in the point-neuron network to synapse activation times at specific positions on their equivalent multicompartment neurons. To obtain the synaptic connectivity onto the different positions on the morphologies of the multicompartment neurons, we defined an 'upper' and 'lower' layer (homologous to e.g., layer 2/3 and 4) on the depth intervals [0, z 1 ) and [z 1 , z 2 ), see Fig 1B. The layer-specific connection probability [29, p. 4470-4473] was set so that half of the excitatory synapses onto excitatory neurons were in the upper layer and half in the lower layer. In contrast all inhibitory synapses and excitatory synapses onto inhibitory neurons were positioned in the lower layer. Within each layer, the probabilities for synaptic connections were proportional to the surface area of each compartment normalized by the total compartment surface area within the layer. Only inhibitory synapses were allowed on the soma compartments. The per-neuron synaptic in-degrees were preserved from the network. Since the singular delta-shaped postsynaptic currents (PSCs) cannot be exactly represented in the multicompartment neuron modeling as in the point-neuron network simulations, alpha-function shaped PSCs (Equation (7) in Table 3) with synaptic time constant τ s were used instead. The amplitude of the PSCs was chosen so that the total transferred charge is equal for both synapse types (thus preserving the total synaptic input current between the network and multicompartment neurons). A full description of the multi-compartment neuron model is given in Tables  3 and 4. The presently used choice of current-based synapses and morphologies with passive membranes in the multicompartment neuron models introduces a linear relationship between any presynaptic spike event and contributions to the LFP resulting from evoked currents in all postsynaptic multicompartment neurons. Thus the LFP contribution � j Y ðr; tÞ at position r from a single presynaptic point-neuron neuron j in population Y can, in general, be calculated by the convolution of its spike train n j Y ðtÞ � tÞ. This kernel encompasses effects of the postsynaptic neuron morphologies and biophysics, the electrostatic forward model, the synaptic connectivity pattern, conduction delay and PSCs. (Note that to be in accordance with the notation used in [28], we here use a different notation than in [29] where instead i and X denoted presynaptic neurons, and j and Y denoted postsynaptic neurons.) The resulting LFP due to spikes in a presynaptic population Y is then given by [29] � Y ðr; tÞ ¼ The evaluation of this sum is computationally expensive for large population sizes. For our purposes where the calculation of LFP signals lasting seconds must be repeated tens of thousands of times to have training and test data for the CNNs, this scheme is not feasible. Following [29, Fig 13] we instead use a firing-rate approximation and compute the LFP by a convolution of population firing rates n Y ðtÞ � P j2Y n j Y ðtÞ and averaged kernels � As in [29], these averaged kernels � H Y ðr; tÞ were computed using the full hybrid-scheme. This was done by computing the LFP resulting from a fully synchronous activation of all the outgoing synapses from all neurons in the presynaptic population. Thus for the computation of the Table 3. Description of multi-compartment neuron populations.

Synapse model
Current-based, α-function shaped, fixed strength for each population

Topology
Cylinder of 1 mm 2 cross-section with somas of both populations positioned in single layer of thickness 0.1 mm.

Neuron models Type
Reconstructed multi-compartment morphologies with passive electrical properties Description For each neuron, the membrane potential V n of compartment n connected to m other compartments k, with a surface area a n , length l n and diameter d n is given by: C mn ¼ c m a n ð4Þ Þ=ð4r a ðl n þ l k ÞÞ ð5Þ where for compartment n, C mn is the membrane capacitance, g akn the axial conductance from compartment k, I mn the membrane current, g Ln the membrane leak conductance, E L the extracellular reversal potential, and I jn the synaptic current from presynaptic neuron j.

Synapse type
α-function shaped postsynaptic current Here t a is the activation time of the synapse, J the synaptic strength, and τ s is the synaptic time constant. C is a constant chosen so that JC LFP kernel, we have n j Y ðtÞ � dðt À t Y Þ where t Y is the timing of the synchronous event in population Y. In the application of Eq 10, this computed kernel is then convolved with the population firing rates measured in the point-neuron simulations to compute the LFP.
By using this kernel approach the computational resources needed to run LFP simulations are reduced by several orders of magnitude compared to direct use of Eq 9. To test the accuracy of the approximation of using Eq 10 instead of Eq 9, we compared their LFP predictions for a set of example parameter sets and found in general excellent agreement between the resulting power spectra. A comparison is shown in the lower panels of the first figure in Results.
The kernel � H Y will scale linearly with the postsynaptic strengths of population Y, and is therefore dependent on the parameters J for Y 2 {E, I} and g for Y 2 {I}. The kernels were thus computed only once for a set of reference values for J and g, and for each simulation these reference kernels were scaled accordingly to the particular values of J and g. The LFP was computed across depth through the center of the cylindric volume with a spatial resolution d as illustrated in Fig 1B for the same duration as the network simulations.

Statistical methods
LFP spectral analysis. The power spectral densities P ϕ (r, f) of LFPs ϕ(r, t) in each location r were estimated using Welch's average periodogram method [38]. For this we used the implementation from the Python SciPy package [39] (scipy.signal.welch), with parameters listed in Table 5.
Statistical measures of activity. Two statistical measures were employed to probe the spiking network activity in the different regions of the parameter space. Simulations of 30.5 seconds of the activity were run and used to calculate the statistics, where the first 500 ms of the simulations were discarded. These simulations were run in addition to the ones entering the training and testing of the convolutional neural network (described in the next section), in order to have longer data sets for calculating reliable statistics characterizing the parameter regimes.
The mean network firing rate, including both the excitatory and inhibitory populations was calculated as over all neurons i and their spikes l at spike times t i l . The coefficient of variation (CV) of the inter-spike intervals (ISI) of individual neurons was used as a measure of the irregularity of firing [40]. The presently used mean CV was defined as averaged over all neurons i. As a measure of the degree to which the LFP power spectrum is spread out over different frequencies, we employed the entropy of the normalized power spectrum of the LFP measured in the uppermost channel, defined as whereP � ðf n Þ is the power spectrum of the LFP ϕ(r, t) at frequency f n normalized to unity. Since the power spectrum is computed numerically using Welch's method, this introduces a discretization in frequency space.

Simulation of training and test data
Two different sets of training and test data were created for this study. The first data set is from a large parameter space with η 2  (Table 2) were kept constant across simulations, which were run for a duration of T sim = 3 s. Start-up transients with duration T transient = 150 ms were discarded. LFP signals for all spiking output were computed as outlined above, and as final training and test data we estimated the power spectrum P ϕ (r, f) in each LFP channel.

Parameter estimation by convolutional neural networks
The CNN architecture is illustrated in Fig 3 and fully described in Table 6, and was set up using the Keras machine learning framework [41] running on top of TensorFlow [42]. It consisted of three convolutional layers with 20 filters, each followed by max pooling layers, and two fully connected layers before the output layer. The rectified linear unit (ReLU) function f (x) = max(0, x) was used as the activation function for all layers apart from the output layer, and biases were only used in the fully connected layers. As input, it took the PSD of each LFP channel, a 6 by 151 matrix. The convolutions were done in one dimension, with kernels extending over all LFP channels. There were two fully connected layers, with 128 nodes each, before the output layer consisting of 3 nodes. Each node in the output layer corresponded to a single parameter η, g and J.
The LFP PSD was normalized for each channel by the mean of the sum of the PSD over all frequencies, serving to diminish the variation in amplitude across the different LFP PSDs input to the network, while keeping the variation in amplitude across channels for each single LFP PSD. Each parameter range was linearly mapped to the interval [0, 1] when used as target  quantities during training and inference. The network was trained by batch gradient descent on 40000 of the simulated LFPs, while the final 10000 simulated LFPs were reserved for testing. To train the CNN, we required a loss function which was minimized during training. We defined the loss as the mean squared error of the estimator where â is the estimate (output from the CNN) and a true is the truth ('ground-truth' value) of any vector of normalized network parameters a. The Adam optimizer [43] was used, with a batch size of 100, learning rate of 0.001 and β 1 , β 2 and � Adam parameters of 0.9, 0.999 and 10 −8 respectively. The networks were trained for 400 epochs, and the network weights with the lowest test loss were saved.

Effect of duration of LFP signals
It was a priori not known what duration of the data are required to obtain stable results. To test this, the duration of each LFP simulation was successively extended, the PSD of the LFP was computed using the Welch method, and the CNN was trained with the data to predict the three parameters simultaneously. The test loss during training is shown in Fig 4A. Overall, the loss decreased with training duration and reached a plateau after a certain amount of training epochs. Increase of the duration of each simulation led to a decrease in the loss. This is due to lower noise in the estimated PSDs when it is computed over larger time windows.
The results in the figure suggested that a simulation duration of about 1800 ms would be a good choice, as shorter simulation times decreased the performance. Fig 4B shows the scaling of the minimal test loss (that is, loss obtained in the limit where more training epochs do not improve results) as a function of simulation duration. The � 1= ffi ffi t p least squares fit was motivated by the scaling of the error of the mean, which gives the square root dependence of the standard error of the mean. This scaling assumes uncorrelated experiments, which is not the case when using Welch's method as we do. Nevertheless, the fact that this scaling was observed to hold in our estimations suggested that the error indeed is largely explained by limited amounts of data and that the error decreases as more data is used.

Technical details
Reproducibility. The simulations were done using Python v2.7.12. All point-network simulations were done with the NEST simulator v2.12.0 [30]. The forward-modeling of the LFP was done using hybridLFPy v0.1.3 [29], with NEURON v7.5 [44]. All simulations were run on the Stallo high-performance computing cluster consisting of 2.6 GHz Intel Xeon E5 2670 and 2.8 GHz Intel Xeon E5 2680 CPUs. ffi ffi t p shape was fitted to the data to illustrate that the scaling is explained by limited amount of data and that the error decreases as more data is used. The R 2 score was 0.994. The convolutional neural networks were trained using Python v3.5.2 using Keras v2.2 with TensorFlow v1.10.0 as backend. All simulation and analysis code for this study is available on Github: https://github.com/CINPLA/Skaar_et_al_2020_PLoS_Comput_Biol.

Results
The aim of this study is to investigate the possibility of estimating network model parameters for the Brunel two-population spiking-network model [28] from the stationary 'background' LFP signal. We start by describing this spiking model and its salient dynamical properties and further describe how the resulting spikes can be used in a hybrid scheme to calculate associated LFPs [29]. Then we discuss the estimation performance of a convolutional neural network (CNN) to predict network parameters of the Brunel network model based on LFP data only.

Network model and LFPs
The presently used Brunel network produces four different network states dependent on the post-synaptic potential amplitude of excitatory connections J, the ratio of inhibitory to excitatory connection strength g, and the strength of the external input η relative to the threshold rate, see Fig 2. In the synchronous regular (SR) state the neurons fire regularly and in synchrony. In the asynchronous irregular (AI) state individual neurons have irregular firing and very little synchronization. The synchronous irregular (SI) state, characterised by oscillatory population firing rates, displays highly irregular firing of individual neurons. Example spike raster plots and population firing rates for SR, AI and SI states of the network are shown in the top rows of Fig 5. AI states are commonly believed to be realised in most healthy neural networks in vivo, often characterized by low average pairwise spike-train correlations (see e.g., [45]) and irregular spike trains (see e.g., [46]).
To generate training and test data for the convolutional neural net (CNN), we simulated the network for many combinations of parameters (η 2 [0. 8,4], g 2 [3. 5,8] and J 2 [0.05, 0.4] mV). This parameter space includes parameter combinations giving three of the states described above: AI, SI, and SR (see orange rectangle in Fig 2). For details of the simulation procedure, see Methods.
The LFPs were simulated using the so-called hybrid scheme introduced by [29]. In this scheme, neuronal network activity is predicted by a point-neuron network (here the Brunel network), and the corresponding LFPs are estimated in a subsequent step by 'playing back' spike times as activation times of synapses distributed across reconstructed neuron morphologies representative for each population type. The LFP is then computed from the resulting transmembrane currents combined with an electrostatic forward model derived from volumeconductor theory, as detailed in Methods. An overview over the hybrid scheme, including the geometrical organisation of the 'cortical column' used in the LFP-generating step, is shown in Fig 1. Example LFPs for different network states. In the presently used model set-up, the LFP is linearly dependent on the point-neuron network spiking activity. Any network parameter change affecting ongoing spiking activity will therefore directly affect the LFP. The panels in the lower two rows of Fig 5 show the resulting LFP and corresponding LFP power spectra for five different network parameter combinations of η, g and J.
An example synchronous regular state (SR) is shown in panel A. The simulation showed high regularity and synchrony of the individual spike trains and a strongly oscillating population firing rate. The corresponding LFP generally had a similar time course over all channels, though with opposite phases for the topmost and lower recording channels. The power spectral density (PSD) of the LFP showed a decrease in power with increasing frequency, though with a clear peak at around 330 Hz. This peak was also seen in the PSD of the firing-rate, reflecting the tight relationship between spikes and LFPs.
Two examples of the synchronous irregular state (SI) are illustrated in Fig 5B and 5C, characterised by synchrony of the firing of neurons while individual neurons fire irregularly. In panel B an example with high firing and fast oscillations is shown. Here the power spectrum of the LFP showed two peaks at around 175 and 350 Hz, respectively. Again, the same peaks are found also in the firing rate spectra. In contrast, panel C shows a low-firing SI state with more slowly varying population firing rates, and with a small peak around 30 Hz in the firing-rate and LFP power spectra.
Two examples of the asynchronous irregular state (AI) are shown in the last two panels ( Fig  5D and 5E). As suggested by the name, this state is defined by lack of synchrony between different neurons and irregular firing patterns of each neuron. For the example in panel D, the firing-rate PSD exhibited three high-frequency peaks, with the peak at the highest frequency (*200 Hz) being highest. The same three peaks were found also in the LFP PSD, but now the  , g and J). For each simulation, A-E, the first row shows spike trains from 100 randomly selected neurons across both populations. The second and third row show the population firing rate (including both the excitatory and inhibitory neurons) and its power spectral density (PSD). The final two rows show the LFP signal from all six channels and the PSD of channel 1, respectively. The dashed red lines in the lowest panel shows the LFP PSD computed from spikes in individual neurons (Eq 9) rather than with the presently used population firing-rate approach (Eq 10, black lines) which is computationally much less demanding. In general, the agreement is seen to be very high, the only discrepancy is seen for the SR-state example where the height of the peak around 300 Hz differs. The network states for the five examples (SR/SI(fast)/SI(slow)/AI, see text) are indicated at the top.
https://doi.org/10.1371/journal.pcbi.1007725.g005 peak at the lowest frequency (*70 Hz) was highest. This reflects low-pass filtering effects of the LFP from synaptic and intrinsic dendritic filtering [34,47]. In panel E the recurrent excitation J is much increased compared to the example in panel D. This combined with a reduction of the relative inhibition g, gave much larger LFP signals, as reflected in the high power of the LFP seen for the low frequencies in the LFP PSD. For this parameter set neither the firing-rate PSD nor the LFP PSD exhibited any notable peaks.
Model behaviour across parameter space. To extract network model parameters from recordings of neural activity such as the LFP, the network model parameters must necessarily be reflected in these recordings. After the qualitative discussion above, we proceed to discuss how the network behaves over the entire parameter space. We therefore give an overview of how different spike-and LFP-based measures of neural activity vary across parameter space. Spikes.
Panel A in Fig 6 shows how the mean network firing rate varied over the parameter space. The overall trend was that with increasing g, the firing rate decreased since inhibition was  , g and J). A, Average population firing rates, that is, average firing rate over all neurons and times. The red dots show the parameter values of the examples in Fig 5. B, Mean coefficient of variation (Eq (12)) of the inter-spike intervals over all neurons as a measure of the spiking irregularity. C, Square root of the variance of the LFP signal integrated over time for the topmost channel (channel 1). This measure corresponds to the square root of integral of the power spectrum of the LFP over all frequencies [33], and is referred to as the standard deviation of the LFP (LFP STD). D, LFP Entropy, cf. Eq 13.
https://doi.org/10.1371/journal.pcbi.1007725.g006 increased. The transition at g = 4 resulted from the fact that there are four times more excitatory neurons than inhibitory, and thus for g < 4 excitation dominates network behaviour. For J ≳ 0.15, three separable regions with smooth transitions emerged: A region of high firing rate on the left border of the plots (g ≲ 4), a region of low firing rate on the bottom of the plots (g ≳ 5, η ≲ 1), and a region of intermediate firing rate in the upper right of the parameter space. For smaller values of J (J ≲ 0.1) the transition between the high-firing region and the intermediate-firing region became smoother. Thus, large values of J amplified the differences between the regions. These distinct regions in the firing-rate phase diagram correspond well with the phase diagrams derived by [28], see Fig 2. Panel B in Fig 6 correspondingly displays the parameter dependence of the average coefficient of variation (CV) of the inter-spike intervals. Similar to the population firing rate, one can see a boundary at about g � 4 over a large part of the considered parameter range of J and η. In the region with low η and high g (η ≲ 1.5, g ≳ 5) there was also a distinct area with low CV, reflecting the expected lower CV of the slow-oscillation SI state compared to the AI state.
For small values of J (J ≲ 0.1), there was a region of larger CV visible in the upper right corner of the parameter space (g ≳ 6 and η ≳ 3.6). This region overlaps with fast-oscillation SI state described by [28] (see phase diagram in Fig 2). Panel C shows a measure of the overall signal power of the LFP signal, that is, the standard deviation (STD) for the topmost channel (channel 1). This measure corresponds to the square root of the variance of the LFP signal integrated over all frequencies [33]. In panel C a first observation was that large values of the excitatory weight J led to higher values of the LFP STD, not surprising given the stronger excitatory synaptic inputs. Likewise, it was seen that the LFP STD generally decreased when inhibition, that is, g, increased. Interestingly, despite the very high firing activity for values of g smaller than 4, the LFP STD was small for these parameter values. This can be understood by inspection of panel A in Fig 5 which shows results for an example state with g = 3.5: Even if there are strong bombardments of synaptic inputs onto the LFP-generating excitatory neurons, the input is so clock-like and regular that there is little power in the LFP signal at the lower frequencies. The only strong LFP signal contribution was obtained for frequencies over *300 Hz, corresponding to the peak seen in the firing rate PSD.
The LFP STD in panel C measures the square root of the squared LFP signal across time or equivalently, according to Parceval's theorem, essentially the square root of the integral of the LFP PSD across frequency. In contrast the measure labeled 'LFP Entropy' in panel D measures how much the overall LFP power is spread across the different frequencies, cf. Eq 13 in Methods. The largest entropy value was observed for the smallest excitatory weight (J = 0.05 mV), but the detailed parameter dependence of the LFP entropy was not the main point here. The most important observation was that the parameter dependence of LFP Entropy was qualitatively different from the parameter dependence of LFP STD. This illustrates that the frequencyresolved PSD likely contains more information regarding the underlying network parameters than either the overall amplitude (LFP STD) or the frequency spread (LFP Entropy) alone.

Network parameters are accurately estimated from LFP
After this rough survey over how the LFP for the Brunel network model varies across parameter space, we now ask the question: Can the network parameters be estimated from this LFP by use of Convolutional Neural Networks (CNNs)? We chose to use CNNs because they do not rely on manual feature extraction, and our analysis thus does not depend on any assumption of how the model network parameters are reflected in the LFP. Further, we used the power spectral density (PSD) of the LFP for this analysis, that is, used the PSD as input to the CNNs. This approach removes phase information in the LFP. However, since we only considered LFP data from stationary network activity, the hypothesis was that most of the available relevant information regarding network parameters should the contained in the PSD.
Our CNN consisted of three convolutional layers followed by two fully connected layers. An illustration can be seen in Fig 3, and detailed specifications are given in Methods. We generated several pairs of training and testing data sets for different scenarios. The parameter space was both sampled randomly and on a regular grid. We also generated a training and test data set on a subset of the parameter space, but with the equal amount of simulations.
While several approaches were tested and compared, we defined the following set-up as the standard set-up: The data was simulated using randomly distributed parameters η, g and J with a simulated duration of 2.85 seconds for each trial. From the simulated LFP, the power spectral densities (PSD) for six recording channels were computed and used as input to the CNN. Then, a single CNN network was trained to predict the parameter vectorp ¼ ðZ; g; JÞ simultaneously, and all three parameters were set to contribute equally to the loss function, Eq 14. To achieve this, the parameter ranges of η, g and J were all scaled to the unit interval [0, 1] for the considered part of the parameter space.
To quantify and illustrate the accuracy of the parameter estimation we used the estimation error â − a true where a true was the true value and â the estimated value. Fig 7 (orange lines) shows the accuracy of the three network parameters when considering the full parameter space (η 2 [0.8, 4), g 2 [3.5, 8) and J 2 [0.05, 0.4) mV). As observed, the estimation errors are in all cases generally smaller than 5%. Further, the estimates had small biases, that is, the mean errors were close to zero (see also Table 7).
The full parameter space considered above covered four of the characteristic network states seen for the Brunel network, see orange rectangle in Fig 2. Here the network-generated LFP can be expected to vary substantially across parameter space making the CNN estimation easier. We thus next explored to what extent CNNs could estimate network parameters within a particular state, that is, the AI state which is thought to be most relevant for cortex.
Training and validation of the CNN were repeated using a second data set, fully contained within the AI region (η 2 [1.5, 3), g 2 [4.5, 6) and J 2 [0.1, 0.25) mV), see blue rectangle in Fig  2. The same amount of training and test data were used as for the full parameter space, so effectively the restricted parameter space was more densely sampled. Estimation errors are shown in Fig 7 (blue lines). With a same-sized data set containing only the AI state, the observed error was even smaller than for the full parameter space. Thus focusing on a single network state within which there expectedly is less variation in the LFP, increased the accuracy. However, when using the CNN trained with the data from the full parameter space, the estimation accuracy for a restricted test set containing only the AI state, was reduced (Fig 7, purple lines). The accuracy was still better than when estimating parameters across the full parameter space, though, that is, the purple line was always positioned between the yellow and blue lines in the cumulative plots in Fig 7B. Further, independent of which data set was used, the g parameter was always the one with the largest prediction accuracy compared with η and J.

Highest prediction accuracy of network parameters in AI state
Next, the variation of the parameter estimation errors across the full parameter set was investigated (Fig 8A-8C). The estimation of η (panel A of Fig 8) was less reliable in the region of low g (g < 4) which corresponds to the SR state of the network model [28]. The estimation performance of J (panel C) was instead worse for the smallest values of η, that is, in and around the region of parameters where the network model is in an SI state. The estimation of g was generally very accurate for all states of the network (panel B of Fig 8). Taken together this implies that the highest prediction accuracy of the three network parameters is obtained for the AI state.
We next considered the estimation accuracies across the restricted parameter space corresponding to the AI network state only (η 2 [1.  Table 7 shows the bias and standard deviation for each of the data sets and estimated parameters. B, Cumulative error distributions, the proportion of absolute errors that fall below a given value, also with all parameters rescaled to [0,1]. This can be understood as the fraction of the data points which are reconstructed better than a specific error. The dashed black lines indicate the 90% coverage interval. https://doi.org/10.1371/journal.pcbi.1007725.g007 Table 7. Bias and standard deviation (std) of estimators of network parameters. The table shows the bias and width (std) of the estimator distributions in Fig 7. Note that the biases are all close to zero, indicating that our estimator is essentially unbiased. see Fig 8D-8F. Also within the AI state, g was predicted with the highest accuracy, and J had the lowest estimation accuracy. Further, while the estimation accuracy of g and η was almost constant across the restricted parameter space, the estimation of J became worse with increasing values of J and g (panel F of Fig 8).

Predicting all parameters at once almost as good as using individually trained CNNs
In the above application all three network parameters were predicted by a single convolutional neural net (CNN). We next investigated to what extent the estimation accuracy changed when CNNs were trained to estimate each parameter separately. The results when considering the full parameter space are shown in Fig 9. As expected the estimation accuracy was always better for these 'single-prediction' CNN networks: The error distribution of the η prediction was more centered, that is, less biased, for a single prediction network, compared to the 'combined-prediction' network (left panel). For the estimation of g, the single-prediction network displayed a narrower peak, also highlighting a slightly better performance. For J, the two approaches gave very similar results. Overall, we conclude that merely small gains are achieved for the present application in terms of estimation accuracy by training a separate CNN for each of the three network parameters.

Randomly sampled training-data preferable
The above estimations were based on CNNs trained by LFPs with random network parameters drawn from uniform distributions. To test if the way the parameter space was sampled had an effect on the accuracy of the estimator, we also generated the same amount of training data on a regular grid, spanning the same parameter space and repeated the training. The estimation accuracy was then computed using a randomly generated test data set, and results are shown in Fig 10. The bias and standard deviation of the estimators are listed in Table 8.
For the prediction of η, there was almost no difference in performance between the CNNs trained with grid-sampled and randomly sampled data (left panel in Fig 10). For g, however, the grid-trained data showed a substantial bias towards lower values of g (middle panel). Such a bias was also seen in the estimation of J, but not so pronounced (right panel).  Table 8.  Table 8. We speculate that training on grid-sampled data introduces a certain lower resolution to the CNN estimators. Randomly sampled data does not contain such a grid scale and eventually (with sufficient training data) enables the network to learn to interpolate on arbitrarily small scales. This intrinsic scale of the grid data might thus be the explanation for the poorer performance of the CNN trained with randomly sampled data.
For this study we only used two simple sampling techniques, grid sampling and random sampling, which were still feasible for the case of only three parameters. However, for further improvement, it might be beneficial to change to more advanced and better performing sampling methods like Latin hypercube sampling [48].

Robustness of parameter estimation
In the above testing we found that the three network parameters η, g, and J generally could be excellently predicted by the LFP signal in the background state. However, in this testing, the other model parameters, that is, the synaptic delay t d and the parameters describing the neuron dynamics (cf. Table 1), are kept identical to the values used in the training of the CNN. Here we explore the robustness of this performance when this is no longer the case, that is, how does the accuracy of the CNN estimation of synaptic weight parameters change when these other parameters are different in the training and test data sets, mimicking an uncertainty on the true parameter at the stage of training the estimator.
In the example in Fig 11 we show in panels A-C how the estimation errors increase when the synaptic delay t d is no longer constant in the network, but rather has a Gaussian distribution about the fixed value t d = 1.5 ms used in the training of the CNN. As expected the estimation errors increase with increasing spread σ of the Gaussian. When only the AI states are considered, the estimation error remains small (less than 0.1) even for very large values of the spread where t d is essentially uniformly distributed between 0 and 3 ms (see panel G). When the full parameter space is considered, the errors remain smaller than 0.1 for values of σ less than 0.1. For larger values of σ, the error increases, mostly so for η and least so for g.
The corresponding results when the membrane time constant τ m is spread around the single value τ m = 20 ms used in the training phase, are shown in Fig 11D-11F. Here the estimation errors are very low when only the AI state is considered, generally much smaller than 0.1, even in the extreme case when τ m is almost uniformly spread between 0 and 40 ms. While the estimation errors increase when the full parameter space is considered, the errors remains smaller than *0.1 when predicting g and J (panels E, F). For η, the prediction errors gets much larger for large values of σ though (panel D).
Also with the firing threshold θ spread around the fixed training-phase value θ = 20 mV (panels G-I), small errors are seen for σ less than 0.1, although larger errors are seen in the Table 8. Bias and standard deviation (std) of estimators of network parameters. The table shows the bias and width (std) of the estimator distributions in Figs 9 and 10 for single and multi-variable prediction and two different sampling methods. Note that the biases are all close to zero, indicating that our estimator is essentially unbiased. prediction of η and J for larger values of σ. Note that it follows from IF-model as described in Table 1 that with all other parameters fixed, the network dynamics will only depend on the ratio between J and θ. Thus the observed effect of the spread in θ would be directly related to the effect of spread in J in the test data. The effect on the estimation errors from the spread of the value of the refractory period around the fixed training-phase value t ref = 2.0 ms is seen to be generally small, however (panels J-L).

Mean and standard deviation of estimators
While the case with heterogeneity in model parameters likely better mimics the situation in brain networks, we also explored in Here the error is not only generally larger, but the error is also larger for the AI case alone than when the full parameter space is considered. The accuracy in the predictions of η is in general less than for g and J, and the errors are particularly large for shifted values of t d , τ m and θ. In general, the estimation errors always remain low when shifting t ref .
We did not explore in detail why, for example, the prediction of η in general is less robust to differences of t d , τ m and θ in the training and test data. However, it is not unexpected that the error in general is larger when the full parameter space is considered rather than only the AI-part of the parameter space. For example, in the SR state the detailed frequency position of the dominant peak in the power spectra (Fig 5) is very sensitive to the values of the synaptic delay parameter t d . If so, this will increase the estimation error for the full parameter space compared to for the AI part alone.

Discussion
In the present work we have investigated to what extent the local field potential (LFP), that is, the low-frequency part of an extracellular electrical signal, can be used to extract information about synaptic connection weights and external inputs in the underlying network. As a model we considered the well-known and thoroughly analysed Brunel network comprising an excitatory and an inhibitory population of recurrently connected integrate-and-fire (LIF) neurons [28]. Here only three parameters, η, J, and g, describe the strength of the recurrent synaptic connections and the strength of the external synaptic drive: Despite its simplicity, this model exhibits a high diversity of network dynamics, that is, regular or irregular spiking patterns of individual neurons and synchronous or asynchronous spiking across populations.
The LFP generated by the network was computed using a hybrid scheme [29]: Spikes computed by the point-neuron Brunel network were replayed as presynaptic spikes onto biophysically detailed multicompartmental neuron models to compute the LFP as predicted by volume-conductor theory [31,32]. We then assessed how well the values of the three network parameters could be estimated from the power spectrum of the stationary 'background' LFP signal by application of a convolutional neural net (CNN) [36] and indeed found that all parameters in general could be very accurately estimated. This was the case even when LFPs stemmed from networks in different dynamic states (Figs 7 and 8), but even more so when the LFPs stemmed from the asynchronous irregular (AI) state only (Fig 8). The accuracy of the estimation of the three network parameters was found to be particularly high when the same values of the remaining model parameters (synaptic delay and parameters describing the neurons) were used for the training data and test data. However, the estimates were generally also quite accurate when these parameters were different in the test and training data (Figs 11 and 12).
For the present situation we found that the necessary information for predicting network parameters was contained in the power spectrum (PSD) of the LFP of the background state. While of course this information is also available in the time domain, we found that it was easier to train the CNN using the power spectrum, particularly due to reduced overfitting to the training data. Using Welch's method, the averaging over frequency windows likely reduces the seed-specific variations in the LFPs, leaving less information to overfit to. It is possible that the information contained in the full time series can provide increased accuracy, but it would require more complex networks and possibly much more training data to overcome the issues of overfitting, so we found that in practice using the PSD gave the best results.

Generalization to more complex network models
An obvious question is whether the present successful estimation of network parameters from LFPs will extend to more complex network models with more than three network parameters like in the Brunel network. Of particular interest here are multilayered cortical network models where several neuronal populations contribute to the LFP signal [24,29,[49][50][51].
The estimation problem will likely become more difficult as the number of parameters to estimate increases. However, in the present application we only used the power-spectral density (PSD) of the LFP signals from the stationary background state in the parameter estimation. A 'richer' LFP signal which may separate the LFP signals for different parameters better, can be obtained by also including the phase information of the LFP Fourier components, but maybe more importantly by also using stimulus-evoked transient LFP signals. Further, in the present application, the parameters were estimated by LFPs from six channels spanning a depth of 0.5 mm. With only a single population contributing to the LFP as in the present case, fewer channels would in fact have sufficed. When several cortical neuronal populations positioned at different depths contribute to the LFP, the spatial variation of the signal contains more information on the network activity. Here the use of a larger number of channels, spanning all cortical layers, should expectedly improve parameter estimation.
In the present study the network-parameter estimation was found to be particularly successful when the same values of the other model parameters were used for the training and test data. However, the estimation accuracy was generally quite good when these other parameters were different in the test and training data (Figs 11 and 12). This accuracy could likely have been further improved by use of more extensive training data, for example, including LFPs from models with a wider spread of these other parameters in the training data. However, we did not pursue this here.
In the presently studied Brunel network model [28] the external spiking inputs (parameterised by the variable η) were assumed to be stationary and Poissonian. In real networks this external input will generally be more patterned, and a future study could be to investigate how well network parameters can be estimated from LFPs by machine learning techniques in this setting. Further, in the Brunel network the three network parameters are fixed to a single value. In other more comprehensive network models such as the Potjans-Diesmann model [15], however, the network weights are instead described by statistical distributions, rather than individual weights. An interesting line of inquiry would be to explore the use of the present approach to attempt to estimate such distribution parameters from LFPs.
The exact number of unknown model parameters that can be estimated from LFPs alone in different situations, is difficult to predict a priori. In the present case for the Brunel network we observed that the three network parameters could be well estimated from the PSD in the stationary background state alone. For more complex networks described by more parameters, one can imagine using temporally resolved stimulus-evoked LFPs in addition. Combined with the background LFP one would expect that this reduces the above-mentioned 'redundancy effect' and allows for 'disambiguation' of more candidate network models differing in their parameter values. Even more disambiguation should be possible if the LFPs are supplemented by simultaneously measured spikes and both signals are used as input for CNNs or other machine-learning tools.
Another direction would be to explore the effects of active dendritic conductances such as the I h channel [52] on parameter estimatability.

Computation of LFPs
To compute the three-second long LFP signals 50000 times to train and test the CNNs in the present study, it was computationally unfeasible to explicitly sum over LFP contributions from each individual presynaptic neuron. Instead we used the approximate formula in Eq 10 based on population firing rates to compute the LFPs, reducing the required computer time by several orders of magnitude. The accuracy of this approximation for the present network was demonstrated for a set of representative examples (Fig 5). In [29] where the eight-population Potjans-Diesmann [15] cortical network model was considered, the same approximation was seen to give fairly accurate LFPs as well [29, Fig. 13], although not as accurate as in the present case as judged by the example tests. Thus the use of the approximation in Eq 10 to compute the LFPs in future applications should be tested on a case-to-case basis.

Choice of machine-learning method
The choice of using convolutional neural networks (CNNs) within the Keras framework [41] for doing the parameter estimation was made out of convenience. Other machine learning techniques, see [53] for a recent review, could likely have done equally well, or even better. Further, the architecture of the CNNs was not optimized in any systematic way. A systematic study of the best machine learning method to use for LFP-based parameter estimation for more complex network models should be pursued, but is beyond the scope of the present paper.

Implications for analysis of LFPs
For single neurons, biophysics-based modeling is well established [1,4,5] and numerous biophysically detailed models with anatomically reconstructed dendrites have been made by fitting to experimental data, for example, [6][7][8]10]. These models have mainly been fitted to intracellular electrical recordings, but extracellular recordings [54] and calcium concentrations [55] can also be used.
Until now the analysis of LFPs has largely been based on statistical methods [20,21]. An overall goal of the present project is to contribute to the investigation of to what extent LFPs also can be used to develop and validate network models in layered brain structures such as cortex and hippocampus. Spikes have already been used to distinguish candidate network models in cortex [18,56], and LFPs recorded in vitro have been used to fit hippocampal network models [27]. There is expectedly a clear link between the accuracy with which a parameter can be (i) estimated from and (ii) fitted to LFP signals. Thus the present observation that network parameters for the Brunel network can be accurately estimated from the background LFPs suggests that the same LFP signal also could be used to accurately fit the same network parameters given that the model structure was known a priori. This link between 'estimatability' and 'fitability' should be properly investigated, not only for the Brunel model, but also for more complex network models. However, such a study is beyond the present scope. A related question that also should be investigated is to what extent LFPs can be used to distinguish between candidate models with a different network structure.

Outlook
The recording of single-unit and multi-unit activity (MUA) from the high-frequency part of the extracellular potential, has historically been the most important method for studying in vivo activity in neurons and neural networks. However, the interest in the low-frequency part, the LFP, has seen a resurgence in the last decades. One key reason is the development of new multicontact electrodes allowing for high-density electrical recordings across laminae and areas (as well as computers and hard drives allowing for the storage and analysis of the LFP signals). Another reason is the realisation that the LFP offers a unique window into how the dendrites of neurons integrate synaptic inputs for populations of thousands or more neurons [33]. In contrast, the MUA measures the output resulting from this dendritic integration, that is, spikes from a handful of neurons around the electrode contact [57]. Thus spikes and LFPs offer complementary information about network activity. Since both signals are produced from the same network model, the combined use of spikes and LFPs appears particularly promising for estimation of network model parameters, or for assessing the merit of candidate network models. Such combined use of spikes and LFPs has been shown to be beneficial in identifying laminar neural populations and their synaptic connectivity patterns from multielectrode cortical recordings [51,58]. Thus combined use of spikes and LFPs in the estimation of model parameters should be explored in projects with more complex network models where, unlike for the presently considered Brunel network model, the LFP signal is insufficient to alone allow for accurate parameter estimation.
Further, many new optical techniques for probing cortical activity have also been developed and refined, for example, two-photon calcium imaging [59], and voltage-sensitive dye imaging (VSDI), measuring population-averaged membrane potentials [60]. Further, at the systems level one has methods such as electroencephalography (EEG) [61]), which measures electrical potentials at the scalp, and magnetoencephalography (MEG) [62]) which measures the magnetic field outside the head. These measures can be computed from the activity of candidate network models [63], and tools to facilitate this have been developed [31,32,64,65]. They can all be used to constrain and validate candidate network models, and used in combination they will likely be particularly powerful.