Model-Free Reconstruction of Excitatory Neuronal Connectivity from Calcium Imaging Signals

A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically infeasible, even in simpler systems like dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct structural connectivity from network activity monitored through calcium imaging. We focus in this study on the inference of excitatory synaptic links. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the functional network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the network in which the inferred functional connectivity is shaped by monosynaptic excitatory connections, rather than by collective synchrony. Our method can discriminate between actual causal influences between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good estimation of the excitatory network clustering coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we demonstrate the applicability of our method to analyses of real recordings of in vitro disinhibited cortical cultures where we suggest that excitatory connections are characterized by an elevated level of clustering compared to a random graph (although not extreme) and can be markedly non-local.


Introduction
The identification of the topological features of neuronal circuits is an essential step towards understanding neuronal computation and function. Despite considerable progress in neuroanatomy, electrophysiology and imaging [1][2][3][4][5][6][7][8], the detailed mapping of neuronal circuits is already a difficult task for a small population of neurons, and becomes impractical when accessing large neuronal ensembles. Even in the case of cultures of dissociated neurons, in which neuronal connections develop de novo during the formation and maturation of the network, very few details are known about the statistical features of this connectivity, which might reflect signatures of self-organized critical activity [9][10][11].
Neuronal cultures have emerged in recent years as simple, yet versatile model systems [12,13] in the quest for uncovering neuronal connectivity [14,15] and dynamics [16][17][18][19]. The fact that relatively simple cultures already exhibit a rich repertoire of spontaneous activity [18,20] make them particularly appealing for studying the interplay between activity and connectivity.
The activity of hundreds to thousands of cells in in vitro cultured neuronal networks can be simultaneously monitored using calcium fluorescence imaging techniques [14,21,22]. Calcium imaging can be applied both in vitro and in vivo and has the potential to be combined with stimulation techniques like optogenetics [23]. A major drawback of this technique, however, is that the typical frame rate during acquisition is slower than the cell's firing dynamics by an order of magnitude. Furthermore the poor signalto-noise ratio makes the detection of elementary firing events difficult.
Neuronal cultures are unique platforms to investigate and quantify the accuracy of network reconstruction from activity data, extending analysis tools initially devised for the characterization of macro-scale functional networks [24,25] to the micro-scale of a developing local circuit.
Here we report a new technique based on information theory to reconstruct the connectivity of a neuronal network from calcium imaging data. We use an extension of Transfer Entropy (TE) [26][27][28] to extract a directed functional connectivity network in which the presence of a directed edge between two nodes reflects a direct causal influence by the source to the target node [29][30][31]. Note that ''causal influence'' is defined operationally as ''improved predictability'' [32,33] reflecting the fact that knowledge of the activity of one node (putatively pre-synaptic) is helpful in predicting the future behavior of another node (putatively postsynaptic). TE has previously been used to study gene regulatory networks [34], the flow of information between auditory neurons [35], to infer directed interactions between brain areas based on EEG recordings [36] or between different LFP frequency bands [37], as well as for the reconstruction of the connectivity based on spike times [38,39]. Importantly, our data-driven TE approach is model-independent. This is in contrast with previous approaches to network reconstruction, which were most often based on the knowledge of precise spike times [40][41][42][43][44][45], or explicitly assumed a specific model of neuronal activity [43,44].
A problem inherent to the indirect algorithmic inference of network connectivity from real data is that the true target topology of the network is not known and that, therefore, it is difficult to assess the quality of the reconstruction. In order to characterize the behavior of our algorithm and to benchmark its potential performance, we resort therefore to synthetic calcium fluorescence time series generated by a simulated cultured neural network that exhibits realistic dynamics. Since the ''ground truth'' topology of cultures in silico is known and arbitrarily selectable, the quality of our reconstruction can be evaluated by systematically comparing the inferred with the real network connectivities.
We use a simplified network simulation to generate surrogate imaging data, improving their realism with the reproduction of light scattering artifacts [6] which ordinarily affect the quality of the recording. Our surrogate data also reproduce another general feature of the activity of neuronal cultures, namely the occurrence of temporally irregular switching between states of asynchronous activity, with relatively weak average firing rates, and states of highly synchronous activity, commonly denoted as ''network bursts'' [20,46,47]. This switching dynamics poses potentially a major obstacle to reconstruction, since directed functional connectivity can be very different during bursting and inter-burst phases and can bear a resemblance to the underlying structural (i.e. synaptic) connectivity only in selected dynamical regimes in which causal influences reflect dominantly mono-synaptic interactions. To restrict our analysis to such ''good'' regimes, we resort to conditioning with respect to the averaged fluorescence level, as an indirect but reliable indicator of the network collective dynamics. Appropriate conditioning -combined with a simple correction coping with the poor time-resolution of imaging data-allows the method to achieve a good topology reconstruction performance (assessed from synthetic data), out-performing other standard approaches, without the need to infer exact spike times through sophisticated techniques (as is required, on the contrary, in [43,45]).
Finally, we apply our algorithm -optimized through modelbased validation-to the analysis of real calcium imaging recordings. For this purpose, we study spontaneously developing networks of dissociated cortical neurons in vitro and we address, as a first step toward a full topology reconstruction, the simpler problem of extracting only their excitatory connectivity. Early mature cultures display a bursting dynamics very similar to our simulated networks, with which they also share an analogous statedependency of directed functional connectivity.
Our generalized TE approach thus identifies network topologies with characteristic and non-trivial features, like the existence of non-local connections, a broadened and strongly right-skewed distribution of degrees (although not ''scale free'') and a moderate but significant level of clustering.

Results
The Results section is organized as follows. After a brief presentation of the qualitative similarity between real calcium fluorescence data from neuronal cultures and simulated data (see Figure 1), we introduce numerical simulations showing that networks with very different clustering levels can lead to matching bursting dynamics (see Figure 2). We then develop our reconstruction strategy, based on a novel generalization of TE, and examine the different elements composing our strategy, namely ''same-bin interactions'' and conditioning with respect to the average fluorescence level. We show that only signals recorded during inter-burst periods convey elevated information about the underlying structural topology (see Figure 3). After a discussion of criteria guiding the choice of the number of links to include in the reconstructed network, we illustrate specific examples of reconstruction (see Figure 4 and 5), contrasting systematically TE with other standard linear and nonlinear competitor methods (see Figure 6) and analyzing factors affecting its performance (see Figure 7). Finally, we apply our reconstruction algorithm to biological recordings and infer topological features of actual neuronal cultures (see Figure 8).

Real and surrogate calcium fluorescence data
In this study, we consider recordings from in vitro cultures of dissociated cortical neurons (see Materials and Methods). To illustrate the quality of our recordings, in Figure 1A we provide a bright field image of a region of a culture together with its associated calcium fluorescence. As previously anticipated, to simplify the network reconstruction problem, experiments are carried out with blocked inhibitory GABA-ergic transmission, so that the network activity is driven solely by excitatory connections. We record activity of early mature cultures at day in vitro (DIV) 9-12. Such young but sufficiently mature cultures display rich bursting events,

Author Summary
Unraveling the general organizing principles of connectivity in neural circuits is a crucial step towards understanding brain function. However, even the simpler task of assessing the global excitatory connectivity of a culture in vitro, where neurons form self-organized networks in absence of external stimuli, remains challenging. Neuronal cultures undergo spontaneous switching between episodes of synchronous bursting and quieter inter-burst periods. We introduce here a novel algorithm which aims at inferring the connectivity of neuronal cultures from calcium fluorescence recordings of their network dynamics. To achieve this goal, we develop a suitable generalization of Transfer Entropy, an information-theoretic measure of causal influences between time series. Unlike previous algorithmic approaches to reconstruction, Transfer Entropy is data-driven and does not rely on specific assumptions about neuronal firing statistics or network topology. We generate simulated calcium signals from networks with controlled ground-truth topology and purely excitatory interactions and show that, by restricting the analysis to inter-bursts periods, Transfer Entropy robustly achieves a good reconstruction performance for disparate network connectivities. Finally, we apply our method to real data and find evidence of non-random features in cultured networks, such as the existence of highly connected hub excitatory neurons and of an elevated (but not extreme) level of clustering.
combined with sparse irregular firing activity during inter-burst periods (cfr. Discussion).
In Figure 1B (left panel) we show actual recordings of the fluorescence traces associated to five different neurons. The corresponding population average for the same time window is shown in Figure 1C (left panel). In these recordings, a stable baseline is broken by intermittent activity peaks that correspond to synchronized network bursts recruiting many neurons. The bursts display a fast rise of fluorescence at their onset followed by a slow decay. In addition, during inter-burst periods, smaller modulations above the baseline are sometimes visible despite the poor timeresolution of a frame every few tens of milliseconds.
In order to benchmark and optimize different reconstruction methods, we also generate surrogate calcium fluorescence data (shown in the right panels of Figures 1B and 1C), based on the activity of simulated networks whose ground truth topology is known. We simulate the spontaneous spiking dynamics of networks formed by N~100 excitatory integrate-and-fire neurons, along a duration of 60 minutes of real time, matching typical lengths of actual recordings. Calcium fluorescence time series are then produced based on this spiking dynamics, resorting to a model introduced in [43] and described in the Materials and Methods section.
Although over 1000 cells are accessible in our experiments, we observed in the simulations that N~100 neurons suffice to reproduce the same dynamical behavior observed for larger network sizes, while still allowing for an exhaustive exploration of the entire algorithmic parameter space. Furthermore, despite their reduced density, we maintain in our simulated cultures the same average probability of connection as in actual cultures, where this probability (p~0:12, see Materials and Methods) is an estimate based on independent studies [15,48].
The fluorescence signal of a particular simulation run or experiment can be conveniently studied in terms of the distribution of fluorescence amplitudes. As shown in Figure 1D for both simulations and experiments, the amplitude distributions display a characteristic right-skewed shape that emerge from the switching between two distinct dynamical regimes, namely the presence or absence of bursts. The distribution in the low fluorescence region assumes a Gaussian-like shape, corresponding to noise-dominated baseline activity, while the high fluorescence region displays a long tail with a cut-off at the level of calcium fluorescence of the highest network spikes. As we will show later, qualitative similarity between the shapes of the simulated and experimental fluorescence distributions will play an important guiding role for an appropriate network reconstruction.

Different network topologies lead to equivalent network bursting
Neurons grown in vitro develop on a two-dimensional substrate and, hence, both connectivity and clustering may be strongly sensitive to the physical distance between neurons. At the same time, due to long axonal projections [13,49], excitatory synaptic connections might be formed at any distance within the whole culture and both activity and signaling-dependent mechanisms might shape non-trivially long-range connectivity [50,51].
To test the reconstruction performance of our algorithm, we consider two general families of network topologies that cover a wide range of clustering coefficients. In a first one, clustering occurs between randomly positioned nodes (non-local clustering). In a second one, the connection probability between two nodes decays with their Euclidean distance according to a Gaussian distribution and, therefore, connected nodes are also likely to be spatially close. In particular, in this latter case, the overall level of clustering is determined by how fast the connection probability decays with distance (local clustering). Cortical slice studies revealed the existence of both local [52,53] and non-local [7,54] types of clustering.
We will later benchmark reconstruction performance for both kinds of topologies and for a wide range of clustering levels, because very similar patterns of neuronal activity can be generated by very different networks, as we now show. Figure 2 illustrates the dynamic behavior of three networks (in this case from the non-local clustering ensemble). The networks are designed to have different clustering coefficients but the same total number of links (see the insets of Figure 2B for an illustration). The synaptic coupling between neurons was adjusted in each network using an automated procedure to obtain bursting activities with comparable bursting rates (see Materials and Methods for details and Table 1 for the actual values of the synaptic weight). As a net effect of this procedure, the synaptic coupling between neurons is slightly reduced for larger clustering coefficients. The simulated spiking dynamics is shown in the raster plots of Figure 2A. These three networks display indeed very similar bursting dynamics, not only in terms of the mean bursting rate, but also in terms of the entire inter-burst interval (IBIs) distribution, shown in Figure 2B. In the same manner, we constructed and simulated local networks -with a small length scale corresponding to high clustering coefficients and vice versa-and obtained the same result, i.e. very similar dynamics for very different decay lengths (not shown).
We stress that our procedure for the automatic generation of networks with similar bursting dynamics was not guaranteed to converge for such a wide range of clustering coefficients. Thus, the illustrative simulations of Figure 2 provide genuine evidence that the relation between network dynamics and network structural clustering is not trivially ''one-to-one'', despite the fact that more clustered networks have been shown to have different cascading dynamics at the onset of a burst [42].

Extraction of directed functional connectivity
We focus, first, on the reconstruction of simulated networks, taken from the local and non-local ensembles described above. We compute their directed functional connectivity based on simulated calcium signals. Synthetic fluorescence time series are preprocessed only by simple discrete differentiation, such as to extract baseline modulations associated to potential firing. These differentiated signals are then used as input to any further analyses.
Generalized TE and directed functional connectivity. We resort to a modified version of TE that includes two novel features (described in detail in the Materials and Methods section), namely the treatment of ''same bin interactions'' and the ad hoc selection of dynamical states.
The original formulation of TE was designed to detect the causal influence of events in the past with events at a later time. Practically, since calcium fluorescence is sampled at discrete times, standard TE evaluates how events occurring in time bin t are influenced by events occurring in earlier time bins t{1, t{2, . . .. By including same bin interactions in TE estimation, we also consider potential causal interactions between events that occur within the same time-bin t. This is important when dealing with experimental data of real neuronal cultures since the image acquisition rate is not sufficiently high to establish the temporal order of elementary spiking events.
On the other hand, the selection of dynamical states is crucial to properly capture interactions between neurons which lead to different activity correlation patterns in different dynamical regimes. Both simulated and real neuronal cultures indeed show a dynamical switching between two distinct states (bursting and non-bursting) that can be separated and characterized by monitoring the average fluorescence amplitude and restricting the analysis only to recording sections in which this average fluorescence falls in a predetermined range. Selection of dynamical states is discussed in the next section.
Once TE functional connectivity strengths have been calculated for every possible directed pair of nodes, a reconstructed network topology can be obtained by applying a threshold to the TE values at an arbitrary level. Only links whose TE value is above this threshold are retained in the reconstructed network topology.
Choosing a threshold is equivalent to choosing an average degree. As a matter of fact, selecting a threshold for the inclusion of links corresponds to setting the average degrees of the reconstructed network. Intuitively, and as shown in Figure S1A, a linear correlation exists between the number of links and the average degree. Because of this relation, an expectation about the probability of connection in the culture, and hence, its average degree, can directly be translated into a threshold number of links to include.
Based on the aforementioned estimations of probability of connection and taken into account the different sizes of our (smaller) simulated network and of our (larger) experimental cultures, threshold values are roughly selected to include the top 10% of links, for reconstructions of simulated networks, and to include the top 5% of links, for reconstructions from actual . Dependence of the directed functional connectivity on the dynamical state. A The distribution of averaged fluorescence amplitudes is divided into seven fluorescence amplitude ranges. The functional connectivity associated to different dynamical regimes is then assessed by focusing the analysis on specific amplitude ranges. B Quality of reconstruction as a function of the average fluorescence amplitude of each range. The blue line corresponds to an analysis carried out using the entire data sampled within each interval, while the red line corresponds to an identical number of data points per interval. C Visual representation of the reconstructed network topology (top 10% of the links only), together with the corresponding ROC curves, for the seven dynamical regimes studied. Edges marked in green are present in both the reconstructed and the real topology, while edges marked in red do not match any actual structural link. Reconstructions are based on an equal number of data points in each interval, therefore reflecting the equal sample size performance (red curve) in panel B. Interval I corresponds to a noise-dominated regime; intervals II to IV correspond to inter-burst intervals with intermediate firing rate and provide the best reconstruction; and intervals V-VII correspond to network bursts with highly synchronized neuronal activity. Simulations were carried out on a network with local topology (l~0:25 mm) and light scattering in the fluorescence dynamics. The results were averaged over 6 network realizations, with the error bars in B and the shaded regions in C indicating a 95% confidence interval. doi:10.1371/journal.pcbi.1002653.g003 biological recordings. These choices are such to lead, in both cases, to reconstructed networks with comparable probability of connection, as previously mentioned. The (limited) impact of a ''wrong'' threshold selection on the inference of specific topologic features, like the clustering coefficient, will be discussed in later sections.

Network reconstruction depends on the dynamical states
Immediately prior to the onset of a burst the network is very excitable. In such a situation it is intuitive to consider that the directed functional connectivity can depart radically from the structural excitatory connectivity, because local events can potentially induce changes at very long ranges due to collective synchronization rather than to direct synaptic coupling. Conversely, in the relatively quiet inter-burst phases, a post-synaptic spike is likely to be influenced solely by the presynaptic firing history. Hence, the directed functional connectivity between neurons is intrinsically state dependent (cfr. also [55]), a property that must be taken into account when reconstructing the connectivity.
We illustrate here the state dependency of directed functional connectivity by generating a random network from the local clustering ensemble and by simulating its dynamics, including light scattering artifacts to obtain more realistic fluorescence signals. The resulting distribution of fluorescence amplitudes is divided into seven non-overlapping ranges of equal width, each of them identified with a Roman numeral ( Figure 3A). Finally, TE is computed separately for each of these ranges, based on different corresponding subsets of data from the simulated recordings.
For simulated data, the inferred connectivity can be directly compared to the ground truth, and a standard Receiver-Operator Characteristic (ROC) analysis can be used to quantify the quality of reconstruction. ROC curves are generated by gradually moving a threshold level from the lowest to the highest TE value, and by plotting at each point the fraction of true positives as a function of the fraction of false positives. The quality of reconstruction is then summarized in a single number by the performance level, which, following an arbitrary convention, is measured as the fraction of true positives at 10% of false positives read out of a complete ROC curve. . TE-based network reconstruction of non-locally clustered topologies. A ROC curve for a network reconstruction with generalized TE of Markov order k~2, and with fluorescence data conditioned at gvg g~0:112. The shaded area depicts the 95% confidence intervals based on 6 networks. B Comparison between structural (shown in blue) and reconstructed (red) network properties: clustering coefficients (top), degree distribution (center), and distance of connections (bottom). C Reconstructed clustering coefficients as a function of the structural ones for different reconstruction methods. Non-linear causality measures, namely Mutual Information (MI, red) and generalized Transfer Entropy (TE, yellow), provide the best agreement, while a linear reconstruction method such as cross-correlation (XC, blue) fails, leading invariably to an overestimated level of clustering. The error bars indicate 95% confidence intervals based on 3 networks for each considered clustering level. All network realizations were constructed with a clustering index of 0.5, and simulated with light scattering artifacts in the fluorescence signal. doi:10.1371/journal.pcbi.1002653.g004 We plot the performance level as a function of the average fluorescence amplitude in each interval, as shown by the blue line of Figure 3B. The highest accuracy is achieved in the lowest fluorescence range, denoted by I, and reaches a remarkably elevated value of approximately 70% of true positives. The performance in the higher ranges II to IV decreases to a value around 45%, to abruptly drop at range V and above to a final plateau that corresponds to the 10% performance of a random reconstruction (ranges VI and VII).
Note that fluorescence values are not distributed homogeneously across ranges I-VII, as evidenced by the overall shape of the fluorescence distribution in Figure 3A. For example, the lowest and highest ranges (I and VII) differ by two orders of magnitude in the number of data points. To discriminate unequal-sampling effects from actual state-dependent phenomena, we studied the performance level using an equal number of data points in all ranges. Effectively, we restrict the number of data points available in each range to be equal to the number of samples in the highest range, VII. The quality of such a reconstruction is shown as the red curve in Figure 3B. The performance level is now generally lower, reflecting the reduced number of time points which are included in the analysis.
Interestingly, the ''true'' peak of reconstruction quality is shifted to range II, corresponding to fluorescence levels just above the Gaussian in the histogram of Figure 3A. This range is therefore the most effective in terms of reconstruction performance for a given data sampling.
For the ranges higher than II, the reconstruction quality gradually decreases again to the 10% performance of purely random choices in ranges VI and VII. The effect of adopting a (shorter) equal sample size is particularly striking for range I, which drops from the best performance level almost down to the baseline for random reconstruction. As a matter of fact, range I is the one for which the shrinkage of sample length due to the constraint for uniform data sampling is most extreme (see later section on dependence of performance from sample size).
The above analysis leads to a different functional network for each dynamical range studied. For the analysis with an equal number of data point per interval, the seven effective networks are drawn in Figure 3C (for clarity only the top 10% of links are shown). Each functional network is accompanied with the corresponding ROC curve.
The lowest range I corresponds to a regime in which spikingrelated signals are buried in noise. Correspondingly, the associated functional connectivity is practically random, as indicated by a ROC curve close to the diagonal. Nevertheless, information about structural topology is still conveyed in the activity associated to this regime and can be extracted through extensive sampling.
At the other extreme, corresponding to the upper ranges V to VII -associated to fully developed synchronous bursts-the functional connectivity has also a poor overlap with the underlying structural network. As addressed later in the Discussion section, functional connectivity in regimes associated to bursting is characterized by the existence of hub nodes with an elevated degree of connection. The spatio-temporal organization of bursting can be described in terms of these functional connectivity hubs, since nodes within the neighborhood of the same functional hub provide the strongest mutual synchronization experienced by an arbitrary pair of nodes across the network (see Discussion and also Figure S2).
The best agreement between functional and excitatory structural connectivity is clearly obtained for the inter-bursts regime associated with the middle range II, and to a lesser degree in ranges III and IV, corresponding to the early building-up of synchronous bursts.
Overall, this study of state-dependent functional connectivity provides arguments to define the optimal dynamical regime for network reconstruction: The regime should include all data points whose average fluorescence across the population g t is below a ''conditioning level''g g, located just on the right side of the Gaussian part of the histogram of the average fluorescence (see Materials and Methods). This selection excludes the regimes of highly synchronized activity (ranges III to VII) and keeps most of the data points for the analysis in order to achieve a good signal-to-noise ratio. Thus, the inclusion of both ranges I and II combines the positive effects of correct state selection and of extensive sampling.
The state-dependency of functional connectivity is not limited to synthetic data. Very similar patterns of state-dependency are observed also in real data from neuronal cultures. In particular, in both simulated and real cultures, the functional connectivity associated to the development of bursts displays a stronger clustering level than in the inter-burst periods. An analysis of the topological properties of functional networks obtained from real data in different states (compared with synthetic data) is provided in Figure S3. In this same figure, sections of fluorescence timeseries associated to different dynamical states are represented in different colors, for a better visualization of the correspondence between states and fluorescence values (for simplicity, only four fluorescence ranges are distinguished).

Analysis of two representative network reconstructions
Our generalized TE, conditioned to the proper dynamic range, enables the reconstruction of network topologies even in the presence of light scattering artifacts. For non-locally clustered topologies we obtain a remarkably high accuracy of up to 75% of true positives at a cost of 10% of false positives. An example of the reconstruction for such a network, with CC~0:5, is shown in Figure 4A. For locally-clustered topologies, accuracy typically Figure 6. Dependence of performance level on network clustering, conditioning level and light scattering artifacts. The color panels show the overall reconstruction performance level, quantified by TP 10% (black, 0% true positives; white, 100% true positives), for different target ground-truth clustering coefficients and as a function of the used conditioning level. Five different reconstruction algorithms are compared: crosscorrelation (XC), Granger Causality (GC) with order k~2, Mutual Information (MI), and Transfer Entropy (TE) with Markov orders k~1,2. The top row corresponds to simulations without artifacts, and the bottom row to simulations including light scattering. Reconstructions with linear methods perform well only in the absence of light scattering artifacts. TE reconstruction with k~2 shows the best overall reconstruction performance, even with light scattering and for any target clustering coefficient. An optimal reconstruction is obtained in a narrow range surrounding the conditioning value ofg g^0:2. doi:10.1371/journal.pcbi.1002653.g006 reaches 60% of true positives at a cost of 10% of false positives, and an example for l~0:25 mm is shown in Figure 5A.
In both topologies, we observe that for a low fraction of false positives detection (i.e. at high thresholds H TE ) the ROC curve displays a sharp rise, indicating a very reliable detection of the causally most efficient excitatory connections. A decrease in the slope, and therefore a rise in the detection of false positives and a larger confidence interval, is observed only at higher fractions of false positives. The confidence intervals are broader in the case of locally-clustered topologies because of the additional network-tonetwork variability that results from the placement of neurons (which is irrelevant for the generation of the non-locally clustered ensembles, see Materials and Methods).
Non-local clustering ensemble. To address the reconstruction quality of the network topology, we focus first on the results for the non-local clustered ensemble. For a conditioning level which corresponds to the right hand side of the Gaussian in the fluorescence amplitude histogram (g g^0:2), we consider three main network observables, namely the distributions of local clustering coefficients, in-degrees, and the distances of connections. As shown in Figure 4B, we obtain a reconstructed network that reproduces well the ground truth properties, with similar mean values and distributions for all three observables considered. We observe, however, a small shift towards lower clustering indices ( Figure 4B, top panel) and especially towards lower average distances (bottom panel) for this highly clustered network.
Despite this underestimation bias for instances with high clustering, Figure 4C shows the existence of a clear linear correlation between the real average clustering coefficient and that of the topology reconstructed with generalized TE (Pearson's correlation coefficient of r~0:92). Such linear relation allows, notably, a reliable discrimination between networks with different levels of clustering but very similar bursting dynamics. Note that this linear relation between real and reconstructed clustering coefficient is robust against misestimation of the expected average degree, or, equivalently, of the number of links to include, as highlighted by Figure S1B.
TE-based reconstructions also yield estimates of the average distance of connection -constant and not correlated with the clustering level-with reasonable accuracy as shown in Figure  S4A.
Local clustering ensemble. For this ensemble (see Figure 5), the quality of reconstruction can be assessed even visually, due to the distance-dependency of the connections, by plotting the network graph of reconstructed connections. In Figure 5B   Dependence of reconstruction quality on TE formulations and recording length. A ROC curves for network reconstructions of non-locally clustered (left panel) and locally-clustered topologies (right), based on three TE formulations: conventional TE (blue), generalized TE with same bin interactions only (red) or also including optimal conditioning (yellow). The vertical lines indicate the performance level at TP 10% , and provide a visual guide to compare the quality of reconstruction between different formulations. B Decay of the reconstruction quality as measured by TP 10% for the two topology ensembles and for generalized TE with conditioning, as a function of the data sampling divisor s. A full simulated recording of 1 h in duration provides a data set of length S 1h , corresponding to a data sampling fraction of s~1. Shorter recording lengths are obtained by shortening the full length time-series to a shorter length given by S'~S 1h =s, with s~1,:::, 19. The insets show the same results but plotted as a function of S' in semi-logarithmic scale. For both A and B, the panels in the left column correspond to the non-locally clustered ensembles (cfr. Figure 4), while the panels in the right column correspond to the locally-clustered ensemble (cfr. Figure 5). Shaded regions and error bars indicate 95% confidence intervals based on 6 networks. doi:10.1371/journal.pcbi.1002653.g007 Again, reconstructed network properties correlate with real properties. The reconstructed distribution of connection distances displays a reduced right-tail compared to the real one. A tendency to estimate a more local connectivity is evident also from a marked overestimation of local clustering coefficients. We attribute such a mismatch to light scattering artifacts that increase local correlations in a spatial region matching the length scale of real structural connections. This is confirmed by the fact that the length scale is correctly inferred in simulations without the light scattering artifact (not shown).
Note, that there is again a very good linear correlation (Pearson's correlation of r~0:97) between the actual and reconstructed (spatial) average connection length, as shown in Figure 5D. Similarly, the reconstructed average clustering coefficient is linearly correlated with that of the ground truth (r~0:98), as shown in Supplementary Figure S4B.
Sensitivity to reconstruction approaches. Overall, TE of Markov order k~2 (i.e. taking into account multiple time scales of interaction, see Eq. (11) in Materials and Methods section) achieved a performance level ranging between 40% and 80% at a level of 10% of false positives, for any clustering type and level.
In Figure 4C and Figure 5D we compare the performance of generalized TE with other reconstruction strategies, respectively for the non-local and for the local clustering ensemble. We considered, as competitors, crosscorrelation (XC), Granger Causality (GC) or Mutual Information (MI) metrics. All of these methods have previously been used to study the connectivity in neural networks [40,[56][57][58][59][60][61]. Detailed definitions of these methods and of the adaptations we introduce for a fair comparison with generalized TE are provided in the Materials and Methods section. When using these alternative metrics, functional networks were extracted exactly as when using generalized TE. The only Only the top 5% of connections are retained in order to achieve, in the final reconstructed network, an average connection degree of 100 (see Results). B Properties of the network inferred from TE reconstruction method (top panels) compared to a cross-correlation (XC) analysis (bottom panels). The figure shows reconstructed distributions for the in-degree (left column), the connection distance (middle column), and the clustering coefficient (right column). In addition to the actual reconstructed histograms (yellow), distributions for randomized networks are also shown. Blue color refers to complete randomizations that preserves only the total number of connections, and red color to partial randomizations that shuffle only the target connections of each neuron in the reconstructed network. doi:10.1371/journal.pcbi.1002653.g008 difference consisted in evaluating functional coupling scores based on XC, GC or MI for each directed edge.
We observe then that MI-based reconstructions yield linear correlations between real and reconstructed clustering coefficient and length scales as well. For the adopted optimal conditioning level, MI can actually out-perform generalized TE (cfr. Supplementary Figure S4), probably due to the smaller sample size required for its estimation. On the contrary, XC-based reconstructions fail in all cases to reproduce these linear correlations, yielding a constant value independently from the ground-truth values. For the non-local clustering ensemble, it distinctly overestimates average clustering level; for the local clustering ensemble, it severely underestimates the average length of connections. Therefore, in XC-based reconstructions, all information on the actual degree of clustering in the network is lost and a high clustering level is invariably inferred.
GC-based reconstructions display the same error syndrome (not shown), which indicates that capturing non-linear correlations in neural activity -as MI and TE can do, but XC and GC cannotis crucial for the inference of the clustering level.
We would like to remind that XC-, GC-and MI-based methods, analogously to the generalized TE approach, include, as generalized TE, the possibility of ''same-bin interactions'' (zerolag). Furthermore, we have modified them to also include optimal conditioning in order to make the comparison between different methods fair. The forthcoming section gives more details about comparison of performance for different conditions and methods.

Performance comparison: Role of topology, dynamics and light scattering
The performance level (fraction of true positives for 10% of false positives, denoted by TP 10% ) provides a measure of the quality of the reconstruction, and allows the comparison of different methods for different network topologies, conditioning levels, and external artifacts (i.e. presence or absence of simulated light scattering). We test linear methods, XC and GC (of order 2; the performance of GC of order 1 is very similar and not shown), and non-linear methods, namely MI and TE (of Markov orders 1 and 2; see Materials and Methods for details). XC and MI are correlation measures, while GC and TE are causality measures. Note that, for each of these methods, we account for state dependency of functional connectivity, performing state separation as described in the Materials and Methods section.
In the case of the non-local clustering ensemble and without light scattering ( Figure 6, top row), even a linear method such as XC achieves a good reconstruction. This success indicates an overlap between communities of higher synchrony in the calcium fluorescence, associated to stronger activity correlations, and the underlying structural connectivity, especially for higher full clustering indices.
GC-based reconstructions have an overall worse quality, due to the inadequacy of a linear model for the prediction of our highly nonlinear network dynamics, but they show similarly improved performance for higher CC.
In a band centered around a shared optimal conditioning level g g^0:2, both MI and generalized TE show a robust performance across all clustering indices. This value is similar to the upper bound of the range II depicted in Figure 3A, i.e. it lies at the interface between the bursting and silent dynamical regimes. In particular for TE and in the case of low clustering indices (which leads to networks closer to random graphs), conditioning greatly improves reconstruction quality. At higher clustering indices the decay in performance is only moderate for conditioning levels above the optimal value, indicating an overlap between the functional connectivities in the bursting and silent regimes. Note, on the contrary, that the performance of MI rapidly decreases if a non-optimal conditioning level is assumed.
The introduction of light scattering causes a dramatic drop in performance of the two linear methods (XC and GC), and even of MI and TE with Markov order k~1. The performance of TE at Markov order 2 also deteriorates, but is still significantly above the random reconstruction baseline in a broad region of parameters. Interestingly, for the optimal conditioning levelg g^0:2 the performance of the TE for k~2 does not fall below TP 10% *40% for any clustering level or l value. It is precisely in this optimal conditioning range that we obtain the linear relations between reconstructed and structural clustering coefficients, for both the non-local and the local clustering ensembles.
A similar trend is obtained when varying the length scale l in the local ensembles (see Supplementary Figure S5). For very local clustering and without light scattering, both XC and TE achieve performance levels up to 80%. The introduction of light scattering, however, reduces the performance of all measures except for MI (but only in the narrow optimal conditioning range) and for TE of higher Markov orders (robust against non-optimal selection of conditioning level). Overall, the performance of the reconstruction for the local clustering ensembles is lower than for the non-locally clustered ensembles. This is also true, incidentally, in absence of light scattering since networks sampled from this ensemble tend to be very similar to purely random topologies (of the Erdös-Rényi type, see e.g. [62]) as soon as the length scale is sufficiently long, and for which performance is generally poorer (cfr. top row of Figure 6, for weak clustering levels).

Contributions to the performance of generalized TE
Our new TE method significantly improves the reconstruction performance compared to the original TE formulation [Eq. (9)]. As shown in Figure 7A for both the local and the non-local clustered networks, reconstruction with the original TE formulation ( Figure 7A, blue line) yields worse results than a random reconstruction, as indicated by the corresponding ROC curves falling below the diagonal. Such a poor performance is due in large part to ''misinterpreted'' delayed interactions. Indeed, by taking into account same bin interactions, a boost in performance is observed (red line). Figure 7A also shows that an additional leap in performance is obtained when the analysis is conditioned (i.e. restricted) to a particular dynamical state of the network, increasing reconstruction quality by 20% (yellow line in Figure 7A). The determination of the optimal conditioning level is discussed later and takes into account the considerations introduced above (cfr. Figure 3). Note that the introduction of same bin interactions alone (red color curves) or conditioning on the dynamical state of network alone (yellow color curves) already brings the performance to a level well superior to random performance. However, at least for our simulated calcium-fluorescence time series, a remarkable boost in performance is obtained only when the inclusion of same-bin interactions and optimal conditioning are combined together (green color curves). Although, in principle, conditioning is enough to indirectly select a proper dynamical regime, the poor timeresolution of the analyzed signals (constrained not only by the frame-rate of acquisition but also intrinsically by the kinetics of the dissociation reaction of the calcium-sensitive dye [63]) also requires the potential consideration of causally-linked events occurring in the same time-bin.
A different way to represent reconstruction performance are ''Positive Precision Curves'' (as introduced in [56] and described in the Materials and Methods section), obtained by plotting, at a given number of reconstructed links, the ''true-false ratio'' (TPR), which emphasizes the probability that a reconstructed link is present in the ground truth topology (true positive). For the same networks and reconstruction as above, we plot the PPCs in Figure S6 (for the ROC curves see Figure 7A). Over a wide range of the number of reconstructed links (TFS), the PPC displays positive values of the TFR, indicative of a majority of true positives over false positives. For both the locally and non-locally clustered networks, the PPC reaches a maximum value of the TPR about 0.5 and remains positive up to about 18% of included links for the clustered topology, or up to 12% in the case of the local topology.

Recording length affects performance
In Figure 7B, we analyze the performance of our algorithm against changes of the sample size. Starting from simulated recordings lasting 1 h of real time (corresponding to about 360 bursting events) and with a full sample number of S 1h , we trimmed these recordings producing shorter fluorescence time series with S'~S 1h =s samples, with s being a divisor of the sample size. For both network topology ensembles, we found that a reduction in the number of samples by a factor of two (corresponding to 30 minutes or about 180 bursts) still yields a performance level of *70%. By further reducing the sample size, we reach a plateau with a quality of *30% for about 40 bursts (corresponding to 6 minutes).
All the experiments analyzed in this work are carried out with a duration between 30 and 60 minutes. Since conditioning, needed to achieve high performance, requires one to ignore a conspicuous fraction of the recorded data, we expect long recordings to be necessary for a good reconstruction, albeit the fact that it is possible to increase the signal-to-noise ratio by increasing the intensity of the fluorescent light. However, the latter manipulation has negative implications for the health of neurons due to photodamage, limiting our experimental recordings to a maximum of 2 hours.

Analysis of biological recordings
We apply our analysis to actual recordings from in vitro networks derived from cortical neurons (see Materials and Methods). To simplify the network reconstruction problem, experiments are carried out with blocked inhibitory GABA-ergic transmission, so that the network activity is driven solely by excitatory connections. This is consistent with previously discussed simulations, in which only excitatory neurons were included.
We consider in Figure 8 a network reconstruction based on a 60 minutes recording of the activity of a mature culture, at day in vitro (DIV) 12, in which N~1720 active neurons were simultaneously imaged. A fully analogous network reconstruction for a second, younger dataset at DIV 9 is presented in Supplementary Figure S7. In general, fluorescence data neither affected by photobleaching nor by photo-damage during this time, as proved by the stability of the average fluorescence signal shown in the Supplementary Figure S8A.
The probability distribution of the average fluorescence signal is computed in the same way as for the simulated data. Neuronal dynamics and the calcium fluorescence display the same bursting dynamics that are well captured by the simulations, leading to a similar fluorescence distribution ( Figure 1D). Thanks to this similarity we can make use of the intuition developed for synthetic data to estimate an adequate conditioning level. We select therefore a conditioning level such as to exclude the right-tail of high fluorescence associated to to fully-developed bursting transient regimes. We have verified, however, that the main qualitative topological features of the reconstructed network are left unchanged when varying the conditioning level in a range centered on our ''optimal'' selection. More details on conditioning level selection are given in the Materials and Methods section.
Reconstruction analysis is carried out for the entire population of imaged neurons. We analyze a network defined by the top 5% of TE-ranked links, as discussed in a previous section. Such choice leads to an average in-degree of about 100, compatible with average degrees reported previously for neuronal cultures of corresponding age (DIV) and density [14,64].
Comparison with randomized networks. The ground truth excitatory connectivity is obviously not known for real recordings and performance cannot be assessed by means of an ROC analysis. However, we can compare the obtained reconstruction to randomized variations to identify non-trivial topological features of the reconstructed network. We perform two kinds of randomizations: In a first one, randomization is full and only the number of network edges is preserved. Comparison with such fully randomized networks detects significant deviations of the reconstructed network from an ensemble of random graphs in which the degree follows the same prescribed Poisson distribution for each node (Erdös-Rényi ensemble, see e.g. [62]). In a second randomization, both the total edge number and the precise outdegrees of each node are preserved. The comparison of the reconstructed network with such a partially randomized ensemble detects local patterns of correlations between in-and out-degrees -including, notably, clustering-which do not arise just in virtue of a specific distribution of out-degrees. Comparison with partial randomizations is particularly important when skewed distributions of degrees are expected.
Topology retrieved by TE. Our analysis shows that the resulting reconstructed topology is characterized by markedly nonlocal structures, as visible in the portion of the reconstructed network in Figure 8A. Distributions of degree, distance of connection and local clustering coefficients inferred by TE are shown in the top row of Figure 8B (yellow histograms). The degree distribution is characteristically broadened and distinctly rightskewed, deviating from the Poisson distribution associated to Erdös-Rényi random graphs (the histogram for fully randomized networks shown in blue). Note that, for partial randomizations (histograms shown in red), we have randomized the out-degree of each node but plotted here the resulting in-degree distribution (the distribution of out-degrees would be, by construction, unchanged).
While the distribution of connection distances matches the one of randomized networks, TE detects clustering at a level which is moderate (CC^0:09) but significantly larger than for random networks. Note that this larger clustering cannot be ascribed to the broadened degree distribution since both full and partial (red histograms) randomizations lead to consistently smaller clustering levels (CC^0:05).
Topology retrieved by XC. By analyzing a network reconstruction based on cross-correlation (XC), we find differences to TE (bottom row of Figure 8B). In particular, XC infers a distribution of distances markedly more local than for full and partially randomized network instances and, correspondingly, markedly higher clustering coefficient (CC^0:17). Distribution of degrees inferred by XC is on the contrary random-like.
As a matter of fact, remarkably similar patterns of discrepancy between reconstruction results based on TE and on XC are also robustly present in synthetic data. Synthetic data analyses consistently show the superior performance of TE compared to XC. Furthermore, these analyses identify a tendency of XC to infer an artificially too local and too clustered connectivity. Therefore, we believe that the topology of the neuronal culture inferred by XC is not reliable, and is biased by the aforementioned systematic drifts.

Relation to state of the art
We have introduced a novel extension of Transfer Entropy, an information theoretical measure, and applied it to infer excitatory connectivity of neuronal cultures in vitro. Other studies have previously applied TE (or a generalization of TE) to the reconstruction of the topology of cultured networks [39,56]. However, our study introduces and discusses important novel aspects, relevant for applications.
Model independency. Our algorithm is model-independent and it is thus not constrained to linear interactions between nodes. This absence of a parametric model can be advantageous not only conceptually, where we hope to make the least amount of assumptions necessary, but also practical for applications to real data. Due to its generality, it can be used virtually without modifications even for the reconstruction based on spike trains or voltage traces. This is important, since massive datasets with modalities beyond calcium fluorescence imaging might become available in a near future, thanks to progresses in connectomics research. Model-independence is also important to avoid potential artifacts due to a too restrictive or inappropriate choice of model for neuronal firing or for network topology. Therefore, it constitutes a major advantage with respect to regression methods or even more elaborated Bayesian approaches, as the one considered in [45]. Both regression and Bayesian techniques indeed assume specific models of calcium fluorescence and neuronal firing dynamics, either explicitly (in the case of the Bayesian framework) or implicitly (assuming a linear dynamical model in the case, e.g., of XC or GC). Note that we use here dynamical network models to benchmark our reconstruction quality. However, our method still remains model-free, because knowledge about these models is not required for reconstruction.
No need for spike times. Competitor approaches used [56] or put emphasis on the need of reconstructing exact spike times [45] with sophisticated deconvolution techniques [43], as a preprocessing step before actual topology reconstruction.
As we have shown here, acquiring such difficult-to-access information is unnecessary for our method, which performs efficiently even for slow calcium fluorescence acquisition rate and operates directly on imaging time series. This is a crucial feature for applications to noisy data, which remains useful even whenas for the data analyzed in this study-the signal-to-noise ratio is sufficiently good to allow sometimes the isolation of individual firing events (cfr. Figure 1B).
Robustness against bursting. We optimized our algorithm to infer excitatory connectivity based on time series of calcium fluorescence with a complex nonlinear dynamics, capturing the irregular bursting and the corresponding time-dependent degree of synchronization observed in cultured networks in vitro. To our knowledge, no previous study about algorithmic connectivity reconstruction has tackled with simulated dynamics reaching this level of realism. We have here identified a simple and conceptually elegant mean-field solution to the problem of switching between bursting and non-bursting states, based just on conditioning with respect to the average level of fluorescence from the whole culture.
A feature of our model network dynamics, and one that is crucial to reproduce temporally irregular network bursting, is the inclusion of short-term depressing synapses. Remarkably, other studies [39], which have modeled explicitly more complex forms of spike-time dependent synaptic plasticity, neglect completely this short-term plasticity, failing correspondingly to generate a realistic model of spontaneous activity of an in vitro culture. Yet, our network model remains very simplified, although the use of networks of integrate-and-fire neurons to generate surrogate data is widespread [39,45,46]. Several features of real cultured neurons are not explicitly included, like heterogeneity in synaptic conductances and time-constants, slow NMDA excitatory currents or distance-dependent axonal delays. However, time-series from more complex models could be used to benchmark our algorithm, without need of introducing any change into it, due to its modelfree nature.
Robustness against light scattering. We have found that, among the tested methods, only generalized TE of at least Markov order k~2 with a proper conditioning allows to distinguish random from clustered topologies and local from long-range connectivities in a reliable manner, in the presence of light scattering artifacts. These artifacts indeed lead to the inference of spurious interactions between the calcium signal of two nodes, reducing the performance of linear causality measures like XC or GC to a random level. Note that this is very likely a similar effect as in [36], where reconstruction with TE is still possible despite cross-talk between EEG electrodes.
Low computational complexity. Finally, we note that our algorithm is computationally simple and relatively efficient. Preprocessing of time series is a simple discrete differentiation. State selection is achieved via conditioning data on a range, which requires only to read once the input time series and compare them with a threshold. The inference procedure itself is not iterative (unlike in [42,45]) and evaluation of TE is done through simple ''plug-in'' estimation, which is fast (unlike elementary steps in, e.g., [45]). The principal determinant of computational complexity is therefore the growing number of putative links (which grow as O(N 2 )) and, hence, of TE scores that must be computed.
For the recording duration of one hour considered in this work, we typically found a computation time of approximately T comp &60ms Ã N 2 including pre-processing on a 2.67 GHz Intel Xeon processor. Reconstructions of networks with 100 nodes required roughly ten minutes; with 1000 nodes roughly half a day. Note however that as the computation of TE for two distinct links is (after pre-processing and conditioning) computationally inde-pendent, it can be easily parallelized, reducing the computation time by a factor equal to the number of CPUs.

Good reconstruction of topological features
Good reconstruction of clustering. Based on synthetic time series of calcium fluorescence, we have studied the relation between ground truth topological features and their reconstructed counterparts. By restricting directed functional connectivity estimation to a proper dynamical regime through conditioning, we found strong linear correlations between real and reconstructed topological properties, both for the average Euclidean distance of connections and the clustering coefficient (CC). Note that deviations from this linear relationship at higher CC values, visible in Figure 4C, are in part due to the fact that we have used a constant conditioning level for all reconstructions, while the optimal conditioning level increases slightly for high CCs.
Clustering coefficient is the most widespread measure of higher order topological features, going beyond characterization of neighbors in a network, but analyzing correlation between local neighborhoods of different nodes. We have adopted the so-called ''full'' clustering index to measure clustering in our directed networks. However, several other definitions of clustering coefficient for directed network exist, emphasizing the contribution to the clustering phenomenon of different topological motifs such as cycles or ''middleman'' loops [65]. We have checked that the linear relationship between real and reconstructed clustering indices hold for all the clustering index types defined in [65], with an almost identical degree of correlation. This hints at a good capacity of our algorithm to reconstruct different classes of graph topology motifs.
Good reconstruction of connectivity motifs. We have focused on the performance of our algorithm in reconstructing specific topological motifs involving more than two nodes. Considering for instance the network whose reconstruction is illustrated in Figure 5, we identified in the ground truth topology of this network all occurrences of shared source motifs, i.e. motifs in which a node A was connected to a node B (A?B) and to a node C (A?C) but in which there was no direct connection between B and C in the reconstructed as well as the ground truth topology. Such a motif might be harmful for reconstruction, because the existence of a shared input A might be mistaken for a reciprocal causal interaction between nodes B and C. Nevertheless, only in a minority of cases (about 20%), spurious links B?C or C?B where erroneously included in our reconstruction. Equivalently, in the case of embedded chains, i.e. motifs in which a node A is connected to B, connected on its turn to C, without a direct link from A to C (A?B?C), the presence of a link A?C (reflecting potentially indirect causation, mistaken for direct) was erroneously inferred only in, once again, about 20% of cases. Note that other studies have investigated the performance of various metrics in the reconstruction of specific small network motifs [66,67], but it is not clear that the efficiency of reconstruction quantified for such small networks continues to hold when these motifs are embedded in larger networks with hundreds of nodes or more.
Good reconstruction of higher-order topological features is important since these features (like e.g. the tendency of existing links to form chains) are known to affect the synchronizability of networks, as stressed by a recent study [68]. Our algorithm was not specifically optimized to infer higher order structures. Analyses based on k-ples rather than on pairs of nodes might lead to a better description of higher order connectivity structures, at a price, however, of a more severe sampling problem. In this sense, conditioning transition probabilities for a specific edge to the ongoing mean-field activity, constitute already a first compromise, allowing to extend the analysis beyond a mere pairwise approach.
Overestimation of bidirectional links. In [68], the role played by the fraction of bidirectional links (i.e. situations in which there is a connection A?B, but also a connection B?A) was also explored.
In our case, approximately 60% of the bidirectional links present in the ground truth topology were reconstructed as bidirectional in the reconstruction of Figure 4. However, this reconstruction exhibited as well a severe overestimation of the number of bidirectional links. Unidirectional in the ground-truth topology were more difficult to detect, such as only 4% of them were ranked among the top 10% of TE scores. Furthermore, when such unidirectional links A?B were reconstructed, the presence of a link B?A was also in most of the cases (85%) spuriously inferred, i.e. included unidirectional links were often mistaken for bidirectional links.
It is likely that an improvement in the inference of directionality might be achieved by increasing the time-resolution of the recordings beyond the present limits of calcium imaging techniques.
Good reconstruction of connectivity length scale. Finally, we have also analyzed the distance dependence of the reconstructed probability of connection in the system. We have found a good agreement between real and reconstructed average length scale of connections, albeit we did find more local connections than present in the ground truth topology (see for example Figure 4B, bottom panel). This is an artifact due to light scattering, as it is suggested by the underestimated peak of the reconstructed connection distance histogram, which matches the characteristic length scale of simulated light scattering l sc .

Functional connectivity hubs
As shown in Figure 3 for simulated data and in Supplementary Figure S3 for actual culture data, epochs of synchronous bursting are associated to functional connectivity with a stronger degree of clustering and a weaker overlap with the underlying structural topology. This feature of functional connectivity is tightly related to the spatio-temporal organization of the synchronous bursts.
In Figure S2A, we highlight the position of selected nodes (of the simulated network considered in Figure 3), characterized by an above-average in-degree of functional connectivity (see Materials and Methods for a detailed definition in terms of TE scores). We denote these nodes -different in general for each of the dynamic regimes numbered from I to VII-as (state-dependent) functional connectivity hubs.
Given a specific hub, we can then define the community of its first neighbors in the corresponding functional network. Consistently across all dynamic ranges (but the noise-dominated range I) we find that synchronization within each of these functional communities is significantly stronger than between the communities centered on different hubs ((pv0:01, Mann-Whitney test, see Materials and Methods). The results of this comparison are reported in Figure S2B, showing particularly marked excess synchronization during burst build-up (ranges II, III and IV) or just prior to burst waning in the largest-size bursts (VII). Therefore, functional connectivity hubs reflect foci of enhanced local bursting synchrony.
Other studies (in brain slices) reported evidence of functional connectivity hubs, whose direct stimulation elicited a strong synchronous activation [69,70]. In [69], the functional hubs were also structural hubs. In the case of our networks, however only functional hubs associated to ranges II and III have an in-degree (and out-degree) larger than average as in [69]. In the other dynamic ranges, this tight correspondence between structural and functional hubs does not hold anymore. Nevertheless, in all dynamic ranges (but range I), we find that pairs of functional hubs have an approximately three-times larger chance of being structurally connected than pairs of arbitrarily selected nodes (not shown).
The timing of firing of these strong-synchrony communities is analyzed in Figure S2C. There we show that the average time of bursting of functional communities for different dynamic ranges is shifted relatively to the average bursting time over the entire network (details of the estimation are provided in Materials and Methods). This temporal shift is negative for the ranges II and III (indicating that functional hubs and related communities fire on average earlier than the rest of the culture) and positive for the ranges V to VII (indicating that firing of these communities occurs on average later than the rest of the culture). The highest negative time delay is detected in range III, such that the communities organized around its associated functional hubs can be described as local burst initiation cores [71,72].

Purely excitatory networks
In this study we did not consider inhibitory interactions, neither in simulations nor in experiments (GABAergic transmission was blocked), but we attempted uniquely the reconstruction of excitatory connectivity.
We would like to point out that this is not a general limitation of TE, since the applicability of TE does not rely on assumptions as to the specific nature of a given causal relationship -for instance about whether a synapse is excitatory or inhibitory. In this sense, TE can be seen as a measure for the absolute strength of a causal interaction, and is able in principle to capture the effects on dynamics of both inhibitory and excitatory connections. Note indeed that previous studies [39,56] have used TE to infer as well the presence of inhibitory connections. However, TE alone could not discriminate the sign of the interaction and additional post-hoc considerations had to be made in order to separate the retrieved connections into separate excitatory and inhibitory subgroups (e.g., in [39], based on the supposedly known existence of a difference in relative strength between excitatory and inhibitory conductances).
A simpler complex problem. By focusing on excitatory connectivity only, our intention was to simplify the full problem of network reconstruction, aiming as a first step to uncover systematically the strongest excitatory links in the network. Such simplification allows indeed to remove potential causes of error, like, e.g. disynaptic inhibition or synchronous excitatory and inhibitory inputs to a same cell, that might result in failed inference of both the excitatory and the inhibitory connections. Nonetheless, this simpler problem remains very difficult because we attempt to reconstruct not only few connections but an entire adjacency matrix for the excitatory subnetwork.
Excitation shapes spontaneous activity. The inference of excitatory connectivity is by itself a very relevant issue. Excitatory recurrent connections are indeed a strong -if not the maindeterminant of spontaneous activity. They act as an ''amplifier'', propagating to the network locally generated firing, and guaranteeing thus that the level of spontaneous activity of the culture remains elevated. Furthermore, modeling studies (see e.g. [9,73]) suggest that excitatory connections only are sufficient to obtain network bursts with experimentally observed statistics [74,75].
Sharper signals. Second, when moving to experimental data, neurons fire more strongly when inhibition is blocked, which makes the identification of firing occurrences more accurate. The amplitude of the fluorescence signal increases by a factor two or more when inhibition is blocked (see e.g. [64]). In Figure S8B we show an example of the fluorescence signal for the same neuron, before (blue) and after (black) blocking inhibitory synaptic transmission. Therefore the reconstruction problem of purely excitatory networks becomes simpler, not really from the algorithmic side (since the algorithm is potentially ready to cope with inhibition), but rather from the experimental side, because of an improved signal-to-noise ratio. Thus, when we generate synthetic data for algorithmic benchmarking, we aim at reproducing these same optimal experimental conditions.
We also note that the distribution of fluorescence values would be different for recordings with and without inhibition. Nevertheless, our method might still be applied, without need of qualitative changes, and state selection might still be performed. The optimal range for conditioning would be different and should again be estimated through a model-based benchmarking, but this would lead only to quantitative adjustments.
A two-steps strategy? A possible strategy to extend our method to the reconstruction of inhibitory connections could be to follow a two-steps experimental approach: first, reconstruct only excitatory connectivity, based on recordings in which inhibitory transmission is blocked; and second, reconstruct only inhibitory connectivity, based on recordings in which, after the wash-out of the culture, excitatory transmission is blocked. In such an experiment, when recurrent excitation is blocked, the spontaneous level of firing activity should be restored by chemical non-synaptic activation. We note however, that although complete blockade of excitation combined with drug-induced network activation is a standard protocol in slice studies (cfr. e.g. [76]), such an approach has never been attempted in cultures of dissociated neurons.
A compromise might be, therefore, to systematically compare the reconstructed connectivity before and after the wash-out of the inhibition blocker thereby collecting indirect evidence about the existence of inhibitory connection, through an analysis of the modulatory action they exert on the causal strengths of previously inferred excitatory connections.

Experimental paradigm
Connectivity in cultures vs. connectivity in vivo. We have here used our current TE algorithm to the inference of connectivity from calcium fluorescence recordings of in vitro cultured networks of dissociated cortical neurons. However, such studies of cultures do not yield direct information about the connectivity of cortical tissues in vivo. Several factors contribute to shape the neuronal circuits in an intact living brain, including the adequate orientation of neurons, dendrites, and axons, the biochemical guidance of processes towards their targets, and the refinement of circuitry through activity. Ultimately, development leads to a complex cortical structure organized both in layers and in columns, and with many particular topological features such as clusters or hierarchical organization [5].
All these structures -which are preserved in slices [7,53,77,78] -are completely lost during the dissociation of the embryonic brain that precedes the preparation of in vitro cultures. Neuronal cultures form new connections ''from scratch'', with a combination of short and long length connections, leading to circuits that have orders of magnitude less neurons and synapses. Nevertheless, understanding how neurons wire together spontaneously in a controlled medium is also of utmost importance, because it allows separating endogenous and exogenous driving components of neuronal wiring. Furthermore, cultures and slices share similar spontaneous bursting dynamics. If this observation alone should not be used to support the existence of shared topological features (cfr. Figure 2), it is true that the selforganization principles underlying the development of networks up to a bursting dynamical state may be common in all living neuronal networks.
Calcium imaging vs. multielectrode arrays. Our algorithm has been optimized for the application to real calcium imaging data, by determining an optimal conditioning range based only on qualitative features of the distribution of the average fluorescence in the network (very similar for synthetic and real data). Other studies have however used electrophysiological recordings from cultures grown on multielectrode arrays (MEAs) [12,15,17,18,74] as a basis for their topology reconstruction strategies (see e.g. [39,56]).
MEAs provide excellent temporal resolution and, to a certain extent, also the possibility to resolve individual spikes. However, MEAs have a limited number of electrodes and often neurons are not precisely positioned on an electrode but at its vicinity, which requires complex processing of the data to identify the source of a given spike. Additionally, the electrodes have to be in contact or embedded into the neuronal tissue, limiting its use to mostly cultures and brain slices.
Calcium imaging, in contrast, offers important advantages. First, the technique provides access to the activity of thousands of neurons in large fields of view for several hours [13,14], and with a time resolution proven to be sufficient for reconstruction in the present study. Second, calcium imaging with superior temporal and spatial resolution [22,79] is a technique that combines perfectly with new breakthroughs in experimental neuroscience, particularly optogenetics [23,80] and genetically encoded calcium indicators [80,81], technologies that are under expansion both in vitro [82] and in vivo [83]. Our reconstruction method can promptly be readapted for the analysis of other calcium imaging datasets. The analysis of in vitro cultures of dissociated neurons is just a first proof-of-concept of the applicability of generalized TE to real data.
As a matter of fact, extended TE might even be applied to multielectrode array data with very few modifications, along lines analogous to [39]. The advantages of an increased time-resolution might then be combined with the efficacy of the state-selection through conditioning concept.
Age of neuronal cultures. In our study we recorded from early mature (DIV 9-12) instead of fully mature cultures. Young but sufficiently mature cultures have rich bursting events while preserving some isolated activity. On the contrary, fully mature cultures show strong synchronized bursting dynamics with very little isolated neuronal activity [84]. In this sense, therefore, young cultures emerge as a model system which better matches the features of non pathologic cortical tissue activity. At the same time, conditions are ideal for an analysis focusing on inter-burst periods, as our one (different, in this sense, from an alternative approach focusing on burst initiation as [42].) Several authors have studied the process of maturation of neuronal cultures, and characterized their spontaneous activity along days or weeks [11,85]. Some studies have identified a stage of full maturation and stable bursting dynamics at the third week, a stage that can last for months (e.g. [84]). However, depending on neuronal density [14,18,20], glial density [86], and the gestation time of the embryo at the moment of dissection [14], spontaneous activity with rich episodes of population bursts can emerge as early as 20]).
On the other hand, GABA switch (the developmental event after which the action of GABAergic synapses become inhibitory as in fully developed networks and stops being excitatory [87]) in cultures similar to ours occurs at around day 7 [14,20]. Therefore, it has already taken place in early mature cultures at DIV 9 (such as the one analyzed in Figure S7). This is confirmed by the fact that the blockade of inhibition by bicuculline leads to an increase of the amplitude of the fluorescent signal by a factor 2-3 (see Figure S8B), as expected after GABA switch has occurred.

Non-local and moderately clustered connectivity in cultures
Evidence for long-range connections. Our TE-based algorithm applied to the reconstruction of the topology of neuronal cultures in vitro have inferred the existence of direct connections between neurons separated by a considerable distance. Indeed, the reconstructed distribution of connection distances peaks at a remarkably high value and is markedly highskewed ( Figure 8B).
Experimental studies showed that cortical slices have a maximum probability of connection at much shorter distances [52]. We note, however, that the density of cells in our culture is considerably more diluted than in normal cortical tissues. Furthermore, cortical developmental processes strongly dictate in vivo (and slice) connectivity [88], while a larger freedom to establish connections exist potentially in cultures in vitro. The growth process and the final length of the processes depend not only on the density of the culture and the population of glia, but also on the chemical properties of the substrate where neurons and process grow [89]. The lack of restrictions during the development of neuronal processes (axons and dendrites) leads to longer axonal lengths in the culture on average, explaining the high connection distance obtained in the reconstruction.
Such long axons in cultures have been reported in other studies [15,49,90], providing broad distributions of lengths with an average value of 1:1mm. Using Green Fluorescence Protein transfection [91], we have directly visualized axonal processes as long as 1:8mm in cultures of comparable age and density to the ones that we used for the fluorescence imaging recordings ( Figure  S8C). Such long axons would be required to mediate the longrange causal interactions captured by the TE analysis. As a matter of fact, as revealed by the randomization of the reconstructed connectivity, long average connection lengths might simply be expected to exist as a result of the bi-dimensional distribution of neurons on a substrate in combination with long individual axonal lengths (i.e. spanning the entire field of view).
For a younger culture at 9 DIV ( Figure S7), on the other hand, we retrieved a probability of connection at short distances higher than expected from randomized networks. This over-connectivity at short distances can be ascribed to the fact that young cultures have less developed axons and therefore a connectivity favored towards closer neurons. However, more exhaustive studiesgoing beyond the scope of the present work-would be needed to assess the full dependence of reconstructed network topology on culture age, neuronal density and spatial distribution, as well as on connection length.
Evidence for moderate clustering. Our TE-based analysis suggest an enhanced tendency to clustering of connections in the analyzed neuronal cultures. Indeed, although the average clustering coefficient is moderated, it is significantly higher than what might be expected based on randomized networks.
Neurons in culture aggregate during growth, giving rise to fluctuations in neuronal density. Since denser areas in the culture have been reported to have a higher connectivity [14], some level of clustering is naturally expected in the real network, as detected by the reconstruction. We could not however identify in our reconstructed topology a statistically significant correlation between the local density of neurons -quantified by the number of cells within a radius of 50mm centered on each given neuron-and the local degree of clustering (rv0:01 for both considered cultures).
This finding suggests the existence of a mixture of local and non-local clustering in the culture and indicates that network clustering in a sparse culture is not a mere byproduct of the inhomogeneous density of cells, but might reflect activitydependent mechanisms for synaptic wiring.
A bias of cross-correlation methods? XC-based reconstructions inferred a much more local average distance of connection, significantly smaller not only than the inference based on TE, but also than what expected from randomized networks. At the same time, XC-based methods inferred an average clustering coefficient almost twice as large as TE approaches.
As a matter of fact, the analyses of Figure 4 and 5 suggest that the connectivity of real cultures inferred by XC is prone to include artifactual features, such as an exceedingly local connectivity, paired to an overestimated level of clustering with respect to ground-truth (unknown in cultures).
Interestingly (and possibly indicative of biased estimations), the discrepancies between TE-based and XC-based inferences of mean connection distance and clustering level are paralleling both the systematic deviations between XC-bases and TE-based reconstructions identified in applications to surrogate data.
No evidence for scale free connectivity. Despite the observation that the distribution of degrees inferred by TE is characteristically broadened and has a conspicuous right tail, we have found no evidence of a scale free degree distribution (i.e. following a power-law [92]). A similar conclusion was also reached for cultured neuronal networks in Ref. [93].
On the contrary, studies like [42] hinted at a large overlap in cultures between the structural connectivity and the retrieved scale free functional connectivity, meeting other studies that identified power-law degree distributions in the functional connectivity of cultures [46] or of slices [70].
Whether connectivity in neural networks, in vitro or in vivo can be characterized as scale free or not is a highly debated issue (see e.g. [94]), and, especially for self-organized networks of dissociated neurons, it is likely to depend on details of how the culture is grown. The connectivity retrieved from our calcium recordings might possibly be better described as ''small-world'' [95], due to the existence of hub nodes with very high out-and in-degree. However, due to the large uncertainty in the reconstruction, we did not attempt any systematic assessment of the average path length between nodes (since this is a quantity very sensitive to error), and prefer to describe the reconstructed degree distribution just as ''right-skewed''.
How reliable are absolute values of reconstructed properties? Through our reconstructions of culture connectivity, we have provided actual absolute values for properties such as average degree, average clustering or average connection length. We note that such estimates are affected by a large uncertainty that goes beyond the variability described by their reconstructed distributions. Indeed, these estimates come from a reconstruction based on a specific choice of the number of links to include. This choice, as highlighted by Figure S1 for synthetic data, corresponds to selecting directly a specific value of the average culture in-and out-degree. Therefore, as previously discussed, the included link number should be determined by our expectations on the average degree of the network to reconstruct, based on independent experimental evidence or on extrinsic guiding hypotheses.
The inference of average clustering or of average distance is robust even against relatively large mistakes in the initial guess for the average degree (see e.g. Figure S1, for clustering estimation). In the case of the DIV 12 network reconstructed in Figure 8, doubling the threshold of included links from 5% to 10% (and therefore adopting a twice as large guess for the average culture degree) changes the inference for the average clustering coefficient from 0:090+0:010 to 0:157+0:011 and the average connection distance from 852+463mm to 855+464mm. Nevertheless, the precise values obtained do depend on the number of reconstructed links.
Due to the lack of information, a better strategy might be, rather than focusing on absolute estimated values, to focus on comparisons with fully and partially randomized networks with analogous average degree or degree distribution, respectively. Such comparisons indeed can convey qualitative information about the occurrence of non-trivial deviations from chance expectations, which are likely to be more reliable than quantitative assessments.

Conclusions and perspectives
In summary, we have developed a new generalization of Transfer Entropy for inferring connectivity in neuronal networks based on fluorescence calcium imaging data. Our new formalism goes beyond previous approaches by introducing two key ingredients, namely the inclusion of same bin interactions and the separation of dynamical states through conditioning of the fluorescence signal. We have thoroughly tested our formalism in a number of simulated neuronal architectures, and later applied it to extract topological features of real, cultured cortical neurons.
We expect that, in the future, algorithmic approaches to network reconstruction, and in particular our own method, will play a pivotal role in unravelling not only topological features of neuronal circuits, but also in providing a better understanding of the circuitry underlying neuronal function. These theoretical and numerical tools may well work side by side with new state-of-theart techniques (such as optogenetics or high-speed two-photon imaging [96][97][98][99][100]) that will enable direct large-scale reconstructions of living neuronal networks. Our Transfer Entropy formalism is highly versatile and could be applied to the analysis of in vivo voltage-sensitive dye recordings with virtually no modifications.
On a shorter time-scale, it would be important to extend our analysis to the reconstruction of both excitatory and inhibitory connectivity in in vitro cultures, which is technically feasible, and to compare diverse network characteristics, such as neuronal density or aggregation. Our algorithm could be used to systematically reconstruct the connectivity of cultures at different development stages in the quest for understanding the switch from local to global neuronal dynamics. Another crucial open issue is to design suitable experimental protocols allowing to confirm the existence of at least some of the inferred synaptic links, in order to validate statistically the reconstructed connectivity. For instance, the actual presence of directed links to which our algorithm assigns the largest TE scores might be systematically probed through targeted paired electrophysiological stimulation and recording. Furthermore, GFP transfection or inmunostaining might be used to obtain actual, precise anatomical data on network architecture to be compared with the reconstructed one. Finally, it might be interesting to reconstruct connectivity of cultured networks before and after physical disconnection of different areas of the culture (e.g. by mechanical etching of the substrate or by chemical silencing). These manipulations would provide a scenario to verify whether TEbased reconstructions correctly capture the absence of direct connections between areas of the neuronal network which are known to be artificially segregated.

Network construction and topologies
We generated synthetic networks with N~100 neurons, distributed randomly over a squared area of 0.5 mm lateral size. We chose p~0:12 as the connection probability between neurons [48], leading to sparse connectivities similar to those observed in local cortical circuits [53]. We used non-periodic boundary conditions to reproduce eventual ''edge'' effects that arise from the anisotropic cell density at the boundaries of the culture.
We considered two general types of networks: (i) a locally-clustered ensemble, where the probability of connection depended on the spatial distance between two neurons; and (ii) a non-locally clustered ensemble, with the connections engineered to display a certain degree of clustering.
For the case of a non-local clustering ensemble, we first created a sparse connectivity matrix, randomly generating links with a homogeneous probability of connection across pairs of neurons. We next selected a random pairs of links and ''crossed'' them (links A?B and C?D became A?D and C?B). We accepted only those changes that updated the clustering index in the direction of a desired target value, thereby maintaining the number of incoming as well as outgoing connections of each neuron. The crossing process was iterated until a clustering index higher or equal to the target value was reached. The overall procedure led to a full clustering index of the reference random network of 0:120+0:004 (mean and standard deviation, respectively, across 6 networks). After the rewiring iterations, we then achieved standard deviations from the desired target clustering value smaller than 0.1% for all higher clustering indices.
We measured the full clustering index of our directed networks according to a common definition introduced by [65]: The binary adjacency matrix is denoted by A, with A ji~1 for a link j?i, and zero otherwise. The adjacent matrix provides a complete description of the network topological properties. For instance, the in-degree of a node i can be computed as d in i~Pj A ji , and the out-degree as d out i~Pj A ij . The total number of links of a node is given by the sum of its in-degree and its out-degree (d tot i~d in i zd out i ). The number of bidirectional links of a given node i (i.e. links between i and j so that i and j are reciprocally connected by directed connections) is given by ii . The adjacency matrix did not contain diagonal entries. Such entries would correspond ''toautaptic'' links that connect a neuron with itself. Note that our directed functional connectivity analysis is based on bivariate time series, and therefore it would be structurally unfit to detect this type of links.
For the case of the local clustering ensemble, two neurons separated a Euclidean distance r were randomly connected with a distance dependent probability described by a Gaussian distribution, of the form p 0 (r)~exp({(r=l) 2 ), with l a characteristic length scale. To guarantee that a constant average number of links C was present in the network, this Gaussian distribution was rescaled by a constant pre-factor, obtained as follows. We first generated a network based on the unscaled kernel p 0 (r) and computed the resulting number of links C'. With this value we then generated a final network based on the rescaled kernel p(r)~C=C' exp({(r=l 2 )).

Simulation of the dynamics of cultured networks
The dynamics of the generated neuronal networks was studied using the NEST simulator [101,102]. We modeled the neurons as leaky integrate-and-fire neurons, with the membrane potential V i (t) of a neuron i described by [103,104]: where g l~5 0pS is the leak conductance and t m~2 0ms is the membrane time-constant. The term I syn accounts for a timedependent input current that arises from recurrent synaptic connections. In the absence of synaptic inputs, the membrane potential relaxes exponentially to a resting level set arbitrarily to zero. Stimulation in the form of inputs from other neurons increase the membrane potential, and above the threshold V thr~2 0mV an action potential is elicited (neuronal firing). The membrane voltage is then reset to zero for a refractory period of t ref~2 ms. The generated action potential excites post-synaptic target neurons. The total synaptic currents are then described by where A is the adjacency matrix, and t s~2 ms is a synaptic time constant. The resulting excitatory post-synaptic potentials (EPSPs) have a standard difference-of-exponentials time-course [105]. Neurons in culture show a rich spontaneous activity that originates from both fluctuations in the membrane potential and small currents in the pre-synaptic terminals (minis). The latter is the most important source of noise and plays a pivotal role in the generation and maintenance of spontaneous activity [106]. To introduce the spontaneous firing of neurons in Eq. (3), each neuron i was driven, through a static coupling conductance with strength a ext~4 :0pA, by independent Poisson spike trains (with a stationary firing rate of n ext~1 :6Hz, spikes fired at stochastic times ft l ext,i g).
Neurons were connected via synapses with short-term depression, due to the finite amount of synaptic resources [104]. We considered only purely excitatory networks to mimic the experimental conditions in which inhibitory transmission is fully blocked. Concerning the recurrent input to neuron i, the set ft k j g represents times of spikes emitted by a presynaptic neuron j, t d is a conduction delay of t d~2 ms, while a int sets a homogeneous scale for the synaptic weights of recurrent connections, whose timedependent strength a int E ji (t) depends on network firing history through the equations dR ji (t) dt~{ In these equations, E ji (t) represents the fraction of neurotransmitters in the ''effective state'', R ji (t) in the ''recovered state'' and I ji (t)~1{R ji {E ji in the ''inactive state'' [103,104]. Once a pre-synaptic action potential is elicited, a fraction U~0:3 of the neurotransmitters in the recovered state enters the effective state, which is proportional to the synaptic current. This fraction decays exponentially towards the inactive state with a time scale t inact~3 ms, from which it recovers with a time scale t rec~5 00ms. Hence, repeated firing of the presynaptic cell in an interval shorter than t rec gradually reduces the amplitude of the evoked EPSPs as the synapse is experiencing fatigue effects (depression). Random networks of integrate-and-fire neurons coupled by depressing synapses are well-known to naturally generate synchronous events [104], comparable to the all-or-none behavior that is observed in cultured neurons [46,47]. To obtain in our model a realistic bursting rate [47], the synaptic weight of internal connections was set to result into a network bursting of 0:10+0:01Hz for all the network realizations that we studied, and in particular for any considered (local or non-local) clustering level. Therefore, after having generated each network topology, we assigned the arbitrary initial value of a int~5 :0pA to internal synaptic weights and simulated 200 seconds of network dynamics, evaluating the resulting average bursting rate. If it was larger (smaller) than the target bursting rate, then the synaptic weight a int was reduced (increased) by 10%. We then iteratively adjusted a int by (linearly) extrapolating the last two simulation results towards the target bursting rate, until the result was closer than 0.01 Hz to the target value. The resulting used values of a int are provided in Table 1.
Note that we defined a network burst to occur when more than 40% of the neurons in the network were active within a time window of 50 ms. Such a criterion does not play any role in the reconstruction algorithm itself, where state selection is achieved through conditioning, but is only used for the automated generation of random networks with a prescribed bursting rate. Typically, for a fully developed burst, more than 90% of the neurons fire within a 50 ms bin, while, during inter-burst intervals, less than 10% do. Due to the clear separation between these two regimes, our burst detection procedure does not depend significantly on the precise choice of threshold within a broad interval.

Model of calcium fluorescence signals
To reproduce the fluorescence signal measured experimentally, we treated the simulated spiking dynamics to generate surrogate calcium fluorescence signals. We used a common model introduced in [43] that gives rise to an initial fast increase of fluorescence after activation, followed by a slow decay (t Ca~1 s). Such a model describes the intra-cellular concentration of calcium that is bound to the fluorescent probe. The concentration changes rapidly by a step amount of A Ca~5 0mM for each action potential that the cell is eliciting in a time step t, of the form where n t is the total number of action potentials. The net fluorescence level F associated to the activity of a neuron i is finally obtained by further feeding the Calcium concentration into a saturating static non-linearity, and by adding a Gaussian distributed noise g t with zero mean: For the simulations, we used a saturation concentration of K d~3 00mM and noise with a standard deviation of 0.03.

Modeling of light scattering
We considered the light scattered in a simulated region of interest (ROI) from surrounding cells. Denoting as d ij the distance between two neurons i and j and by l sc~0 :15 mm the scattering length scale (determined by the typical light deflection in the medium and the optical apparatus), the resulting fluorescence amplitude of a given neuron F sc i,t is given by A sketch illustrating the radius of influence of the light scattering phenomenon is given in Figure S9. The scaling factor A sc sets the overall strength of the simulated scattering artifact. Note that light scattered, according to the equation shown above, could be completely corrected using a standard deconvolution algorithm, at least for very large fields of view and a scattering length known with sufficient accuracy. In a real setup however, the relatively small fields of view (on the order of 3mm 2 ), the inaccuracies in inferring the scattering radius l sc , as well as the inhomogeneities in the medium and on the optical system, make perfect deconvolution not possible. Therefore, artifacts due to light scattering cannot be completely eliminated [107,108]. The scaling factor A sc , that we arbitrarily assumed to be small and with value A sc~0 :15, can be seen as a measure of this residual artifact component.

Generalized Transfer Entropy
In its original formulation [26], for two discrete Markov processes X and Y (here shown for equal Markov order k), the Transfer Entropy (TE) from Y to X was defined as: where n is a discrete time index and x (k) n is a vector of length k whose entries are the samples of X at the time steps n, n{1, …, n{k. The sum goes over all possible values of x nz1 , x (k) n and y (k) n . TE can be seen as the distance in probability space (known as the Kullback-Leibler divergence [109]) between the ''single node'' transition matrix P(x nz1 Dx (k) n ) and the ''two nodes'' transition matrix P(x nz1 Dx (k) n ,y (k) n ). As expected from a distance measure, TE is zero if and only if the two transition matrices are identical, i.e. if transitions of X do not depend statistically on past values of Y , and is greater than zero otherwise, signaling dependence of the transition dynamics of X on Y .
We use TE to evaluate the directed functional connectivity between different network nodes. In a pre-processing step, we apply a basic discrete differentiation operator to calcium fluorescence time series F sc x,t , as a rather crude way to isolate potential spike events. Thus, given a network node x, we define x n~F sc x,nz1 {F sc x,n . This pre-processing step also improves the signal-to-noise ratio, thus allowing for a better sampling of probability distributions with a limited number of data points. To adapt TE to our particular problem we need to take into account the general characteristics of the system. We therefore modified TE in two crucial aspects: 1. We take into account that the synaptic time constants of the neuronal network (*1ms) are much shorter than the acquisition times of the recording (*10ms). We therefore need to account for ''same bin'' causal interactions between nodes, i.e. between events that fall in the same time-bin. Slower interactions with longer lags are still taken into account by evaluating TE for a Markov order larger than one (in time-bins units). 2. We consider the possibility that the network dynamics switches between multiple dynamical states, i.e. between bursting and inter-bursting regimes. These regimes are characterized by different mean rates of activity and, potentially, by different transition matrices. Hence, we have to restrict the evaluation of TE to time ranges in which the network is consistently in a single dynamical state. The separation of dynamical states can be achieved by introducing a variable g t for the average signal of the whole network, We then include all data points at time instants in which this average fluorescence g t is below a predefined threshold parameter g g, i.e. we consider only the time points that fulfill ft : g t vg gg. We only make an exception that corresponds to the simulations of Figure 3 and Figure S2, where we considered time points that fall within an interval bounded by a higher and a lower thresholds, i.e. ft :g g low vg t vg g high g.
Using these two novel aspects, we have extended the original description of Transfer Entropy [Eq. (9)] to the following form TE Ã Y ?X (g g)X P(x nz1 ,x (k) n ,y (k) nz1 jg nz1 vg g)log P(x nz1 jx (k) n ,y (k) nz1 ,g nz1 vg g) P(x nz1 jx (k) n ,g nz1 vg g) : Probability distributions have to be evaluated as discrete histograms. Hence, the continuous range of fluorescence values (see e.g. the bottom panels of Figure 1) is quantized into a finite number B of discrete levels. We typically used a small B~3, a value that we justify based on the observation that the resulting bin width b is close to twice the standard deviation of the signal. The presence of large fluctuations, most likely associated to spiking events, is then still captured by such a coarse, almost nonparametric description of fluorescence levels.

Network reconstruction
Generalized TE values are obtained for every possible directed pair of network nodes, and using a fixed threshold levelg g. The set of TE scores are then ranked in ascending order and scaled to fall in the unit range. A threshold TE thr is then applied to the rescaled data, so that only those links with scores above TE thr are retained in the reconstructed network.
A standard Receiver-Operator Characteristic (ROC) analysis is used to assess the quality of the reconstruction by evaluating the number of true positives (reconstructed links that are present in the actual network) or false positives (not present), and for different threshold values TE thr [110]. The highest threshold value leads to zero reconstructed links and therefore zero true positives and false positives. At the other extreme, the lowest threshold provides both 100% of true positives and false positives. Intermediate thresholds give rise to a smooth curve of true/false positives as a function of the threshold. The performance of the reconstruction is then measured as the degree of deviation of this curve from the diagonal, and that corresponds to a random choice of connections between neurons.
To provide a simple method to compare different reconstructions, we arbitrarily use the quantity TE 10% , defined as the fraction of true positives for a 10% of false positives, as indicator for the quality of the reconstruction.
An alternative to the ROC is the Positive Prediction Curve [56], plotting the ''true-false ratio'' (TFR) against the number of reconstructed links, called ''true-false sum'' (TFS). The TFR represents the fraction of true positives relative to the false positives. Denoting by #TP the absolute number of true positives and by #FP the number of false positives, TFR is therefore defined in [56] in the following way: The case TFR~0 corresponds to the case that, for any given reconstructed link, it is on average equally likely that it is in fact a true positive rather than a false positive.

Alternative reconstruction methods
To gain further insight into the quality of our reconstruction method, we compare reconstructions based on TE with three other reconstruction strategies, namely cross-correlation, mutual information, and Granger causality.
Cross-correlation (XC) reconstructions are based on standard Pearson cross-correlation. The score assigned to each potential link is given by the largest cross-correlogram peak for lags between 0 and t max~6 0ms, of the form XC Y ?X~m ax Dt~0:::tmax corr x (S{Dt) S ,y (S{Dt) In a similar way, the scores for Mutual Information (MI) reconstructions are evaluated as MI Y ?X~m ax Dt~0:::tmax X P(x n ,y n{Dt jg n vg g) n log P(x n ,y n{Dt jg n vg g) P(x n jg n vg g)P(y n{Dt jg n vg g) Analogously to TE, the sum goes over all entries of the joint probability matrix.
For the reconstruction based on Granger causality (GC) [33] we first model the signal x t by least-squares fitting of a univariate autoregressive model, obtaining the coefficients a 0 k and the residual g 0 , In a second step, we fit a second bivariate autoregressive model that includes the potential source signal y t , and determine the residual g 1 , Note that in the latter bivariate regression scheme we take into account ''same bin'' interactions as for Transfer Entropy (index of the second sum starts at m~0). Given C 0 , the covariance matrix of the univariate fit in Eq. (15), and C 1 , the covariance matrix of the bivariate fit in Eq. (16), GC is then given by the logarithm of the ratio between their traces: GC analyses were performed at an order k~2. Analyses at k~1 yielded however fully analogous performance (not shown). We note that the same pre-processing used for TE is also adopted for all the other analyses. The same holds for conditioning on the value of the average fluorescence g n , which can be applied simply by only including the subset of samples in which g t vg g.

Hubs of (causal) connectivity
Connectivity in reconstructed networks is often inhomogeneous, and groups of nodes with tighter internal connectivity are sometimes visually apparent (see e.g. reconstructed topologies in Figure 3C). We do not attempt a systematic reconstruction of network communities [111], but we limit ourselves to the detection of ''causal sink'' nodes [112], which have a larger than average indegree. We define this property in terms of the sum of TE from all other nodes to one particular node ( P j TE j?i ), choosing the top 20 nodes for each particular network as selected ''hub nodes''.
We then analyze the dynamics of these selected hub nodes and of their neighbors. Specifically we define as C the subgraph spanned by a given hub node and by its first neighbors. We analyze then the cross-correlogram of the average fluorescence of a given group C with the average fluorescence of the whole culture: The D-notation indicates that we correlate discretely differentiated average fluorescence time series, rather than the average time series themselves. Indeed, cross-correlograms for these differentiated time series are well modeled by a Gaussian functional form, due to the slow change of the averaged fluorescence compared to the sampling rate. Therefore, we fit a Gaussian to the cross-correlogram y(t): determining thus a cross-correlation amplitude A C , a crosscorrelation peak lag t C and the standard deviation s C . The cross-correlation peak lag t C indicates therefore whether nodes in a given local hub neighborhood C fire on average earlier or later than other neurons in the network.
Relative strength of synchrony within a local hub neighborhood C can be analogously evaluated by computing XCs, as defined in Eq. (13), for all the links within C and comparing it with peak XCs over the entire network.

Experimental preparation
Primary cultures of cortical neurons were prepared following standard procedures [14,113]. Cortices were dissected from Sprague-Dawley embryonic rat brains at 19 days of development, and neurons dissociated by mechanical trituration. Neurons were plated onto 13 mm glass cover slips (Marienfeld, Germany) previously coated overnight with 0.01% Poly-l-lysine (Sigma) to facilitate cell adhesion. Neuronal cultures were incubated at 37 0 C, 95% humidity, and 5%CO 2 for 5 days in plating medium, consisting of 90% Eagle's MEM -supplemented with 0.6% glucose, 1% 1006 glutamax (Gibco), and 20mg=ml gentamicin (Sigma) -with 5% heat-inactivated horse serum (Invitrogen), 5% heat-inactivated fetal calf serum (Invitrogen), and 1mg=ml B27 (Invitrogen). The medium was next switched to changing medium, consisting of of 90% supplemented MEM, 9.5% heat-inactivated horse serum, and 0.5% FUDR (5-fluoro-deoxy-uridine) for 3 days to limit glia growth, and thereafter to final medium, consisting of 90% supplemented MEM and 10% heat-inactivated horse serum. The final medium was refreshed every 3 days by replacing the entire culture well volume. Typical neuronal densities (measured at the end of the experiments) ranged between 500 and 700neurons=mm 2 . Cultures prepared in these conditions develop connections within 24 hours and show spontaneous activity by day in vitro (DIV) 3-4 [14,18,20]. GABA switch, the change of GABAergic response from excitatory to inhibitory, occurs at DIV 6-7 [14,20].
Neuronal activity was studied at day in vitro (DIV) 9-12. Prior to imaging, cultures were incubated for 60 min in pH-stable recording medium in the presence of 0.4% of the cell-permeant calcium sensitive dye Fluo-4-AM (Invitrogen).
The culture was washed off Fluo-4 after incubation and finally placed in a chamber filled with fresh recording medium. The chamber was mounted on a Zeiss inverted microscope equipped with a 56 objective and a 0.46 optical zoom.
Neuronal activity was monitored through high-speed fluorescence imaging using a Hamamatsu Orca Flash 2.8 CMOS camera attached to the microscope. Images were acquired at a speed of 100 frames/s (i.e. 10 ms between two consecutive frames), which were later converted to a 20 ms resolution using a sliding window average to match the typical temporal resolution of such recordings [13,43,45,90,114]. The recorded images had a size of 648|312 pixels with 256 grey-scale levels, and a final spatial resolution of 3:4 mm=pixel. This settings provided a final field of view of 2:2|1:1 mm 2 that contained on the order of 2000 neurons.
Activity was finally recorded as a long image sequence of 60 minutes in duration. We verified that the fluorescence signal remained stable during the recording, as shown in Supplementary Figure S8A, and we did not observe neither photo-bleaching of the calcium probe nor photo-damage of the neurons.
Before the beginning of the experiment, inhibitory synapses were fully blocked with 40 mM bicuculline, a GABA A antagonist, so that activity was solely driven by excitatory neurons.
Since cultures were studied after GABA switch, the blockade of inhibition resulted in an increase of the fluorescence amplitude, which facilitated the detection of neuronal firing, as illustrated in Figure S8B.
The image sequence was analyzed at the end of the experiment to identify all active neurons, which were marked as regions of interest (ROIs) on the images. The average grey-level on each ROI along the complete sequence finally provided, for each neuron, the fluorescence intensity as a function of time. Each sequence typically contained on the order of a hundred bursts.
Examples of recorded fluorescence signal for individual neurons are shown in Figure 1B.

Analysis of experimental recordings
The fluorescence data obtained from recordings of neuronal cultures was analyzed following exactly the same procedures used for simulated data (e.g. processed in a pipeline including discrete differentiation, TE or other metrics evaluation, ranking, and final thresholding to maintain the top 10% of connections).
Due to the lack of knowledge of the ground-truth topology, optimal conditioning level cannot be known. However, based on the similarity between experimental and simulated distributions of calcium fluorescence, we select a conditioning level such to exclude the high fluorescence transients associated to fullydeveloped bursting transients while keeping as many data points as possible. Concretely, this is achieved by taking a conditioning level equal to approximately two standard deviations above the mean of a Gaussian fit to the left peak of the fluorescence histogram. Such a level coincides with the point where, when gradually increasing the conditioning level, the reconstructed clustering index reaches a plateau, i.e. matches indicatively the upper limit of range II in Figure 3.
To check for robustness of our reconstruction, we generated alternative reconstructions based on different conditioning levels. For the selected conditioning value, and for both the experimental datasets analyzed (Figures 8 and S7), we verified that the inferred topological features, including notably the average clustering coefficient and connection distance, were stable in a range centered on the selected conditioning value and wide as much as approximately two standard deviations of the fluorescence distribution.
To identify statistically significant non-random features of the real cultured networks in exam, we compared the reconstructed topology to two randomizations.
A first one consisted in a complete randomization that preserved only the total number of connections in the network, but scrambled completely the source and target nodes. The resulting random ensemble of graphs was an Erdös-Rényi ensemble (see, e.g. [62]) in which each possible link exists with a uniform probability of connection p~C=(N(N{1)), where C is the total number of connections in the reference reconstructed network.
A second partial randomization preserved the in-degree distributions only, and was implemented by shuffling the entries of each row of the reconstructed adjacency matrix, internally rowby-row. In this way, the out-degrees of each node were preserved. In both randomization processes, we disallowed diagonal entries.
For both randomizations we calculated the in-degree, the distance of connections and the full clustering index for each node, leading to distributions of network topology features that could be compared between the reconstructed network and the randomized ensembles, to identify significant deviations from random expectancy. Figure S1 Dependence of reconstructed degree and clustering coefficient on the fraction of included links. We report here analyses for three representative non-locally clustered networks with clustering coefficients 0.1, 0.3 and 0.5 (as in Figure 2). A The average degree, by construction, varies linearly with the fraction of included links (threshold). B The reconstructed clustering coefficient shows a steep rise at low threshold values, and describe broad hill-like profiles between approximately 5% and 10% of reconstructed links, which peaks around CC values approximately matching the ground truth clustering value of the network. For higher values of the threshold, the dependency of the reconstructed clustering coefficient on the threshold is closer to linear. Note that by definition of the ''full'' clustering coefficient, measured here (see Materials and Methods section), a random network would show a perfect linear correlation. (EPS) Figure S2 Hubs of functional connectivity. A Plot of the spatial position of neurons of an example simulated network, highlighting functional connectivity hubs (red dots) for each of the dynamic intervals I-VII depicted in Figure 3. B For the corresponding dynamical regimes, the average cross-correlation between calcium signals of first-neighbor nodes of functional connectivity hubs (blue bars, see Materials and Methods) is compared to the average cross-correlation between nodes inside the latter local groups and the rest of the population (green). Significance analysis is carried out using a Mann-Whitney test (n.s.: not significant). The results show that the degree of synchronization within the neighborhood of functional connectivity hubs is significantly higher than across groups. C Cross-correlogram time-shift of the average activity of the same local groups with respect to the whole network average (see Materials and Methods, cfr. also Figure S3). Negative shifts denote ''bursting earlier'' and positive shifts ''bursting later''. The analysis shows that the calcium fluorescence signal in the neighborhood of functional connectivity hubs for the dynamic ranges II-IV is ''leading'', i.e. has a negative lag in relation to the average population. Such functional connectivity hubs correspond therefore to foci of burst initiation. (EPS) Figure S3 State dependency in simulated and real data. Two experiments are considered, ''A'' being the one at DIV 12 ( Figure 8) and ''B'' corresponding to the one at DIV 9 ( Figure S7), together with a simulated data set, corresponding to the network from the local clustering ensemble (studied in Figure 5). For the three cases, we separate the fluorescence signal into four ranges and identify to each a different dynamical regime (encoded by different colors, top row). We then compute, in each one of these regimes, the clustering index (central row) and the average connection distance (bottom row). While the exact reconstructed values of the clustering index and the average connection distance are of course different, all three datasets share their main qualitative features, e.g. the fact that the reconstructed clustering index is peaked in range ''III'' (marked in yellow in the top panels) where their average connection distance is lowest. (EPS) Figure S4 Analysis of additional topological features. A In non-locally clustered topologies, the non-linear causality measures, Mutual Information (MI, red) and Transfer Entropy (TE, yellow), correctly estimate the average connection distance, while a linear measure such as cross-correlation (XC, blue) fail to do so. The latter invariably under-estimates this distance, as it does for the locally clustered ensemble (cfr. Figure 5D). Note that nonlocally clustered topologies are random by construction in terms of the spatial distribution of connections, and, therefore, they display a virtually identical average connection length, independent of the clustering coefficient. B In locally-clustered topologies, the nonlinear measures, MI (red) and TE (yellow), also provide a good estimation of the resulting clustering coefficient, in contrast with cross-correlation, XC (blue). The latter invariably overestimates the clustering level, as it does for the non-locally clustered ensemble (cfr. Figure 4D). In both plots, the dashed line corresponds to perfect reconstruction. (EPS) Figure S5 Dependence of performance on characteristic length scale, conditioning level and light scattering. The color panels show the overall reconstruction performance level, quantified by TP 10% (black, 0% true positives; white, 100% true positives), for different target ground-truth clustering coefficients and as a function of the used conditioning level. Five different reconstruction algorithms are compared: cross-correlation (XC), Granger Causality (GC) with order k~2, Mutual Information (MI), and Transfer Entropy (TE) with Markov orders k~1,2. The top row corresponds to simulations without artifacts, and the bottom row to simulations including light scattering. TE of order k~2 and with light scattering provides a fair reconstruction quality at any length scale for a conditioning valueg g^0:1. (Note that the scale bar of reconstruction performance is different from the one in studied at day in vitro 9. Only the top 5% of connections are shown. C Properties of the network inferred from TE reconstruction method (top panels) compared to a cross-correlation (XC) analysis (bottom panels). The figure shows reconstructed distributions for the in-degree (left column), the connection distance (middle column), and the clustering coefficient (right column). In addition to the actual reconstructed histograms (yellow), distributions for randomized networks are also shown. Blue color refers to complete randomizations that preserves only the total number of connections, and red color to partial randomizations that shuffle only the target connections of each neuron in the reconstructed network.