Information Flow in Networks and the Law of Diminishing Marginal Returns: Evidence from Modeling and Human Electroencephalographic Recordings

We analyze simple dynamical network models which describe the limited capacity of nodes to process the input information. For a proper range of their parameters, the information flow pattern in these models is characterized by exponential distribution of the incoming information and a fat-tailed distribution of the outgoing information, as a signature of the law of diminishing marginal returns. We apply this analysis to effective connectivity networks from human EEG signals, obtained by Granger Causality, which has recently been given an interpretation in the framework of information theory. From the distributions of the incoming versus the outgoing values of the information flow it is evident that the incoming information is exponentially distributed whilst the outgoing information shows a fat tail. This suggests that overall brain effective connectivity networks may also be considered in the light of the law of diminishing marginal returns. Interestingly, this pattern is reproduced locally but with a clear modulation: a topographic analysis has also been made considering the distribution of incoming and outgoing values at each electrode, suggesting a functional role for this phenomenon.


Introduction
Most social, biological, and technological systems can be modeled as complex networks, and display substantial non-trivial topological features [1,2]. Moreover, time series of simultaneously recorded variables are available in many fields of science; the inference of the underlying network structure, from these time series, is an important problem that received great attention in the last years. A method based on chaotic synchronization has been proposed in [3], a method based on model identification has been described in [4]. Use of a phase slope index to detect directionalities of interactions has been proposed in [5].
The inference of dynamical networks is also related to the estimation, from data, of the flow of information between variables, as measured by the transfer entropy [6,7]. Wiener [8] and Granger [9] formalized the notion that, if the prediction of one time series could be improved by incorporating the knowledge of past values of a second one, then the latter is said to have a causal influence on the former. Initially developed for econometric applications, Granger causality has gained popularity also among physicists (see, e.g., [10][11][12][13][14][15]) and eventually became one of the methods of choice to study brain connectivity in neuroscience [16]. Multivariate Granger causality may be used to infer the structure of dynamical networks from data as described in [17]. It has been recently shown that for Gaussian variables Granger causality and transfer entropy are equivalent [18], and this framework has also been generalized to other probability densities [19]. Hence a weighted network obtained by Granger causality analysis can be given an interpretation in terms of flow of information between different components of a system. This way to look at information flow is particularly relevant for neuroscience, where it is crucial to shed light on the communication among neuronal populations, which is the mechanism underlying the information processing in the brain [20]. Furthermore, recent studies have investigated the economics implications of several network types mapping brain function [21,22].
In many situations it can be expected that each node of the network may handle a limited amount of information. This structural constraint suggests that information flow networks should exhibit some topological evidences of the law of diminishing marginal returns [23], a fundamental principle of economics which states that when the amount of a variable resource is increased, while other resources are kept fixed, the resulting change in the output will eventually diminish [24,25]. The purpose of this work is to introduce a simple dynamical network model where the topology of connections, assumed to be undirected, gives rise to a peculiar pattern of the information flow between nodes: a fat tailed distribution of the outgoing information flows while the average incoming information flow does not depend on the connectivity of the node. In the proposed model the units, at the nodes the network, are characterized by a transfer function that allows them to process just a limited amount of the incoming information. We show that a similar behavior is observed in another network model, which describes in a different fashion the law of diminishing marginal returns. Moreover, we also propose an exactly solvable Ising model on sparse networks, in the limit of an infinite number of nodes, whose behavior may be seen in the light of the law of diminishing marginal returns. Finally we show that this relevant topological feature is found as well in real neural data.

Materials and Methods
We implement three models on different network structures. Then we analyze human EEG data.

Model 1
The first model we propose is as follows. Given an undirected network of n nodes and symmetric connectivity matrix A ij [ f0,1g, to each node we associate a real variable x i whose evolution, at discrete times, is given by: where j are unit variance Gaussian noise terms, whose strength is controlled by s; F is a transfer function chosen as follows: where h is a threshold value. This transfer function is chosen to mimic the fact that each unit is capable to handle a limited amount of information. For large h our model becomes a linear map. At intermediate values of h, the nonlinearity connected to the threshold will affect mainly the mostly connected nodes (hubs): the input P A ij x j to nodes with low connectivity will remain typically sub-threshold in this case. We consider hierarchical networks generated by preferential attachment mechanism [26]. From numerical simulations of eqs. (1), we evaluate the linear causality pattern for this system as the threshold is varied. We verify that, in spite of the threshold, variables are nearly Gaussian so that we may identify the causality with the information flow between variables [18].

Model 2
We also analyze the following model: to each node of an undirected network we associate the variable x i whose evolution is where j(t) is a node chosen randomly, at each time t, in the set of the neighboring nodes of i. Equations (3) implement, in a different way from (1), the occurrence that nodes may handle a limited incoming information: at each time each node is influenced just by one other node.

Model 3
As another example we consider a diluted Ising model on a directed network [27], [28], constructed as follows. The model is made of N Ising spins s i~+ 1, each connected (with coupling J) to k input spins, chosen at random among the N{1 remaining spins. The number of incoming links for each spin, the in-degree k, is independently sampled with probability P in (k), k~1, . . . ,K max , K max being the maximum value that k may assume. The dynamics of the system corresponds to parallel updating of Ising variables fs i g i~1,...,N : where the local fields are given by where the sum is over the input spins of s i , and J is the positive coupling. We will consider the limit K max v v lnN: it is well known that input spins may be treated as independent stochastic variables in this limit: this makes simple the numerical evaluation of T E (k), the transfer entropy from one input spin to a target spin of connectivity k (see, e.g., [29]). For N?? the out-degree of spins, P out (k), is a Poisson distribution with parameter The input flow of information for a spin with in-degree k is whilst the average information flow outgoing a spin of out-degree k is given by The distribution of c in in the whole system is.

Human EEG Data
As a real example we consider electroencephalogram (EEG) data. We used recording obtained at rest from 10 healthy subjects. During the experiment, which lasted for 15 min, the subjects were instructed to relax and keep their eyes closed. To avoid drowsiness, every minute the subjects were asked to open their eyes for 5 s. EEG was measured with a standard 10-20 system consisting of 19 channels [5]. Data were analyzed using the linked mastoids reference, and are available from [30].

Model 1
Concerning the first model, we compute the incoming and outgoing information flow from and to each node, c in and c out , summing respectively all the sources for a given target and all the targets for a given source. Then we evaluate the standard deviation of the distributions of c in and c out , varying the realization of the preferential attachment network and running eqs. (1) for 10000 time points.
In figure 1 we depict R, the ratio between the standard deviation of c out over those of c in , as a function of the h. As the threshold is varied, we encounter a range of for which the distribution of c in is much narrower than that of c out . In the same figure we also depict the corresponding curve for deterministic scale free networks [31], which exhibits a similar peak, and for homogeneous random graphs (or Erdos-Renyi networks [32]), with R always very close to one. The discrepancy between the distributions of the incoming and outgoing causalities arises thus in hierarchical networks. We remark that, in order to quantify the difference between the distributions of c in and c out , here we use the ratio of standard deviations but qualitatively similar results would have been shown using other measures of discrepancy.
In figure 2 we report the scatter plot in the plane c in {c out for preferential attachment networks and for some values of the threshold. The distributions of c in and c out , with h equal to 0.012 and corresponding to the peak of figure 1, are depicted in figure 3: c in appears to be exponentially distributed around a typical value, whilst c out shows a fat tail. In other words, the power law connectivity, of the underlying network, influences just the distribution of outgoing causalities.
In figure 4 we show the average value of c in and c out versus the connectivity k of the network node: c out grows uniformly with k, thus confirming that its fat tail is a consequence of the power law of the connectivity. On the contrary c in appears to be almost constant: on average the nodes receive the same amount of information, irrespective of k, whilst the outgoing information from each node depends on the number of neighbors.
It is worth mentioning that since a precise estimation of the information flow is computationally expensive, our simulations are restricted to rather small networks; in particular the distribution of c out appears to have a fat tail but, due to our limited data, we can not claim that it corresponds to a simple power-law.

Model 2
A fat tail in the distribution of c out is observed also in model 2: in figure 5 we depict R as a function of a, for preferential attachment networks and for different size of the networks: the discrepancy between the distributions of c in and c out increases as the size of the network grows while keeping a fixed.

Model 3
As already stated, model 3 is exactly solvable in the limit N??. In figure 6 we depict c in and c out versus k, for a power law distribution for connectivity P in (k)!k {a , a~1:5, K max~1 00 and J~0:5. The incoming information flow tends to saturate for spins with large in-degree.
In figure 7 we depict r in (c) for several values of J corresponding to a power law distribution for in-degree of spins characterized by a~1:5. For low J the distribution of c in appears to be a power law as the in-degree distribution: r in (c)!J 2(a{1) c {a at small J. Increasing J, the distribution tends to became exponential, in spite of the power law of input connectivity. These results are robust w.r.t. changes in the exponent a.

EEG Data
For each subject we considered several epochs of 4 seconds in which the subjects kept their eyes closed. For each epoch we computed multivariate Kernel Granger Causality [15] using a linear kernel and a model order of 5, determined by leave-one-out cross-validation. We then pooled all the values for information flow towards and from any electrode and analyzed their distribution.
In figure 8 we plot the incoming versus the outgoing values of the information flow, as well as the distributions of the two quantities: the incoming information seems exponentially distributed whilst the outgoing information shows a fat tail. These results suggest that overall brain effective connectivity networks may also be considered in the light of the law of diminishing marginal returns.
More interestingly, this pattern is reproduced locally but with a clear modulation: a topographic analysis has also been made considering the distribution of incoming and outgoing causalities at each electrode. In figure 9 we show the distributions of incoming and outgoing connections corresponding to the electrodes locations on the scalp, and in figure 10 the corresponding map of the parameter R; the law of diminishing marginal returns seems to affect mostly the temporal regions. This well defined pattern suggests a functional role for the distributions. It is worth to note that this pattern has been reproduced in other EEG data at rest from 9 healthy subjects collected for another study with a different equipment.

Discussion
In this work we have pointed out that the pattern of information flow among variables of a complex system is the result of the interplay between the topology of the underlying network and the capacity of nodes to handle the incoming information. Imple- menting two simple toy models on different network structures, we have shown that they may exhibit the law of diminishing marginal returns for a suitable choice of parameters: the presence of nodes with different in-degree is a fundamental ingredient for these phenomena. Our simulations for these two models are restricted to rather small networks, due to the computational burden. However to address this issue we have also proposed an Ising model on a sparse network, which can be exactly solved in the limit of an  The analysis of a real EEG data-set has shown that similar patterns exist for brain signals and could have a specific functional role. We remark that the distribution of in-degree in resting state fMRI directed networks has been observed to fit an exponentially truncated power law [33]; in the same study the architecture of directed networks was presented as a complement to the same work performed in anatomical and functional connectivity.  Apart from fMRI, there is an increasing interest in investigating resting state networks from EEG recordings [34]. The findings of our study could then represent an additional feature to consider in these networks.
The study of information flow mechanisms is crucial in brain research, and effective methods to mine the information flow pattern from data have been recently introduced. Recently interesting contributions, towards a better understanding of communications in brain, have been provided [35]. Our results, thus, may be relevant to get a better characterization of the topology of brain networks.    In general, evidences of the law of diminishing marginal returns are related to the presence of units which are close to be receiving the maximal amount of information that they can process. A similar interpretation may apply in neuroscience. Indeed the brain is an expensive part of human body, and the organization of brain networks can be explained by a parsimonious drive; it has been proposed that connectomes organization corresponds to a trade-off between minimizing costs and the emergence of functional connectivity between multiple neural populations [22]. This economical principle in brain networks may also be connected to the presence, under particular circumstances, of brain units receiving the maximal amount of information in input. Such situations will display evidences of the law of diminishing marginal returns and should be put in evidence by the proposed analysis.
We should as well mention that there are other measures of directed brain connectivity, such as Directed Transfer Function, Partial Directed Coherence and Phase Slope Index, for which the interpretation in terms of information flow is still debated [36]. On the other hand we verified that a significant discrepancy between the distributions of incoming and outgoing connectivities holds also for these methods. Furthermore, bivariate measures do not display this asymmetry of the distributions of c in and c out : this is not surprising, indeed it is well known that bivariate causality also account for indirect interactions, see e.g. [17]. Here we limited ourselves to linear information flow; the amount of nonlinear information transmission and its functional roles are not clear [37]. It will be interesting to investigate these phenomena also in the nonlinear case.