Mapping the Voxel-Wise Effective Connectome in Resting State fMRI

A network approach to brain and dynamics opens new perspectives towards understanding of its function. The functional connectivity from functional MRI recordings in humans is widely explored at large scale, and recently also at the voxel level. The networks of dynamical directed connections are far less investigated, in particular at the voxel level. To reconstruct full brain effective connectivity network and study its topological organization, we present a novel approach to multivariate Granger causality which integrates information theory and the architecture of the dynamical network to efficiently select a limited number of variables. The proposed method aggregates conditional information sets according to community organization, allowing to perform Granger causality analysis avoiding redundancy and overfitting even for high-dimensional and short datasets, such as time series from individual voxels in fMRI. We for the first time depicted the voxel-wise hubs of incoming and outgoing information, called Granger causality density (GCD), as a complement to previous repertoire of functional and anatomical connectomes. Analogies with these networks have been presented in most part of default mode network; while differences suggested differences in the specific measure of centrality. Our findings could open the way to a new description of global organization and information influence of brain function. With this approach is thus feasible to study the architecture of directed networks at the voxel level and individuating hubs by investigation of degree, betweenness and clustering coefficient.


Introduction
Resting-state functional magnetic resonance imaging (rs-fMRI) is increasingly being used to investigate brain dynamics [1]. The dynamical integration between brain areas, evidencing neuronal communications beyond the underlying anatomical structure, is investigated by functional and effective connectivity. Functional connectivity (FC) measures statistical dependencies of time-series between distinct units; while effective connectivity (EC) investigates the influence one neuronal system exerts over another, by means of predictive models [2]. The former has been comprehensively described and integrated in the functional connectome of the human brain [3]. Nevertheless, only a few studies have investigated the large-scale directed influence brain network based on EC [4,5], though not yet at the voxel level.
Once that the architecture of a neural network is known, it is possible to identify its functional hubs and critical nodes, determining preferred pathways of neuronal communication and estimating the controllability of a system [6], or to use the graph structure as a decoding tool for brain states [7]. A graphtheoretical approach to whole brain functional connectivity, based on the count of the number of functional connections per voxel (edges in graph) has been successfully applied [8][9][10][11][12][13][14][15] allowing to identify the distribution of functional hubs. Prominent functional hubs were identified in the default mode network as well as in dorsal, parietal and prefrontal regions.
A significant advance in the understanding of brain function could come from the investigation of directed networks of information transfer, such as those based on effective connectivity. The models on which effective connectivity is based can either be physiologically motivated, such as dynamical causal models, or purely data-driven such as in Granger causality (GC) analysis (for an extensive review see [16]). GC [17], which evaluates whether the prediction error on one variable is significantly reduced by including another variable in the autoregressive (AR) model, has been used to identify the effective connectivity of blood-oxygenlevel-dependent (BOLD) fMRI signals [18][19][20]. It is worth to note that the application of GC to fMRI is controversial [21,22], especially for resting-state fMRI [16]. Nonetheless, the analogies and differences between network architectures of functional connectivity and GC-based effective connectivity have been investigated [5,23,24]. Those studies are based on coarse-grained parcellations from anatomically based brain atlases. Little is known on the functional hubs in voxel-wise EC network. The main issue arising when applying Granger causality to high dimensional networks, such as voxel time series from the whole brain, is the curse of dimensionality in the conditioning variables.
To cope with redundancy and dimensionality issues in evaluating multivariate GC, it has recently been proposed [25] that conditioning on a small number of variables, chosen as the most informative ones for each given driver, can be enough to recover a network of effective connectivity eliminating spurious influences in particular when the connectivity pattern is sparse. We refer to this approach as the partially conditioned GC (PCGC). Another issue related with the recovery of EC networks from BOLD signal is the possibly confounding effect of the hemodynamic response. In order to decouple the neuronal activity and the hemodynamic responses, we applied a blind deconvolution procedure, based on the detection of pseudo-events, to the BOLD signal [26].

Subjects and Data Acquisition
The resting-state fMRI dataset used in this study has been publicly released under the '1000 Functional Connectomes Project' (http://fcon_1000.projects.nitrc.org, accessed March 2012).and has been collected at the State Key Laboratory of Cognitive Neuroscience and Learning at Beijing Normal University (n = 197, 122 females; age: 21.261.8 years). All participants had no history of neurological and psychiatric disorders. Written informed consent was obtained from each participant, and the study was approved by the local Institutional Review Board. During the resting state, participants were instructed to keep still with their eyes closed but not to fall asleep, remaining as motionless as possible. The fMRI images were acquired by using single-shot gradient echo planar imaging (EPI) sequence (repetition time (TR): 2000 ms; echo time: 30 ms; axial slices: 33; thickness: 3 mm; inter-slice gap: 0.6 mm; field of view: 2006200 mm 2 ; in-plane resolution: 64664; flip angle: 90u). For each subject, a total of 225-volumes were acquired, resulting in a total scan time of 450 s.

Data Preprocessing
Preprocessing of resting-state images was performed using SPM8: data underwent slice timing correction, realigning of all the images to the first image using six degrees of freedom rigid body transformations, spatial normalization into the Montreal Neurological Institute template then resampling to 3-mm isotropic voxels, and spatial smoothing using a 6-mm full-width halfmaximum Gaussian kernel. Recently, small head movements have been identified as an important confounding factor for resting state fMRI studies [27][28][29]. To limit the impact of micro-movements artifacts on these data, we implemented a 'scrubbing' procedure as part of data preprocessing. An estimate of head motion at each time point was calculated as the frame-wise displacement (FD) (mean absolute FD across all subject = 0.10460.045 mm), using six displacements from rigid body motion correction procedure mentioned above [27]. Following [30], any image with FD.0.5 mm was removed and replaced by a linear interpolation.
Additional parameters were used to remove possible spurious variances from the data through linear regression. These were 1) six head motion parameters obtained in the realigning step, 2) signal from a region in cerebrospinal fluid, 3) signal from a region centered in the white matter, 4) global signal averaged over the whole brain. Time series were linearly detrended and temporally band-pass filtered (0.01-0.08 Hz). We then generated a studyspecific functional volume mask that included only voxels present in all participants.

Spontaneous Point Event Detection and HRF Deconvolution
Previous studies have shown that the hemodynamic processes are inhomogeneous across the whole brain [31]; in order to maximally eliminate the effect of hemodynamic response which may disturb the inference of temporal precedence [32], we employed a blind deconvolution technique developed for restingstate BOLD-fMRI signal [26], starting from the idea that the resting-state BOLD spikes are due to spontaneous point events, based on the increasing evidence of non-random patterns of BOLD spike that govern the dynamics of the brain at rest [33][34][35]. These spontaneous events can be detected by point process analysis (PPA), picking up BOLD fluctuations of relatively large amplitude [36,37]. After detecting these resting-state BOLD transients, the neural event onsets are stored for further HRF reconstruction. Voxel-specific HRF is obtained by fitting raw BOLD signal with canonical HRF and its time derivative, in order to finally recover signals at the neural level by Wiener deconvolution (Matlab code is available at http://users.ugent. be/,dmarinaz/code.html) [38].

Partially Conditioned Granger Causality
Partially conditioned Granger causality (PCGC) was originally proposed in [25] as a technique able to compute the GC conditioned to a small number of variables in the framework of information theory. The idea is that conditioning on a small number of the most informative variables for the candidate driver variable is sufficient to remove indirect interactions especially for sparse connectivity patterns. Here we briefly report the foundations of the approach.
Let's consider n covariance-stationary variables fx i (t)g i~1,ÁÁÁ,n ; the state vectors, representing the past realizations up to a lag q are denoted as X i (t)~x i (t{q), Á Á Á ,x i (t{1) ½ . The multivariate Granger causality from variable b to variable a is defined as the logarithm of the ratio of e(x a DX ), the mean squared error prediction of x a on the basis of all the vectors X , and e(x a DX \X b ), the mean squared error prediction of x a on the basis of the past of all variables but b. What was proposed is a reduction of the number of variables to be included in the conditioning dataset.
The PCGC index PCGC(b?a)is defined as follows: where Z~fX i 1 , Á Á Á ,X in d gis a set of the n d variables, in X\X b , In order to choose the first variable of the subset, the mutual information between the candidate driver variable and each of the other variables is estimated; the second variable of the subset is selected among the remaining ones, as those that, jointly with the previously chosen variable, maximize the mutual information with the driver variable. Then, one keeps adding the rest of the variables by iterating this procedure. This is repeated until the addition of another variable does not result in a substantial information gain.
The model order for PCGC analysis can be chosen by standard methods such as the Akaike information criterion, the Bayesian information Criterion or leave-one-out cross validation. In the following analysis we set q~1, as in other fMRI studies [18]. From now on we refer to this data-driven method as PCGC d .
The statistical significance of Granger causality value was estimated under the null hypothesis of zero influence, with a standard F-test on the restricted and unrestricted AR model [39].
In order to cope with extra-large data sets, such as voxel-wise fMRI data, an additional strategy to reduce the number of conditioning variables is in order. In this study it is proposed to make use of the community structure of the data. This procedure, indicated as PCGC t , exploits a hierarchical partition, at two resolutions, of the brain signal. It consists of the following steps: [1]. Considering each potential driver voxel b, the whole ensemble of voxels (excluding b) S is divided into N systems: S 1 ,S 2 Á Á Á ,S N , such as the signal for the N systems is obtained aggregating voxels inside each system S k resulting in Z S~f Z Z S1 , Á Á Á , Z Z SN g. [2]. Each system is further partitioned into subsystems S k1 , Á Á Á ,S kd , such that now the signal within the subsystems of S k is given by i the mean signal of the variables X j belonging to the subsystem S ki . [ This strategy is justified by the following assumptions: Let us consider PCGC t in the restricted and unrestricted regression models: where and I S j i are the index of S, system S j and subsystem S j i respectively; and Y j~½ Y 1 j , Á Á Á ,Y d j . For voxel-wise analysis, excluding the special case in which Z is a small subset containing all the informative variables, the observation is always much smaller than the number of predictors in Eq. 2, resulting in a singular matrix in the computation of the regression coefficients. Moreover, predictors will also face a high degree of multicollinearity (predictors too are redundant). As a consequence estimation of regression coefficients in CGC may change erratically in response to small changes in the data.
According to our algorithm, the coefficients of X ji (j i [I Sj ,j=g) will have the same given weight; different weights will be assigned to the coefficients of where 6 denotes the Kronecker product and e~1, Á Á Á ,1 ½ [< 1|t , t is changed according to the dimension of Y k g and Y j . So, in the proposed algorithm, even if we only consider a few conditioning variables Z~fZ s \ Z Z S g ,Z Sg g, we are potentially taking into account all the information needed to partial out possible indirect causal influences, and avoiding multicollinearity in regression analysis models.
In order to achieve effectiveness and feasibility of the proposed scheme, the predictors should be reasonably aggregated into groups, ensuring that they contribute with approximately equal weights to the dependent variable. Since the construction of a pairwise correlation matrix will yield indications on the likelihood that predictor variables are multicollinear/redundant, we can group the predictors after detecting community structure from the correlation matrix. We then average the predictors which contain the redundant information about the dependent variable to avoid overfitting in regression analysis model. Considering that spatially connected voxels will most likely display similar BOLD signal, we can find community structure on a coarse resolution under the local mean-field assumption.

Detection of the Conditioning Dataset
Community detection. In order to reduce the dimensionality of the set Z of variables to include in the conditional analysis we explored its community distribution.
First, the preprocessed functional images were parcellated into 90 (45 for each hemisphere) non-cerebellar anatomical regions of interest (ROIs) using automated anatomical labeling (AAL) template [40]. This parcellation scheme is referred to as AAL-90. Considering that the range of nodal scale and the difference in template parcellations may affect the results of community detection [41], we also used a high-resolution parcellation scheme with 512 and 1024 micro ROIs [42,43]. Specifically, we generated smaller ROIs of approximately identical size across both hemispheres by subdividing each region of the low-resolution AAL-90 template into a set of sub-regions. These parcellation schemes are referred to as AAL-512 and AAL-1024. The studyspecific functional volume mask was superposed to the AAL-90/ 512/1024 templates.
Then, the time series from each ROI i and j were used to calculate the pairwise Pearson correlation matrix R = (r ij ) for each subject. This matrix was averaged across all subjects and its community structure was explored. As negative weights play a controversial role in network organization [44], for this study the absolute values of the averaged matrix were considered. The Louvain algorithm for modularity detection was run 10 4 times, and the solution producing the highest Q was selected as the representative modular partition, where modularity Q was defined as [45]: Where k i~P j r ij , c i is the community to which vertex i is assigned, d(c i ,c j ) is the Kronecker delta, and m~1 2 P ij r ij . According to the PCGC t algorithm, these large modules were further divided into smaller sub-modules according to the strategy described above.
Statistical analysis of Z. Following the identification of modules from the mean correlation matrix (see Fig. 1), further analysis was performed on the distributions of Z according to modular structure. The distributions of the first n d variables (obtained from greedy algorithm) in the partitioned module are reported in Fig. 2 in which it's evident that the most informative variables for each candidate driver come mainly from the same partition, but also, with no major differences in proportion, from the other modules.
Effect of including the driver variable in Z. The formulation of PCGC t requires that the driver variable b is excluded before partitioning the system. While this step is absolutely necessary at large scale, when working with time series from individual voxels one can suppose that the results will not be dramatically affected since its effect will be most likely averaged out. Including the driver variable is computationally very advantageous, saving time in the partition step.
To validate this hypothesis, we propose a test to evaluate how the presence of the candidate driver variable affects the result of the voxel-wise PCGC t analysis. Firstly, the correlation between the average signal Z Z S k i of subsystem S ki and its individual voxels is computed (see the distribution of these values in Fig. S1). Then, for every subsystem, a driver voxel b yielding the maximum value of r is chosen, and PCGC t is computed including it in the subsystem Z. This modified approach is called PCGC ti .  Seed-based Granger Causality As a representative example, medial prefrontal cortex (mPFC, MNI coordinates [0, 52,26] with sphere 6mm diameter, see Fig. 3) was used as the seed ROI. This choice is motivated by the evidence of it being a hub sending out information in default mode network [46]. The set Z of conditioning variables was chosen on AAL-1024 template. Causal interaction was investigated by mapping the influence from the source to voxels in the rest of the brain. Indirect influences will be misleadingly considered as direct in the traditional pairwise GC analysis, which was computed for comparison and validation.

Voxel-wise Granger Causality
To construct the voxel-wise Granger causality network, the PCGC conditioning variables Z were individuated using the AAL-1024 based community structure. Specifically, the time series for each voxel were extracted from the HRF-deconvolved rs-fMRI data to calculate a PCGC matrix G~(a ij ),1ƒi,jƒN (N is the number of voxels), where a ij is the GC value between the iand jth voxels. A visualization of group level voxel-wise directed graph resconstructed by PCGC is reported in Fig. 4. Considering that the graph G is directed, all topological properties were calculated on both incoming and outgoing matrix. Graph theoretical analyses were carried out on the EC network using the MatlabBGL package (https://code.launchpad.net/matlab-bgl).

Centrality Indices
Degree centrality (DC) is the sum of the weights of edges connected to a node, i.e. DC(i)~P j a ij . Nodes with high DC can be considered as hubs for information integration.
Betweenness centrality (BC) is a measure based on shortest paths, widely used in complex network analysis. Nodes with high BC are important in managing the flow of information in the graph due to the fact that they have a high probability to occur on a randomly chosen shortest path between two randomly chosen nodes.
Clustering coefficient (CC) is defined as the number of connections among the neighbors of a particular node. It reflects the local efficiency of information transfer in the graph. A high CC along with a small characteristic path length indicates ''smallworld'' architecture, reflecting regional hubs with long-distance connections and high clustering within each of them.
Normalized nodal parameters. We calculated the normalized nodal parameters as in the following formula [47]: where p node (i,k) is an integrated nodal parameter (BC, CC and DC) of node i in the network of subject k, M is the number of networks included in the analysis(M = 197) and N is the number of nodes.
Identification of hubs. The hubs for each node in the brain network were identified according to the following criteria: (1) Node i is a BC-hub if BC norm (i) .mean+SD. (2) Node i is a CChub if CC norm (i) .mean+SD. (3) Node i is a DC-hub if DC norm (i) .mean+SD. To each node was assigned a score between 0 and 3, determined by the total number of hub criteria fulfilled. Voxels showing a hub-score of 2 or 3 (i.e. which were designed hubs for at least two measures) were marked as hub nodes.

Validations: Simulated Data
The reliability of PCGC t was validated using simulated data. A benchmark dataset was created based on the following AR(1) model: where j are i.i.d. unit variance Gaussian variables. By construction, y?g,g?c and m?u. A system of 6k time series, where k = 10 or 20 was constructed as follows. For i~1, Á Á Á ,k: where r and c are i.i.d. Gaussian variables, r are zero mean and unit variance, c is generated from a Gaussian distribution with mean 0.3 and variance 0.3. Note that the first k variables share the same information corresponding to y (Module 1), whilst the second k variables share the information corresponding to g (Module 2). The variables x i (Module 3), with i~2kz1, Á Á Á ,3k, form a group of variables with correlations at equal times, similarly to the group of variables with i~3kz1, Á Á Á ,4k (Module 4) and i~4kz1, Á Á Á ,5k (Module 5). The variables x i , with i~5kz1, Á Á Á ,6k (Module 6), correspond to pure noise. We generated a data set of 5000 time points (in order to get robust statistical significance in the next analysis). Then we evaluated the element-wise GC/PCGC for all pairs of maps. We repeated the simulation 100 times with random values of j, r and c to generate a null distribution; Wilcoxon signed rank test was employed to

Results
Seed-based and voxel-wise Granger causality were evaluated. In the latter case, conditioning variables were obtained after partitioning the data in high-resolution functional connectivity communities. We further report the centrality analyses based on binary directed influence network at voxel-level.
The conditional variables Z were detected in functional connectomes of different spatial scale, constructed using AAL-90, AAL-512 and AAL-1024 templates. On average 6 communities were detected in each functional connectome (Fig. 1). These results are consistent with previous findings [48,49]. Further analysis was performed on the distribution of the variables in Z across the modules. The distribution of Z n d (n d = 10, Fig. 2) according to the partitioned community organization shows that the highest fraction of the predictors in Z n d come from the same module of the driver variable, and contributions from other modules are relatively equally distributed.

Seed-based Granger Causality Mapping
The reproducibility of directed influence from mPFC (seed-tovoxel causality mapping) across all subjects is shown in Fig. 3. The reproducibility is given by the number of subject which showed a significant F value, divided by the total number of subjects, for a given voxel. The outgoing information values retrieved with pairwise GC and PCGC t were relatively consistent (r = 0.43). Compared to pairwise GC, the PCGC t displayed higher reproducibility in medial frontal gyrus, superior frontal gyrus, inferior frontal gyrus, middle temporal gyrus, anterior prefrontal cortex, anterior cingulate, dorsolateral prefrontal cortex, posterior cingulate, precuneus and lower in occipital lobe, cuneus under the same statistical significancep [10 {3 .
Moreover, the first 10 most informative voxels for mPFC are shown in Fig. S2, with size proportional to their reproducibility across all subjects. It can be observed that these voxels are distributed not only in proximity of the zone of interest but across the brain, consistently with findings reported for Z derived when voxel time series were averaged according to AAL-90/512/1024 templates [50].
Concerning the effect of including the driver voxel in Z, we found that PCGC ti is highly correlated with PCGC t (minimum correlation 0.993 across all subsystems and subjects, Fig. S3), especially for the statistical significant values, thus indicating that this approximate step has a negligible influence on the accuracy of the method.

Voxel-wise Granger Causality Network
In Fig. 4 the voxel-wise PCGC t network is represented using a network layout at a ij ]0:3 for each subject. This network of directed information is divided in modules which are then mapped on the brain. For a better visualization, only nodes with degree .11 are reported in the figure. The purple cluster, containing the posterior regions of the default mode network is intensely interconnected to other modules. In particular it appears to send directed information to the anterior regions (pink cluster) rather than receiving, providing additional details to previous results on the directionality of information flow in the default mode network [46]. The salmon cluster, containing the thalamus and the putamen, does not have strong connections to the other modules. These results are consistent with those reported in [51], in which it was shown that all midline cortical rich-club nodes (i.e., bilateral precuneus, superior frontal, superior parietal) are connector hubs, playing an important role in between-module connectivity, while subcortical rich-club regions (bilateral thalamus, putamen) play an important role in module structure.
Considering that the graph we focused on is directed, each node's incoming degree and outgoing degree must also be considered separately [24]. Incoming degree and outgoing degree represent the total number of connections incoming to a node and outgoing from the same node, respectively [52].
Here only binary graph results with fixed threshold a ij ]0:3 and a minimum cluster size of 27 contiguous voxels were reported. The spatial distributions of the weighted graphs are similar (Fig. S4). Based on normalized nodal parameters, some consistent regions are identified as hubs (voxel hub-score of 2 or 3) at the same time in the incoming and outgoing directed influence network (Fig. 5 These results are in line with previous reports studying brain anatomical, functional connectivity networks [15,53].
Some regions were consistently identified as hubs of outgoing directed influence: superior temporal gyrus, postcentral gyrus, middle occipital gyrus, transverse temporal gyrus, precentral gyrus, and primary auditory cortex.

GCD vs. FCD
In addition, we compared DC in voxel-wise Granger causality network versus voxel-wise functional connectivity network. For voxel-wise functional connectome, DC was referred to as global FC density (FCD) in previous studies [13]. The FCD map (binary graph at fixed significant threshold p[10 {6 ) is consistent with previous functional connectivity studies [15,54]. The incoming and outgoing GCD maps (binary graph results with fixed significant threshold p [10 {6 ) are shown in Fig. 6. The regions showing high DC both for EC (Incoming/Outgoing) and FC are located in middle frontal gyrus, superior frontal gyrus, dorsal frontal cortex, superior temporal gyrus, angular gyrus, supramarginal gyrus, dorsal posterior cingulate cortex, anterior prefrontal cortex, primary auditory cortex, precuneus, insula, posterior cingulate cortex; most of them are part of the DMN system.

Simulated Validations
We simulated data according to Eq.2 with k = 10, 20. The resulting modules when k = 20 are reported in Fig. 7 (similar results are obtained with k = 10). Pairwise GC and PCGC analysis were performed with model order equal to 1. PCGC t and PCGC d all successfully revealed the ground truth in both cases, while pairwise GC detected false positives from Module 2 to Module 1, and from Module 1 to Module 3. The n d = 10 for PCGC d analysis is determined by the knee of the curve of the information gain when an additional variable is used for conditioning (Fig. S5 right).

Discussion
Large-scale integration of information across brain regions is investigated by both functional and anatomical connectomes. In this study, to extend human brain connectomic repertoire, we first constructed the effective connectivity network using voxel-wise Granger causality on resting-state fMRI data. To cope with dimensionality issues for voxel-wise Granger causality and to decouple the neuronal activity and hemodynamic responses of resting-state fMRI, we proposed the partially conditioned Granger causality (PCGC) and blind deconvolution using the spontaneous events detected in BOLD signal. The convergence and divergence of hub regions between functional and effective connectivity network were documented.

Directed Network Centrality Mapping
Specific network centrality measures have been primarily focused on the identification of the human brain hubs at regional [9,55] and voxel level [13,15,54,[56][57][58]. Brain hubs take a central position in a network and play a crucial role in fast transfer and efficient integration of information across the human connectome [3]. In this study, hubs of directed brain network were generally identified by high levels of degree centrality, betweenness centrality, and clustering coefficient [3,58]. As an addition to previous findings in structural and functional connectomes, here for the first time the voxel-wise centrality-based characteristics of information flow in the human brain directed network was reported. Some regions have been found to be consistently hubs across various modalities (e.g., fMRI vs. DTI) and different dynamical connectivity approaches (FC vs. EC), such as posterior  cingulate cortex, precuneus, medial prefrontal cortex, lateral parietal and temporal cortex, insula. Also, some regions displayed remarkable differences (e.g., cuneus), due to the specific measure of centrality [15], the parcellation scale [41] and brain connectivity definition [53] employed. Nonetheless, our findings suggest that higher order cortical association regions acted as pivotal incoming or outgoing hubs, maintaining information flow even in resting state.
Although pivotal hubs have already been found within single resting-state network [46], among multiple networks [5,19], and even in large-scale whole brain network [4,24], uncovering voxelwise centrality hubs on directed networks is particularly challenging. Efficient algorithms to estimate voxel-wise centralities are still under development [59], while computation of the intermediate directed connectivity matrix (,10 9 elements) involves accuracy and efficiency problems. In the present work, we proposed a novel approach, PCGC t , to remove indirect interactions in large multivariate datasets.

Partial Conditioning Technique
It has been recently proposed [25] that partial conditioning on a small number of the most informative variables for the driver node is sufficient to obtain a reliable estimate of the directed connectivity, especially when the pattern of causalities is sparse. This approach not only allows a much faster calculation of Granger causality matrix, but also a more accurate one, where a fully multivariate approach would incur in curse of dimensionality and in underestimation of influences due to the presence of redundancy. Anatomical studies have shown that axonal connectivity of the cortex is generally sparse [53], functional connectivity studies have been shown that the human brain is a highly clustered and redundancy complex system. Furthermore, the information gain plots reflect that the most informative variables for driver node were confined to small number of nodes or components. These evidences provide the idea to construct voxel-wise EC network by uses of partial conditioning technique.
In a recent study we have shown that the relative information gain (and thus the number of variables to condition on) is not affected by the time between successive scans (TR) [26], even though data with shorter TR contain more absolute information.
Here we further examined how template size affects the information gain [60]. However, with lower scale template (such as AAL 512 and AAL 1024, and in general when the number of variables is larger than the number of samples), the residual redundancy will prevent a further decrease of the information gain after a local minimum. On the other hand, when Z is built from the aggregated signal according to community structure, this phenomenon disappears (Fig. S5).
The statistical analysis of Z n d provides the evidence that the most informative variables for the candidate driver mostly come from the community to which it belongs and are uniformly distributed within the rest of communities. This may give an additional explanation for the number of variables for which the curve of the information gain shows a knee, corresponding to the case in which relevant information is picked across all the communities. The joint information collected from the information gain curve, and the sensitivity and specificity of the greedy searching approach, one can choose the most convenient number of variables to include in the conditioning dataset. In the present study we set n d = 10.
PCGC d method is similar to LASSO based full-brain AR model [61][62][63], only including a few variables to predict the other ones. Compared to PCGC d , PCGC t uses all the information from the conditional variables, and a proportional distribution of weight values for conditional variable in AR model are fixed a priori according to the community parcellation results.

Methodological Considerations and Limitations
On average 6.7 min/subject were required to complete a network, running on Windows 7 (64 bit), Processor: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz, Installed memory (RAM): In the simulated model, we did not consider the effect of time series length. We only chose a fixed value of the data length which ensured a robust significant causal inference. In addition, the simulated is not meant to reproduce complex brain activity, it is rather a controlled benchmark to be used for a proof of concept.
Community structure revealed by grouping the first 10 most informative contribution regions across all subjects at large scale parcellation (AAL-90) shows that there is a well distributed spatial organization of the set of conditioning variables Z [50]. Based on the above evidence, the distribution the variables in Z was further explored in the current study, and community organization derived from correlation matrix was reported as stable across three parcellations with increasing spatial resolution (AAL-90, AAL-512, AAL-1024), but it still remains to be validated how the performance of PCGC t is affected by inter-and intra-subject variability of the community structure [64].
It is also worth to note that apart from directed connectivity, the problem of conditional dependencies affects as well correlationbased undirected (functional) connectivity, and a generalization of the approach proposed here to the latter case could be in order, and straightforward.
Here we reported the findings based on binary network, such as FCD. However, given that weighted networks contain information about connection strength that reflects heterogeneity in capacity and intensity of connections, these latter could be more indicated for brain connectome representation. For a cross-validation of our results, we additionally used Granger causality strength to identify brain hubs based on weighted effective connectivity network (see Figs. S4, S6 and S7). These results are in accordance with the ones described in the main text.
Finally, for cross-validation of threshold selection, we used additional thresholds to evaluate the stability of the hubs organization in the effective networks (see Figs. S6 and S7), obtaining a general consistence across all the values.
To summarize, we proposed a an approach to perform partially conditioned Granger causality rooted in information theory and graph-theory analysis, coupled to a blind deconvolution technique based on point process analysis to reconstruct the voxel-wise effective connectome of the human brain. We put in evidence for the first time the voxel-wise hubs of incoming and outgoing information, as a complement to previous results on functional and anatomical connectomes. Analogies and differences with these networks have been presented and discussed. Our findings could open the way to a new description of global organization and information influence of brain function in terms of the Granger causality density. Figure S1 Distribution of Pearson correlation r between each voxel and the mean signal of its community (according to the community structure retrieved from AAL-1024). (TIF) Figure S2 Spatial distribution of the n d = 10 most informative voxels for seed region mPFC (MNI coordi-nate: [0 52 26], 6mm-diameter sphere, blue). The size and color of the sphere denote the relative frequency with which a given voxel was selected. (TIF) Figure S3 Log-log plot of PCGC ti and PCGC t . Inset, linear plot. (TIF) Figure S4 The spatial distribution of hub voxels of the weighted graph obtained keeping all the weights higher than a threshold of 0.3, with their value, and setting the rest to zero. In the top sagittal views, red indicates the incoming network hubs, blue the outgoing network hubs, while green the common hubs of incoming and outgoing network. Concerning the axial views, 1-3 rd (5-7 th ) rows indicate the BC/CC/DC incoming network hubs. In 4 th (8 th ) row, yellow indicates incoming (outgoing) regions which are hubs for one measure (hub-score of 1), red indicates incoming (outgoing) regions which are hubs for two measures (hub-score of 2), while green indicates regions which are hubs for all three measures (hub-score of 3). The last row indicates the regions that are at the same time hubs for incoming and outgoing network with hub score of at least 2.

Supporting Information
(TIF) Figure S5 The mutual information gain (Dy), when the (n d +1)-th variable is included, is plotted versus n d . The information gain is averaged over all the variables. Left: the conditioning set Z nd is calculated from the raw signal extracted from AAL-90/512/1024 template; Top right: Z n d is calculated on the signal extracted from each community; Right: curves for the simulated dataset; (TIF) Figure S6